=> EXPLAIN SELECT f.flight_id, bp.seat_no FROM flights f JOIN boarding_passes bp ON bp.flight_id = f.flight_id;
QUERY PLAN ----------------------------------------------------------------------------------- Hash Join (cost=7985.51..225367.71 rows=7925845 width=7) Hash Cond: (bp.flight_id = f.flight_id) -> Seq Scan on boarding_passes bp (cost=0.00..133919.45 rows=7925845 width=7) -> Hash (cost=4564.67..4564.67 rows=214867 width=4) -> Seq Scan on flights f (cost=0.00..4564.67 rows=214867 width=4) (5 rows)
Использовано соединение хешированием.
=> EXPLAIN (COSTS OFF, ANALYZE) SELECT f.flight_id, bp.seat_no FROM flights f JOIN boarding_passes bp ON bp.flight_id = f.flight_id;
QUERY PLAN ----------------------------------------------------------------------------------------- Hash Join (actual time=220.598..10245.637 rows=7925812 loops=1) Hash Cond: (bp.flight_id = f.flight_id) -> Seq Scan on boarding_passes bp (actual time=0.499..3678.391 rows=7925812 loops=1) -> Hash (actual time=219.576..219.577 rows=214867 loops=1) Buckets: 262144 Batches: 2 Memory Usage: 3553kB -> Seq Scan on flights f (actual time=0.482..110.486 rows=214867 loops=1) Planning time: 0.248 ms Execution time: 12249.401 ms (8 rows)
Хеш-таблица не поместилась целиком в память, потребовалось 2 пакета.
=> EXPLAIN SELECT count(*) FROM flights f JOIN boarding_passes bp ON bp.flight_id = f.flight_id;
QUERY PLAN ------------------------------------------------------------------------------------------------------------- Finalize Aggregate (cost=136907.22..136907.23 rows=1 width=8) -> Gather (cost=136907.01..136907.22 rows=2 width=8) Workers Planned: 2 -> Partial Aggregate (cost=135907.01..135907.02 rows=1 width=8) -> Hash Join (cost=7985.51..127650.92 rows=3302435 width=0) Hash Cond: (bp.flight_id = f.flight_id) -> Parallel Seq Scan on boarding_passes bp (cost=0.00..87685.35 rows=3302435 width=4) -> Hash (cost=4564.67..4564.67 rows=214867 width=4) -> Seq Scan on flights f (cost=0.00..4564.67 rows=214867 width=4) (9 rows)
Здесь планировщик использовал параллельный план. В предыдущем запросе это не было оправдано из-за высокой стоимости пересылки данных между процессами, а в данном случае передается только одно число.
=> EXPLAIN (COSTS OFF, ANALYZE) SELECT count(*) FROM flights f JOIN boarding_passes bp ON bp.flight_id = f.flight_id;
QUERY PLAN -------------------------------------------------------------------------------------------------------------------- Finalize Aggregate (actual time=11603.584..11603.585 rows=1 loops=1) -> Gather (actual time=11594.714..11614.365 rows=3 loops=1) Workers Planned: 2 Workers Launched: 2 -> Partial Aggregate (actual time=11583.669..11583.670 rows=1 loops=3) -> Hash Join (actual time=518.764..9210.885 rows=2641937 loops=3) Hash Cond: (bp.flight_id = f.flight_id) -> Parallel Seq Scan on boarding_passes bp (actual time=0.025..2976.753 rows=2641937 loops=3) -> Hash (actual time=514.073..514.074 rows=214867 loops=3) Buckets: 262144 Batches: 2 Memory Usage: 3553kB -> Seq Scan on flights f (actual time=0.022..254.237 rows=214867 loops=3) Planning time: 0.266 ms Execution time: 11614.454 ms (13 rows)
Обратите внимание на поле loops в узлах выше и ниже Gather - оно соответствует реальному числу процессов, работавших над запросом.