=> EXPLAIN SELECT f.flight_id, bp.seat_no FROM flights f JOIN boarding_passes bp ON bp.flight_id = f.flight_id;
QUERY PLAN
-----------------------------------------------------------------------------------
Hash Join (cost=7985.51..225367.71 rows=7925845 width=7)
Hash Cond: (bp.flight_id = f.flight_id)
-> Seq Scan on boarding_passes bp (cost=0.00..133919.45 rows=7925845 width=7)
-> Hash (cost=4564.67..4564.67 rows=214867 width=4)
-> Seq Scan on flights f (cost=0.00..4564.67 rows=214867 width=4)
(5 rows)
Использовано соединение хешированием.
=> EXPLAIN (COSTS OFF, ANALYZE) SELECT f.flight_id, bp.seat_no FROM flights f JOIN boarding_passes bp ON bp.flight_id = f.flight_id;
QUERY PLAN
-----------------------------------------------------------------------------------------
Hash Join (actual time=220.598..10245.637 rows=7925812 loops=1)
Hash Cond: (bp.flight_id = f.flight_id)
-> Seq Scan on boarding_passes bp (actual time=0.499..3678.391 rows=7925812 loops=1)
-> Hash (actual time=219.576..219.577 rows=214867 loops=1)
Buckets: 262144 Batches: 2 Memory Usage: 3553kB
-> Seq Scan on flights f (actual time=0.482..110.486 rows=214867 loops=1)
Planning time: 0.248 ms
Execution time: 12249.401 ms
(8 rows)
Хеш-таблица не поместилась целиком в память, потребовалось 2 пакета.
=> EXPLAIN SELECT count(*) FROM flights f JOIN boarding_passes bp ON bp.flight_id = f.flight_id;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=136907.22..136907.23 rows=1 width=8)
-> Gather (cost=136907.01..136907.22 rows=2 width=8)
Workers Planned: 2
-> Partial Aggregate (cost=135907.01..135907.02 rows=1 width=8)
-> Hash Join (cost=7985.51..127650.92 rows=3302435 width=0)
Hash Cond: (bp.flight_id = f.flight_id)
-> Parallel Seq Scan on boarding_passes bp (cost=0.00..87685.35 rows=3302435 width=4)
-> Hash (cost=4564.67..4564.67 rows=214867 width=4)
-> Seq Scan on flights f (cost=0.00..4564.67 rows=214867 width=4)
(9 rows)
Здесь планировщик использовал параллельный план. В предыдущем запросе это не было оправдано из-за высокой стоимости пересылки данных между процессами, а в данном случае передается только одно число.
=> EXPLAIN (COSTS OFF, ANALYZE) SELECT count(*) FROM flights f JOIN boarding_passes bp ON bp.flight_id = f.flight_id;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (actual time=11603.584..11603.585 rows=1 loops=1)
-> Gather (actual time=11594.714..11614.365 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (actual time=11583.669..11583.670 rows=1 loops=3)
-> Hash Join (actual time=518.764..9210.885 rows=2641937 loops=3)
Hash Cond: (bp.flight_id = f.flight_id)
-> Parallel Seq Scan on boarding_passes bp (actual time=0.025..2976.753 rows=2641937 loops=3)
-> Hash (actual time=514.073..514.074 rows=214867 loops=3)
Buckets: 262144 Batches: 2 Memory Usage: 3553kB
-> Seq Scan on flights f (actual time=0.022..254.237 rows=214867 loops=3)
Planning time: 0.266 ms
Execution time: 11614.454 ms
(13 rows)
Обратите внимание на поле loops в узлах выше и ниже Gather - оно соответствует реальному числу процессов, работавших над запросом.