PostgreSQL 9.6: Parallel Query Execution and Scale-Out Architecture

Unleashing multi-core performance. We explore parallel scans, multi-worker joins, and vacuuming options.

VP
SHIVAM ITCS
·22 August 2016·10 min read·1 views

Technical Overview & Strategic Context

Relational database engines have traditionally executed single SQL queries on a single CPU thread. In an era of multi-core servers, this execution constraint limits performance for heavy analytical queries on large datasets. The release of PostgreSQL 9.6 addresses this bottleneck by introducing a parallel query execution engine. This engine allows PostgreSQL to distribute query workloads across multiple background worker threads, utilizing multi-core hardware to accelerate query speeds.

Architectural Principle: Design database schemas to leverage parallel query planners. Scale query execution by distributing large table scans across multiple CPU worker threads.

Core Concepts & Architectural Blueprint

PostgreSQL 9.6 parallelizes query execution by launching background worker processes. The query planner identifies portions of execution plans (like sequential scans, aggregations, and joins) that can be run in parallel. Each worker thread processes a partition of the table, and the results are aggregated by a coordinator process, speeding up analytical queries.

Performance & Capability Comparison

Query OperationPre-PostgreSQL 9.6PostgreSQL 9.6 Parallel ModePerformance Scale
Sequential ScansExecuted on a single CPU threadDistributed across CPU worker threadsAggregates large tables faster
Hash JoinsSingle-threaded hash generationWorkers compile hash tables concurrentlySpeeds up join execution times
AggregationsSingle-threaded count/sum loopsWorkers calculate partial aggregationsReduces analytical query delays

Implementation & Code Pattern

To configure PostgreSQL 9.6 for parallel query execution, check these database settings:

  • Set max_worker_processes to allocate background threads for queries.
  • Configure max_parallel_workers_per_gather to limit threads per query.
  • Set min_parallel_relation_size to enable parallel queries on larger tables.
  • Verify query parallelization using SQL EXPLAIN commands.
sqlcode
-- Configuring and verifying parallel execution settings in PostgreSQL 9.6
-- Enable parallel workers on the session
SET max_parallel_workers_per_gather = 4;
SET force_parallel_mode = on;

-- Analyze query execution plans for large transaction tables
EXPLAIN ANALYZE
SELECT category, SUM(amount), COUNT(*)
FROM sales_records
GROUP BY category;

-- Query output displays parallel worker allocations:
-- ->  Gather  (cost=1000.00..50000.00 rows=1000000 width=40)
--       Workers Planned: 3
--       ->  Partial HashAggregate  (cost=10000.00..12000.00 rows=333333 width=40)
--             ->  Parallel Seq Scan on sales_records  (cost=0.00..8000.00 rows=333333 width=20)

Operational Governance & Future Outlook

PostgreSQL 9.6's introduction of parallel query execution enabled multi-core database optimizations. Distributing query workloads across worker threads allows PostgreSQL to support high-performance analytical queries.

VP
Vijay Paliwal
Founder, SHIVAM ITCS · 18+ years enterprise & AI engineering
MCA · Ex-HiveGPT USA · Ex-Social27 Seattle
PostgreSQL 9.6: Parallel Query Execution and Scale-Out Architecture | SHIVAM ITCS Blog | SHIVAM ITCS