High-Performance Columnstore Indexes in SQL Server 2012

Rethinking data warehouse storage. We analyze column-oriented page compressions and batch-mode query processing.

VP
SHIVAM ITCS
·25 February 2012·10 min read·1 views

Row-Oriented Storage vs. Columnar Storage

Traditional database engines store table records as sequential rows on disk pages. While ideal for transaction processing (OLTP) where you read and write a single user row at a time, row-oriented layouts are highly inefficient for analytics (OLAP) queries.

If an analytics query calculates the average purchase value across 50,000,000 orders, the database must read all columns of all rows from disk, consuming massive I/O bandwidth.

SQL Server 2012 addresses this with xVelocity Columnstore Indexes.

The Columnstore Architecture

Instead of storing data row-by-row, columnstore indexes group and compress data column-by-column:

  • Data Compression: Because values within a column share similar data types and values, algorithms can compress columnar data by up to 10x compared to row-based equivalents.
  • Reduced Disk I/O: The query engine only reads the columns specified in the SQL select statement, reducing disk read requirements.
sqlcode
-- Creating a non-clustered columnstore index in SQL Server 2012
CREATE NONCLUSTERED COLUMNSTORE INDEX CSI_OrderFact
ON OrderFact (OrderDateKey, ProductKey, Quantity, TotalAmount);

Batch Mode Processing

In addition to storage layout changes, SQL Server 2012 introduces Batch Mode Execution. Instead of processing rows one-by-one, the query optimizer processes batches of roughly 900 rows at a time, maximizing CPU cache utilization and speed.

VP
Vijay Paliwal
Founder, SHIVAM ITCS · 18+ years enterprise & AI engineering
MCA · Ex-HiveGPT USA · Ex-Social27 Seattle
High-Performance Columnstore Indexes in SQL Server 2012 | SHIVAM ITCS Blog | SHIVAM ITCS