Row-Oriented Storage vs. Columnar Storage
Traditional database engines store table records as sequential rows on disk pages. While ideal for transaction processing (OLTP) where you read and write a single user row at a time, row-oriented layouts are highly inefficient for analytics (OLAP) queries.
If an analytics query calculates the average purchase value across 50,000,000 orders, the database must read all columns of all rows from disk, consuming massive I/O bandwidth.
SQL Server 2012 addresses this with xVelocity Columnstore Indexes.
The Columnstore Architecture
Instead of storing data row-by-row, columnstore indexes group and compress data column-by-column:
- ◆Data Compression: Because values within a column share similar data types and values, algorithms can compress columnar data by up to 10x compared to row-based equivalents.
- ◆Reduced Disk I/O: The query engine only reads the columns specified in the SQL select statement, reducing disk read requirements.
-- Creating a non-clustered columnstore index in SQL Server 2012
CREATE NONCLUSTERED COLUMNSTORE INDEX CSI_OrderFact
ON OrderFact (OrderDateKey, ProductKey, Quantity, TotalAmount);Batch Mode Processing
In addition to storage layout changes, SQL Server 2012 introduces Batch Mode Execution. Instead of processing rows one-by-one, the query optimizer processes batches of roughly 900 rows at a time, maximizing CPU cache utilization and speed.