The Cost of Table Scans
As databases grow from thousands to millions of rows, query performance degrades rapidly if the database engine must perform full table scans to find records. Creating indexes is the most effective way to optimize relational queries.
However, poorly designed indexes can slow down write performance (INSERT, UPDATE, DELETE) and fail to speed up reads.
Understanding B-Tree Index Structure
Most relational DBMS engines (such as Microsoft SQL Server, PostgreSQL, and Oracle) utilize B-Tree (Balanced Tree) structures for indexing.
- ◆Root Node: The entry point for the index traversal.
- ◆Intermediate Nodes: Direct the search navigation to the correct sub-branches.
- ◆Leaf Nodes: Contain the actual data pages (Clustered Index) or pointers to data rows (Non-Clustered Index).
Traversing a B-Tree takes logarithmic time—meaning finding a single record in a 1,000,000-row table takes only a few page reads instead of scanning 1,000,000 rows.
Clustered vs. Non-Clustered Indexes
| Index Type | Storage Behavior | Maximum per Table |
|---|---|---|
| Clustered | Determines the physical order of rows on disk. Data pages are at leaf level. | 1 |
| Non-Clustered | Contains a separate index structure with pointers to the data rows. | Multiple (e.g. 250+) |
Best Practice: Keep your clustered index key short, unique, and sequential (like an IDENTITY or SERIAL integer). Avoid using GUIDs as clustered index keys, as they cause massive page splits and fragment indexes.
Structuring Composite Indexes
When writing queries with multiple filters, a composite (multi-column) index is essential:
CREATE INDEX IX_Users_Status_Country ON Users (Status, Country);The order of columns matters! Place the column with the highest selectivity (most unique values) or the columns filtered with exact equality checks first.
Reviewing Execution Plans
Before deploying an index, always review the Execution Plan in your database client. Look for:
- ◆Index Seek: The engine traverses the B-Tree directly (Fast).
- ◆Index Scan: The engine reads the entire index table (Slow).
- ◆Key Lookup: The engine found the pointer in the non-clustered index but had to fetch additional columns from the clustered index (causes random I/O overhead).