Data Engineering
Viewing all posts categorized under Data Engineering.
Timeline
Filter by Year
Data Mesh Adoption in Real Enterprises
Adoption metrics and architectural patterns for domain-driven Data Mesh in enterprises.
Data Mesh Adoption in Real Enterprises
Adoption metrics and architectural patterns for domain-driven Data Mesh in enterprises.
PostgreSQL 9.6: Parallel Query Execution and Scale-Out Architecture
Analyzing the PostgreSQL 9.6 release, detailing its multi-core parallel query execution engine, parallel sequential scans, and index joins.
TensorFlow Open Source: Computation Graphs and Declarative Machine Learning Pipelines
Analyzing Google's open-source release of TensorFlow in late 2015. We detail the mechanics of declarative computational graphs and tensors.
Apache Spark 1.4: Introducing DataFrames and Spark SQL for Distributed Datasets
Analyzing the release of Apache Spark 1.4 in mid-2015. We break down the new DataFrame API, the Catalyst optimizer, and Spark SQL query execution.
Database Index Fragmentation: Diagnosing and Rebuilding SQL Server B-Trees
An engineering guide to database maintenance in late 2014, detailing index fragmentation analysis and rebuild commands.
IndexedDB vs. LocalStorage: Choosing Client-Side Databases for Web Apps
A comparative review of client-side storage options in late 2014, comparing synchronous LocalStorage against transactional IndexedDB.
SQL Server Auditing: Monitoring Database Access and Tracking Audit Trails
An engineering guide to configuring database-level auditing in SQL Server 2014, detailing audit specifications and compliance tracking.
SQL Server 2014 In-Memory OLTP: Speeding Up Writes with Hekaton Tables
An architectural review of SQL Server 2014's Hekaton In-Memory OLTP engine, analyzing lock-free index structures and compiled queries.
Database Sharding Patterns: Architecting Horizontal Scale-Out for Web SaaS
An architectural guide to database sharding for SaaS platforms in late 2013, detailing sharding algorithms and query coordination.
Clustered Columnstore Indexes in SQL Server 2014: Columnar Storage for OLAP Databases
Exploring the announced Clustered Columnstore Indexes in SQL Server 2014, detailing read-write optimizations and data compression.
Hadoop 2.0 YARN: Splitting Resource Management from MapReduce Computation
An architectural review of YARN in Hadoop 2.0, detailing how splitting resource allocation from execution enables multi-tenant clusters.
MongoDB Replica Set Elections: Maintaining High Availability During Primary Node Failures
An architectural review of MongoDB replica set election mechanics in early 2013, analyzing heartbeat windows, quorum calculations, and write concerns.
Real-time Data: Why Apache Spark is Replacing MapReduce Batching
An architectural review of Apache Spark in late 2012, analyzing Resilient Distributed Datasets (RDD) and in-memory processing speeds.
Redis 2.6: Lua Scripting, Server-Side Scripts, and Commands
Exploring the release of Redis 2.6 in late 2012, detailing server-side Lua scripting, transaction security, and cache performance.
Apache Cassandra 1.1: Multi-Data Center Replication for Enterprise
Reviewing Apache Cassandra 1.1 release features, examining gossip communication protocol configurations and multi-datacenter data replication.
High-Performance Columnstore Indexes in SQL Server 2012
A deep dive into SQL Server 2012 Columnstore Indexes, evaluating column storage structures and batch processing for data warehousing.
MongoDB 2.0: Concurrency, Indexing, and Enterprise Readiness
Reviewing the release of MongoDB 2.0 in late 2011, examining improvements to concurrency locks, index sizes, and automatic failovers.
SQL Server 2012 AlwaysOn: Rethinking Database Disaster Recovery
An evaluation of high availability features in the SQL Server 2012 release previews, focusing on AlwaysOn Availability Groups.
Entity Framework 4.1: Code-First Development and DbContext API
Exploring the landmark release of Entity Framework 4.1 in March 2011, detailing the transition to Code-First development and the simplified DbContext object.