Data Engineering
Viewing all posts categorized under Data Engineering.
Timeline
Filter by Year
Database Normalization vs. Denormalization for Web-Scale Performance
Understanding the balance between database normalization (3NF) to prevent anomalies and denormalization to accelerate high-volume queries.
Entity Framework 4.0: Resolving the Object-Relational Impedance Mismatch
Exploring the release of Entity Framework 4.0 in late 2010. We examine POCO support, lazy loading, and code-first ORM patterns.
Hadoop and MapReduce: Demystifying Big Data Processing for the Enterprise
An architectural guide to Apache Hadoop in mid-2010. We discuss HDFS clusters, MapReduce job execution, and structured big data parsing.
Designing High-Performance SQL Indexes: A Masterclass in Query Optimization
A deep dive into database internals, examining B-Tree layouts, page reads, and how to structure composite indexes to satisfy query execution plans.
The Rise of NoSQL: Evaluating MongoDB and Cassandra for Scale-Out Architectures
A look at the late-2009 / early-2010 buzz around NoSQL databases. We analyze document stores (MongoDB) vs. wide-column stores (Cassandra) for horizontal scalability.
The Oracle-Sun Merger: What the Future Holds for Java and Enterprise Databases
An in-depth look at the monumental Oracle-Sun acquisition of January 2010, its ramifications for the Java community, OpenJDK governance, and the future of open-source MySQL.