Practice
Techniques and patterns for warehouse delivery.
Focused how-to articles on the specific moves that determine whether a warehouse holds up in production. The Foundations pillars cover the model; Practice covers the work.
14 entries
Technique
Advanced dimensional modeling: bridge tables, inferred members, multi-timezone, and the awkward cases
How to model the dimensional cases the textbook example never quite covers: multivalued dimensions and bridge tables, inferred members for late-arriving dimensions, free-text comments, and facts that span multiple time zones.
Technique
Building a data warehouse: a four-phase practitioner's playbook
How warehouse projects actually get built, organized as discovery, design, development, and deployment, with the Kimball-versus-Inmon design choice treated as a concrete decision rather than an academic debate.
Technique
Change data capture: implementation strategies
How log-based, timestamp-based, and trigger-based change data capture actually work in production, including the initial snapshot handoff, schema evolution failure modes, and the operational disciplines that keep CDC pipelines correct.
Technique
Data cleansing in the warehouse: where it belongs and what it does
Where data cleansing sits in a modern warehouse load: the staging-to-curated boundary, the rule categories that catch real defects, the test-at-the-transform-layer pattern, and the observability that catches the drift the rules miss.
Technique
Data extraction models: full, incremental, log-based, query-based, file-based, API, and streaming
The seven data extraction patterns a warehouse encounters in practice, what each one assumes about the source, where each one fails, and how the modern connector stack (Fivetran, Airbyte, Estuary, Debezium, Kafka) decides between them.
Technique
Data masking in the data warehouse
How static, dynamic, and on-the-fly data masking actually work in a cloud warehouse, including the mask-before-load versus mask-in-warehouse axis, column-level masking policies on Snowflake, BigQuery, and Databricks, and the trade-offs between tokenization, encryption, and hashing under GDPR, CCPA, and HIPAA.
Technique
Data modeling phases: conceptual, logical, and physical
How conceptual, logical, and physical data models actually divide warehouse design work in 2026, including where data contracts and dbt fit, and the handoffs that determine whether the model survives production.
Technique
Data virtualization: federated query in modern warehouse stacks
How data virtualization works as a technique, what it shares with and how it differs from federated query and the logical data warehouse, where it fits in cloud warehouse stacks, and the failure modes that determine when virtualization holds up in production.
Technique
Data warehouse metadata: catalogs, lineage, and the metadata repository in 2026
How technical, business, and operational metadata get organized in a modern warehouse stack, including the shift from monolithic metadata repositories to federated data catalogs, dbt-driven lineage, and OpenLineage as the cross-tool standard.
Technique
Data warehouse testing: validation, regression, and performance
What to test in a production warehouse pipeline, where each kind of test lives, and how dbt tests, Great Expectations, and contract patterns fit together without producing a green dashboard over wrong data.
Technique
Logical data warehouse: the architectural pattern
The logical data warehouse unifies a physical warehouse with lakehouses, operational stores, and SaaS sources behind a single query layer. How the pattern actually works in 2026, where it fits, and where it quietly breaks.
Technique
Normalization and denormalization in data warehousing
Normalization vs denormalization for analytical workloads: where 3NF still belongs in a 2026 warehouse, why columnar engines have made denormalization the default for query layers, and how to think about the trade-off layer by layer.
Technique
Slowly changing dimensions: implementation strategies
How SCD Type 1, 2, 3, and the hybrid types actually work in a production warehouse, including active row identification, fact loading under Type 2, and the edge cases that bite teams in practice.
Technique
Surrogate key management: generation, lookup, and the cases that bite
How to generate and manage surrogate keys in a 2026 cloud warehouse: integer sequences, hash-based deterministic keys, UUID v7, the fact-loading lookup under Type 2 SCD, and the edge cases that produce silent errors.
