Topic
Warehouse Fundamentals
What a data warehouse is, the architectural patterns, the relationship to lakes and lakehouses, and where the warehouse sits in the modern data stack.
22 entries
Techniques
7
Technique
Building a data warehouse: a four-phase practitioner's playbook
How warehouse projects actually get built, organized as discovery, design, development, and deployment, with the Kimball-versus-Inmon design choice treated as a concrete decision rather than an academic debate.
Read →Technique
Data extraction models: full, incremental, log-based, query-based, file-based, API, and streaming
The seven data extraction patterns a warehouse encounters in practice, what each one assumes about the source, where each one fails, and how the modern connector stack (Fivetran, Airbyte, Estuary, Debezium, Kafka) decides between them.
Read →Technique
Data masking in the data warehouse
How static, dynamic, and on-the-fly data masking actually work in a cloud warehouse, including the mask-before-load versus mask-in-warehouse axis, column-level masking policies on Snowflake, BigQuery, and Databricks, and the trade-offs between tokenization, encryption, and hashing under GDPR, CCPA, and HIPAA.
Read →Technique
Data modeling phases: conceptual, logical, and physical
How conceptual, logical, and physical data models actually divide warehouse design work in 2026, including where data contracts and dbt fit, and the handoffs that determine whether the model survives production.
Read →Technique
Data virtualization: federated query in modern warehouse stacks
How data virtualization works as a technique, what it shares with and how it differs from federated query and the logical data warehouse, where it fits in cloud warehouse stacks, and the failure modes that determine when virtualization holds up in production.
Read →Technique
Data warehouse metadata: catalogs, lineage, and the metadata repository in 2026
How technical, business, and operational metadata get organized in a modern warehouse stack, including the shift from monolithic metadata repositories to federated data catalogs, dbt-driven lineage, and OpenLineage as the cross-tool standard.
Read →Technique
Logical data warehouse: the architectural pattern
The logical data warehouse unifies a physical warehouse with lakehouses, operational stores, and SaaS sources behind a single query layer. How the pattern actually works in 2026, where it fits, and where it quietly breaks.
Read →
Comparisons
3
Comparison
Data warehouse vs data lake vs data mart vs lakehouse
Data warehouse vs data lake vs data mart vs lakehouse: four distinct architectural commitments, what each one actually is, how they compare on storage, governance, query engine, and workload, and when each is the right choice in a 2026 stack.
Read →Comparison
ETL vs ELT
ETL vs ELT: what the order of operations actually changes, why cloud columnar warehouses shifted the default from ETL to ELT, the trade-offs that determine which pattern fits a given workload, and a note on where reverse ETL fits.
Read →Comparison
OLTP vs OLAP
OLTP vs OLAP: what the two database categories are actually optimized for, where the workload boundary used to be sharp, how columnar warehouses and HTAP systems have blurred it, and the trade-offs that determine which side a given workload belongs on.
Read →
Glossary
10
Glossary
Abstraction layer
In warehouse architecture, a layer that hides physical or implementation detail so the layer above can address data in business terms. Most often refers to the semantic layer between warehouse tables and BI tools.
Read →Glossary
Data catalog
A searchable index over the metadata of the data assets in an analytics platform: tables, columns, dashboards, models, owners, descriptions, and lineage, federated from the upstream tools that produce each piece.
Read →Glossary
Data fabric
A metadata-driven architecture that unifies heterogeneous data sources through an active catalog, automated governance, and a federated query layer.
Read →Glossary
Data lake
Object storage of raw, varied, schema-on-read data. The storage layer for analytical workloads that don't fit the warehouse's structured model.
Read →Glossary
Data lakehouse
Object storage plus an open table format (Iceberg, Delta, or Hudi), exposing lake-style data through a warehouse-style table abstraction with ACID, schema enforcement, and time travel.
Read →Glossary
Data lineage
The recorded graph of how a data value flows from source to destination across the pipeline: which sources fed which models fed which dashboards, at table or column granularity, derived from build artifacts and runtime events rather than maintained by hand.
Read →Glossary
Data mart
A curated subset of a data warehouse, organized around a single department or subject area. Not a different architectural pattern; a deployment style of warehouse content.
Read →Glossary
Enterprise data warehouse (EDW)
An integrated, organization-wide data warehouse that consolidates analytical data across business units, in contrast to a single departmental mart. In modern cloud deployments the qualifier is mostly redundant.
Read →Glossary
Federated query
A query that executes across multiple underlying data stores through a single engine, with the engine pushing predicates down to each source and combining results.
Read →Glossary
Semantic layer
A modeling abstraction between physical warehouse tables and BI tools that defines business entities, metrics, and dimensions once, so downstream consumers query consistent definitions rather than rebuilding them per report.
Read →
