Different Data Warehouse Architectures

Solutions Review’s Tim King created this introductory resource to highlight the different data warehouse architectures for your reference.
Enterprise Data Warehouse (EDW) architectures define the structure and components of an organization’s data warehousing environment. These architectures are designed to support the storage, management, and analysis of data from various sources. Here are some of the different types of EDW architectures:
- Single-Tier (Single-Layer) Architecture:
- In a single-tier EDW architecture, all components, including data extraction, transformation, loading (ETL), storage, and querying, are hosted on a single server or system.
- This architecture is suitable for small-scale data warehousing needs, but it lacks scalability and fault tolerance.
- Two-Tier (Two-Layer) Architecture:
- A two-tier EDW architecture separates the ETL processes from the data storage and querying components.
- ETL processes are typically executed on one server, while the data storage and querying components reside on another server or set of servers.
- This separation allows for better scalability and performance.
- Three-Tier (Three-Layer) Architecture:
- A three-tier EDW architecture divides the components into three layers:
- Data Sources Layer: This layer includes the source systems and databases where raw data originates.
- Data Warehouse Layer: This is the central repository for cleaned, transformed, and integrated data. It includes the data warehouse database.
- Front-End Layer: The front-end layer provides tools and interfaces for querying and reporting on the data.
- This architecture enhances scalability, maintainability, and security.
- A three-tier EDW architecture divides the components into three layers:
- Hub-and-Spoke (Hub-and-Spoke with Satellite) Architecture:
- The hub-and-spoke architecture, also known as the hub-and-spoke with satellite architecture, involves a central hub and multiple spokes.
- The central hub serves as the primary data repository, while the spokes represent data marts or department-specific warehouses.
- Data from the hub is selectively replicated to the spokes as needed.
- This architecture allows for both centralized data control and decentralized department-specific reporting.
- Data Warehouse Bus Architecture:
- The data warehouse bus architecture extends the three-tier architecture by incorporating a bus structure connecting data marts.
- Each data mart represents a specific subject area, and they share common dimensions and facts from the central data warehouse.
- This architecture supports both enterprise-wide reporting and departmental analytics.
- Federated (Virtual) Architecture:
- A federated EDW architecture integrates data from various source systems on an as-needed basis, without physically centralizing the data.
- It provides a unified view of data across distributed sources, enabling real-time access to data without data duplication.
- Federated architectures are suitable when centralization is not feasible or necessary.
- Data Lake Integration:
- Many organizations incorporate a data lake into their data warehousing architecture.
- Data lakes store raw and unprocessed data from a variety of sources, which can then be ingested, processed, and integrated into the data warehouse as needed.
- This approach allows for flexibility in handling diverse data types and supports big data and advanced analytics use cases.
The choice of EDW architecture depends on factors such as the organization’s data management strategy, scalability requirements, data integration complexity, and the need for departmental or centralized analytics. Many enterprises may adopt a hybrid approach that combines elements of multiple architectures to meet their specific needs.