Different Data Warehouse Models

Solutions Review’s Tim King created this introductory resource to highlight the different data warehouse models for your reference.
Enterprise data warehouse (EDW) models are designed to facilitate the storage, organization, and retrieval of large volumes of data from various sources within an organization. These models help organizations consolidate and analyze data to support business intelligence and decision-making. There are several common types of EDW models:
- Centralized Data Warehouse:
- In a centralized EDW model, all data from various source systems is extracted, transformed, and loaded (ETL) into a single, centralized repository.
- This model provides a unified and standardized view of the organization’s data, making it easier to manage and analyze.
- It is well-suited for organizations with a relatively simple data environment and a single source of truth.
- Data Mart:
- A data mart is a subset of a data warehouse that focuses on specific business areas or departments.
- Data marts are typically designed to serve the needs of a particular group within the organization, such as sales, marketing, or finance.
- This model offers more focused and tailored data storage and reporting capabilities.
- Data Warehouse Bus Architecture:
- The bus architecture is an extension of the centralized data warehouse model.
- It involves creating multiple data marts, each focused on specific subject areas, while still maintaining a centralized repository for shared data.
- The data marts are connected through a common set of dimensions and facts, forming a “bus” structure.
- This model offers a balance between centralization and decentralization, allowing for both enterprise-wide reporting and department-specific analysis.
- Hub-and-Spoke Architecture:
- In the hub-and-spoke architecture, a central data warehouse (the hub) serves as the main repository for shared data.
- Data marts (the spokes) are connected to the central hub, and each spoke is responsible for a specific business area.
- This model allows for centralized control of core data while accommodating department-specific needs.
- Federated Data Warehouse:
- The federated data warehouse model does not physically consolidate data into a single repository.
- Instead, it integrates data from various source systems on an as-needed basis, presenting a unified view without storing all data centrally.
- This model is useful when data is distributed across multiple locations or when a true centralization of data is impractical.
- Virtual Data Warehouse:
- A virtual data warehouse does not store data physically but provides a logical layer for querying and accessing data from disparate sources.
- It allows users to query and analyze data from various sources as if it were part of a single, centralized warehouse.
- This model is suitable for organizations that prefer to avoid the overhead of physically consolidating data.
- Data Lake:
- While not strictly an EDW model, a data lake is a storage repository that can hold vast amounts of structured and unstructured data.
- Data lakes are often used in conjunction with data warehouses to store raw data before it is processed and integrated into the warehouse.
- They provide flexibility in handling diverse data types and are commonly associated with big data and advanced analytics.
The choice of EDW model depends on the organization’s data management goals, the complexity of its data environment, and the specific needs of various business units. Many organizations implement a combination of these models to create a data architecture that best suits their requirements.