Four Essentials for Effective Cloud Data Architecture Governance

Four Essentials for Effective Cloud Data Architecture Governance

- by Philip Russom, Expert in Data Management

As with many practical applications of data governance, its application to cloud data architecture also involves four broad areas, each guided by specific principles: 

Business Control for Accessing Cloud Data

This rich area of data governance involves multiple approaches to three practices: security (directories, roles, cybersecurity), privacy (respecting personally identifiable information (PII), your corporate privacy policy), and regulatory compliance for data usage (HIPAA, GDPR). On cloud, security for data is the highest concern, simply because some people still don’t believe clouds can be secured. (They can!) Satisfy privacy and regulatory requirements becomes difficult when sensitive data associate with one geography is stored in a cloud data center in a different geography. 

Controls for Data Condition and Data Content on Cloud

Data stewardship’s role is to identify quality problems with cloud data, plus assure the remediation of those problems. Curation depends on a curator who creates curation principles for a cloud data architecture. The principles determine which datasets or data elements are allowed into the cloud data architecture; without curation, a data architecture can become a dumping ground or “swamp.” The most problematic area within a cloud data architecture is the data lake; many lakes have failed, due to a lack of curation that led to an irreversible swamp. 

Enterprise Data Standards for Cloud Data Architecture

Standards range widely, including data quality metrics, required design patterns for data models, preferred interfaces, coding standards, naming conventions, and metadata and catalog rules. In recent years, as data architects have risen in prominence, they have taken enterprise data standards under their wings, under the assumption that a good architecture demands consistent standards. As more enterprises migrate their data and analytics architectures to cloud, they find they must reengineer existing solutions to so achieve the speed, scale, and low administration that is possible on cloud. In turn, this causes them to revisit their data standards. For example, data (modeled oddly on premises, to gain speed) can be modeled more simply on cloud and still achieve performant query responses. Likewise, practices that lacked standards in the past (metadata, design patterns for ETL and pipelines) are the being governed more closely. As data scientist move deeper into analytics programming on cloud, there is a need for standards there. 

Financial Governance

This is a special case within Cloud Governance. Managing and analyzing data on cloud is very similar to what enterprises have been doing on premises. However, Financial Governance is an area where costs are measured and billed differently. For example, even when you migrate a known data workload to cloud, it will consume resources very differently on cloud. And most licenses on cloud are consumption-based, instead of the peak-oriented licenses on premises. Hence, when an organization migrates a solution to cloud (or builds a new one there), they must monitor the system daily to gain an understanding of the price tags that result from their workloads. Many DBAs are being assigned this task, so they must adjust their job and learn new skills for financial governance for data and analytics on cloud.