90 Data Management Predictions from 55 Experts for 2024

By Tim King , Executive Editor at Solutions Review
Best Practices, Featured, Staff Pick,

For our 5th annual Insight Jam LIVE! Solutions Review editors sourced this resource guide of data management predictions for 2024 from Insight Jam, its new community of enterprise tech experts.

Note: Data management predictions are listed in the order we received them.

Data Management Predictions from Experts for 2024

Rahul Pradhan, Vice President of Product and Strategy at Couchbase

Real-time data will become the standard for businesses to power generative experiences with AI; Data layers should support both transactional and real-time analytics

“The explosive growth of generative AI in 2023 will continue strong into 2024. Even more enterprises will integrate generative AI to power real-time data applications and create dynamic and adaptive AI-powered solutions. As AI becomes business critical, organizations need to ensure the data underpinning AI models is grounded in truth and reality by leveraging data that is as fresh as possible.”

“Just like food, gift cards and medicine, data also has an expiration date. For generative AI to truly be effective, accurate and provide contextually relevant results, it needs to be built on real-time, continually updated data. The growing appetite for real-time insights will drive the adoption of technologies that enable real-time data processing and analytics. In 2024 and beyond, businesses will increasingly leverage a data layer that supports both transactional and real-time analytics to make timely decisions and respond to market dynamics instantaneously.”

Expect a paradigm shift from model-centric to data-centric AI

“Data is key in modern-day machine learning, but it needs to be addressed and handled properly in AI projects. Because today’s AI takes a model-centric approach, hundreds of hours are wasted on tuning a model built on low-quality data.”

“As AI models mature, evolve and increase, the focus will shift to bringing models closer to the data rather than the other way around. Data-centric AI will enable organizations to deliver both generative and predictive experiences that are grounded in the freshest data. This will significantly improve the output of the models while reducing hallucinations.”

Multimodal LLMs and databases will enable a new frontier of AI apps across industries

“One of the most exciting trends for 2024 will be the rise of multimodal LLMs. With this emergence, the need for multimodal databases that can store, manage and allow efficient querying across diverse data types has grown. However, the size and complexity of multimodal datasets pose a challenge for traditional databases, which are typically designed to store and query a single type of data, such as text or images. “

“Multimodal databases, on the other hand, are much more versatile and powerful. They represent a natural progression in the evolution of LLMs to incorporate the different aspects of processing and understanding information using multiple modalities such as text, images, audio and video. There will be a number of use cases and industries that will benefit directly from the multimodal approach including healthcare, robotics, e-commerce, education, retail and gaming. Multimodal databases will see significant growth and investments in 2024 and beyond — so businesses can continue to drive AI-powered applications.”

Anil Mahadev, Senior Solutions Architect at IDERA Software

“The role of the Database professional is ever-changing. My prediction would be DBAs and Developers alike will utilize AI to help elevate their profession while focusing on Strategic initiatives with the Data Platforms of choice.”

Nima Negahban, CEO and Co-Founder at Kinetica

Generative AI turns its focus towards structured, enterprise data

“Businesses will embrace the use of generative AI for extracting insights from structured numeric data, enhancing generative AI’s conventional applications in producing original content from images, video, text and audio. Generative AI will persist in automating data analysis, streamlining the rapid identification of patterns, anomalies, and trends, particularly in sensor and machine data use cases. This automation will bolster predictive analytics, enabling businesses to proactively respond to changing conditions, optimizing operations, and improving customer experiences.”

English will replace SQL as the lingua-franca of business analysts

“We can anticipate a significant mainstream adoption of language-to-SQL technology, following successful efforts to address its accuracy, performance, and security concerns. Moreover, LLMs for language-to-SQL will move in-database to protect sensitive data when utilizing these LLMs, addressing one of the primary concerns surrounding data privacy and security. The maturation of language-to-SQL technology will open doors to a broader audience, democratizing access to data and database management tools, and furthering the integration of natural language processing into everyday data-related tasks.”

Location-enriched time-series data will outpace conventional time-series data

“Location-enriched time-series data will outpace conventional time-series data in automotive, telco and logistics industries. The widespread deployment of GPS devices, location-aware chips, and advanced satellite constellations will fuel this shift. Organizations will begin to adapt their data architectures to harness the full potential of location-based sensor data, marking a pivotal transformation in these sectors.”

Vasu Sattenapalli, CEO at RightData

Generative AI Will Move to Modern Data Management

“Historically, data management is a bit of a black box with highly technical skills required to create a strategy and manage data efficiently. With the help of LLMs, modern data management will change its framework, allowing users to participate in the entire data stack in a fully governed and compliant manner.”

Shawn Rogers, CEO and Fellow at BARC

AI is driving innovation in data management, especially through automation and speed

“Having strength at this core level of your data stack is critical for AI success. NLP and conversational UI’s will open the door for the true democratization of analytics. It’s an exciting time for data and insights.”

Naren Narendran, Chief Scientist at Aerospike

Data to drive hyper-personalization in eCommerce

“Data will drive hyper-personalized user experiences in diverse eCommerce applications. Rather than platforms serving content based on aggregate statistics or the behavior from a buyer’s journey in the past six months, for example, they’ll react based on a search from three hours ago — or even a click from two minutes ago. As ML systems are fed more and more data to boost application performance, we’ll see generalized statistical predictions funnel into hyper-personalized ones at the individual level for a more tailored user experience in retail and eCommerce.”

Jeremy Burton, CEO at Observe

Observability is recognized as a Data Problem

“Despite pouring $17 billion into observability and monitoring tools each year, enterprises are seeing a negligible impact on mean-time-to-resolution (MTTR) — in fact they are increasing. Why? Modern distributed applications are complex, they change multiple times a day which leads to DevOps teams seeing ‘unknown’ problems in production every day.

When troubleshooting an ‘unknown’ problem, DevOps team must triangulate on data points to determine where the problem may be occurring. That’s where the problems start, some data points are in a logging tool,a monitoring tool, or an APM tool.The best practice is often to screenshot what each tool is showing and post in a Slack channel so the final decision maker can correlate.

This is not sustainable. For Observability to deliver on its promise, the observability data must be in one place — not in several siloes. If the data is in one place it’s easier to navigate, find relevant context for the incident being investigated, and for the DevOps team to collaborate in one consistent interface (that’s not Slack!)”

Mav Turner, CTO at Tricentis

“As the volume and flow of data continues to expand, data integrity is the determining factor to success for organizations implementing data-driven decision making. In fact, data creation is predicted to double over the next year, so successfully labeling, standardizing, delivering, and testing to maintain data integrity will become even more essential for organizations. With this, organizations should look to solutions that leverage AI and automation to manage the volume of data at accelerated speeds, all while reducing the risk of human error. Automated testing solutions which incorporate AI are more precise, enable smoother workflows, and minimize the amount of expertise that developers need to possess.”

Jim Liddle, Chief Innovation Officer at Nasuni

2024 will be a make-or-break year for data intelligence

“Following the booming interest in AI in 2023, enterprises will face increased pressure from their boards to leverage AI to gain a competitive edge. That rush for an AI advantage is surfacing deeper data infrastructure issues that have been mounting for years. Before they can integrate AI effectively, organizations will first have to address how they collect, store, and manage their unstructured data, particularly at the edge.

AI doesn’t work in a vacuum and it’s just one part of the broader data intelligence umbrella. Many organizations have already implemented data analytics, machine learning and AI into their sales, customer support, and similar low-hanging initiatives, but struggle to integrate the technology in more sophisticated, high-value applications.

Visibility, for example, is a crucial and often-overlooked first step towards data intelligence. A shocking number of companies store massive volumes of data simply because they don’t know what’s in it or whether they need it. Is the data accurate and up-to-date? Is it properly classified and ‘searchable’? Is it compliant? Does it contain personal identifiable information (PII), protected health information (PHI), or other sensitive information? Is it available on-demand or archived?

In the coming year, companies across the board will be forced to come to terms with the data quality, governance, access, and storage requirements of AI before they can move forward with digital transformation or improvement programs to give them the desired competitive edge.”

Russ Kennedy, Chief Product Officer at Nasuni

Organizations will continue to grapple with data infrastructure to support hybrid work long after the pandemic

“The genie is out of the bottle and hybrid or remote is here to stay. Though the greatest economic upheavals have hopefully passed, we’re seeing the residual effects. Many companies are still trying to design or optimize infrastructure to accommodate hybrid work and reconfigured supply chains.

Though organizations worked quickly to spin up the necessary systems, they simply weren’t designed to support thousands of remote workers. Inevitably, workers started using whatever tools necessary to collaborate, and many businesses saw a significant increase in shadow IT tools outside of sanctioned corporate IT programs. As we enter 2024, IT organizations are still grappling with the effects of remote work on top of mounting pressure to reduce costs and regain control of their disparate and sprawling corporate data assets.

Some have tried to remedy the issue by mandating employees back into the office, but to attract and retain appropriate talent, businesses will need to provide enhanced multi-team collaboration options and the data infrastructure to scale it. Those that have the right data access solutions in place to streamline processes and remote collaboration will succeed in the hybrid work economy.”

Rex Ahlstrom, CTO and EVP at Syniti

Increased adoption of generative AI will drive need for clean data

“The foundation of generative AI is data. That is, to function as desired, data is what provides the basis for this new technology. However, that data also needs to be clean. Regardless of where you’re pulling the data from – whether you’re using something like modeling or a warehouse of your choice – quality data will be essential. Bad data can lead to bad recommendations, inaccuracies, bias, etc. Having a strong data governance strategy will become more important as more organizations seek to leverage the power of generative AI in their organization. Ensuring your data stewards can access and control this data will also be key.”

The shift to data fabric will accelerate thanks to AI

“When I surveyed the landscape at the end of last year, I anticipated that more organizations would move from a data mesh approach to a data fabric to help break down information silos and make data available to business users more quickly. We haven’t quite seen as fast a transition, but this trend is certainly accelerating and in 2024, this trajectory will be driven largely by increased adoption of AI and other self-discovering technologies. While there’s been a lot of discussion around data fabric in recent years, it will become a bigger goal for organizations thanks to the emergence of more advanced AI.”

Data quality will start to become an executive-level topic

“Ownership of data and data quality are core to business success but are still too often ignored or overlooked by the executive and board levels of most organizations. We can see this in the disconnects around perception vs. reality. Over 80 percent of executives we surveyed think they trust their data, but the reality is that there are still a lot of people that are doing a lot of work to get data quality to a level where the data can be consumed and used. As data quality takes on greater importance, it will escalate to an executive-level conversation.”

Justin Borgman, Co-Founder and CEO at Starburst

Companies will prioritize minding the gap between data foundations and AI innovation

“There is no AI strategy without a data strategy and companies will need to prioritize closing gaps in their data strategy; specifically, the foundational elements of more efficiently accessing more accurate data securely.”

Two hot topics, data products & data sharing, will converge in 2024

“Data sharing was already on the rise as companies sought to uncover monetization opportunities, but a refined method to curate the shared experience was still missing. As the lasting legacy of data mesh hype, data products will emerge as that method. Incorporating Gen AI features to streamline data product creation and enable seamless sharing of these products marks the pivotal trifecta moving data value realization forward.”

Open formats are poised to deal the final blow to the data warehouse model

“While many anticipate the data lakehouse model supplanting warehouses, the true disruptors are open formats and data stacks. They free companies from vendor lock-in, a constraint that affects both lakehouse and warehouse architectures.”

Cloudera misses the cloud era

“Hadoop lives on and companies are finding more modern solutions, including more efficient on-prem storage, while building connectivity to cloud systems for analytics workloads.”

All things make a comeback and on-prem storage is having a resurgence

“Companies including Dell have heavily invested in their EMC portfolio. Enterprise customers will continue to recognize that enhancing on-premise storage hardware presents the faster path to mitigating rising cloud expenses. This modernization will allow companies to manage data gravity for on-premise data that cannot be easily relocated, ensuring a more efficient approach.”

Avthar Sewrathan, GM for AI and Vector at Timescale

One-way Ticket to Vector-town

“As new applications get built from the ground up with AI, and as LLMs become integrated into existing applications, vector databases will play an increasingly important role in the tech stack, just as application databases have in the past. Teams will need scalable, easy to use, and operationally simple vector data storage as they seek to create AI-enabled products with new LLM-powered capabilities.”

Say Hello to Your New Coworker (or Copilot)

“AI will give individual engineers far more leverage and shrink the optimal size of development teams. Products like coding co-pilots and code generation tools have gained popularity, but they will become even more entrenched in the software development process in the coming year. Small development teams of engineers with AI copilots and AI-integrated devtools contributing what in the past would’ve taken dozens of engineers will become the norm.”

See You Later, Serverless

“More developers will move away from fully serverless architectures given the growing discontent with their lack of transparency, scalability and cost predictability. Developers will increasingly choose other paradigms that offer more transparency, such as provisioned architectures and hybrid architectures that blend elements of serverless and traditional provisioned paradigms together.”

Haoyuan Li, Founder and CEO at Alluxio

Overcoming Data Silo Challenges

“Data silos remain a challenge for organizations – many analytics and AI systems spread across regions, clouds, and platforms, resulting in a vast amount of data duplication and separate governance models. In 2024, to accelerate time-to-insights and scale analytics and AI initiatives, organizations will increasingly need to manage distributed data. More will develop data strategies for unified management of scattered data through flexible orchestration, abstraction, and virtualization.”

Sean Knapp, Founder and CEO of Ascend.io

CIOs will make structural changes in 2024 as a result of AI

“2023 saw an explosion of interest in AI. In 2024, companies will enact sweeping top-down AI adoption mandates. We expect to see goals such as reducing Opex by 20 percent, boosting CSAT/NRR by 10 percent, and generating 10 percent in top-line revenue from AI-based products all on the table. The organizations that succeed here will make significant structural changes similar to the ones we saw during the digital transformations of the 2010s. We are already starting to see powerful roles like the Chief AI Officer assuming some core responsibilities of the CIO. It will be interesting to see if CIOs can deploy enough infrastructure automation to carve out a strong focus on AI or ultimately cede that territory to this newcomer in the C-suite.”

Austin Parker, Head of Open Source at Honeycomb

Future of observability

“As more and more organizations try to understand the value they’re getting from their existing monitoring strategies, observability will get put under a microscope in the coming year. We’ll begin to see cracks in the traditional model of observability vendors as more companies look to consolidate their OpenTelemetry data, with an eye toward powering predictive AI models. With OpenTelemetry being the crucial piece where generative AI and observability meet, its data can power next-generation models with system reliability and performance.”

Future of open-source

“Open source will continue to power the world, and projects like OpenTelemetry will continue to gain mass adoption as they stabilize, and the ecosystem matures. We’ll also see more growth within open-source development as developer productivity and software development lifecycle tools enter into the foundation model of governance, such as OpenTofu. The future is, ultimately, decentralized, and we’ll start to see open source rise up as the best way and place to build protocols for the many tools and applications being used.”

Brian Peterson, Co-Founder and Chief Technology Officer at Dialpad

Influx of data talent/AI skills

“As businesses continue to embrace AI, we’re going to see not only an increase in productivity but also an increase in the need for data talent. From data scientists to data analysts, this knowledge will be necessary in order to sort through all the data needed to train these AI models. While recent AI advancements are helping people comb through data faster, there will always be a need for human oversight – employees who can review and organize data in a way that’s helpful for each model will be a competitive advantage. Companies will continue looking to hire more data-specific specialists to help them develop and maintain their AI offerings. And those who can’t hire and retain top talent – or don’t have the relevant data to train to begin with – won’t be able to compete.

Just like we all had to learn how to incorporate computers into our jobs years ago, non-technical employees will now have to learn how to use and master AI tools in their jobs. And, just like with the computer, I don’t believe AI will eliminate jobs, more so that it will shift job functions around the use of the technology. It will make everyone faster at their jobs, and will pose a disadvantage to those who don’t learn how to use it. ”

The commoditization of data to train AI

“As specialized AI models become more prevalent, the proprietary data used to train and refine them will be critical. For this reason, we’re going to see an explosion of data commoditization across all industries. Companies that collect data that could be used to train chatbots, take Reddit for example, sit on an immensely valuable resource. Companies will start competitively pricing and selling this data.”

Wayne Eckerson, President at Eckerson Group

“Within five years, most large companies will implement a data product platform (DPP), otherwise known as an internal data marketplace, to facilitate the publication, sharing, consumption, and distribution of data products.”

Helena Schwenk, VP, Chief Data & Analytics Officer at Exasol

FinOps becomes a business priority, as CIOs analyze price / performance across the tech stack

“Last year, we predicted that CFOs would become more cloud-savvy amidst recession fears, and we watched this unfold as organizations shifted to a “do more with less” mentality. In 2024, FinOps practices the financial governance of cloud IT operations, as the business takes aim at preventing unpredictable, sometimes chaotic, cloud spend and gains assurance from the CIO that cloud investments are aligned with business objectives.

As IT budgetary headwinds prevail, the ability to save on cloud spend represents a real opportunity for cost optimization for the CIO. One of the most important metrics for achieving this goal is price/performance, as it provides a comparative gauge of resource efficiency in the data tech stack. Given most FinOps practices are immature, we expect CIOs to spearhead these efforts and start to perform regular price/performance reviews.

FinOps will become even more important against the backdrop of organizations reporting on ESG and sustainability initiatives. Beyond its role in forecasting, monitoring, and optimizing resource usage, FinOps practices will become more integral to driving carbon efficiencies to align with the sustainability goals of the organization.”

AI governance becomes C-level imperative, causing CDOs to reach their breaking point

“The practice of AI governance will become a C-level imperative as businesses seek to leverage the game-changing opportunities it presents while balancing responsible and compliant use. This challenge is further emphasized by the emergence of generative AI, adding complexity to the landscape.

AI governance is a collective effort, demanding collaborative efforts across functions to address the ethical, legal, social, and operational implications of AI. Nonetheless, for CDOs, the responsibility squarely rests on their shoulders. The impending introduction of new AI regulations adds an additional layer of complexity, as CDOs grapple with an evolving regulatory landscape that threatens substantial fines for non-compliance, potentially costing millions.

This pressure will push certain CDOs to their breaking point. For others, it will underscore the importance of establishing a fully-resourced AI governance capability, coupled with C-level oversight. This strategic approach not only addresses immediate challenges, but strengthens the overall case for proactive and well-supported AI governance going forward.”

Florian Wenzel, Global Head of Solution Engineering at Exasol

Expect AI backlash, as organizations waste more time and money trying to ‘get it right’

“As organizations dive deeper into AI, experimentation is bound to be a key theme in the first half of 2024. Those responsible for AI implementation must lead with a mindset of “try fast, fail fast,” but too often, these roles need to understand the variables they are targeting, do not have clear expected outcomes, and struggle to ask the right questions of AI. The most successful organizations will fail fast and quickly rebound from lessons learned. Enterprises should anticipate spending extra time and money on AI experimentation, given that most of these practices are not rooted in a scientific approach. At the end of the year, clear winners of AI will emerge if the right conclusions are drawn.

With failure also comes greater questioning around the data fueling AI’s potential. For example, data analysts and C-suite leaders will both raise questions such as: How clean is the data we’re using? What’s our legal right to this data, specifically if used in any new models? What about our customers’ legal rights? With any new technology comes greater questioning, and in turn, more involvement across the entire enterprise.”

Chad Thompson, CMO at Exasol

Data silos come crashing down as data democratization finally happens, creating a need to train the workforce on insights-driven skill sets

“The year 2024 is when data democratization will shift from a topic of discussion to action within organizations. More people, across various departments, will finally have access to meaningful insights, alleviating the traditional bottlenecks caused by data analytics teams. As these traditional silos come crashing down, organizations will realize just how wide and deep the need is for their teams and individuals to use data. Even people who don’t currently think they are an end user of data will be pulled into feed off of data, with 2024 being the catalyst for such change.

With this shift comes a major challenge to anticipate in the coming years — the workforce will need to be upgraded in order for every employee to gain the proper skill set to effectively use data and insights to make business decisions. Today’s workforce won’t know the right questions to ask of its data feed, or the automation powering it. The value of being able to articulate precise, probing and business-tethered questions just increased in value, creating a dire need to train the workforce on this capability.”

Mathias Golombek, Chief Technology Officer at Exasol

AI shifts from reactionary to intentional, unlocking opportunity while eliminating data collection-based roles

“The year 2023 introduced AI, which caused knee-jerk reactions from organizations that ultimately spawned countless poorly designed and executed automation experiments. In 2024, AI will shift from reactionary to strategic, rooted in purposeful proofs of concept that bring more clarity and focus on business objectives. We’ll see more business benefit-driven use cases leveraging AI and ML than ever before.

As AI is paired with other technologies, like open source, we’ll see new models emerge to solve traditional business problems. Generative AI, like ChatGPT, will also merge with more traditional AI technology, such as descriptive or predictive analytics, to open new opportunities for organizations and streamline traditionally cumbersome processes.

As a result, AI will continue to eliminate redundant job roles that involve high levels of repetition, data collection and data processing, with customer service, retail sales, manufacturing production and office support expected to be most impacted by the end of 2024.”

Mike Scott, CISO at Immuta

Data visibility will help to drive regulatory compliance across organizations

“As privacy laws continue to evolve and new regulations are introduced, it’s becoming much more challenging to determine which regulations apply to you, which to apply to your data set, and how to adhere to each regulation. Despite the proliferation of policies, most companies still don’t have a dedicated privacy team or leader responsible for this. Looking ahead, having the ability to easily look across your data and understand what kind of data you have, where it’s at, and what policies apply to that will be incredibly important to ensure compliance.”

Continued transition to the cloud creates more third-party risk

“Third-party risk will evolve as a big data-security-related challenge in the coming year as organizations of all sizes continue their transition to the cloud. It’s clear teams can’t accomplish the same amount of work at scale with on-prem solutions as they can in the cloud, but with this transition comes a pressing need to understand the risks of integrating with a third party and monitor that third party on an ongoing basis. Organizations tend to want to move quickly, but it’s important that business leaders take the time to evaluate and compare the security capabilities of these vendors to ensure they are not introducing more risk to their data.”

Claude Zwicker, Senior Product Manager & Data Mesh SME at Immuta

Data mesh becomes more business driven vs IT driven

“As AI continues to increase the appetite for more data distributed everywhere, business professionals will continue to explore conceptual solutions like data mesh that enable data democratization. But as money gets tighter and investments become more targeted, IT leaders must shift the way they look at data mesh adoption to turn it from a buzz into a reality, focusing more on the milestones they can hit to show business value, rather than being solely fixated on reaching the perfect state of implementation.”

Arina Curtis, CEO and Co-Founder at DataGPT

Data and Business Teams Will Lock Horns Onboarding AI Products

While business user demand for AI products like ChatGPT has already taken off, data teams will still impose a huge checklist before allowing access to corporate data. This tail wagging the dog scenario may be a forcing function to strike a balance, and adoption could come sooner than later as AI proves itself as reliable and secure.”

Businesses Big and Small Will Prioritize Clean Data Sets

“As companies realize the power of AI-driven data analysis, they’ll want to jump on the bandwagon – but won’t get far without consolidated, clean data sets, as the effectiveness of AI algorithms is heavily dependent on the quality and cleanliness of data. Clean data sets will serve as the foundation for successful AI implementation, enabling businesses to derive valuable insights and stay competitive.”

Giorgio Regni, CTO at Scality

End users will discover the value of unstructured data for AI

“The meteoric rise of large language models (LLMs) over the past year highlights the incredible potential they hold for organizations of all sizes and industries. They primarily leverage structured, or text-based, training data. In the coming year, businesses will discover the value of their vast troves of unstructured data, in the form of images and other media.

This unstructured data will become a useful source of insights through AI/ML tooling for image recognition applications in healthcare, surveillance, transportation, and other business domains. Organizations will store petabytes of unstructured data in scalable “lakehouses” that can feed this unstructured data to AI-optimized services in the core, edge and public cloud as needed to gain insights faster.”

Molly Presley, SVP of Marketing at Hammerspace

Unstructured Data Sets Missing Link to Successful AI Data Pipelines

“Organizations will put distributed unstructured data sets to work to fortify their AI strategies and AI data pipelines while simultaneously achieving the performance and scale not found in traditional enterprise solutions. One of the biggest challenges facing organizations is putting distributed unstructured data sets to work in their AI strategies while simultaneously delivering the performance and scale not found in traditional enterprise solutions. It is critical that a data pipeline is designed to use all available compute power and can make data available to the cloud models such as those found in Databricks and Snowflake. In 2024, high-performance local read/write access to data that is orchestrated globally in real time, in a global data environment, will become indispensable and ubiquitous.”

Data Orchestration Takes Center Stage

“Organizations will start moving away from “store and copy” to a world of data orchestration. Driven by AI advancements, robust tools now exist to analyze data and tease out actionable insights. However, file storage infrastructure has not kept pace with these advancements. Unlike solutions that try to manage storage silos and distributed environments by moving file copies from one place to another, data orchestration helps organizations integrate data into a single namespace from different silos and locations and automates the placement of data when and where it’s most valuable, making it easier to analyze and derive insights. IT organizations need the flexibility to use all of their data – structured, semi-structured and unstructured – for iteration and may need to move different data sets to different models. The data orchestration model allows organizations to realize the benefits of eliminating copying data to new files and repositories – including reducing the time to inference from weeks to hours for large data environments.”

Data Teams Embrace the Value of Metadata to Automate Data Management

“In 2024, data teams will increasingly use rich, actionable metadata to derive value from data. With the continued growth and business value of unstructured data across all industries, IT organizations must cope with increasing operational complexity when they manage digital assets that span multiple storage types, locations, and clouds. Wrangling data services across silos in a hybrid environment can be an extremely manual and risk-prone process, made more difficult by incompatibilities between different storage types. Metadata has the power to enable customers to solve these problems. Machine-generated metadata and data orchestration are crucial to data insights. “

Amer Deeba, CEO and Co-Founder at Normalyze

SEC Regulations Will Impact The One Area We Don’t Want to Talk About: Your Data

“As we know, the new SEC transparency requirements and ruling now requires public companies to disclose cybersecurity posture annually and cyber incidents within four days after determining an incident was material. In 2024, this major policy shift will have a significant effect on one key area: data, forcing businesses to think about security with data at the forefront. In response, enterprises will dedicate both effort and budget to support the SEC’s data-first strategy – implementing best practices that assure shareholders that their company’s most valuable asset – data – is protected. In 2024, companies will need to discover where their data resides and who can access it, while proactively remediating risks that have the highest monetary impact in the event of a breach. When faced with this dilemma companies will lean on automation, specifically end-to-end, automated solutions that center on a holistic approach.

The recent ALPHV/Black Cat and MeridianLink breach underscores the importance for businesses of understanding exactly what data they have, where it lives, and how it is protected. In order to answer critical questions with confidence in the event of a breach and lower the probability of a breach, companies need to build better defenses. The risk of exposure/tagging is not novel, but with these new disclosure requirements, securing the target of such attacks – the data – has gone from a good first practice to an absolute necessity. Being proactive means that if a breach does occur, you can respond quickly, answer these critical questions, be in compliance with the SEC requirements, and most importantly — respond. To summarize, in 2024 we’ll see organizations separated by their approach to data security. With these regulations, there is no alternative. Organizations must effectively remediate risks to lucrative sensitive data before breaches occur. Only this will allow organizations to respond decisively and confidently if an incident occurs.”

To Address the Influx of Data, Security Teams Must Approach Data Security Like a Team Sport

“As AI booms, the industry is facing increasing complexity and an influx of data, and companies are grappling with how to keep it all secure. In the height of AI technology adoption, companies will need to refocus in 2024 on what matters most – protecting their data as it gets used by machine learning modes and new AI technologies. Businesses need to change their approach: the success of the coming year for organizations big and small will come back to how they do so. The challenges that this will bring require the profound depth and efficiencies of AI and automated processes to ensure protection of cloud-resident sensitive data. As demands around data change in 2024, organizations will need to invest in their security and cloud ops teams, approaching data security like a team sport, building more efficient shared responsibility models to better protect data. These teams can then regain visibility of all data stores within an enterprise’s cloud or on-premises environment and trace possible attack paths, overprovisioned access, and risks that can lead to data exposure. Only by identifying the approach to data, ensuring permissions and privileges and efficiently implementing AI will companies enable their teams to be successful in 2024.”

Kurt Markley, Managing Director, Americas at Apricorn

Data Management Within Security Policy

“Data is no longer a byproduct of what an organization’s users create; it is the most valuable asset organizations have. Businesses, agencies and organizations have invested billions of dollars over the past decade to move their data assets to the cloud; the demand is so high that Gartner expects that public-cloud end user spending will reach $600B this year. These organizations made the move to the cloud, at least in part, because of a perception that the cloud was more secure than traditional on-prem options.

It’s estimated that 30 percent of cloud data assets contain sensitive information. All that data makes the cloud a juicy target and we expect that 2024 will continue to show that bad actors are cunning, clever and hard-working when it comes to pursuing data. The industry has seen triple the number of hacking groups attacking the cloud, with high-profile successes against VMware servers and the U.S. Pentagon taking place this year.

As IT teams spend more on moving and storing data in the cloud, organizations must spend the next 12 – 24 months auditing, categorizing and storing it accordingly. They need to gain deeper visibility into what data they have stored in the cloud, how data relates to each other, and if it is still meaningful to the operations of the organization. In doing so, they are advised to create specific security policies about how, where and for how long they store their data. These policies, when actively enforced, will help organizations better protect their most valuable asset – their data.”

Tanya Bragin, VP Product at ClickHouse

Competition will drive open platform support

“Proliferation of new vendors innovating in the data space will force incumbents to support competing technologies – or risk losing customers. For instance, Snowflake is under pressure from Databricks to support data lakes, a more open data architecture.”

A new class of data warehousing will emerge

“Snowflake, BigQuery, and Redshift brought enterprise data to the cloud. In 2024 we’ll see a new generation of databases steal workload from these monolithic data warehouses. These real-time data warehouses will do so by offering faster and more efficient handling of real-time data-driven applications that power products in observability and analytics.”

A fresh opening for challengers

“Despite the bear market for nascent companies in 2023, next year I expect to see operationally efficient startups across all sectors pose a real challenge to incumbents.

That’s because many organizations run all of their data operations on platforms like Snowflake, which leads to performance challenges in non-data warehouse specific use cases such as real-time analytics.

A degraded user experience and significant runaway costs are then opening up the field for challenger companies that build on better infrastructure from the start.”

Doug Kimball, CMO at Ontotext

Shift from How to Why: Enter the Year of Outcome-based Decision Making

“In 2024, data management conversations will experience a transformative shift and pivot from “how” to “why.” Rather than focusing on technical requirements, discussions next year will shift to a greater emphasis on the “why” and the strategic value data can bring to the business. Manufacturers recognize that data, once viewed as a technical asset, is a major driver of business success. Solution providers that deal with these needs are also seeing this change, and would be wise to respond accordingly.

In the coming year, data strategy and planning will increasingly revolve around outcomes and the value/benefit of effective data management, as leaders better understand the key role data plays in achieving overarching business objectives. Manufacturers will also reflect on their technology spend particularly those that have yielded questionable results or none at all. Instead of technical deep dives into intricacies like data storage and processing, crafting comprehensive data strategies that drive lasting results will be the priority.

Next year, manufacturers will move beyond technical deep-dives and focus on the big picture. This strategic shift signals a major change in the data management mindset for 2024 and beyond, ideally aligning technology with the broader objectives of the business such as driving growth, enhancing customer experiences, and guiding informed decision-making.”

Atanas Kiryakov, CEO at Ontotext

Manufacturers (finally) Manage the Hype Around AI

“As the deafening noise around GenAI reaches a crescendo, manufacturers will be forced to temper the hype and foster a realistic and responsible approach to this disruptive technology. Whether it’s an AI crisis around the shortage of GPUs, climate effects of training large language models (LLMs), or concerns around privacy, ethics, bias, and/or governance, these challenges will worsen before they get better leading many to wonder if it’s worth applying GenAI in the first place.

While corporate pressures may prompt manufacturers and supply chain organizations to do something with AI, being data driven must come first and remain top priority. After all, ensuring data is organized, shareable, and interconnected is just as critical as asking whether GenAI models are trusted, reliable, deterministic, explainable, ethical, and free from bias.

Before deploying GenAI solutions to production, manufacturers must be sure to protect their intellectual property and plan for potential liability issues. This is because while GenAI can replace people in some cases, there is no professional liability insurance for LLMs. As a result, business processes that involve GenAI will still require extensive “humans-in-the-loop” involvement which can offset any efficiency gains.

In 2024, expect to see vendors accelerate enhancements to their product offerings by adding new interfaces focused on meeting the GenAI market trend. However, organizations need to be aware that these may be nothing more than bolted-on band aids. Addressing challenges like data quality and ensuring unified, semantically consistent access to accurate, trustworthy data will require setting a clear data strategy, as well as taking a realistic, business driven approach. Without this, manufacturers will continue to pay the bad data tax as AI/ML models will struggle to get past a proof of concept and ultimately fail to deliver on the hype.”

Knowledge Graph Adoption Accelerates Due to LLMs and Technology Convergence

“A key factor slowing down knowledge graphs (KG) adoption is the extensive (and expensive) process of developing the necessary domain models. LLMs can optimize several tasks ranging from the evolution of taxonomies, classifying entities, and extracting new properties and relationships from unstructured data. Done correctly, LLMs could lower information extraction costs, as the proper tools and methodology can manage the quality of text analysis pipelines and bootstrap/evolve KGs at a fraction of the effort currently required. LLMs will also make it easier to consume KGs by applying natural language querying and summarization.

Labeled Property Graphs (LPG) and Resource Description Frameworks (RDF) will also help propel KG adoption, as each are powerful data models with strong synergies when combined. So while RDF and LPG are optimized for different things, data managers and technology vendors are realizing that together they provide a comprehensive and flexible approach to data modeling and integration. The combination of these graph technology stacks will enable manufacturers to create better data management practices, where data analytics, reference data and metadata management, data sharing and reuse are handled in an efficient and future proof manner. Once an effective graph foundation is built, it can be reused and repurposed across organizations to deliver enterprise level results, instead of being limited to disconnected KG implementations.

As innovative and emerging technologies such as digital twins, IoT, AI, and ML gain further mind-share, managing data will become even more important. Using LPG and RDF’s capabilities together, manufacturers can represent complex data relationships between AI and ML models, as well as tracking IoT data to support these new use cases. Additionally, with both the scale and diversity of data increasing, this combination will also address the need for better performance. As a result, expect knowledge graph adoption to continue to grow as manufacturers look to connect, process, analyze, and query the large volume data sets that are currently in use.”

Data Fabric Comes of Age and Will Employ Semantic Metadata

“Good decisions rely on shared data, especially the right data at the right time. Sometimes, the challenge encountered is that the data itself often raises more questions than it answers. This trend will continue to worsen before it improves, as disjointed data ecosystems with disparate tools, platforms, and disconnected data silos become increasingly challenging for manufacturers and supply chain organizations. This is why the concept of a data fabric has emerged as a method to better manage and share their data.

Data fabric’s holistic goal is the culmination of data management tools designed to identify, access, cleanse, enrich, transform, govern, and analyze. It is a tall order and is one that will take several years to mature before adoption happens across businesses.

Current solutions were not fully developed to deliver all the promises of a data fabric. In the coming year, manufacturers will incorporate knowledge graphs and artificial intelligence for metadata management to improve today’s offerings and will be a key criteria to making them more effective. Semantic metadata will serve as an enabling factor for decentralized data management, following the data mesh paradigm. It will also provide formal context about the meaning of data elements that are governed independently, serving different business functions and embodying different business logic and assumptions. Additionally, these solutions will evolve and incorporate self-learning metadata analytics, driving data utilization pattern identifications to optimize, automate, and access domain specific data through data products.

Data security, access, governance, and bias issues continue to routinely impact daily business processes, and with Generative AI getting so much attention, manufacturers will look to leverage a data fabric powered by semantic technologies to lower cost of ownership and operating costs, while improving data sharing and trust.”

Matt Watts, CTO at NetApp

Reimagining Your Data in 2024

“In today’s business climate, massive amounts of data and its related metadata are created from every Web click, swipe, and customer conversation where they’re gathered and analyzed to gain insights that shape how companies compete in the market. It seems everyone is looking for those rare data nuggets that can drive efficiency and positively impact bottom lines. So, as we wrap up 2023, I wanted to look ahead and share some areas where data-driven companies can focus their energy and investments to move the needle in 2024.”

Generative AI Needs Steady Data Pipelines

“Lately it seems everyone is looking for ways to use generative AI, whether in business or for personal use. This near ubiquity has presented potential uses that may be limited only by one’s imagination. One particularly valuable development is in the use of large language models that can process text and images that can react and respond when prompted. You may have spoken to an AI-based customer service representative in a voice call or even interacted, unknowingly, with an AI avatar that was more human than robot. But as AI gives us new insights and helps us transform the way we do business, we need these models to move faster, be more intuitive, and become more like us. But they are limited by the pre-set parameters of developers and data scientists. The answer? Data pipelines that can constantly feed AI-driven applications so they can constantly learn and become more complex. Companies that look toward data pipelines for AI in 2024 will see greater innovation and become more agile.”

Free the Data!

“In 2024, we’ll see companies remove silos between data types stored within their organizations through unified data storage, so it can be used for data-hungry AI and analytics applications. Currently, companies who do dive into their data for deeper, more actionable, business insights, they often find their existing architecture and infrastructure deliver siloed data pulled separately depending on data type (product, customer, partner, employee). Users may feel hamstrung by outdated platforms, computing models, and storage systems and locations that limit flexibility and prevent quick adaptability when business needs or market conditions demand it. We’ll see more consolidation of data silos into single streams so they can be managed as a single source. Through unified data storage, companies can make data more accessible, analytics more comprehensive as they find connections that were missed previously.”

James Beecham, Founder and CEO at ALTR

Data governance will ‘Shift Left’

“2024 will be the year when data access governance and security will “Shift Left.” Here, organizations will implement data governance and security measures earlier in the data journey, to the left of a cloud data warehouse, which will not only protect sensitive information, but will also improve the overall quality of the data collected. With the increasing number of regulations regarding data privacy and security, companies that prioritize data governance and security early on will be better equipped to comply with these regulations. In 2024, expect to see a surge of companies prioritizing shift left data governance and security – allowing them to initiate strong data access governance and data security capabilities available on cloud data warehouses and lake houses and extending them back to the data as it leaves source systems.

By achieving data access governance and data security practices further left in the data stream, data users ensure that the proper policies are attached, and applied, throughout the data journey to the cloud and to the data users themselves. This will be critically important in 2024, because sensitive and regulated data left unprotected prior to reaching the cloud data warehouse means that data is at a high risk of exposure.”

Do-it-yourself data governance will accelerate security breaches & compliance errors

“In 2023, many companies turned to do-it-yourself (DIY) data governance to manage their data. Yet, without seeking the help of data governance experts or professionals, this proved to be insufficient due to compliance gaps and the data security errors it leaves in its wake.

While do-it-yourself data governance seemed like a cost-effective solution, it has serious consequences for companies leaving them exposed to data breaches and other security threats. This is because DIY data governance often lacks the comprehensive security protocols and expertise that professional data governance provides leading to both data breaches and other security threats. Worse, the approach often involves piecemeal solutions that do not integrate well with each other, creating security gaps and leaving data vulnerable to attack. As a result, DIY data governance may not be able to keep up with the constantly evolving data privacy landscape, including new regulations and compliance requirements.

Companies that rely on do-it-yourself data governance are exposing themselves to significant risks and will see the repercussions of this in 2024. The importance of well implemented data governance practices in the coming year cannot be overstated. Next year, more and more companies will engage in SaaS-based data governance solutions that are able to scale with their business to ensure the security and integrity of their data.”

With increased data sharing, comes increased risk

“Two things will drive an increased need for governance and security in 2024. First, the need to share sensitive data outside of traditional on-premise systems means that businesses need increased real-time auditing and protection. It’s no surprise that sharing sensitive data outside the traditional four walls creates additional risks that need to be mitigated, so next year, businesses need – and want – to ensure that they have the right governance policies in place to protect it.

The other issue is that new data sets are starting to move to the cloud and need to be shared. The cloud is an increasingly popular platform for this, as it provides a highly scalable and cost-effective way to store and share data. However, as data moves to the cloud, businesses need to ensure that they have the right security policies in place to protect data, and that these policies are being followed. This includes ensuring that data is encrypted both at rest and in transit, and that the right access controls are in place to ensure that only authorized users can access the data.

In 2024, to reduce these security risks, businesses will make even more of an effort to protect their data no matter where it resides.”

The increase in data consumption will create the need for centralized and streamlined data access

“The anticipated growth in data consumption and data sources in 2024 will create a greater need for centralized and streamlined data access. The rapid expansion of data sources and the volume of data being generated will make it even more difficult for organizations to manage and utilize data effectively, making data access governance and streamlined security a priority.

In the coming year, organizations will centralize data access to improve their decision-making processes, providing users with accurate and up-to-date information on the data they have stored. As the amount of data increases, they will also use this to keep tabs on their stored data to understand who is accessing it, and how frequently. Using a system where data can be easily searched, sorted, and accessed, enables business users to quickly find the information they need and ensure that data access is granted only to those that are approved to receive it.”

Phillip Merrick, Co-Founder and CEO at pgEdge

“Regulatory pressures and the recognition of cloud concentration risk mean more business-critical applications will need to be truly multi-cloud, able to operate across — and failover between — multiple clouds.”

“We will see distributed databases simplify a lot of messy, complicated and costly architectures that rely on middleware and spaghetti API connections to just move data around. (And that’s coming from someone who sold literally billions of dollars worth of middleware software!)”

“In 2024 more organizations will realize the only sensible approach to data residency and data sovereignty issues is to use selective replication to keep local data local while sharing data globally as permitted.”

Yeshwant Mummaneni, Chief Engineer, Cloud at Altair

Blockchain Plays the Hero in Securing Data Lineage

“As AI/ML models play key roles in critical decision-making, whether supervised by humans or in a completely autonomous fashion, model provenance/lineage becomes crucial. The foundational technology that powered blockchain to provide immutability of records, digital identities, signatures, and verifications leveraging cryptography will become a key aspect of enterprise AI to provide tamper proof model provenance.”

Angel Vina, CEO & Founder at Denodo

Data Anti-Gravity Will Prevail

“The notion of data gravity, which is an analogy of the nature of data and its ability to attract additional applications and services, no longer exists. Every organization with a modern data strategy needs a data warehouse alongside a data lake, if not multiple ones, to fulfill their business needs. In the last two decades, data warehouses and data lakes became popular to solve enterprise data silo problems, yet what they created were even bigger problems. This is because data warehouses and data lakes are comprised of both on-premises and cloud systems, and they are often geographically dispersed. Also, even though every cloud service provider tries to solve many data and analytics problems independently, most organizations run their data and analytics in a multi-cloud environment, cherry-picking products and services from two or more cloud service providers.

This is why data anti-gravity, where data and applications remain distributed across regional and cloud boundaries, will be the new norm in 2024 and beyond. Other factors contributing to data anti-gravity will be the rising costs of data replication, data sovereignty, local data governance laws and regulations, and the requirement for accelerated speed-to-insight. As the data anti-gravity trend continues, data management leaders should invest in technologies that are built on the premise of distributed data management.”

Data Products Will Rise in Importance

“2024 will be a pivotal year for the ascent of data mesh, which embraces the inherently distributed nature of data. In contrast with traditional, centralized paradigms in which data is stored and managed by a central data team that delivers data projects to business users, data mesh is organized around multiple data domains, each of which is managed by the primary business consumers of that data. In a data mesh, the role of IT shifts to providing the foundation for data domains to do their work, i.e., the creation and distribution of data products throughout the enterprise.

The turning point will be the realization that data products should be treated with the same level of importance as any other product offering. Take, for instance, a Tylenol capsule: its value is not just in the capsule itself but in the comprehensive package that earns consumer trust—from the description and intended use to the ingredient list and safety measures. Similarly, data catalogs act as the crucial “packaging” that turns raw data into reliable, consumable assets.

In this data-centric era, it is not enough to merely package data attractively; organizations need to enhance the entire end-user experience. Echoing the best practices of e-commerce giants, contemporary data platforms must offer features like personalized recommendations and popular product highlights, while also building confidence through user endorsements and data lineage visibility. Moreover, these platforms should facilitate real-time queries directly from the data catalog and maintain an interactive feedback loop for user inquiries, data requests, and modifications. Just as timely delivery is essential in e-commerce, quick and dependable access to data is becoming indispensable for organizations.”

Data Security and Governance Will Need To Be Simplified

“Poorly integrated data impacts the agility of an organization on many levels, but this impact is perhaps felt most strongly on data security and governance. Because it takes time to update the myriad of siloed systems individually, it is impossible to secure or govern all enterprise systems simultaneously.

To meet this challenge, organizations are leveraging global policies for data security and governance. Global data security policies can be based not only on user roles, but also on location, so that a person on vacation might not be able to access the data from the main office. Global data governance policies too can automatically standardize the spelling of certain words, across the different systems within a company.

However, in order to synchronize the application of global policies in real time, such data security and governance implementations require the foundation of a logical approach to data management, and such an approach is covered in the next section.”

Sunny Bains, Software Architect at PingCAP

Shrinking IT budgets force data management reckoning

“With IT budgets shrinking, many high-flying companies will find their early data management decisions coming back to haunt them. For years, fast-growing businesses have tended to develop their own custom data infrastructures — sometimes by choice, other times by necessity. The trouble is, those kinds of bespoke systems are extremely complex. The cost to maintain them is very high. You need specialists in Kafka, Redis, MySQL and 20 other services to make them all work together. The more successful you become, the greater the challenge. If you have unlimited resources, you can just throw more people and servers at the problem. But when the flow of easy money stops, you need to find more sustainable solutions and try to consolidate to reduce cost and simplify management. Already you see companies like Pinterest adopting technologies like distributed SQL to simplify their management load while improving their ability to scale their services at a lower cost. That’s just the tip of the iceberg. In 2024, we’ll see more rocket-ship companies make similar moves as they confront the need to do more with less. There is a clarion call for greater efficiency and simpler operations — for example, to consolidate databases and reduce operational complexity by using more modern multi-tenant-capable distributed SQL databases.”

Barzan Mozafari, CEO at Keebo

2024 will be the year data teams must take action on Snowflake costs

“In 2023, Instacart, one of Snowflake’s biggest customers, revealed they saved 50 percent on their Snowflake costs. This led to a lot of discussion online and an unprecedented amount of information from Snowflake, including:

Many customer sessions at Snowflake Summit on cost savings
The release of guides on optimizing costs
A new set of budget reporting tools

It will be natural, then, for the finance officers of Snowflake customers to expect their data teams to replicate Instacart’s success. And with so much information from Snowflake on how to do it, there will be little room for excuses. The only problem is that data teams are already overburdened with no time to optimize. This will lead either to failure in reducing costs, having to hire additional data engineers, or teams turning to automated solutions.”

Ciaran Dynes, Chief Product Officer at Matillion

“The role of the data engineer has radically expanded over the past decade. We even used to call them ETL developers, then it was big data engineers. The skills required for the modern data engineer range from ETL designer, to data modeller, to SQL developer, Site Reliability Engineer, to security analyst, and now with the advent of AI and Machine learning, to python programmer and data scientist.”

“There are new technologies to learn, from vector databases, large language models to train and tune, whether it’s ChatGPT, Bert, Claude, Lambda or beyond, plus there are new AI tools to use from AWS Bedrock, Azure OpenAI, Anthropic, and Databricks: it is all linguistic soup.”

“The next 12 months will be the year that tech companies make life simpler for data engineers.”

“Tools will come to market, be integrated into existing platforms to enable adding generative AI to existing data pipelines with the ability to deploy these models internally so that users can interact live with these models just like they already do with ChatGPT.”

“Regardless of the tools that come to market, the next year will also see huge demand for data engineers to retrain to master prompt engineering, how to fine tune these models, how to massively increase their productivity. The next year will see data engineers’ lives get so much more interesting.”

Chandler Hoisington, CPO at EDB

“We’ll see shifts in customer experiences, products, and operational efficiencies: Customer experiences will become more automated, personalized and contextualized; these experiences should be as close to a “natural human interface” as possible. Additionally, products will be delivered as-a-service innovations driven through software-defined, data-driven, AI-infused functionality. We will also continue to see an increase in rich edge data analytics and generative AI to power operational efficiencies.”

“Although customers are at different stages of their open source and data ecosystem journeys, their common challenges are leveraging the full potential of their data and ensuring they have the reliable, secure, available data needed for AI or intelligent applications. In order to address these challenges, organizations will look to simplify their application stacks to realize the value of technologies like AI and LLMs. Given this, businesses should focus on two key factors when assessing whether to adopt new technologies: need and manageability.”

Jozef de Vries, Chief Product Engineering Officer at EDB

“While AI/Generative AI is a significant trend, the biggest shift has been towards “Data Democratization,” enabling wider access to data across organizations without compromising security. As we move into 2024, this trend will evolve with enhanced focus on data privacy and governance, balancing accessibility with compliance.”

“In the next year, the database space is expected to evolve significantly. The integration of AI and ML will make data language more akin to natural language, enhancing accessibility and supporting decision-making for a broader range of users with varying skill levels.”

“There will also be a greater focus on the data ecosystem, with increased partnerships between ecosystems and Cloud Service Providers (CSPs). Furthermore, the balance between operational databases and data warehouses is shifting, with analytics poised to surpass transactional workloads in importance and usage.”

Rohit Choudhary, CEO and Co-Founder at Acceldata

Real-Time AI Monitoring: A Data-Driven Future

“2024 will witness the rise of real-time AI monitoring systems, capable of detecting and resolving data anomalies instantaneously. This transformative technology will ensure data reliability and accessibility, especially for the ever-growing volume of unstructured data.”

Compliance Experts: Navigating the Data Regulatory Landscape

“In 2024, compliance gurus will emerge as essential players in ensuring data compliance amidst a complex regulatory landscape. Their expertise will be crucial in managing diverse data types, particularly unstructured data, which often poses unique compliance challenges.”

Collaboration: Bridging the Data Silos

“The traditional silos between IT, legal, and business departments will crumble in 2024, giving way to a more collaborative approach. This cross-functional synergy will ensure that data, including unstructured data, is not just collected but strategically utilized to drive business value.”

AI and ML Specialists: Demystifying Unstructured Data

“AI and ML will play a pivotal role in deciphering the unstructured data puzzle in 2024. Experts who can harness the power of AI and ML to extract insights from unstructured data, such as social media posts, videos, and customer reviews, will be highly sought after.”

Data Management: The Key to Unlocking Insights

“Data management will evolve beyond mere data storage in 2024. Organizations will recognize the strategic value of weaving data, including unstructured data, into their business strategies. This shift will unlock a wealth of insights and redefine business decision-making.”

Molham Aref, Founder and CEO at RelationalAI

Knowledge Graphs will Help Users Eliminate Data Silos

“As enterprises continue to move more data into a data cloud, they are collecting hundreds, thousands, and sometimes even tens of thousands, of data silos in their clouds. Knowledge graphs can easily drive language models to navigate all of the data silos present by leveraging the relationships between various data sources. In the new year, we will see a variety of established and novel AI techniques that support the development of intelligent applications emerge. ”

Peter Shafton, CTO at ngrok

Organizations will Take Back Control of Their Data

“Data management in 2024 will significantly shift towards greater accessibility and control. While the past decade witnessed a rush towards cloud-based data solutions, the pendulum is swinging back towards more self-management. The reasons behind this shift are twofold: privacy and cost-effectiveness. The constant threat of data breaches and the need for more stringent access control have made businesses wary of relying solely on external cloud platforms. Additionally, the unpredictability of cloud data storage and processing costs has led organizations to seek more predictable and cost-effective solutions. This trend is also facilitated by a proliferation of accessible and user-friendly data management tools, often originating from open-source solutions pioneered by tech giants like Uber, Netflix, and Airbnb.”

Brett Hansen, Chief Growth Officer at Semarchy

Government Regulations Continue to Drive Adoption of New Data Initiatives

“As federal governments become more data-savvy, they’ll realize the potential data holds to answer burning questions about consumer health and safety. For example, we’ve all read headlines about foodborne illness breakouts among popular fast-food chains. Tracing these breakouts is often arduous because many organizations — even enterprises — don’t understand the provenance of their ingredients. Remember, localized supply chains are incredibly interconnected and complicated. But with excellent data hygiene, this becomes a non-issue. Imagine a data solution that tracks your restaurant locations and categorizes which of your ingredients come from where. As more retailers adopt data solutions with these capabilities, governments will catch on, and regulations will dictate data provenance strategies as a must-have.”

Data Governance Evolves into Data Intelligence

“Data loss prevention and protection strategies ruled the roost during the early days of data governance. Although still useful for meeting governmental requirements, these tools may impede the effective exploitation of data. When data is locked away tightly, stewards can’t understand how their data is used, moved or accessed, so they cannot effectively improve their data storage and implementation practices. But I foresee a change coming soon.

Yes, data governance will remain vital for maintaining compliance. However, evolved data intelligence capabilities have now emerged, allowing practitioners to not only control data but also understand it — and these capabilities are a must in the modern business world. Mining metadata to comprehend its lifecycle will allow teams to more effectively support their business requirements. These enlightened governance strategies will help organizations achieve mutual goals of data compliance while also uncovering granular data insights.”

Bruce Kirchner, CTO at EvolveWare

“Expect data sourcing and legal challenges with using GenAI for code generation next year. In 2024, the challenges associated with using GenAI to generate code will surface more explicitly. This includes issues getting the appropriate types and amounts of data to give organizations sound results due to the fear of proprietary information being leaked. Navigating the legal landscape and mitigating intellectual property violations will also be a factor, as organizations will encounter issues around authorship and sourcing, permissions, open source licenses and other considerations. From a technology standpoint, we’ll see more time spent on R&D to improve models as opposed to the focus on building the models that we saw in 2023.

Modernizing to serverless and microservices-based architectures will bring new security challenges. Enterprises are increasingly seeking to modernize monolithic applications to new architectures such as microservices and serverless, and we’ll see this accelerate next year. However, alongside the benefits of these modern approaches, this trend will come with a new set of security challenges. This includes the need to track potential cybersecurity issues across multiple microservices, challenges ensuring that sensitive data is secure in the various states associated with serverless configurations, and the potential for data exposure that comes with sharing servers across multiple parties. These concerns will compel organizations to make security a core focus of their modernization initiatives.”

David Boskovic, Founder and CEO at Flatfile

Data Will be the Top Rate Limiter on Innovation

“The rapid expansion of data will continue to be a dominant trend in 2024. However, the ability to efficiently gather, process, and utilize this data will become the critical factor limiting or accelerating innovation within organizations. The challenge will lie in developing methods to quickly and securely assimilate this growing data influx, converting it into actionable insights. Companies that can effectively manage this data deluge, turning it into a strategic asset for innovation, will gain a competitive edge in the increasingly data-driven business landscape.”

Mark Van de Wiel, Field CTO at Fivetran

The pendulum for build versus buy is going to swing back to buy in 2024

“For most of 2022 and 2023 organizations felt they could save money with a build strategy. But guess what, engineering effort is not free. And it does not have the security or availability of a buy. Hence the build strategy hurts productivity big time.”

RAG (Retrieval Augmented Generation) will be the rage in 2024

“Companies will take foundational LLMs (Large Language Models), train and tune them with their own data, and churn out chatbots, copilots and similar utilities for external and internal productivity gains.”

Satyen Sangani, CEO and Co-Founder at Alation

Trusted data will become the most critical asset in the world

“The critical role of trusted data in AI systems is becoming a cornerstone for the future of technology. Ensuring the information and data that come out of the AI system are trustworthy is just as critical. In a world that’s inching closer and closer to artificial general intelligence (AGI), knowing what to trust and who to trust will be critical to everything we learn and everything we think we know.

Highlighting this shift, Forrester predicts that domain-specific, Large Language Model (LLM)-infused digital coworkers will soon assist 1 in 10 operational tasks. When tailored to specific business needs, these LLMs promise substantial investment returns. This trend has led organizations to focus more on finding, understanding, and governing high-quality, dependable data, which is vital for training AI models tailored to specific business requirements.

The result is that AI governance is going to gain importance quickly. It involves more than just managing data; it’s about understanding the entire lifecycle of information and models. The analogy of data as the new oil now seems insufficient in the era of generative AI and the challenges hallucinations bring. Merely amassing and analyzing large data sets is no longer adequate in today’s business environment.

In 2024 and beyond, trusted data – and all the tools associated with building trust in data – will be the number one commodity for organizations.”

2024 will be the year of the AI and data C-Suite leader

“If 2023 is the year that enterprise AI burst onto the scene, then 2024 will be a year of consolidation as businesses look to understand how to use it to gain a competitive advantage and comply with inevitable future regulations. To future-proof AI deployments, organizations will increasingly look to build out a role at the C-Suite level to oversee both AI innovation and compliance, but that won’t necessarily be in the form of a Chief AI Officer. Instead, AI will likely create a new generation of Chief Data Officers where existing data leaders develop new skill sets. Just as we’ve seen the rise of Chief Data and Analytics Officers, we could be about to see the start of a fresh generation of Chief Data and Artificial Intelligence Officers focused on ensuring the data foundations of AI models are compliant with new legislation and of a high enough quality to gain the business a competitive advantage. What’s certain is the rise of AI Governance committees, taking cross-functional roles in ensuring safe and efficient enterprise AI and partnering with Legal, Ethics, Security, and Privacy constituencies in the same way that Data officers have in years past.”

Jonathan Bruce, VP of Product Management at Alation

Building a strong data culture is no longer an option, but a strategic imperative

“In 2023, enterprises faced unparalleled challenges. As business leaders eagerly integrated cutting-edge technologies like generative AI into their strategies and raced to empower people with trusted data, they also had to contend with the pervasive influence of macroeconomic factors on every decision. Organizations with strong data leaders who successfully nurtured a data-centric mindset drove innovation and agility and gained a competitive edge, even amid today’s challenging market. 89 percent of organizations with strong data leadership say their organizations met or exceeded revenue goals in the past year. In 2024, data literacy (often coming as consumable data products) – a pillar of data culture – will be the defining factor between business success and failure. Organizations that empower technical and non-technical folks alike to understand data better and use trusted information for daily decisions will innovate faster, exceed revenue goals, and surpass the competition.”

AI and cloud synergy will usher in a new era of innovation and efficiency

“While the pandemic fueled the cloud computing frenzy, this year’s macroeconomic instability and tighter customer budgets were headwinds for cloud adoption. In the natural cyclical nature of any market, LLMs are emerging as the cloud’s greatest ally. The cloud’s scalability, adaptability, and cost-efficiency complement AI’s data-driven capabilities, offering a vast data storage and processing reservoir crucial for AI’s transformative power. Innovations are gathering pace, too, and commodity cloud infrastructure isn’t about to be left behind either, reducing the accessibility barriers to many. Organizations leveraging cloud technology gain access to extensive resources for training AI models and analyzing large datasets, benefiting from the cloud’s ability to handle diverse workloads efficiently. This democratization of AI, offered through cloud services, empowers organizations of all sizes, enabling seamless collaboration and rapid deployment of AI applications. Since this makes the cloud the leading conduit for organizations to usher in a new era of AI-driven innovation and efficiency, 2024 will bring a resurgence of cloud computing and transformation.”

Discoverable value will be synonymous with adoption

“Data literacy will be driven by the ability to package value in consumable components that drive the value directly to those who need it. Increasingly, enterprises will view the output of GenAI models as critical IP, giving them the competitive edge to differentiate and provide value and innovations to their customers in entirely new ways. The advent of highly specialized governance capabilities that enable the utility of these models, with greater trust and consumables as data products, will maximize their utility.”

Diby Malakar, VP of Product Management at Alation

Organizations that can’t communicate data ROI will be left behind

“As data, analytics, and AI present vast opportunities, organizations can no longer tolerate uncoordinated data efforts that lack a direct link to value creation. Harvard Business Review highlights that most Chief Data Officers (CDOs) struggle to quantify the business outcomes derived from data and analytics. Despite the growing importance of data management across industries, many teams find it challenging to define the true Return on Investment (ROI) of data initiatives. There is an urgent need for a comprehensive framework that helps organizations benchmark and improve their data management capabilities, now considered vital strategic undertakings. A significant obstacle remains: The lack of effective tools to measure the business impact of data initiatives. Moreover, the diverse data analysis methods across teams call for a unified system to assess usage statistics and nurture a consistent data culture. For organizations focused on uncovering new business opportunities, scaling innovative initiatives, and boosting the ROI of operational activities across various functions, mastering data ROI is not just beneficial, but essential.”

2024: The year of the data product

“Data mesh represents a transformative shift in modern data architecture, moving from centralized models to decentralized approaches. This paradigm shift empowers domain-specific teams with the autonomy to manage and govern their data independently, enhancing agility and understanding. Alongside this shift is the rise of ‘data products,’ akin to physical products, which redefine the focus of data and analytics. Instead of being project-based or one-time deliverables, these data products embody a continuous lifecycle, paralleling tangible goods. This will be a rising trend in 2024, with more organizations adopting data products to deliver value and drive significant business outcomes.”

Steve Santamaria, CEO at Folio Photonics

A Leap Towards Greener Data Management

“2024 is poised to bring significant strides in eco-friendly data handling. We can expect to see companies putting a greater emphasis on making data storage more efficient. This could involve streamlining how data is stored, embracing server virtualization, and moving towards advanced yet less power-hungry cooling systems. The goal here is twofold: to trim down data centers’ environmental impact and to shave off some of their operating expenses. There’s also going to be a tilt towards long-lasting, energy-smart storage options, such as optical storage, for a more sustainable data management approach.”

Sendur Sellakumar, CEO at Dremio

Generative AI hype train will continue to grow exponentially

“I think we’re still in a GenAI hype cycle, and I tend to be very practical. Things around GenAI have been very compelling. We hardly talked about GenAI a year ago; now we do, which is excellent.

Generative AI will be the future of user interfaces. All applications will embed generative AI to drive user interaction, which guides user productivity. Companies are embedding GenAI to do semantic searching to solve some of those old data problems – discovery becomes easier, creating pipelines becomes more accessible.”

Read Maloney, CMO at Dremio

Practice makes perfect

“In 2024, we’re going to see development best practices, in terms of code, make their way into data. The concepts of data uptime and data downtime, which are related to data observability and part of data operations, will come in.

Users have different ways to ensure the quality, and that if something goes wrong, you can pinpoint where it went wrong, saving you precious time. The increased complexity and costs are taking engineers away from the more important things and what they want to do, which is much higher value projects than shoveling data all day long.”

A new sheriff in town

“The adoption of Apache Iceberg will be huge in 2024 – it’s going to be the open format for interoperability and the customer choice for flexibility. For us, it comes back to the customer and they want to own their own data and we think Apache Iceberg is the format for that. It’s going to take over as the open standard in 2024.”

Register for Insight Jam (free) to gain exclusive access to best practices resources, DEMO SLAM, leading enterprise tech experts, and more!

This article was written by Tim King on December 7, 2023

Tim King

Executive Editor

Tim is Solutions Review's Executive Editor and leads coverage on data management and analytics. A 2017 and 2018 Most Influential Business Journalist and 2021 "Who's Who" in Data Management, Tim is a recognized industry thought leader and changemaker. Story? Reach him via email at tking@solutionsreview dot com.

Data Management News for the Week of July 18; Updates from Airbyte, Ataccama, Promethium & More - July 18, 2025
Data Management News for the Week of July 11; Updates from CapStorm, Denodo, Graphwise & More - July 11, 2025
Data Management News for the Week of July 4; Updates from Aerospike, IBM, Predibase & More - July 3, 2025

Best Practices

90 Data Management Predictions from 55 Experts for 2024

Data Management Predictions from Experts for 2024

Rahul Pradhan, Vice President of Product and Strategy at Couchbase

Anil Mahadev, Senior Solutions Architect at IDERA Software

Nima Negahban, CEO and Co-Founder at Kinetica

Vasu Sattenapalli, CEO at RightData

Shawn Rogers, CEO and Fellow at BARC

Naren Narendran, Chief Scientist at Aerospike

Jeremy Burton, CEO at Observe

Mav Turner, CTO at Tricentis

Jim Liddle, Chief Innovation Officer at Nasuni

Russ Kennedy, Chief Product Officer at Nasuni

Rex Ahlstrom, CTO and EVP at Syniti

Justin Borgman, Co-Founder and CEO at Starburst

Avthar Sewrathan, GM for AI and Vector at Timescale

Haoyuan Li, Founder and CEO at Alluxio

Sean Knapp, Founder and CEO of Ascend.io

Austin Parker, Head of Open Source at Honeycomb

Brian Peterson, Co-Founder and Chief Technology Officer at Dialpad

Wayne Eckerson, President at Eckerson Group

Helena Schwenk, VP, Chief Data & Analytics Officer at Exasol

Florian Wenzel, Global Head of Solution Engineering at Exasol

Chad Thompson, CMO at Exasol

Mathias Golombek, Chief Technology Officer at Exasol

Mike Scott, CISO at Immuta

Claude Zwicker, Senior Product Manager & Data Mesh SME at Immuta

Arina Curtis, CEO and Co-Founder at DataGPT

Giorgio Regni, CTO at Scality

Molly Presley, SVP of Marketing at Hammerspace

Amer Deeba, CEO and Co-Founder at Normalyze

Kurt Markley, Managing Director, Americas at Apricorn

Tanya Bragin, VP Product at ClickHouse

Doug Kimball, CMO at Ontotext

Atanas Kiryakov, CEO at Ontotext

Matt Watts, CTO at NetApp

James Beecham, Founder and CEO at ALTR

Phillip Merrick, Co-Founder and CEO at pgEdge

Yeshwant Mummaneni, Chief Engineer, Cloud at Altair

Angel Vina, CEO & Founder at Denodo

Sunny Bains, Software Architect at PingCAP

Barzan Mozafari, CEO at Keebo

Ciaran Dynes, Chief Product Officer at Matillion

Chandler Hoisington, CPO at EDB

Jozef de Vries, Chief Product Engineering Officer at EDB

Rohit Choudhary, CEO and Co-Founder at Acceldata

Molham Aref, Founder and CEO at RelationalAI

Peter Shafton, CTO at ngrok

Brett Hansen, Chief Growth Officer at Semarchy

Bruce Kirchner, CTO at EvolveWare

David Boskovic, Founder and CEO at Flatfile

Mark Van de Wiel, Field CTO at Fivetran

Satyen Sangani, CEO and Co-Founder at Alation

Jonathan Bruce, VP of Product Management at Alation

Diby Malakar, VP of Product Management at Alation

Steve Santamaria, CEO at Folio Photonics

Sendur Sellakumar, CEO at Dremio

Read Maloney, CMO at Dremio

Share This

Tim King

Executive Editor

Related Posts

From Chaos to Clarity: Why Ensuring Access to Connected Data Is the Key to Co...

Future-Proofing Your Business: Smart Data Strategies for Sustained Growth in ...

AI Infrastructure Investment is Accelerating: Are Enterprises?

Expert Insights

Latest Posts

Follow Solutions Review