Calculating the Damage of a Data Breach

By Dan Draper , Founder and CEO at CipherStash
Best Practices, Featured,

Data Breach

Solutions Review’s Contributed Content Series is a collection of contributed articles written by thought leaders in enterprise software categories. Dan Draper of CipherStash puts a number on the real cost of a data breach, and walks us through best practices to prevent them.

As more people engage with the internet, and as more businesses collect user data to gain insights and stay connected to potential customers, the number of data compromises — and the overall cost of those breaches — has exploded.

In 2021, a record 1,862 data compromises were recorded – up from just 157 in 2005. Although the number of hacking incidents dipped a year later, the total number of people affected by those breaches – more than 422 million – increased. The total average cost of a data compromise for a business peaked in 2022 – only to be topped in 2023, at $9.48 million.

Calculating these exact numbers requires a degree of speculation, but suffice to say that falling short in protecting sensitive user information is extremely bad business. And it may be even worse for forward-facing brands.

Calculating the Damage of a Data Breach

The largest confirmed data compromise of 2023 (through September) was marked by the exposure of 3.8 billion user records, the responsibility of which fell to a company called DarkBeam. It’s an objectively awful look – one with undeniable consequences – for an organization whose business is the monitoring and managing of supply chain cybersecurity risk. But ask yourself: Does the average consumer have a clue what DarkBeam does? Have they even heard of the company?

Compare that scenario with that of Yum! Brands. The parent company of fast food chains KFC, Taco Bell and Pizza Hut, Yum! revealed in the spring that a January incident had exposed certain corporate data, and upon further review found that some employee data had also been compromised. Although it seems no customer information was involved in the breach, and even though Yum! only handles sensitive data as a matter of course – not as a core area of its business – the implications could be nightmarish for a company featuring so many recognizable pillars.

After all, the average consumer likely eats at Taco Bell more often than they have ever interacted with, say, DarkBeam. Yum! reportedly closed almost 300 locations in the U.K. in the wake of its data breach, and it has cost the company dearly to clean up the mess. But given the organization’s profile, as well as the vast landscape of alternatives from which consumers have to choose, the real damage to its brands may be far greater than the math suggests.

Why Hacks Happen – and How to Avoid Them

The uncomfortable irony is that many, and perhaps most, data-breach incidents today might have been prevented by looking at data protection through a different lens. Among the most common sources of modern data leaks are:

Misconfigured software settings
Social engineering
Recycled or leaked passwords

These issues are typically considered as part of an end-user security program. Educating staff about good password hygiene or how to avoid falling prey to phishing scams are important strategies but put significant onus on individuals to maintain data security. Even if the majority of employees do an excellent job maintaining best practices, there will always be some who fall victim to an account compromise or scam.

When assessing the data security risk within an organization, we must ask two fundamental questions:

Is the data available strictly on a need to know basis?
Is every data access recorded accurately?

How an organization answers these questions can have a profound impact on its ability to defend against data breaches.

Limiting Access

Seventy percent of employees have access to data they shouldn’t, so the goal is to limit data access only to those who genuinely need it to do their job. This means that if an employee is compromised, the likelihood an attacker will get something of value is greatly diminished.

However, eliminating or even minimizing data access in practice can be a challenging task. Data is like sand after a day at the beach. Somehow, even after thoroughly washing your feet, a little sand seems to end up everywhere from in your sneakers to the floor of your car. While the database is often where data’s journey starts, you’ll almost certainly find some in a spreadsheet, on a file server or in multiple forms across a dozen different systems.

The Core Issue of Data Security

Herein lies the core issue– Traditional approaches to protecting data don’t actually protect data at all! They protect the systems where data resides. But strong access controls on the database provide no protection for data that has found its way into a spreadsheet on an employee laptop.

Encryption-in-use, traditionally known as row-level or field-level encryption promises to protect data directly. Newly collected data can be encrypted immediately and remain so from that point forward, no matter where it might end up. Such an approach can even cryptographically tag data so that access and retention policies can be applied when access is eventually needed by an authorized user. The encryption becomes a universal, deny-by-default layer that works anywhere and everywhere.

Despite its promise, encryption-in-use has had little adoption outside of specific use cases like protecting payment data. The reason is simple– encrypting individual data records means standard data tooling like SQL no longer functions correctly.

Take a database storing customer names and email addresses. Common queries might include fetching a customer by their email address or performing a partial match on name (a search for “Dan Dr” might match “Dan Draper” and “Daniella Dresden”). It is also often desirable to sort results alphabetically or by update recency. Such tasks are a breeze for any modern database but encrypt the records with encryption-in-use and everything stops. None of the queries mentioned above will work any longer.

Fully Homomorphic Encryption

A recent attempt to address this, Fully Homomorphic Encryption (FHE) makes standard operations like those used by a database server, work on data even when encrypted. But FHE has failed to live up to the hype due to woeful performance and eye-watering costs. For perspective, a query that might take a standard database a tenth of a second over unencrypted data, would take several hours if the data was encrypted using FHE. Even the most paranoid and cashed-up companies have given up on it.

However, another approach is not only showing great promise, it’s now being actively used in real world applications. After a number of recent advancements, searchable encryption is finally allowing encryption-in-use to deliver on its promise: to protect data directly rather than just the systems in which it resides. Vendors like MongoDB, Vaultree and my own company, CipherStash are making encryption-in-use possible using the technology available today.

Unlike general purpose FHE which supports all mathematical operations at the expense of efficiency, searchable encryption specializes to work just with the kinds of operations needed to retain query functionality in the database. A trade-off that makes it over 100,000x faster than FHE. Database queries using searchable encryption are so fast that end-users don’t even realize the data is encrypted (that is until they try to access something they’re not supposed to!).

Data Access Logs

Reflecting on the 2 fundamental questions I asked earlier, we’ve so far answered one: by protecting data directly, the problem of limiting access to data finally becomes tractable. How do we tackle the issue of recording data access reliably? Believe it or not, encryption-in-use has an answer for that as well.

Consider the challenge of reliably recording every data access using traditional technology. Recording data accesses in a database is not too difficult (though logging who accessed data is complicated by the fact that users often don’t access the database directly but do instead via applications that are connected to the database). But what if the company has hundreds or thousands of databases? Or, sensitive data is accessed in a spreadsheet? How then do you identify, log and centrally store audit information for every access?

Before answering this question, allow me to convince you why data access logging is so important. If a breach occurs, isn’t a record of what was accessed after the fact useless? The data is already gone! While the first priority should always be prevention, early detection is very nearly as good.

Most data breaches, particularly those of any scale, don’t happen in an instant. They play out over hours, days or even weeks. In over 40 percent of cases data is siphoned out of a data warehouse or file server using the credentials of a legitimate user. Ensuring that every access is logged along with exactly what was accessed, where from and the identity of the accessing user, makes unusual behavior easily detectable.

Regulating Access with Data Keys

Consider the digital patient records of a hospital. An oncologist working in the cancer ward suddenly accesses hundreds of patient records in the burn unit. Certainly, this unusual activity warrants further investigation. In the age of AI, not only are such anomalies easy to detect, but access can be revoked and a password reset sent to the doctor within the blink of an eye, well before a breach can take hold. The critical part of making all this possible is reliable and timely access data.

As data regulators in the US and around the world increase their demands to not just be made aware of data breaches, but for detailed and timely information about their materiality, reliable access data is also fundamental in maintaining compliance.

When a record protected with encryption-in-use is accessed, a decryption request is sent to a service called a key-server. The requesting user must identify themselves to the server and if the decryption is permitted, they will be returned a data key that decrypts only the exact record in question. Because the data key uniquely identifies the record, the key server can log what was accessed without ever seeing any sensitive data.

Data Breach: Final Thoughts and Best Practices

Airtight encryption practices can stave off much of the danger from the above scenarios, as well as other common risks. Reining in data access around an organization, and limiting the exposure of information to third parties, is another important step – one that is grounded as much in old-school standard operating procedure as it is in cybersecurity. And when data is interfaced by a company, comprehensive and clear logging of what is being accessed, by whom and for what reasons is an absolute must. If all else fails, records that paint a clear picture of the source of a data breach can help an organization more swiftly report an incident to the proper officials, offer greater transparency to ameliorate customers and begin filling the cracks in its cybersecurity foundation.

There is no such thing as fool-proof data-compromise prevention. Humans are, and always will be, at the heart of business operations – which means the corporate protection of sensitive data will always be subject to human error. But there are basic principles of cybersecurity that can – and should – be followed by even the most spartan of operations. A small investment of financial resources, supported by a healthy dose of care and concern for your customers and brand, go a very long way in building a firewall between bad actors and the sensitive information that belongs to the people who keep your business thriving.

This article was written by Dan Draper on January 22, 2024

Dan Draper

Founder and CEO

Dan Draper is the Founder and CEO of CipherStash. Dan is a lifelong coder and self-taught cryptographer, passionate about developing cutting-edge technology rooted in academic research. He previously worked as VP of Engineering at Medical Director and Expert360, and is a member of Australia’s Cyber Security Working Group-- an organization that prioritizes changes in data security regulation.

How to Assess the AI Readiness of Your Information Security Team

Best Practices

Calculating the Damage of a Data Breach