Security researchers Vinny Troia and Bob Diachenko discovered an unsecured server containing an unprecedented amount of personally-identifying information. Indeed, the server contained 4 terabytes of personal data—1.2 billion records in total. That is 1.2 billion records exposed and easily accessible on the internet.
As a result, hundreds of millions of people suffered in this data exposure. The data exposed includes home and mobile device numbers and social media profiles for Facebook, Twitter, LinkedIn, and Github. Additionally, it includes work histories—possibly pulled for LinkedIn—as well as 50 million unique phone numbers and 622 million email addresses.
Thankfully, the information did not include financial information such as credit card numbers, Social Security Numbers, or passwords. However, it still represents an unprecedented leakage of social media information in the history of cybersecurity.
One crucial mystery lingers around the exposure of 1.2 billion records—those responsible. The security researchers did trace the IP address but only traced it to Google Cloud Services. While many of the datasets within the server connected to San Francisco data broker People Data Labs, they denied ownership of the server; security researcher Troia agreed with their claim.
The researchers could not determine who created the server nor why. Further, they could not determine who found the server before they did or downloaded the data therein. The server proved easy to find and to access.
Adding to the mystery, the server disappeared within hours of security researchers informing the FBI.
Experts Weigh in on 1.2 Billion Records Exposed
Sudhakar Ramakrishna is CEO of Pulse Secure
“This type of data breach is alarming due to the sheer amount of personal information exposed and the potential fidelity added to social media attack vectors. There should be little comfort in the fact that credit card or SSN numbers were not exposed, given the massive volume of profiles and contact information of hundreds of millions of people.”
“The harsh reality of today’s evolving threat landscape and threat actor marketplace is this new data will be bought and sold on the dark web and can easily be combined with other exposed PII from one of the many data breaches in 2019 to create more comprehensive identity exploits. This highlights exactly why enterprises need to revisit auditing their data, access, controls and protection obligations. A zero-trust framework with orchestrated data protection mechanisms is necessary. Servers and storage, whether being serviced, repurposed or sold, should never have this type of data in the clear.”
Anurag Kahol is the CTO of Bitglass
“This unsecured database is one for the record books. Impacting 1.2 billion records, it is one of the largest leaks we have ever seen. Names, email addresses, and phone numbers, along with other social media profile information, were left public-facing. It is currently unknown who owns this database; however, they will surely face significant repercussions from regulatory bodies as well as the general public. There is no excuse for negligent security practices such as leaving databases exposed.
While the server was discovered by security researchers, there are tools designed to detect abusable misconfigurations within IT assets like ElasticSearch databases. While there have not yet been any reports of a breach stemming from this particular incident, it is unknown if any malicious parties found the data before the researchers did. Organizations must have full visibility and control over their customer data in order to prevent breaches. To do so, they should look for security solutions that remediate misconfigurations, enforce real-time access control, encrypt sensitive data at rest, manage the sharing of data with external parties, and prevent the leakage of sensitive information.”
Dvir Babila is Head of Product Management at CyCognito
“This is a massive breach and a major open question is who owned the server behind the breach. Troia noted in the original blog “all we can tell from the IP address (220.127.116.11) is that it is (or was) hosted with Google Cloud.” Determining the ownership of IT assets that exist in the shadows like this requires a lot of fingerprints, and you have to associate those fingerprints with other IT assets exposed on the internet to build a complete picture.
Doing this manually with tons of raw threat intelligence data is very challenging. Applying mathematical techniques, such as a graph data model, works well. With more of every organization’s IT assets living in cloud environments than ever, a new level of automation has to be applied to threat intelligence both for assessing risk and for dealing with post-incident forensics.”
How to Gain Visibility
Check out our Identity and Access Management Buyer’s Guide for enterprise-level solutions! Don’t let incidents like the 1.2 billion records exposed happen to you!
Latest posts by Ben Canner (see all)
- The 16 Best Identity Governance Tools for 2020 - February 18, 2020
- How Do Privileged Identity Management Tools Work? - February 12, 2020
- Expert Commentary on Safer Internet Day for Businesses - February 11, 2020