The Myth of Perfect Data

The Myth of Perfect Data

- by Samir Sharma, Expert in Data Analytics & BI

In the last week I’ve spoken to a number of data management professionals about data and LLMs.

Most say we can’t create LLMs as the data isn’t perfect! I have a question about that, but, it might be a little too reflective for those folks than is necessary right now!

So, what I want to do is talk about the elephant in the room: the myth of 100% perfect data for building out LLMs.

The reality is that perfect data doesn’t exist! There I said it! It never has!

In the realm of data management, the pursuit of flawless data is akin to chasing unicorns.

The truth is, perfect data is a myth. 😬

Humans, who are the architects of data, themselves don’t always have the complete set of information, leading to occasional “hallucinations” in our understanding of reality.

So, why should we expect our models to be infallible?

We humans draw conclusions based on incomplete information, LLMs can also “hallucinate” when faced with data gaps. It’s a natural byproduct of the learning process. Instead of viewing this as a flaw, we should see it as an opportunity for improvement and iteration.

Yes the buzz around ai is rampant, and LLMs are powerful tools, but, right now they are not a substitute for human judgment. The key lies in acknowledging the role of humans in the loop. We need experts who can critically assess and validate the outputs, filling in the gaps where the model may falter.

It’s time to dispel the notion that all data management professionals demand perfection. Rather than fixating on an unattainable ideal, let’s focus on the practical aspects of data, understanding that imperfections are part of the game.

This may sound strange to you and I’m attempting to get my head around this. If we flip the script, instead of obsessing over an unattainable perfect data utopia, let’s revel in the quirks and charms of imperfect data. It’s the imperfections that give character to our models, making them more real and relatable.

Do they Samir? I think you are crazy!

Collaboration, fact checking, etc. will be the key aspects where we will work with the true potential of LLMs, a collaborative effort between business teams, data professionals and of course the AI itself.

The world that we should be attempting to create is where we can create a safe space that capitalizes on the strengths of both human intuition and machine efficiency.

In the ever-evolving landscape of AI, it’s essential to debunk myths and embrace the reality that perfection is an illusion. Let’s celebrate imperfections, learn from them, and work together to build more robust and reliable systems.

That all starts with the right data & AI strategy that leads the business strategy.

Do you think we can do that?