You don't need great data to get wins with AI

Dawid Naude, Director, Pathfindr

The myth of delaying AI until you have perfect data

I don’t know if it’s simply a convenient answer whilst there are hundreds of other competing priorities, or if it’s a misinformed opinion, but “our data isn’t ready yet to take advantage of AI” is something we hear regularly. Or similar variants like needing to spend 2024 getting data ready and then they’ll reassess AI in 2025.

Thanks for reading The Path - Making a difference with AI in your business! Subscribe for free to receive new posts and support my work.

The Large Language Model (LLM), the incredible tech that sits behind ChatGPT made it possible to get tremendous benefits without needing to do anything particularly custom.

Even on the free version of ChatGPT you could have your résumé critiqued and improved, or copy/paste a complex document and have it explained to you in layman’s terms, even creating analogies that make something as incomprehensible as quantum mechanics easy for an 8-year-old to grasp.

The Large Language Model changed lots of things, one, in particular, is the role of the skillset you need to get busy with AI. You could now interact with it in a chat interface as you would a person, but also it allowed organisations to take advantage of AI with a software engineer instead of a data scientist. For every 50 software engineers that a company has, they might have 1 data scientist, if they have any at all.

One irony of Large Language Models is that they make sense of previously unusable data. You could feed an entire database through an LLM and ask it what’s going on, and it’ll reason and take a stab. ‘It looks like these records are duplicated, 1/5th of the data doesn’t have a last name, and it seems the format of the date/time field changed to date only a few years ago’. Combining that with visualisation tools means you’ve bypassed the need for several manual steps. It’s not perfect, but it’s closer to usable now than it once was.

So what sorts of activity can you do without needing to get your data right? It’s anything where you’re looking at one particular block of text and making sense of it or creating something new from it. We’ve always called this unstructured data - artefacts like emails, contracts, presentations and marketing material. It seems structured to a person, but to a machine it’s highly unstructured, there are no neat columns and rows. This is the type of data that LLM’s can make valuable.

You can put in examples of your 10 best contracts and then allow users to put in a draft contract and compare it to the previous contracts and it’ll provide recommendations. You can take a customer record at the point of a contact centre agent answering a call and summarise all the information into a paragraph - if I called in, instead of the agent clicking different tabs and areas of a Customer Relationship Management tool to get the full context of me, they would just get ‘Dawid is a gold frequent flyer, most of his flights are on time but recently we lost his luggage, and he’s also got a flight booked for next week’.

We can make troves of unstructured data such as IT Support articles, HR policies, finance processes and procedures all easily accessible. Take a simple question like ‘can I take sick leave whilst I’m on annual leave’ and you can get a simple answer. This is a seemingly simple question but is an edge-case, and the answer may be in a sub-policy or even in an FAQ on a SharePoint site, whilst the rest of the policy is in Workday. An LLM allows it to answer this as if you’ve allocated this to your assistant to go through everything and come up with an answer.

One final note on data - whilst I’ve argued that you don’t need good data to get started, you certainly can get a huge competitive benefit by having unique data in order. If you’re a national chain of pet stores, you have unique buying insight, supply chain intelligence and knowledge of consumer behaviour. If you’re able to get this into a dataset and integrated with an LLM, you now have an engine that can personalise every marketing moment, e-commerce interaction and even commercialise this information through an API enabled by an LLM, which would mean all the other system would need to ask is “what colour cat harness is most popular”, and it would return a structured response, without the other system needing to know anything about the schema.

It’s the perfect time to get started, if you’re prioritising first doing an AI strategy, or essentially spending your time and money on PPT slides instead of AI experiments, you’re not going about it as well as you could.

Put simply, if you took 2 companies today, and one spent the next 3 months on their AI strategy, and the other on experimentation without any mention of strategy - who would have a better idea of their strategy at the end of it?

Other Blogs from Dawid


5 No Brainers

For all the hype around AI, it’s often not clear to business where they should get started. Some convince themselves they have a plan, you’re in this category if you say “We’ve given some people access to MS Copilot”. Let me guess - they seem to like the MS Teams meeting summarisation right?

How to think about copilots

Bots, like people, have two types of usefulness - knowledge and skills. Things it knows, and work it can do. In this Blog, Dawid Naude of Pathfindr provides practical advice to help you roll out or review your deployment of co-pilots and chatbots within your organisation, to generate both a short time impact and longer term strategic value.