GPT-3 and the Question Nobody's Asking

OpenAI released GPT-3 in June. The demos are genuinely impressive. It writes coherent essays. It generates working code from natural language descriptions. It can summarise documents, translate between languages, and answer questions with a fluency that feels uncanny. The tech community is excited. I am too. But I keep noticing that the conversation is almost entirely about what it can do, and almost nobody is asking the questions that matter for enterprise use.

The Demo vs the Deployment

Let me be clear: GPT-3 is a real technical achievement. The scale alone is significant, 175 billion parameters trained on a massive corpus of internet text. The outputs are often remarkable. In a demo environment, with curated prompts and cherry-picked examples, it looks like the future of computing.

In an enterprise environment, the picture changes.

175B

parameters in GPT-3, the largest language model to date

Source: OpenAI, Language Models are Few-Shot Learners, 2020

The Data Question

Enterprise software handles sensitive data. Customer records. Financial transactions. Health information. Intellectual property. The current GPT-3 API sends your input to OpenAI's servers for processing. For most enterprise use cases, that's a non-starter.

The question isn't "can GPT-3 summarise our legal contracts?" It can probably produce something plausible. The question is "can we send our legal contracts to a third-party API and accept the data handling implications?" For regulated industries, health, finance, government, the answer right now is almost certainly no.

The Accuracy Question

GPT-3 produces fluent text. Fluent text is not the same as accurate text. It can generate a perfectly grammatical paragraph about a legal concept that is substantively wrong. It can write code that looks correct but has subtle bugs. It can summarise a document and miss the critical detail.

For creative applications, this is acceptable. For enterprise applications where accuracy matters, it's a problem. You'd need a human reviewing every output, which raises the question of what time you're actually saving.

The gap between "this produces impressive text" and "this produces text we can stake the business on" is enormous. Most enterprise value lives in the second.

Mak Khan

Chief AI Officer

The Integration Question

GPT-3 is an API. Using it in an enterprise context means integrating it with existing systems, handling authentication, managing rate limits, dealing with response variability, and building a UX that accounts for the fact that the same input can produce different outputs each time.

That integration work is non-trivial. And the total cost, API fees, development time, ongoing maintenance, human review, needs to be weighed against the value generated. For most enterprise use cases I can think of right now, the economics don't yet work.

What's Actually Interesting

I don't want to dismiss GPT-3. The underlying capability is real, and the trajectory is worth paying attention to.

Knowledge work assistance is the most promising near-term application. Not replacement. Assistance. A tool that generates first drafts for a human to refine. A tool that suggests code completions for a developer to evaluate. The human stays in the loop. The tool accelerates parts of their workflow.

Information retrieval could be transformed by models like this. Instead of searching through documents with keywords, you ask a question in natural language and get a synthesised answer. The accuracy problem needs solving first, but the interaction model is genuinely better.

Structured data extraction from unstructured text is a real enterprise need. Pulling key terms from contracts, extracting data from emails, categorising support tickets. These are tasks that current NLP handles poorly and that a model like GPT-3 might handle well enough to be useful.

What I'd Tell Enterprise Leaders

Don't ignore this. The capability is real and the technology will improve. But don't buy the hype either.

Watch, don't invest. This technology is moving fast. The right move for most enterprises is to understand the capability, experiment internally with non-sensitive data, and wait for the data handling and accuracy problems to be addressed before committing budget.

Think about your data strategy. Whatever AI tools emerge in the next few years, they'll all need data to work with. The organisations that have clean, structured, accessible data will be able to adopt these tools when they're ready. The organisations with data scattered across spreadsheets and legacy systems won't, regardless of how good the AI is.

Be sceptical of vendors. Anyone selling you a GPT-3-powered enterprise solution in late 2020 is selling you a demo, not a product. The technology isn't mature enough for enterprise-grade reliability. That will change. It hasn't changed yet.

The question nobody's asking is the right question: not "what can it do?" but "can we trust it with what matters?" When the answer to that second question becomes yes, the enterprise implications will be significant. We're not there yet. We're watching.