The AI hype cycle has produced more jargon than insight. Here's what the terms actually mean, stripped of marketing, explained for enterprise leaders, not computer scientists.
Generative AI
AI that creates new content (text, images, code, audio) rather than just classifying or predicting from existing data. ChatGPT, DALL-E, and Midjourney are the consumer-facing examples. In enterprise context, generative AI produces draft documents, extracts structured data from unstructured sources, generates code, summarises meetings, and powers conversational interfaces.
What it is: A category of AI models that produce outputs, not just decisions.
What it isn't: A single product or vendor. "Generative AI" is as broad as "database." The value depends entirely on what you build with it.
Large Language Model (LLM)
The technology behind tools like ChatGPT. An LLM is a neural network trained on massive amounts of text data, enabling it to understand and generate human language. GPT-4, Claude, and PaLM are examples. They're "large" because they have billions of parameters (learned patterns) and are trained on datasets that span a significant portion of the internet.
Enterprise relevance: LLMs are the engine. What matters is what you build around them: the data you feed them, the systems you connect them to, and the governance you wrap around them. Choosing between GPT-4 and Claude matters far less than the quality of your AI foundation.
Prompt Engineering
The practice of crafting inputs (prompts) to get useful outputs from an AI model. A well-structured prompt produces dramatically better results than a vague one. In enterprise context, prompt engineering means designing the instructions and context that make AI tools reliable for specific business tasks.
Reality check: Prompt engineering is real and valuable, but it's not a career path or a replacement for proper system design. A well-engineered prompt can't compensate for bad data, missing context, or a poorly defined problem.
Hallucination
When an AI model generates confident, plausible, and completely wrong output. It's not lying. It's pattern-matching without understanding. A model might cite a paper that doesn't exist, invent statistics, or confidently state something that contradicts its source material.
Enterprise impact: This is the single biggest risk in enterprise AI deployment. Hallucination means you can't trust AI output without verification, which means your systems need human-in-the-loop review, confidence scoring, and source attribution. Any vendor claiming their AI "doesn't hallucinate" is, ironically, hallucinating.
Fine-Tuning
Taking a pre-trained model (like GPT-4) and training it further on your specific data to improve performance on your specific tasks. Think of it as teaching a generally smart model your organisation's particular language, processes, and standards.
When it matters: Fine-tuning is powerful but expensive and complex. Most enterprise use cases are better served by RAG (giving the model your data at query time) rather than fine-tuning (permanently changing the model). Fine-tuning makes sense when you need the model to consistently behave in a specific way across thousands of interactions.
RAG (Retrieval-Augmented Generation)
A technique where the AI model retrieves relevant documents or data before generating a response, instead of relying solely on what it learned during training. In practice: the user asks a question, the system searches your knowledge base, feeds the relevant documents to the model alongside the question, and the model generates an answer grounded in your data.
Why enterprise cares: RAG lets you build AI systems that use your organisation's knowledge without retraining the model. It's the foundation pattern for most enterprise AI applications: claims intelligence, advisory tools, knowledge retrieval, document processing. See our introduction to RAG for the full picture.
Embedding
A way of converting text (or images, or any data) into numerical representations that capture meaning. Similar concepts produce similar numbers. This enables AI systems to find semantically related content, searching not just for matching words but for matching intent.
Enterprise application: Embeddings power search, recommendation, and retrieval systems. When an AI tool finds the right policy document for a claims question, it's using embeddings to match the question's meaning against the document's meaning, not just matching keywords.
Token
The basic unit of text that an LLM processes. Roughly, one token equals about three-quarters of a word. Tokens matter because LLMs have context limits (how much text they can process at once) and pricing models based on tokens consumed.
What to know: GPT-4's context window is 8K or 32K tokens (roughly 6,000-24,000 words). This limits how much information you can feed the model in a single interaction, which is why RAG, chunking strategies, and context management matter for enterprise applications.
AI Foundation
Shared infrastructure that multiple AI capabilities build upon: document processing pipelines, knowledge bases, integration frameworks, and governance patterns. Instead of building each AI tool from scratch, an AI foundation lets each new capability reuse what came before.
Why it matters: Without a foundation, capability #4 costs as much as capability #1. With a foundation, capability #4 costs a fraction and ships in weeks instead of months. This is the compound value argument, and it's the single most important architectural decision in enterprise AI.
Vector Database
A database optimised for storing and searching embeddings. Traditional databases search by exact matches or ranges. Vector databases search by similarity: "find the ten documents most similar to this query." Essential infrastructure for RAG-based AI applications.
Enterprise context: If you're building AI that needs to search your organisation's knowledge, you'll need a vector database. Pinecone, Weaviate, and pgvector (PostgreSQL extension) are common choices. The choice matters less than the quality of what you put in it.
Inference
The process of running a trained AI model to generate output. When ChatGPT answers your question, that's inference. In enterprise terms, inference is the operational cost of AI. Every query, every document processed, every response generated costs compute.
Cost consideration: Inference costs are falling rapidly but aren't zero. For high-volume enterprise applications (processing thousands of documents daily), inference costs and latency need to be factored into architecture decisions.
- Which AI model should our enterprise use?
- It depends on your use case, not the leaderboard. GPT-4 is the most capable general-purpose model as of mid-2023, but Claude, PaLM, and open-source alternatives each have strengths. The model matters less than the data, integration, and governance you build around it.
- Do we need to understand all these terms to start with AI?
- No. You need to understand what problem you're solving, whether your data is accessible, and who will govern the system. The technical terms help you evaluate vendors and have informed conversations, but the strategy conversation comes first.
