The periodic table of AI: how to judge whether an AI system really works

You get a demo. Impressive. The AI answers questions about your own documents. The salesperson calls it “RAG-powered.” You nod. But do you know what’s under the hood? And more importantly: do you know what’s missing?

That is the problem with AI in 2026. Everything sounds good. The demos are convincing. But 95% of AI pilots deliver no measurable impact (MIT, 2025). 42% of companies have already abandoned AI initiatives. Not because AI doesn’t work. But because no one assesses whether the right system is being deployed for the right problem.

The AI Periodic Table in brief:

AI systems consist of seventeen recognizable building blocks
From simple prompts to multi-agent systems: each row is more complex
The model works as an assessment framework: what’s in it, what’s missing, is the complexity right?
Four red flags you can check immediately at any AI demo

Seventeen building blocks of AI

Martin Keen, Master Inventor at IBM with more than 400 patents, developed the AI Periodic Table. The idea is surprisingly simple: just as the real periodic table orders all chemical elements, this model orders all building blocks of AI. Every AI system you come across, from ChatGPT to the “AI agent” your vendor sells, is built from a combination of these elements.

The AI Periodic Table: seventeen building blocks of AI organized into four rows and five families

Four rows, four complexity levels. The higher the row, the more advanced the system.

Row	Level	Elements	In brief
1	Primitives	Prompts, Embeddings, LLMs, Tokens, Attention	You type something, you get something back. This is ChatGPT.
2	Compositions	Function Calling, Vector Databases, RAG, Guardrails, Multi-modal	The AI uses your data. This is where it gets interesting for businesses.
3	Deployment	Agents, Fine Tuning, Frameworks, Red Teaming, Small Models	The AI acts independently. Production.
4	Emerging	Multi-Agent, Synthetic Data, Interpretability, Thinking Models	Frontier technology. Experimental.

Row 1 is where everyone starts. You ask a question, you get an answer. No connection to your business data, no action in other systems. Just a smart chat window.

Row 2 is where it gets interesting for businesses. RAG (Retrieval Augmented Generation) means the AI searches your documents before answering. Guardrails prevent the AI from saying or doing things it shouldn’t. Function Calling lets the AI perform actions in other systems. This is the level at which most business AI applications run.

Row 3 is about production and autonomy. Agents think, act, observe and repeat, independently. Fine Tuning adapts a model to your specific domain. Red Teaming tests the system by actively trying to break it. We wrote earlier about the difference between vibe coding and agentic coding; agents are precisely that intersection.

Row 4 is frontier. Multi-agent systems in which multiple AIs collaborate. Thinking Models that can explain their own reasoning. Interpretability that makes clear why an AI made a particular decision. Most companies don’t need this. But it is sold as if they do.

74.4% of Dutch organizations now use AI. But only 8% do so organization-wide. The rest experiment, often without knowing what level they are at.

Four ways to assess an AI system

The periodic table is not decoration. It is a tool. Here is how you use it.

1. Architecture mapping: what’s in it?

Ask the provider which elements are in their system. Not in marketing language, but concretely. Does it use RAG? Then there must be a vector database. Is there an LLM in it? Which model, which version? Are guardrails applied?

A concrete answer sounds like this: “We use GPT-4o as the LLM, Pinecone as the vector database, and we have input and output guardrails for hallucination detection.” That is row 2, clearly described.

A vague answer sounds like this: “We use the latest AI technology.” That is a red flag.

2. Gap analysis: what’s missing?

This is at least as important as what’s in it. A RAG system without guardrails means the AI can say anything, including things that are wrong, harmful or legally risky. No red teaming? Then you don’t know how the system responds to misuse.

We described earlier how 45% of AI code is vulnerable due to missing checks. The same applies to AI systems: what isn’t in it determines the risk.

3. Complexity check: does it fit the problem?

A common mistake: selling a row 4 system for a row 2 problem. Companies pay for multi-agent architectures while a simple RAG application would suffice.

The question is not “how advanced is it?” but “does the complexity fit what I need?” More complexity means higher costs, more maintenance and more that can go wrong.

Concretely: an organization that wants a FAQ bot for customer service is well served by row 2. An LLM, RAG on the knowledge base, and guardrails that prevent the bot from talking nonsense. You don’t need a multi-agent system or fine-tuning for that. Don’t pay for it either.

4. Reaction check: do the combinations add up?

Some combinations are logically mandatory. RAG without a vector database is not real RAG; then the AI doesn’t search your documents, it just pretends to. An agent without guardrails is an autonomous AI without a brake. Function Calling without authentication gives the AI access to systems without control.

If a provider claims to deliver row 3 but the essentials of row 2 are missing, something isn’t right.

Three AI claims you can debunk right now

No abstract theory. These are three situations you could encounter tomorrow. For each claim we show what it means in periodic-table terms, and which question to ask to test the claim.

”Our AI learns from your business data”

Sounds good. But this can mean three very different things:

RAG (row 2): The AI searches your documents and uses them as context. Your data stays where it is.
Fine-tuning (row 3): The AI model is actually adapted to your data. More expensive, more complex, and your data goes into the model.
System prompt with context: The simplest variant. Your data is passed as text with each question. No durable “learning” capacity.

Three very different things in terms of privacy, cost and effectiveness. With RAG, your data stays under your own control. With fine-tuning, your data becomes part of the model, and you can no longer retrieve it. With a system prompt, the context disappears after each conversation.

The question to ask: “Does my data stay with you, or does it go into the AI model?"

"We have an AI agent that handles this automatically”

The word “agent” is used at every turn. A real agent (row 3) thinks, acts, observes the result and repeats. It can make decisions independently.

Many “agents” are chatbots with if-then rules. Nothing wrong with that, but it is row 1, not row 3. The difference lies in autonomy: can the system independently take on a task, adjust along the way and evaluate the result? Or does it follow a fixed script?

The Carnegie Mellon AgentCompany benchmark shows how hard real agents are: the best AI agent completes only 24% of complex tasks independently. Agents are powerful, but they are also unpredictable when the frameworks are missing. Separate research shows that 39% of companies report that AI agents accessed unintended systems.

The question to ask: “What can the agent decide independently, and where is the brake?"

"Our system is completely secure”

This is the claim to be most alert to. Check three things:

Guardrails (row 2): Are there rules that determine what the AI may and may not do?
Red Teaming (row 3): Has the system been tested by someone who tried to break it?
Interpretability (row 4): Can you see afterward why the AI gave a particular answer?

If none of the three is present, “completely secure” is a marketing claim. Not a technical truth. No AI company scores higher than C+ on the AI Safety Index (Future of Life Institute, 2025).

The examples are painfully concrete. Chevrolet’s chatbot agreed to sell a $70,000 car for $1, no guardrails. Air Canada’s chatbot gave wrong travel advice and the company was held legally liable. 12% of AI skills are malicious, designed to steal data.

The question to ask: “What did you do to break the system before it went live?”

The FTC and SEC are now actively enforcing against AI-washing, companies that promise more than their technology delivers. In the Netherlands, the ACM is the supervisor. The direction is clear: anyone who claims something is “secure” must be able to prove it.

How we use the periodic table

We use the periodic table in two ways. And the second is perhaps the more valuable.

Assessing: testing afterward. When we audit an existing AI system, we map it onto the periodic table. Which elements are present? Which are missing? Does the complexity fit the problem? What we often encounter then: systems sold at row 3-4 that are actually row 1-2. An “AI agent” that, on closer inspection, is a chatbot with three if-then rules. A “RAG system” that has no vector database but just a system prompt with some text in it.

Designing: specifying in advance. The periodic table is at least as valuable at the front end. When we sit down with a client to design an AI application, we put the model on the table. Which elements do you need for this problem? Where shouldn’t you even start? Do you need to be at row 2 or is row 1 enough? That prevents over-engineering before a single line of code is written. And it gives the client the same language to think along with. No mysterious abbreviations, but an overview you can fill in together.

We also see patterns that keep recurring. Systems at row 2 that miss essentials: RAG without guardrails, function calling without authentication. At 97% of organizations with AI breaches, basic access controls were missing (IBM, 2025). If you fill in the periodic table in advance, you see those gaps before they become problems.

The model makes conversations concrete. No opinions, but a checklist. Becoming AI-native starts with understanding what you build, and what you deliberately leave out.

Five questions for your next AI demo

Cut these out. Print them. Take them to your next conversation with an AI provider. You don’t need a technical background to ask these questions. The answers tell you enough.

“Which elements from the AI Periodic Table are in your product?” A good party can answer this concretely. “We use an LLM with RAG, vector database and guardrails.” Not: “We use the latest AI technology.”
“What is deliberately missing, and why?” Every system makes choices. Fine-tuning is not always necessary. Multi-agent neither. But the provider must be able to explain why they don’t do something.
“What happens when the AI gets it wrong?” This tests whether there are guardrails. Can the AI give unlimited answers? Or are there limits built in? What happens with a wrong answer?
“Has the system been tested by someone who tried to break it?” Red teaming. If the answer is no, no one knows how the system responds to misuse. And that is exactly the kind of negligence that regulators are now watching.
“Where does my data go?” Privacy check. Does your data stay local? Does it go to a cloud provider? Is it used to train the model? The EU AI Act makes risk management for AI mandatory from August 2026. Know where you stand now.

Need help?

Want to know what is really inside an AI system? We look together. No sales pitch, an honest conversation.

Schedule a call →

Sources: IBM Technology, The AI Periodic Table (Martin Keen), Fortune, 95% of AI Pilots Fail (MIT, Aug 2025), Carnegie Mellon AgentCompany Benchmark, IBM Cost of a Data Breach Report 2025, S&P Global, 42% of AI Initiatives Discontinued, Future of Life Institute, AI Safety Index, EU AI Act