Knowledge Graphs and Symbolic AI

In Short

Symbolic AI encodes knowledge as explicit rules and logic that humans can read and inspect, in contrast to neural networks that learn implicit patterns from data. Knowledge graphs extend this idea by representing facts as structured networks of entities and relations, and they are now being fused with neural models to build more reliable, explainable AI systems.

01. What It Is

Symbolic AI, sometimes called Good Old-Fashioned AI (GOFAI), is an approach to artificial intelligence based on explicit, human-readable representations of knowledge. Rather than learning from examples, symbolic systems operate on symbols and rules: if condition A and condition B are true, then conclude C. This style of reasoning dominated AI research from the 1950s through the 1980s.

A knowledge graph is a particular form of symbolic knowledge representation. It stores facts as a collection of triples: (subject, predicate, object), for example (Paris, capitalOf, France) or (Marie Curie, wonPrize, Nobel Prize in Physics). Entities are nodes in the graph, and relations are labeled edges connecting them. The result is a machine-readable network of facts that can be queried, traversed, and reasoned over.

02. Why It Matters

Neural networks are powerful pattern recognizers but they are opaque. You cannot easily ask a language model why it believes something, and its knowledge is frozen at training time. Knowledge graphs are the opposite: they are transparent, auditable, and can be updated without retraining. Combining the two approaches, under the label neuro-symbolic AI, is one of the most active areas of research in 2025-2026.

Knowledge graphs also underpin major products people use every day. Google's Knowledge Graph powers the information panels that appear when you search for a person, place, or organization. Wikidata, the structured data layer behind Wikipedia, is a public knowledge graph with over 100 million items. Amazon, LinkedIn, and the BBC all use knowledge graphs to power recommendations and navigation.

Understanding knowledge graphs is also important for grasping how Retrieval-Augmented Generation (RAG) works at its most structured level.
GraphRAG, a variant developed by Microsoft, builds a knowledge graph from a document corpus and retrieves structured subgraphs rather than raw text chunks, producing more coherent answers on multi-hop questions (see Retrieval-Augmented Generation (RAG)).

03. How It Works

Expert systems

The first practical application of symbolic AI at scale was the expert system. An expert system consists of two parts: a knowledge base of domain-specific facts and rules, typically encoded by interviewing human experts, and an inference engine that applies those rules to new inputs.

MYCIN (developed at Stanford in the early 1970s) diagnosed bacterial infections and recommended antibiotics. Its accuracy matched or exceeded that of medical specialists. DENDRAL analyzed mass spectra to identify chemical structures. R1/XCON at Digital Equipment Corporation configured computer orders, saving millions of dollars annually by the mid-1980s.

Expert systems hit a wall because maintaining the knowledge base was expensive, rules could conflict, and they could not handle uncertainty or learn from new data. When a rule was wrong, someone had to find it and fix it by hand.

Rule-based reasoning and formal logic

Symbolic AI expresses knowledge in formal logic. First-order predicate logic (the basis for PROLOG) allows statements like "all birds can fly" and "Tweety is a bird" to be combined to conclude "Tweety can fly." The inference engine searches for chains of rules that connect premises to conclusions, a process called forward chaining (start from facts, derive conclusions) or backward chaining (start from a goal, find facts that prove it).

The fundamental limitation is the frame problem: a rule-based system must be told explicitly about every relevant fact and every change in the world. Real-world domains have too many facts and too many exceptions.

Ontologies

An ontology is a formal specification of the concepts in a domain and the relationships among them. In AI and the semantic web, ontologies are typically expressed in OWL (Web Ontology Language) or RDFS (RDF Schema). An ontology for medicine might specify that "MRI" is a subtype of "DiagnosticProcedure," that "Procedure" has a property "performedOn" that must be of type "Patient," and that "Drug" and "Procedure" are disjoint classes.

Ontologies make implicit knowledge explicit and machine-readable. They allow automated reasoning: if you know something is a Mammal, and your ontology states that all Mammals are warm-blooded, you can infer warm-bloodedness without storing it explicitly for every mammal.

Knowledge graphs and RDF

The Resource Description Framework (RDF), standardized by the W3C, is the primary data model for web-scale knowledge graphs. Every fact is a triple: (subject IRI, predicate IRI, object IRI or literal). IRIs (Internationalized Resource Identifiers) are globally unique identifiers, so triples from different sources can be merged without naming conflicts.

For example:

<https://dbpedia.org/resource/Paris> 
    <http://dbpedia.org/ontology/country> 
    <https://dbpedia.org/resource/France> .

SPARQL is the query language for RDF graphs, analogous to SQL for relational databases. A SPARQL query can find all cities in France with a population over one million by traversing the graph's edges.

How Google Knowledge Graph works

Google launched its Knowledge Graph in 2012 to move beyond matching keywords to connecting real-world entities. It ingests structured data from Wikipedia, Wikidata, CIA World Factbook, and other sources, then combines them with signals from Google's own crawl. The graph stores entities (people, places, organizations, films, etc.) with attributes and relationships. When you search for "Paris," Google can identify you likely mean the city, display its population, mayor, weather, and nearby sights, and link related entities like the Eiffel Tower, all without parsing unstructured text.

The Knowledge Graph is also used by Google Search to understand queries: "What is the capital of the country where the Eiffel Tower is?" requires understanding that the Eiffel Tower is in France and that France's capital is Paris.

Neuro-symbolic AI

Neuro-symbolic AI attempts to combine the pattern-recognition strength of neural networks with the logical reasoning and transparency of symbolic systems. Approaches include:

Neural theorem provers:
Use neural networks to guide symbolic proof search, learning which inference steps are productive.
Knowledge graph embeddings:
Represent entities and relations as vectors (TransE, RotatE, ComplEx), then use neural similarity to predict missing triples. This lets you ask "what relation most likely holds between these two entities?" by computing vector distances.
LLM plus knowledge graph:
Retrieve relevant subgraphs from a knowledge graph and provide them as structured context to a language model at inference time, grounding the model's answers in verified facts.
GraphRAG:
Microsoft's approach builds a knowledge graph from a document corpus, extracts communities of related entities, and synthesizes answers from structured subgraph summaries rather than raw text chunks.

The intuition is that neural models are good at perception and language but poor at multi-step logical deduction, while symbolic systems are good at deduction but cannot handle noisy real-world inputs. Combining them aims to get the strengths of both.

04. Key Terms / Milestones

Term	Definition
GOFAI	Good Old-Fashioned AI. Symbolic, rule-based AI as practiced from the 1950s through the 1980s.
Inference engine	The component of an expert system that applies rules to facts to derive new conclusions.
Triple (RDF triple)	The atomic unit of a knowledge graph: (subject, predicate, object).
OWL	Web Ontology Language. A W3C standard for expressing rich ontologies in machine-readable form.
SPARQL	The standard query language for RDF knowledge graphs.
Knowledge graph embedding	A technique that maps entities and relations to dense vectors so that structural patterns can be learned by neural models.
GraphRAG	A retrieval-augmented generation variant that uses a knowledge graph as the retrieval index rather than raw text chunks.
Wikidata	A free, collaborative knowledge graph containing over 100 million items, maintained by the Wikimedia Foundation.

05. Examples

Google Search panels:
When you search "Barack Obama," the right-side panel showing his birth date, spouse, books, and related politicians is drawn from the Google Knowledge Graph.

Drug interaction databases:
Pharmaceutical knowledge graphs encode drug-drug interactions, contraindications, and mechanism-of-action pathways, allowing automated safety checks for prescriptions.

Fraud detection:
Financial institutions build knowledge graphs of account holders, transactions, devices, and IP addresses. Anomalous connections (a device associated with many accounts that each have suspicious transaction patterns) become visible as graph structures.

Semantic search in enterprise:
Companies like Palantir and Stardog use knowledge graphs to connect disparate internal data sources, allowing analysts to ask multi-hop questions across systems that were never designed to talk to each other.

06. Common Pitfalls / Misconceptions

Knowledge graphs are not just databases:
A relational database stores rows and columns. A knowledge graph stores entities and named, typed relationships between them, enabling graph traversal and logical inference that SQL was not designed for.

Symbolic AI did not fail, it was superseded for certain tasks:
Expert systems are still widely deployed in medicine, finance, and law. They were not wrong, they were just not scalable to open-ended perception and language tasks. Those limitations are what neural networks address.

Neuro-symbolic is not one thing:
It is an umbrella label for many architectures with very different designs. Claims that a system is "neuro-symbolic" need to be examined for what exactly is being combined and how.

Knowledge graphs have a coverage problem:
Even large knowledge graphs like Wikidata are incomplete. The ratio of what is known to what is represented can be very low in specialized domains. This motivates knowledge graph completion research.