Glossary of AI Terms | BasicsOf.AI

A

Abliteration: Editing a model's internal weights to erase the single direction that controls refusals, without retraining.
Academic integrity: Doing your own work honestly and crediting any help you used.
Agent (agentic mode): A mode where the AI takes actions, creating and editing files, running commands, and fixing its own errors across many steps.
Agent-computer interface (ACI): The set of tools and interaction format a harness gives a model, the agent equivalent of the screen and keyboard a person uses to work a computer.
Agent framework: Software that coordinates multiple AI agents working together on a task.
Agent harness: The software layer wrapped around a language model that runs the agent loop, hands the model its tools, executes the actions it asks for, and manages the context window. The harness is what turns a model into an agent.
Agentic browser: A browser, extension, or mode where the AI acts on websites for you, not just answers.
Agent loop: The repeating cycle the harness drives. Send the current context to the model, run whatever tool the model asks for, feed the result back, and repeat until the task is done or a stop condition is hit.
Agent mode: The setting that lets the assistant take multi-step actions, unlike a sidebar that only chats.
AGENTS.md: A plain Markdown README for agents giving build steps, tests, and conventions to coding tools.
Agent skill: A folder of plain-language instructions plus optional bundled files an agent loads when a task matches.
AGI: A system that generalises knowledge across domains and solves tasks it was not explicitly trained for.
AI-text detector: Software that guesses if writing was AI-generated, giving a probability, not a verdict.
AI-use disclosure: A short statement naming which AI tools you used and how.
AI agent: A system where a language model directs its own actions in a loop, using tools and feedback to reach a goal.
AI coding assistant: A tool that writes or completes software from plain-language instructions, instead of you typing every line.
AI companion: An app or character built for ongoing personal or romantic conversation. Designed for attachment.
AI governance: The rules, norms, and institutions for how AI systems are developed, deployed, and audited.
AI Overview: Google's AI-written summary box on top of a normal results page, above the blue links.
AI RMF: NIST's voluntary AI Risk Management Framework, increasingly cited by regulators as a reference.
AI winter: A collapse in AI funding after overpromising fails to match actual capabilities.
Algorithm: A procedure that, given data, adjusts its parameters to minimize error or maximize reward.
Algorithmic bias: A systematic, repeatable AI error producing unfair outcomes against a protected group.
Alignment: Making an AI reliably pursue goals humans actually want.
Answer engine: A tool that searches the web and writes one cited answer instead of a list of links.
API: A way for software to send text to a model and pay per use.
ARIMA: A classic statistical forecasting model.
ASI: A hypothetical system whose cognitive performance greatly exceeds the most capable human in virtually all domains.
Assistive technology (AT): Any product or system that helps a person function, from a wheelchair to a screen reader.
Automatic speech recognition: Transcribing spoken words into text (ASR).
Automatic speech recognition (ASR): AI turning spoken words into text in real time.
Autoregressive image generation: A newer class that builds image content sequentially.

B

Batch API: Non-real-time processing at 50% discount. Results delivered within 24 hours.
Benchmark: A standardized, shared eval with a fixed test set and scoring.
BF16 and INT4: A 2-byte default format versus a 4-bit format that quarters memory.
Bias-variance tradeoff: More model complexity lowers bias but raises variance. The goal is the sweet spot.
Bounding box: A rectangle (x, y, width, height) that localizes an object in an image.
Bring-your-own-LLM: You connect the language model you choose, such as Claude, GPT, or a local one, rather than being tied to a single provider.

C

C2PA Content Credential: A cryptographically signed provenance record embedded in a digital asset.
Capability axis: Sorting AI by how broad its intelligence is: ANI, AGI, ASI.
CE marking: A mark high-risk EU AI systems require as evidence of compliance, like product safety marking.
Chain-of-thought: Prompting the model to reason step by step before answering.
Chain-of-thought (CoT) prompting: Eliciting a language model's intermediate reasoning steps before its final answer.
Chat app: A finished product like ChatGPT run on the company's servers, zero setup.
Chunking: Splitting documents into smaller segments before embedding them for retrieval.
Circle to Search: The Android gesture that searches whatever is on your screen without leaving the app.
Citation / source link: The numbered footnote an answer attaches to a claim. A pointer to verify, not a guarantee.
Closed-weight model: Proprietary weights accessed only via API or a product.
Cloud / hosted: The model runs on a company's servers. You reach it over the internet.
Collaborative filtering: Recommending from patterns in many users' behavior.
Commercially safe / indemnified model: A generator trained only on licensed or public-domain data, sold with a promise to defend the user.
Compounding error: An early mistake that cascades into increasingly wrong subsequent actions.
Compute concentration: Degree to which training compute is held by a small number of actors.
Computer use: The model looks at a screenshot and replies with an action like click, type, or scroll.
Conformity assessment: A formal evaluation that a high-risk AI system meets requirements before deployment.
Content Credentials (C2PA): A tamper-evident record attached to a file describing how it was made, including any AI use.
Context compaction: Condensing accumulated context to reclaim token budget without losing task continuity.
Context engineering: The discipline of designing and managing the full context window a model sees at inference.
Context management: The harness keeping the model's limited context window useful. One common technique is compaction, which summarizes a near-full window and starts a fresh one from the summary.
Context window: The max number of tokens a language model can process in one request-response cycle.
Cost asymmetry: Training is a huge one-time cost. Inference dominates ongoing cost.
Cross-device: Millions of consumer devices each training on a small local dataset.
Cross-silo: A few institutions collaborating without sharing raw data.

D

Data-centric AI: Improving model performance by improving data quality rather than model architecture.
Data augmentation: Transforming existing samples to expand a dataset.
Data leakage: Including info in training data not available at prediction time, inflating apparent performance.
Data retention: How long the company keeps your conversation after you delete it.
Decentralized training: Fine-tuning a model across many separate computers over the open internet rather than in one data center, coordinated here by Nous's Psyche network and DisTrO optimizer.
Deepfake: AI-altered or synthesized media made to look authentic, often of a real person.
Deep Learning: A subset of ML that uses neural networks with many hidden layers.
Deep research: A slower mode that searches many sources and writes a longer, cited answer.
Dense retrieval: Semantic matching via neural vector embeddings.
Description: The line the assistant uses to decide when to reach for the skill.
Diffusion model: A model that learns to reverse a noise-adding process to make images from text.
Direct injection: A user message that tries to override the system prompt or jailbreak.
Discriminative AI: Earlier systems that classify or predict rather than create.
Disparate impact: A neutral policy that disproportionately harms a protected group.
Disparate treatment: Explicit use of a protected characteristic in a decision.

E

Edge AI: Running ML models on the device where data is generated, not in the cloud.
Effective context: The portion of the window a model reliably attends to, smaller than the technical max.
Embedded AI: AI built into an app you already use rather than a separate chatbot, met as a button or icon.
Embedding: A dense numeric vector that represents the meaning of text or other data.
Embodied AI: Intelligence arising from a body interacting with a physical environment.
Emotional over-attachment: When a child turns to an always-available, always-agreeable AI for comfort in place of people.
Ensemble: A collection of models whose predictions are combined, like random forests and gradient boosting.
Enterprise no-training tier: An approved tier that does not train on your data.
Eval: Measuring how well a model performs on a defined set of tasks.
Explainability: Describing why a model produced a specific output in actionable human terms.
Explainability / XAI: Making AI decisions interpretable to affected parties and auditors.
Export controls: US restrictions limiting export of advanced AI chips to China and other countries.

F

Fact-checking an AI answer: Judging which parts of a response to trust and which to confirm first.
Failure mode: A predictable way AI fails, following from how the technology works.
Fair use: US doctrine allowing limited use of protected works without permission.
False positive: When a detector flags human-written work as AI.
Feature map: The output of applying a convolutional filter to an image or a previous layer's output.
Federated learning: Training a shared model across devices without raw data ever leaving them.
Few-shot: Prompting with a few examples of the desired output.
Few-shot CoT: Including a handful of worked examples with step-by-step solutions (the original paper used eight) so the model copies the pattern.
Fine-tuning: Continuing training on a curated dataset to specialize a base model.
FLOP: Floating-point operation. The standard unit for measuring training compute.
Follow-up refinement: One-line tweaks like shorter or as a table to refine output.
Forecasting foundation models: Pre-trained models like TimesFM for zero-shot forecasting.
Foundation Model: A large model trained on broad data at scale, adaptable to many downstream tasks.
Four freedoms: The OSI criteria a system must meet to count as open source.
Free tier: The no-cost version of a chat app, with daily message and feature limits.
Frontier model: The most capable models available, run in the cloud by the big providers.
Function calling: OpenAI's term for tool use. The concept is identical.

G

Gateway: OpenClaw's always-on local background program that holds your chat connections and runs the agent. The single control point.
Generalist chat app: A multimodal generalist that covers most everyday jobs.
Generative AI: Models that learn data's structure and produce new content resembling it.
Generative AI tool (text-to-X): Software that makes text, an image, audio, or video from a prompt or reference.
Generative editing: AI editing of existing media, like removing part of an image, extending a frame, or dubbing.
Generative photo edit: Editing a photo by adding or removing content with AI, so the result is partly invented.
GEO (generative engine optimization): Shaping your content so AI answers cite you. The AI-era version of SEO.
GGUF: A dominant format for deploying quantized models.
Greedy decoding: Always picking the single highest-probability token.
Grounding: Anchoring output to real sources, especially via RAG.
Guardrails: Input and output filtering or moderation layers around a model.

H

Hallucination: Fluent, confident text that is factually wrong or unsupported by any source.
High-risk claims: Names, numbers, quotes, citations, dates, and legal or medical claims to verify.
Human review: A sample of chats read by trained people to rate quality or check for abuse.
Hybrid reasoning: A single model that can answer instantly or first work through a visible step-by-step <think> monologue, toggled on or off by the user or the model.
Hybrid search and reranking: Combining both, then a final precision pass with a cross-encoder.
Hyperparameter: A parameter set before training, as opposed to one learned during training.

I

ILSVRC: ImageNet Large Scale Visual Recognition Challenge, the annual 2010-2017 image classification competition.
ImageNet: A dataset of 14 million labeled images across 20,000+ categories, used for the ILSVRC benchmark.
Imposter scam: The FTC's most-reported category: pretending to be someone you trust to extract money.
Indirect injection: Hidden instructions in content the model later retrieves.
Inference: Running the finished model to generate output, with no learning.
Instrumental convergence: Almost any goal plus enough intelligence leads an agent to pursue self-preservation, resources, and goal preservation.
Intent matching: The old command-and-control approach needing near-exact phrasing.
Interpretability: A property of a model whose decision process is directly readable.

K

k-nearest neighbor search: Finding the k closest vectors by geometric proximity.
Knowledge cutoff: A fixed date past which the model has no built-in knowledge.
Knowledge distillation: Training a small student model to replicate a large teacher model's behavior.
Knowledge graph: Facts stored as a network of entities and labeled relations.
KV cache: Stored key and value representations for all tokens in context, for efficient generation.

L

Label: The correct output for a supervised learning example. Also called annotation or ground truth.
LAION: German nonprofit behind large image-text datasets used to train image models.
Large language model: A neural network trained on text to predict the next token.
Least privilege: Giving the agent the minimum sites and actions it needs, not access to everything you are logged into.
LLM: A deep learning model built on the transformer architecture, trained on text via next-token prediction.
LLMOps: MLOps extended for LLMs, adding prompt versioning, evals, and cost monitoring.
LLM rebuild: The shift to LLM-based assistants that converse and chain steps.
Local / on-device: The model runs on your own computer, offline, with nothing sent out.
Local install: Running an open model on your own computer, offline.
Local model: A model that runs on your own device, so prompts never leave it.
Local model runner: A tool that serves and runs AI models locally instead of via a hosted API.
LoRA: A parameter-efficient method that updates only a small fraction of weights.
Lost in the middle: Models recall info buried mid-context worse than at the start or end.

M

Machine learning: AI where systems learn patterns from data instead of hand-coded rules.
Markov decision process: The states-actions-rewards framework underlying most RL.
MCP host: An AI application, like Claude Desktop, that connects to MCP servers.
MCP server: A program that wraps a tool, database, or API for any MCP client.
Mean Average Precision (mAP): The standard metric for evaluating object detection, averaging precision across recall levels and categories.
Memory (stateful vs stateless): Whether the app carries what it knows about you between conversations.
Memory poisoning: When hidden instructions get written into memory and quietly affect later answers. A prompt-injection attack.
Metadata: Fields attached to each chunk at ingestion enabling hybrid filtering and citations.
Mixture-of-experts: An architecture where only part of the model runs per token.
MLOps: DevOps rigor applied to the full machine learning lifecycle.
Modality: A type of data such as text, image, audio, or video.
Model: The trained artifact produced by running a learning algorithm on a specific dataset.
Model-based RL: Reinforcement learning that plans using a learned model of dynamics.
Model collapse: Degradation from training recursively on a model's own outputs.
Model Context Protocol: An open standard for connecting AI apps to external tools and data.
Model drift: Silent performance decay as data shifts after deployment.
Model hub: A platform for discovering, accessing, and deploying open-weight models.
Model routing: Directing requests to different models based on complexity to optimize cost.
Multi-agent orchestration: Coordinating several agents on one task instead of relying on one.
Multimodal model: A model that processes or generates more than one type of data.

N

Name the task first: Identify the job, then start simple and specialize only if needed.
Narrow AI (ANI): AI that performs only within a specific domain. All AI today is narrow.
Natural language processing: The AI subfield for understanding and generating human language.
Neural network: Layers of simple units that weight inputs, sum them, and decide.
Neuron: A unit that weights its inputs, sums them, and applies an activation.
Neutral alignment: Nous Research's design goal of building models that follow instructions and refuse far less often than mainstream assistants, while still declining a few defined categories.
Next-token prediction: The single training objective that, at scale, yields broad ability.
No-code / app builder: A tool where you chat in plain language and it builds a running app, often handling hosting, login, and the database.
Non-parametric memory: Retrieved external data, swappable without retraining.

O

Ollama and LM Studio: Free programs that download and run local models.
On-device assistant: An SLM at or below ~4B parameters running on a phone.
On-device inference: A local model processes data with no network round-trip.
On-device processing: Running the AI on the phone itself rather than sending data to a server.
On-device vs cloud: Whether the AI runs on your phone (private, offline) or sends data to company servers (more capable).
Open-source AI: A stronger standard adding the data, code, and recipe to rebuild it.
Open-weight: You can download and run a model's weights, but not its data or recipe.
Open-weight model: Publicly downloadable weights you can self-host.
Operator vs request layer: The system prompt sets behavior. The user turn carries the request.
Opt-out mechanism: A way for rights holders to signal their content should not train AI.
Orchestration framework: A library that structures how an app calls models, routes data, and chains operations.
Orchestrator: An LLM that decomposes tasks and directs worker agents.
Orchestrator-worker pattern: A lead agent splits work among subagents and combines results.
Orthogonality thesis: Intelligence and goals are independent. A highly intelligent system can pursue mundane or destructive goals.
Orthogonal lenses: Four independent axes: capability, functionality, approach, function.
Overfitting: The model memorizes noise in the training data and performs poorly on new data.
Overlap: Tokens repeated between the end of one chunk and the start of the next to preserve boundary context.

P

Paid tier / subscription: A flat monthly fee that raises limits and unlocks the best models and features.
Parallelization: Processing all tokens at once, which enabled massive scaling.
Parameter: A single learned number (weight) inside a neural network.
Parameter count: The 7B or 70B in a model name, the total number of weights.
Parameters: The billions of learned values that separate an LLM from earlier models.
Parametric memory: Knowledge baked into model weights during training.
Parent-document retrieval: Index small chunks for precision, return the larger parent chunk to the LLM for context.
Parental controls: Settings to link a teen's account, set limits, disable features, and opt out of training. Bypassable.
Perplexity: The predictability of writing that detectors measure. plain wording scores as machine-like.
Persona: Who the AI should act as in the prompt.
Personal AI agent: An AI assistant you run yourself that can take actions on your computer and accounts, not just answer questions.
Personalization / personal context: The umbrella term for tailoring answers to you, Gemini's Personal Intelligence.
pgvector: A Postgres extension adding vector search to a relational database.
Pig butchering: A long-game romance-and-investment scam luring a victim into a fake crypto investment over weeks.
Planning without physical cost: Exploring possible futures mentally before acting.
Policy: A strategy for choosing actions that maximizes cumulative reward.
Power Usage Effectiveness (PUE): Ratio of total data center energy to IT equipment energy. 1.0 is perfect.
Precedence: System-level constraints outrank user messages.
Progressive disclosure: The agent sees only a one-line summary of each skill and opens full files only when needed.
Prompt: The full text you send a model before it generates a reply.
Prompt caching: Storing the computed KV-cache for repeated prompt prefixes to avoid reprocessing.
Prompt engineering: Shaping output by changing the input at inference. The model is unchanged.
Prompt injection: Crafted input that makes an LLM override its instructions.
Prompt injection (indirect): A hidden instruction in a webpage or email that hijacks the agent into something you did not ask for.
Prompt shape: A simple template: persona, task, context, and format.
Provenance: Verifying where media came from, the durable defense against fakes.

Q

Quantization: Storing weights at lower precision to cut memory and speed inference.

R

RAG: Retrieval-augmented generation, where retrieved data is fed to a model to answer queries.
RAG (retrieval-augmented generation): Retrieving relevant documents and inserting them into context at query time.
ReAct: A pattern interleaving reasoning with tool actions.
Reasoning model: A model trained to generate extended internal thinking before answering.
Recommender system: A filter that surfaces the items most relevant to a user.
Red teaming: Adversarial testing that tries to break a model's safety.
RefusalBench: Nous's own benchmark that scores how often a model answers requests other assistants tend to decline. A higher score means fewer refusals, not better or safer answers.
Refusal direction: A one-dimensional pattern inside a model's activations that, when present, makes it refuse a request.
Regularization: Adding a penalty term to the loss to discourage large coefficients and reduce overfitting.
Reinforcement learning: Learning by taking actions and receiving rewards, with no labels.
Responsible AI: Practices and governance for fair, transparent, accountable, and safe AI.
Retrieval-Augmented Generation: Connecting a model to an external knowledge source at inference.
Retrieval and ranking: Two phases: narrow to candidates, then score and order them.
RLHF: Reinforcement Learning from Human Feedback to make a model helpful, harmless, and honest.
RLHF and DPO: Two ways to align a model to human preferences. DPO is the simpler one.
Running locally: The model lives on your computer and runs offline, no server.

S

Saturation and contamination: When benchmarks max out or leak into training, they stop predicting real performance.
Saved memory / memory summary: The stored profile of facts and preferences about you, your notepad in ChatGPT.
Scaffold: Another word for the harness, common in benchmarking. The combination of a model and the scaffold around it is what people loosely call an agent.
Scratchpad: Any intermediate computation generated before a final answer.
Screen reader: Software that reads what is on a screen aloud, or sends it to a braille display.
Self-attention: Letting every token directly consider every other token at once.
Self-consistency: Sampling multiple reasoning chains and taking a majority vote on the final answers.
Self-hosted / local-first agent: An agent that runs on your own machine and keeps your data there, instead of on a company's servers.
Semantic Chunking: Uses an embedding model to place boundaries where cosine similarity between adjacent sentences drops.
Semantic search: Finding results by meaning rather than exact keywords.
SHAP and LIME: Post-hoc methods that approximate why any model made a prediction.
Sim-to-real gap: The drop in performance moving from simulation to the real world.
Size-to-capability curve: Each year's small models approach the prior era's large ones.
Skill: A folder with a SKILL.md file that teaches an AI assistant one repeatable task.
SKILL.md: The one required file in a skill: a name, a description, then plain-language instructions.
Small language model: A compact model, usually under ~15B parameters, that runs locally.
Social engineering: Manipulating a person into sending money or sharing a code by exploiting trust and urgency, not hacking.
Soft labels: The teacher's full probability distribution, encoding which concepts are similar.
Sparse retrieval: Keyword matching via term-frequency vectors (BM25).
Specialized tool: Use one only when it wins outright: voice, video, heavy data, or code.
Specification gaming: A model finding unintended ways to maximize its reward.
Subagent: A worker agent with its own context window handling part of the task.
Summarize: A one-tap feature that shortens a long thread, page, recording, or notifications. A preview, not a replacement.
Sycophancy: A chatbot's tendency to agree with and flatter the user to stay engaging.
Symbolic AI: AI based on explicit, human-readable rules and logic (GOFAI).
Synthetic data: Machine-generated data that mimics real data's statistics without real records.
Synthetic media: Any image, audio, video, or text produced or substantially altered by AI.
Systemic risk: Under the AI Act, GPAI models above 10^25 FLOPs are presumed to pose this and face heaviest rules.
System prompt: Operator-level instructions sent before any user message, lasting all session.

T

Teacher and student: The large model whose outputs train the smaller, cheaper model.
Technical debt: Messy or duplicated code that works now but gets harder and costlier to change later.
Temperature: The main dial trading coherence against creativity in sampling.
Temporary / Incognito chat: A no-memory mode: not saved to history, doesn't use or update memory, not used for training.
Test-time compute: Spending more compute at inference so the model thinks longer.
Test set: Data held out for final evaluation. Touched only once.
Text-to-image: Producing a new image from a natural-language prompt, not retrieval.
Text-to-speech: Generating spoken audio from text.
Text-to-speech (TTS) and voice banking: TTS reads text aloud. Voice banking recreates a person's own voice from recordings.
TF-IDF: A classic method weighting words by frequency and rarity.
The do-not-paste rule: Never paste confidential or personal data into a consumer AI account.
The edge: Any compute outside a central data center: a phone, sensor, or browser tab.
The nesting: Every deep learning system is ML, and every ML system is AI, but not the reverse.
Thinking tokens: Intermediate reasoning steps generated before the final answer.
Time series: A sequence of observations indexed in chronological order.
Token: The basic chunk of text a model reads and writes, not a word or character.
Token-by-token generation: Producing novel output from a learned probability distribution, not retrieval.
Token budget: The explicit allocation of context space to different components.
Tokenization: Converting raw text into integer token IDs before processing.
Tokens per second: The measure of how fast a model generates text.
Tool definition: A name, description, and JSON schema the model reads to decide when to call.
Tool use: A model requesting that the application run an external function.
Top-p and top-k: Filters that limit which tokens are eligible before sampling.
Training: Adjusting a model's parameters over many examples to build it.
Training data license: A contractual right to use specific datasets for training.
Training opt-out: Per-app setting controlling whether your chats improve the model. Usually on by default.
Transformer: The neural network architecture behind virtually every LLM.
Triple: A subject-predicate-object fact such as (Paris, capitalOf, France).
TTFT (Time to First Token): How long until the model starts outputting. Key latency metric for streaming.
Turing test: If a human cannot reliably tell a machine from a person in text chat, it can be said to think.

U

Uncensored model: An open-weight model whose safety and refusal training was removed or never added, so it rarely declines a request.
Unified memory: Apple Silicon's shared memory used to hold the model.
Usage cap / limit: A ceiling on messages or heavy requests in a period before you wait or upgrade.

V

Validation set: Data used to tune hyperparameters and compare models. Also called dev set.
Vector database: Storage that returns the stored vectors most similar to a query vector.
Vector space: A high-dimensional space where similar items land close together.
Verify, do not forward: AI output is a draft to check, not a fact to pass on.
Verify checkable claims: Confirm specific AI claims against an outside source.
Vibe coding: Describing what you want and letting the AI write the code, accepting it without reading or fully understanding it.
Vision-Language-Action model: A model mapping perception and language to robot actions.
Vision-language model: A model that reasons across text and images together (VLM).
Vocabulary: The fixed set of tokens, typically 32,000 to 256,000 entries.
Voice assistant: Software you talk to on a speaker, phone, or display.
Voice cloning: Using AI to copy a person's voice from a short audio sample.
Voice cloning and AI dubbing: Recreating a voice from a sample to read new lines and translating speech while keeping delivery.
VRAM: Graphics-card memory that must hold the whole model to run it.

W

Weights and biases: The numbers a network learns, where all its knowledge lives.
Word embeddings: The predecessor of the contextual embeddings transformers produce.
Workflow: A system where LLMs and tools are orchestrated through predefined code paths set in advance.
World model: A learned internal model of an environment an agent can simulate.

Z

Zero-shot: Prompting with no examples included.
Zero-shot CoT: Appending a trigger phrase like "Let's think step by step" to get reasoning with no examples.
Zero Data Retention: Business API mode where inputs and outputs are not logged or stored, apart from a narrow child-safety exception.

#

"Nudify" deepfake: A tool that fabricates a fake nude from a clothed photo. Against a minor it is CSAM under US law.