A context window is the total number of tokens a language model can "see" and reason about in a single request. It acts like worki...
An embedding is a list of numbers (a vector) that represents the meaning of a piece of text, an image, or other data. Items with s...
Parameters are the billions of learned numerical values that make up a neural network's "knowledge." When you see "7B" or "70B" in...
A token is the basic unit that a language model reads and writes -- not a word, not a character, but a chunk of text determined by...
Training is the process of building a model by adjusting its parameters over billions of examples, costing tens of millions of dol...
The transformer is the neural network architecture that underlies virtually every large language model in existence. Its core inno...
A large language model (LLM) is a neural network trained on massive amounts of text to predict the next word (token) in a sequence...