01. What It Is
A context window is the maximum number of tokens a language model can process in one request-response cycle. Every token the model reads or generates counts against this limit: the system prompt, the full conversation history, any documents you paste in, and the model's reply.
Tokens are the model's unit of measurement, not words. In English, one token is roughly 0.75 words or four characters. So 100K tokens is approximately 75,000 words, or about 150 printed pages. In Chinese, Japanese, or Korean, the ratio differs. A single CJK character often maps to one or two tokens, meaning CJK text produces fewer characters per token than English. However, each CJK character carries higher semantic density, so the information conveyed per token can be roughly comparable.
When the context window is full, the model cannot see further back. Older content is either silently dropped (usually the oldest messages) or the request is refused outright, depending on the provider.