03. How It Works: The Three Paths Compared
1. Chat app (subscription)
You open a browser tab or install an app, sign in, and start typing. The model runs in the provider's data centre on hardware you could never fit at home. You get the company's strongest models, a polished interface, and extra features bolted on (web search, file uploads, image generation, voice).
- Cost: a flat monthly fee. Most have a free tier with limits, and a paid tier around 20 US dollars a month.
See chat-apps-and-subscriptions.
- Setup: none. Works on any device with a browser.
- Privacy: your conversations are sent to the company and may be used to improve their models unless you opt out in settings.
- Capability: the highest available. Chat apps run the frontier models.
This is the path the vast majority of people should use.
2. API (for developers and tools)
An API is plumbing, not a product. Instead of a chat window, a piece of software makes a request to the provider's servers, includes your text, and receives the model's answer. You need either some code or a third-party tool that is built to talk to APIs.
- Cost: pay-as-you-go, measured in tokens (chunks of text, roughly three-quarters of a word). You pay for both the text you send and the text you get back. Cheap at low volume, and it can scale to large bills at high volume.
See tokens-and-tokenization and cost-latency-deployment.
- Setup: requires technical work or a tool built on top of it.
- Privacy: data still goes to the provider's servers, though business and "zero-retention" tiers exist.
- Capability: the same frontier models as the chat apps, plus full control over how they are used.
A non-coder rarely touches an API directly. It is named here so the landscape is complete, and because many of the tools people do use are API clients underneath.
3. Local install (run it on your own machine)
You download an open-weight model (one whose internals are released publicly) and run it with a free program like Ollama or LM Studio. The model lives on your hard drive and runs on your own processor or graphics card.
- Cost: free to run. The cost is the hardware you already own, plus electricity.
- Setup: install a program, download a model file (several gigabytes), and you are running. Modern tools have made this much easier than it used to be.
See running-llms-locally.
- Privacy: the strongest. Your prompts never leave your computer. It works with no internet once the model is downloaded.
- Capability: lower than the cloud frontier. You can run small and mid-size models, not the hundreds-of-billions-parameter giants.
See hardware-and-performance.
This path is for privacy, offline use, tinkering, and avoiding subscription fees, accepting a capable but not top-tier model in return.