DX Heroes logo
#ai
#getting-started

What is a token in AI?

Length: 

3 min

Published: 

June 9, 2026

What is a token in AI?

What is a token?

A token is the basic unit of text that a language model processes. Models do not read whole words or letters; they read tokens, which are short pieces of text. A token can be a whole word, part of a word, a single character, or a punctuation mark. As a rough rule for English, one token is about four characters, and 100 tokens land near 75 words.

When you send a prompt, the model first splits your text into tokens. It then predicts the next token, one at a time, until it has built a full answer. Both your input and the model's output are measured in tokens.

In plain words

Think of tokens as the LEGO bricks of text. A long word like "unbelievable" might break into a few bricks ("un", "believ", "able"), while a common word like "cat" is a single brick. The model never sees the finished sentence as you do. It sees a pile of bricks and decides which brick most likely comes next.

Why it matters

  • Cost. AI providers bill per token, for both input and output. A longer prompt and a longer answer both cost more, so token count is your price tag.
  • Context limits. Every model has a maximum number of tokens it can hold at once, called the context window. A long document can simply not fit, which forces you to trim or split it.
  • Speed. The model generates one token at a time, so longer answers take longer to appear.

Common pitfalls

  • Tokens are not words. Counting words underestimates the real total, especially for code, numbers, or languages other than English, where a word can split into many tokens.
  • Hidden tokens add up. System instructions, examples, and chat history all sit in the context window and count toward your limit, even when you do not see them.
  • Other languages cost more. Czech, with its diacritics and word forms, often needs more tokens than English for the same meaning, which raises cost and fills the window faster.

Related articles:

  • What is an LLM? - The model that reads and writes those tokens one at a time.
  • What is a prompt? - The instruction you send, and why its length is measured in tokens.
  • What are embeddings? - How models turn tokens into numbers they can work with.

Want to stay one step ahead?

Don't miss our best insights. No spam, just practical analyses, invitations to exclusive events, and podcast summaries delivered straight to your inbox.