How to save up tokens while using Claude Code (or any other AI agent)

People keep complaining about Anthropic's token restrictions. I get it. The limits feel tight, and the company hasn't exactly been transparent about changes.

But here's the thing: most users don't manage context properly. They're torching tokens without realizing it.

I've run the 20x plan for five months now. Never hit the cap. And I'm using Claude Code constantly with three or four terminals running at once, minimum.

Let me share what actually works.

Ten Ways to Save Tokens in Claude Code

Stop letting conversations compress. This eats tokens like crazy, especially with thinking mode or Opus 4.1. Start fresh each time instead.
Don't default to thinking mode. More thinking doesn't guarantee better results. It does guarantee higher token usage. Save ultrathink for when you really need it.
Trim your MCP servers. Too many servers sitting idle will drain your budget fast. Supabase, GitHub, and Chrome DevTools together? That's nearly 75,000 tokens even when you're not touching them. Delete what you don't use.
Keep CLAUDE.md files short. These load into memory constantly. Bloated documentation files become silent token vampires.
Use specialized agents. When you invoke an agent, it runs on separate tokens. Your main session stays untouched.
Drop images into the CLI. Claude Code accepts them. Drag and drop. One screenshot beats a thousand words, especially for frontend bugs or visual explanations.
Limit reasoning MCPs. Sequential thinker and code reasoner tools chew through tokens. Deploy them strategically, not automatically.
Block unnecessary documentation. Claude Code loves generating markdown files. Add a rule to your CLAUDE.md file or start sessions with the # command to prevent this behavior.
Reserve Opus 4.1 for complex work. This model burns tokens faster than anything else. Only pull it out when the task actually demands that level of power.
Request concise responses. Tell Claude Code upfront to keep answers direct and relevant. Skip the fluff. You'll shave tokens off every interaction.

The Real Problem

Token management isn't sexy. It's boring housekeeping that most people ignore until they slam into rate limits.

But if you're hitting caps regularly while others cruise through unlimited usage, the issue isn't just Anthropic's restrictions. It's how you're using the tool.

Clean up your setup. Think before invoking heavy features. Start treating tokens like the finite resource they are.

Your productivity will thank you.