Understanding Context
How AI context windows work and why they matter for your agent interactions
Context is the information an AI agent has access to when processing your request. Understanding how context works helps you get better results from your agents and explains why they sometimes miss things or need reminders.
What is a Context Window?
Every AI model has a context window — the maximum amount of text it can consider at once. Think of it like working memory. The context window includes:
- The agent's system prompt (its personality and instructions)
- Relevant workspace context (channel info, user details)
- Recent conversation history
- Any documents, task details, or data loaded for the current request
When the total information exceeds the context window, older or less relevant information must be dropped.
How Saltare Manages Context
Saltare uses a tiered context system to make the most of each agent's context window:
Tier L0 — Metadata
Always loaded. Lightweight information about:
- Who is making the request
- Which channel and workspace
- Current date and time
- Available tools
Tier L1 — Summaries
Loaded when relevant:
- Agent memories (saved facts and preferences)
- Discussion summaries from previous conversations
- Document and task summaries
Tier L2 — Full Data
Loaded on demand:
- Complete message history
- Full document content
- Detailed task information with comments and activity
This tiered approach means agents start with a bird's-eye view and dive deeper only when needed, keeping token usage efficient.
Why Context Matters
Better Prompts
When you provide clear, specific context in your message, the agent spends less of its context window figuring out what you mean:
- Low context:
@Assistant write a report - High context:
@Assistant write a report on our Q2 sales performance, comparing against Q1. Focus on the enterprise segment. Use the data in the Sales Dashboard database.
Agent Memory
Agents can save important information to memory, which persists across conversations. This is how agents learn your preferences over time:
- Your preferred writing style
- Project-specific terminology
- Team conventions and processes
- Key decisions and their rationale
Memory is loaded at L1 (summary tier), so agents always have access to what they've learned.
Conversation Length
Very long conversations eventually push earlier messages out of the context window. For extended work:
- Start a new thread for each distinct topic
- Ask the agent to summarize progress before switching topics
- Use documents to store intermediate results (they persist independently of chat context)
Token Budgets
Saltare manages a token budget for each agent request (approximately 8,000 tokens of context by default). This budget is allocated across tiers:
- System prompt and metadata (fixed cost)
- Agent memories (loaded by relevance)
- Recent messages (most recent first)
- Referenced documents and tasks (loaded on demand)
You don't need to manage token budgets directly — Saltare handles this automatically. But understanding that this budget exists helps explain why agents sometimes ask for clarification or miss details from much earlier in a conversation.
Key Takeaways
- Context is finite — Agents can't remember everything at once, similar to human working memory
- Be specific — Clear requests use context more efficiently
- Use memory — Important facts saved to memory persist across conversations
- Start fresh for new topics — New threads give agents a clean context to work with
- Documents are persistent — Store important outputs in documents, not just chat messages