Documentation Index
Fetch the complete documentation index at: https://pulze.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Context Chunking
The Context Chunking tool enables processing of large documents by intelligently splitting them into manageable pieces. When content exceeds a model’s context window limit, this tool automatically chunks the document and orchestrates a multi-step analysis process.Key Features
- Automatic chunk size optimization based on model context windows
- Multiple chunking strategies (semantic, fixed, hierarchical)
- Two-step processing flow for efficient large document analysis
- Smart overlap between chunks to preserve context
- Temporary storage with automatic cleanup (1-hour TTL)
- Token estimation and safety limits
How It Works
The Context Chunking tool works automatically under the hood to handle large documents:- Automatic Chunking: When your content exceeds the model’s context window, the tool intelligently splits it into smaller chunks that fit within the model’s limits.
- Smart Processing: Each chunk is processed individually with your provided prompt (or defaults to analyzing and summarizing the content if no specific prompt is given).
- Seamless Integration: The tool automatically selects the optimal chunking strategy based on your content and model, making individual API calls with smaller chunks that fit perfectly into your selected model’s context window.
- Result Synthesis: After processing all chunks, results are combined into a comprehensive answer.
Chunking Strategies
By default, the tool automatically selects the optimal strategy for your content. However, you can configure specific strategies when needed:Semantic Chunking (Default)
- Respects natural document boundaries (paragraphs, sentences)
- Maintains context and readability
- Best for: General documents, articles, reports
Fixed Chunking
- Creates equal-sized chunks based on token count
- Predictable chunk sizes
- Best for: Structured data, when uniform processing is needed
Hierarchical Chunking
- Starts with semantic boundaries
- Creates nested structure for complex documents
- Best for: Long technical documents with clear structure
Parameters
create_chunks Operation
| Parameter | Type | Required | Description |
|---|---|---|---|
operation | string | Yes | Set to "create_chunks" |
content | string | Yes | The large text or document to chunk |
query | string | No | The specific question or analysis task |
chunking_strategy | string | No | Strategy: "semantic", "fixed", or "hierarchical" (default: "semantic") |
chunk_size | integer | No | Target tokens per chunk (auto-calculated if not provided) |
chunk_overlap | integer | No | Tokens to overlap between chunks (auto-calculated if not provided) |
process_chunk Operation
| Parameter | Type | Required | Description |
|---|---|---|---|
operation | string | Yes | Set to "process_chunk" |
chunk_id | string | Yes | The ID of the chunk to retrieve and process |
Configuration Options
The tool can be configured in multiple ways:During Assistant Creation
When creating or configuring an assistant, you can customize the Context Chunking tool with:- chunking_strategy: Choose between semantic, fixed, or hierarchical
- chunk_size: Set custom chunk size in tokens
- chunk_overlap: Define overlap between chunks
- max_chunks: Set maximum number of chunks
Organization Defaults
Default values apply when not explicitly configured:- default_chunk_size: 100,000 tokens (uses 80% of model’s context window)
- chunk_overlap: 1,000 tokens (10% of chunk size)
- max_chunks: 100 chunks (safety limit)
Use Cases
Large Document Analysis
Process lengthy reports, research papers, or documentation that exceed context limits:Multi-file Processing
When analyzing multiple large files in a workflow:Token-Limited Models
Optimize usage of models with smaller context windows:Best Practices
-
Let Auto-Optimization Work
- Don’t specify chunk_size unless you have specific requirements
- The tool automatically calculates optimal sizes based on your model
-
Provide Clear Queries
- Include your analysis question in the
queryparameter - This helps maintain focus across all chunks
- Include your analysis question in the
-
Choose the Right Strategy
- Use semantic chunking for most documents
- Use fixed chunking for structured data
- Use hierarchical for complex technical documents
-
Synthesize Results
- After processing all chunks, always synthesize the findings
- Look for patterns and connections across chunks
- Provide a comprehensive final answer
-
Monitor Chunk Count
- Very large documents may generate many chunks
- Consider the max_chunks limit (default: 100)
- If you hit the limit, increase chunk_size
Example Workflow
Technical Details
- Token Estimation: Uses approximate token counting (1 token ≈ 4 characters)
- Context Window Safety: Uses 80% of model’s context window to ensure reliability
- Overlap Handling: 10% overlap between chunks preserves context at boundaries
- Automatic Orchestration: Handles all API calls and result synthesis automatically
When to Use Context Chunking
✅ Use when:- Content exceeds model’s context window
- Processing very large documents (>50,000 tokens)
- You receive context limit errors
- You need systematic analysis of lengthy content
- Content fits within model limits
- You need real-time processing
- Document structure must remain intact
- Quick responses are priority over thoroughness
Related Tools
- Add Data: For including complete file contents in workflows (use before Context Chunking)
- Model Selector: For choosing appropriate models for chunk processing
- Space Search: For finding and retrieving documents to chunk
