π Release v0.6.0: Provider-aware Token Limits
β¨ Features
- Provider-aware chunking: Implement intelligent chunking that respects embedding model token limits
- Auto-configuration: Cache automatically detects and sets MaxTokens from the embedding provider
- Efficient processing: Only chunks text when it exceeds the model's token capacity, avoiding unnecessary overhead
π§ API Changes
- Added
GetMaxTokens()method toEmbeddingProviderinterface - OpenAI provider now exposes model-specific token limits (8191 tokens for supported models)
π§ͺ Testing
- Added comprehensive tests for chunking behavior with MaxTokens thresholds
- Updated mock providers to implement new interface methods
- All tests pass with 90%+ coverage
π Changes
- Modified
cache.Set()to count tokens first before chunking decisions - Updated chunker logic to use MaxTokens instead of ChunkSize for threshold checks
- Enhanced error handling and validation for token counting operations
π Migration Notes
This is a minor version bump with backward-compatible changes. Existing code will continue to work, but chunking behavior is now more efficient by avoiding unnecessary processing of texts that fit within model limits.