v0.6.0: Provider-aware Token Limits

Latest

Latest

botirk38 released this 27 Nov 18:51

f240858

🚀 Release v0.6.0: Provider-aware Token Limits

✨ Features

Provider-aware chunking: Implement intelligent chunking that respects embedding model token limits
Auto-configuration: Cache automatically detects and sets MaxTokens from the embedding provider
Efficient processing: Only chunks text when it exceeds the model's token capacity, avoiding unnecessary overhead

🔧 API Changes

Added GetMaxTokens() method to EmbeddingProvider interface
OpenAI provider now exposes model-specific token limits (8191 tokens for supported models)

🧪 Testing

Added comprehensive tests for chunking behavior with MaxTokens thresholds
Updated mock providers to implement new interface methods
All tests pass with 90%+ coverage

📝 Changes

Modified cache.Set() to count tokens first before chunking decisions
Updated chunker logic to use MaxTokens instead of ChunkSize for threshold checks
Enhanced error handling and validation for token counting operations

🔄 Migration Notes

This is a minor version bump with backward-compatible changes. Existing code will continue to work, but chunking behavior is now more efficient by avoiding unnecessary processing of texts that fit within model limits.

Assets 2