Skip to content

v0.6.0: Provider-aware Token Limits

Latest

Choose a tag to compare

@botirk38 botirk38 released this 27 Nov 18:51

πŸš€ Release v0.6.0: Provider-aware Token Limits

✨ Features

  • Provider-aware chunking: Implement intelligent chunking that respects embedding model token limits
  • Auto-configuration: Cache automatically detects and sets MaxTokens from the embedding provider
  • Efficient processing: Only chunks text when it exceeds the model's token capacity, avoiding unnecessary overhead

πŸ”§ API Changes

  • Added GetMaxTokens() method to EmbeddingProvider interface
  • OpenAI provider now exposes model-specific token limits (8191 tokens for supported models)

πŸ§ͺ Testing

  • Added comprehensive tests for chunking behavior with MaxTokens thresholds
  • Updated mock providers to implement new interface methods
  • All tests pass with 90%+ coverage

πŸ“ Changes

  • Modified cache.Set() to count tokens first before chunking decisions
  • Updated chunker logic to use MaxTokens instead of ChunkSize for threshold checks
  • Enhanced error handling and validation for token counting operations

πŸ”„ Migration Notes

This is a minor version bump with backward-compatible changes. Existing code will continue to work, but chunking behavior is now more efficient by avoiding unnecessary processing of texts that fit within model limits.