Skip to content

Conversation

@dzakwanalifi
Copy link

@dzakwanalifi dzakwanalifi commented Oct 24, 2025

Problem

PageIndex currently only supports OpenAI models, limiting user choice and potentially increasing costs.

Solution

Add unified interface supporting both OpenAI GPT-4 and Gemini 2.5 Flash models with structured output capabilities.

Key Changes

  • Add LLM provider abstraction layer in utils.py
  • Support structured output with Pydantic models for Gemini
  • Add provider configuration in config.yaml
  • Fix JSON parsing error handling in page_index.py
  • Update function signatures for better model parameter handling

Code Changes

  • utils.py: Added LLMProvider class with unified interface
  • config.yaml: Added provider configuration option
  • page_index.py: Enhanced error handling and Gemini integration

Use Case

Users can now choose between providers:

# OpenAI (default)
python run_pageindex.py --pdf_path doc.pdf

# Gemini
python run_pageindex.py --pdf_path doc.pdf --provider gemini

This provides flexibility for different cost/performance requirements while maintaining full compatibility.

Add support for both OpenAI and Gemini providers with unified interface.
Includes structured output support and improved error handling.
@dgallitelli
Copy link

Great work on this multi-provider abstraction! 🎉

This pattern would make it straightforward to add additional cloud providers. I've opened two related feature requests that could build on this PR:

Both would follow the same LLMProvider interface pattern you've established here. The main implementation considerations:

For Bedrock:

  • Uses boto3 with the Converse API
  • Stop reason mapping: end_turnstop, max_tokenslength
  • Message format: {"content": [{"text": "..."}]}

For SageMaker:

  • Uses invoke_endpoint with custom payload format (varies by model container)
  • Enables self-hosted open models (Llama, Mistral, etc.)

Would you be open to expanding this PR to include AWS providers, or would separate PRs be preferred? Happy to contribute the Bedrock/SageMaker implementations if helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants