SOLR-18127 Solr native text to vector model interface#4161
Open
prathyand wants to merge 3 commits intoapache:mainfrom
Open
SOLR-18127 Solr native text to vector model interface#4161prathyand wants to merge 3 commits intoapache:mainfrom
prathyand wants to merge 3 commits intoapache:mainfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
https://issues.apache.org/jira/browse/SOLR-18127
Description
This PR adds a Solr‑native
TextToVectorModelinterface so that users can plug in their own embedding models without needing to implement LangChain4j classes. Right now,SolrTextToVectorModelis tied directly to LangChain4j’sEmbeddingModel, which makes it awkward for anyone who has a custom embedding service — they have to implement the whole LC4j API even if they don’t use it anywhere else.The goal here is to give Solr a simple internal abstraction for “text → vector” while keeping all existing LangChain4j providers working the same as before.
Solution
The change is broken up into a few pieces:
New
TextToVectorModelinterfaceA small interface that provides the basic methods Solr needs.
LangChain4jModelAdapterAn adapter that wraps an LC4j
EmbeddingModeland exposes it through the new interface. This keeps everything backwards‑compatible so existing configs keep working without changes.Updates to the model initialization workflow
SolrTextToVectorModel's getInstance() now recognizes either:
TextToVectorModelimplementation, orAI Assist Disclosure
The
ModelConfigUtils.convertValuehelper method was written with some assistance from the AI tool Windsurf. It’s a small utility that converts parsed JSON model config values into the types expected by constructors.Tests
TextToVectorUpdateProcessorTestthat loads a customDummyTextToVectorModel.Checklist
main../gradlew check.