Skip to content

Commit 9eabf93

Browse files
committed
docs: address code review findings and add PR references
Brainstorm: add PR #166 (Agent plugin) and PR #200 (Vector Search plugin) as future extension references. Rename future enhancement section to cover both Vector Search and Lakebase pgvector options. Plan: address findings from multi-agent code review (architecture, security, performance, spec flow, pattern recognition): - Fix cache infrastructure: use shared CacheManager pool, not fictional maxEntries config - Clarify error contract: programmatic API errors propagate, HTTP handlers use execute() for interceptors - Separate _chatCollect()/_embed() from HTTP handlers - Add SSE buffer max size (1MB) to prevent OOM - Restrict response_format to text/json_object (no json_schema v1) - Add runtime role validation against known set - Add model to parameter allowlist for Foundation Model API - Add stop parameter bounds (4 entries, 256 chars) - Standardize connection pool at 100 (was contradictory 50/100) - Add retry on 503 for chatCollect() (cold-start resilience) - Specify setup() throws on missing endpoint, shutdown() cleanup - Extract SSE parser to stream/sse-parser.ts in Phase 2 - Add per-route body-parser middleware (not global) - Update acceptance criteria and security checklist Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com>
1 parent 5423cd8 commit 9eabf93

File tree

2 files changed

+68
-33
lines changed

2 files changed

+68
-33
lines changed

docs/brainstorms/2026-03-23-model-serving-plugin-brainstorm.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ A **Model Serving plugin** for AppKit that provides authenticated access to Data
1818
### Out of Scope (for v1)
1919

2020
- Custom ML model scoring (dataframe_split/inputs format)
21-
- ChatAgent / ResponsesAgent endpoint types (natural future extension — see `app-templates/e2e-chatbot-app` for reference patterns)
21+
- ChatAgent / ResponsesAgent endpoint types (natural future extension — see `app-templates/e2e-chatbot-app` for reference patterns and [PR #166: Agent plugin](https://github.com/databricks/appkit/pull/166))
2222
- Endpoint management (create/update/delete/start/stop)
2323
- Conversation/session management
2424
- Response normalization or custom abstractions
@@ -153,9 +153,11 @@ A minimal chat page with streaming responses — conditionally included when the
153153

154154
**Why:** Templates should be minimal starting points. RAG can be added by referencing the dev-playground pattern.
155155

156-
### Future Enhancement: Lakebase pgvector
156+
### Future Enhancement: Persistent Vector Storage
157157

158-
The in-memory vector store could be swapped for Lakebase with pgvector extension for persistence and larger doc sets. A `VectorStore` interface abstraction would make this a drop-in upgrade. Deferred for now.
158+
The in-memory vector store could be upgraded to a persistent solution:
159+
- **Vector Search plugin** ([PR #200](https://github.com/databricks/appkit/pull/200)) — native Databricks Vector Search with REST API client, OBO auth, and React components. A natural complement for RAG use cases combining `embed()` from serving with Vector Search for retrieval.
160+
- **Lakebase pgvector** — Lakebase with pgvector extension for persistence and larger doc sets. A `VectorStore` interface abstraction would make either option a drop-in upgrade.
159161

160162
## Open Questions
161163

@@ -169,3 +171,5 @@ _(None — all key decisions resolved during brainstorm)_
169171
- [Query chat models](https://docs.databricks.com/aws/en/machine-learning/model-serving/query-chat-models)
170172
- [Databricks Apps: Model Serving integration](https://docs.databricks.com/aws/en/dev-tools/databricks-apps/model-serving)
171173
- Existing plugin patterns: Analytics, Files, Genie, Lakebase
174+
- [PR #166: Agent plugin](https://github.com/databricks/appkit/pull/166) — ChatAgent/ResponsesAgent future extension
175+
- [PR #200: Vector Search plugin](https://github.com/databricks/appkit/pull/200) — Vector search for RAG use cases (complement to serving embeddings)

0 commit comments

Comments
 (0)