Add Model Serving CRUD tools (create, update, delete endpoints)#413
Open
jralfonsog wants to merge 5 commits intodatabricks-solutions:mainfrom
Open
Add Model Serving CRUD tools (create, update, delete endpoints)#413jralfonsog wants to merge 5 commits intodatabricks-solutions:mainfrom
jralfonsog wants to merge 5 commits intodatabricks-solutions:mainfrom
Conversation
Co-authored-by: Isaac
Add GPU workload type, environment variables, custom concurrency, provisioned throughput, instance profiles, and custom entity names to the served entity builder. Three scaling modes are now supported as mutually exclusive options. Co-authored-by: Isaac
Co-authored-by: Isaac
…ut config Co-authored-by: Isaac
- Docstrings: opening """ on its own line - MCP module header: add tool listing with _find_endpoint_by_name helper - Returns sections: bullet list format for dict keys - Manifest imports: late imports in try blocks - Identity: create passes get_default_tags() and with_description_footer() to tag endpoints as "Built with Databricks AI Dev Kit" - Idempotent create: rename to create_or_update_serving_endpoint - Core create_serving_endpoint: new optional tags and description params Co-authored-by: Isaac
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
create_serving_endpoint()with sync (wait=True) and async (wait=False) modes for classical ML (~2 min) vs GenAI agents (~15 min)update_serving_endpoint()to deploy new model versions, change workload size, or modify traffic routing for A/B testingdelete_serving_endpoint()with idempotent not-found handling (returns status instead of raising)track_resourceon create,remove_resourceon delete) and registers a deleter for cleanupSupported ServedEntityInput fields
entity_nameentity_versionworkload_sizescale_to_zero_enabledworkload_typeenvironment_vars{{secrets/scope/key}}refsmin/max_provisioned_concurrencymin/max_provisioned_throughputnameinstance_profile_arnChanges
serving/endpoints.py_build_served_entity_inputswith all SDK fields,_extract_endpoint_summarysurfaces extended fieldsserving/__init__.pytools/serving.py@mcp.toolwrappers with manifest integration, documented extended entity fieldstests/unit/test_serving.pyDesign decisions
ResourceDoesNotExist,NotFound) instead of string matching_extract_endpoint_summaryomits null extended fields to keep responses cleanTest plan
This pull request was AI-assisted by Isaac