[v1.37] Tokenization tutorial + query-profile docs: Java and TypeScript snippets#411
Open
[v1.37] Tokenization tutorial + query-profile docs: Java and TypeScript snippets#411
Conversation
There was a problem hiding this comment.
Orca Security Scan Summary
| Status | Check | Issues by priority | |
|---|---|---|---|
| Infrastructure as Code | View in Orca | ||
| SAST | View in Orca | ||
| Secrets | View in Orca | ||
| Vulnerabilities | View in Orca |
☢️ The following Vulnerabilities (CVEs) have been detected
| PACKAGE | FILE | CVE ID | INSTALLED VERSION | FIXED VERSION | ||
|---|---|---|---|---|---|---|
| tar | ./yarn.lock | CVE-2026-24842 | 6.2.1 | 7.5.7 | View in code | |
| tar | ./yarn.lock | CVE-2026-26960 | 6.2.1 | 7.5.8 | View in code |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds the Java v6 and TypeScript ports for the v1.37 tokenization tutorial (Examples 4–6) and the query-profile how-to page. Both pages were Python-only at PR #398 / #402 time because the Java and TS clients didn't yet expose the v1.37 surface; that's resolved now (Java 6.2.1-SNAPSHOT, TS
weaviate/typescript-client#429). Also bumps the test compose stack to Weaviate 1.37.2 — the version where the/v1/tokenizestopwordPresetsrequest shape becameMap<string, []string>, matching what both clients send naturally.Tokenization tutorial (
docs/weaviate/tutorials/tokenization.md)_includes/code/tutorials/tokenization/{accent_folding,custom_stopwords,tokenize_endpoint}.tsmirroring the Python markers_includes/code/java-v6/src/test/java/TokenizationTest.java(5@Testmethods covering all three examples plus theforPropertyvariant)Query profiling (
docs/weaviate/search/query-profile.md)_includes/code/howto/search.profile.tsand_includes/code/java-v6/src/test/java/SearchProfileTest.java. The latter waits 3 s afterinsertManyforASYNC_INDEXING=trueto build the HNSW graph — without the wait,nearVectorreturns no objects and the server skips populatingqueryProfile.shards.Test harness
tests/test_typescript.py— newtest_tokenizationandtest_search_profileparametrize blockstests/test_java_v6.py—TokenizationTestandSearchProfileTestadded1.37.1→1.37.2Client dependency notes
_includes/code/java-v6/pom.xmlpinned toclient6:6.2.1-SNAPSHOT— needs the camelCaseTextAnalyzer@SerializedNamefix that's not in released 6.2.0. Will switch back to a release version onceclient6:6.2.1ships._includes/code/package.jsonaddsweaviate-client(version managed viaweaviate/typescript-client#429SNAPSHOT until released).Test plan
uv run pytest -m java_v6 -k "Tokenization or SearchProfile" -v— 2 passed (TokenizationTest's 5 sub-tests + SearchProfileTest's 3 sub-tests all green)uv run pytest -m ts -k "tokenization or search_profile" -v— 4 passed (accent_folding.ts,custom_stopwords.ts,tokenize_endpoint.ts,search.profile.ts)npx tsx _includes/code/tutorials/tokenization/{accent_folding,custom_stopwords,tokenize_endpoint}.ts— direct runs match the Python output verbatimmvn test -Dtest=TokenizationTestand-Dtest=SearchProfileTest— green/weaviate/tutorials/tokenizationand/weaviate/search/query-profile— Python / TypeScript / Java v6 tabs render and code is highlighted correctlyNote on commits
This branch is currently based on
digital-ocean(one commit). TheAdd DigitalOcean deployment typecommit will collapse out of this PR's diff automatically once that branch merges to main; if you'd prefer, I can rebase onto main once the DigitalOcean PR lands.🤖 Generated with Claude Code