refactor: untrack generated python proto, regenerate in build#5077
refactor: untrack generated python proto, regenerate in build#5077Ma77Ball wants to merge 14 commits into
Conversation
|
/request-review @Yicong-Huang |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #5077 +/- ##
============================================
+ Coverage 43.66% 43.83% +0.17%
Complexity 2218 2218
============================================
Files 1049 1051 +2
Lines 40580 40705 +125
Branches 4324 4324
============================================
+ Hits 17719 17843 +124
Misses 21766 21766
- Partials 1095 1096 +1
*This pull request uses carry forward flags. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Thanks, that's a good direction for change! Let inline comments. Please make pr description concise. The "how you test" section is basically saying "ci" can verify. I don't get too much new info reading it. |
…o feat/changeProtobufGenerate
|
@Yicong-Huang, I addressed the comments. Please review when available. |
Yicong-Huang
left a comment
There was a problem hiding this comment.
Thanks for the effort to embed the script. however, letting python protobuf generation depends on sbt/scala/jdk is also not ideal. Can we have a better solution so that it only depends on python and a protoc lib? Sorry if this was confusing before.
| if [ -f amber/requirements.txt ]; then uv pip install --system --index-strategy unsafe-best-match -r amber/requirements.txt; fi | ||
| if [ -f amber/operator-requirements.txt ]; then uv pip install --system --index-strategy unsafe-best-match -r amber/operator-requirements.txt; fi | ||
| if [ -f amber/dev-requirements.txt ]; then uv pip install --system --index-strategy unsafe-best-match -r amber/dev-requirements.txt; fi | ||
| - name: Install protoc 3.19.4 |
There was a problem hiding this comment.
nit: is it possible to refer to some dependency file to declare version? so that we don't need to update CI file when need to bump protoc version.
There was a problem hiding this comment.
Pinned in bin/protoc-version.txt.
| - name: Set up JDK 17 | ||
| uses: actions/setup-java@v5 | ||
| with: | ||
| distribution: temurin | ||
| java-version: 17 | ||
| - name: Setup sbt launcher | ||
| uses: sbt/setup-sbt@508b753e53cb6095967669e0911487d2b9bc9f41 # v1.1.22 | ||
| - name: Generate Python proto bindings | ||
| run: sbt "WorkflowExecutionService/genPythonProto" |
There was a problem hiding this comment.
hmm, can we not rely on sbt to generate python protobuf? it is unnecessary to install JDK just for this purpose.
There was a problem hiding this comment.
The python and python-state-materialization-mac jobs now call python3 bin/gen-python-proto.py directly
| # protoc 3.19.4 (matches PB.protocVersion in amber/build.sbt) and the | ||
| # betterproto plugin are required by the genPythonProto sbt task so the | ||
| # generated amber/src/main/python/proto/ tree is populated before the | ||
| # dist is packaged. |
There was a problem hiding this comment.
we are hard coding this 3.19.4 version in too many places. let's define it at one place and all refer to it.
There was a problem hiding this comment.
Single source of truth is bin/protoc-version.txt.
| && unzip -o /tmp/protoc.zip -d /usr/local \ | ||
| && chmod +x /usr/local/bin/protoc \ | ||
| && rm /tmp/protoc.zip \ | ||
| && pip3 install --no-cache-dir 'betterproto[compiler]==2.0.0b7' |
There was a problem hiding this comment.
betterproto[compiler]==2.0.0b7 this is also hard coded. can we refer to requirements.txt instead?
There was a problem hiding this comment.
Pin lives only in amber/dev-requirements.txt.
| `amber/src/main/python/proto/` is gitignored and regenerated by the | ||
| `genPythonProto` sbt task in [`amber/build.sbt`](amber/build.sbt), which | ||
| runs as part of `sbt amber/compile` (or `sbt amber/genPythonProto` | ||
| directly). Requires `protoc` on PATH (pin to `3.19.4` to match | ||
| `PB.protocVersion`); skipped with a warning when `protoc` is missing. |
There was a problem hiding this comment.
let's remove this from Agent.md. too many details.
What changes were proposed in this PR?
Stop version-tracking the betterproto-generated Python files under
amber/src/main/python/proto/. Add agenPythonProtosbt task that runsbin/python-proto-gen.shas a dependency ofWorkflowExecutionService/Compile/compile, with a graceful skip whenprotocis missing. Wireprotoc3.19.4 (matchingPB.protocVersion) andbetterproto[compiler]==2.0.0b7into the three image build stages (worker,master, web-app) and the three CI jobs that touch Python proto (python,python-state-materialization-mac,amber-integration). The 14th tracked file (proto/org/apache/texera/web/__init__.py) had no corresponding.protosource and is gone after regen; grep confirms no callers.Any related issues, documentation, or discussions?
Closes: #4102
How was this PR tested?
locally: sbt clean WorkflowExecutionService/compile regenerates the tree and pytest -m "not integration" passes; skip path verified with protoc absent. All three Docker images build end-to-end. CI covers the workflow YAML on
push.
Was this PR authored or co-authored using generative AI tooling?
Co-authored with Claude Opus 4.7 in compliance with ASF