refactor: upgrade HBase and replace custom hbase-shaded-endpint#3021
refactor: upgrade HBase and replace custom hbase-shaded-endpint#3021vaijosh wants to merge 5 commits into
Conversation
…int with official artifacts apache#3016 -Added hbase-shaded-client and hbase-endpoint dependencies instead of custom hbase-shaded-endpoint library. -Added docker files and HBASE.md containing instructions for HBase backend
There was a problem hiding this comment.
Pull request overview
This PR modernizes the HBase backend by replacing the long-pinned com.baidu.hugegraph:hbase-shaded-endpoint:2.0.6 with the official Apache hbase-endpoint + hbase-shaded-client 2.6.5 artifacts, and ships a Docker-based local HBase test environment plus an end-to-end usage guide so contributors can reproduce HBase-backend validation.
Changes:
- Upgrade HBase client to 2.6.5 (official Apache artifacts) with transitive exclusions and a dependency-allowlist update.
- Add a self-contained Docker setup (
Dockerfile,entrypoint.sh,hbase-site.xml,docker-compose.hbase.yml) for a standalone HBase 2.6.5 cluster. - Add
docker/HBASE.mddocumenting build, run, API sanity checks, and troubleshooting.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 14 comments.
Show a summary per file
| File | Description |
|---|---|
| hugegraph-server/hugegraph-hbase/pom.xml | Switch HBase deps to official 2.6.5 with transitive exclusions and ordering comment. |
| install-dist/scripts/dependency/known-dependencies.txt | Replace old shaded-endpoint jar with new endpoint/shaded-client jars. |
| docker/hbase/Dockerfile | Build standalone HBase 2.6.5 image with SHA512 verification + mirror fallback. |
| docker/hbase/entrypoint.sh | Start ZK/master/regionserver and wait for readiness, then tail logs. |
| docker/hbase/hbase-site.xml | Standalone/pseudo-distributed HBase config tuned for HugeGraph defaults. |
| docker/hbase/docker-compose.hbase.yml | Compose service with ports, volumes, healthcheck, build args. |
| docker/HBASE.md | End-to-end Docker/HBase backend setup and troubleshooting guide. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
apache#3021 Addressed review comments in this update: - docker/HBASE.md - fixed Quick Start step title to match the actual command (image build) - aligned manual API examples with the default local server endpoint base (/graphs) - clarified idempotency wording around check_exist behavior - docker/hbase/entrypoint.sh - fixed log glob pattern to match runtime-generated hbase-* log files - replaced invalid exec+|| fallback with explicit log-file existence handling - docker/hbase/hbase-site.xml - set hbase.rootdir to explicit file:///tmp/hbase for deterministic local-FS mode - docker/hbase/Dockerfile - switched to stable archive URL as primary source - fetch checksum from the actually downloaded source first - hardened checksum parsing for grouped SHA512 formats - removed stale cleanup path
Replace custom hbase-shaded-endpoint with a streamlined hbase-endpoint. This reduces the runtime footprint by excluding heavyweight transitive dependencies not required by the HugeGraph HBase client. Key exclusions and rationale: - Server logic: hbase-server (coprocessors run on RS, not client). - Batch/Async: hbase-mapreduce, hbase-asyncfs, and hbase-replication. - Hadoop stack: hadoop-client/auth/common/hdfs. HugeGraph uses the ZooKeeper registry directly and avoids the YARN/MapReduce stack. - Legacy logging: log4j 1.x, slf4j-log4j12, and redundant slf4j-api versions were purged to eliminate vulnerabilities and conflicts. - Native/Compression: snappy-java (handled server-side). Updated known-dependencies.txt to reflect the minimal allowlist. Improved pom.xml comments to document exclusion rationales and addressed automated review feedback regarding dependency management.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #3021 +/- ##
=============================================
+ Coverage 35.90% 93.25% +57.35%
+ Complexity 338 65 -273
=============================================
Files 803 9 -794
Lines 68040 267 -67773
Branches 8905 22 -8883
=============================================
- Hits 24429 249 -24180
+ Misses 40991 8 -40983
+ Partials 2620 10 -2610 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
The HBase upgrade direction looks reasonable, but I don't think this is ready to merge until the dependency/release-materials check and HBase runtime verification are completed. This PR replaces the old custom For dependency changes, please also check the release-compliance materials and update them as needed, following the existing HugeGraph release-docs style: This does not necessarily mean adding large new LICENSE / NOTICE sections. The exact update should be based on the actual newly added or changed jars and their license/notice requirements, consistent with how this repo already records third-party dependencies. But we should not merge with only the dependency allowlist updated if the corresponding release materials still describe the old HBase 2.0.6 dependency set. Please also confirm the new HBase 2.6.5 runtime path with enough verification before merge. Since this PR adds a Dockerized HBase 2.6.5 environment, it would be helpful to include the exact commands/results used to verify that the new version works end to end, for example: In short: |
Implemented review comments -Updated the new dependencies in LICENSE, NOTICE files -Added licenses corresponding to new libraries in install-dist/release-docs/licenses -Updating the install-hbase.sh and hbase-site.xml files to fix the failures. Varified it locally using CI steps.
I have updated the PR based on your feedback:
I have attached the successful local API test execution logs for the reference. |
Title
fixes #3016
feat(hbase): upgrade to HBase 2.6.5 and replace custom shaded endpoint with official Apache artifacts
Background
This PR modernizes HugeGraph’s HBase integration by replacing the custom
hbase-shaded-endpointdependency with official Apache HBase 2.6.5 artifacts, and adds a reproducible Docker-based local test environment for HBase backend development/verification.What changed
1) HBase dependency upgrade (
hugegraph-server/hugegraph-hbase/pom.xml)hbase.versionproperty:2.6.5com.baidu.hugegraph:hbase-shaded-endpoint:2.0.6org.apache.hbase:hbase-endpoint:${hbase.version}org.apache.hbase:hbase-shaded-client:${hbase.version}hbase-endpointto avoid pulling heavyweight server/hadoop transitive components not needed by HugeGraph runtime.hbase-endpointbeforehbase-shaded-client) to preserveAggregationClient/LongColumnInterpretercompatibility.2) Dockerized HBase standalone environment (new files under
docker/hbase/)Dockerfileto build HBase2.6.5image from official Apache tarballs.downloads.apache.orgarchive.apache.org.sha512formats.entrypoint.shthat starts ZooKeeper + Master + RegionServer and blocks until service readiness.hbase-site.xmltuned for local standalone/pseudo-distributed usage and HugeGraph defaults.docker-compose.hbase.ymlwith ports, healthcheck, persistent volumes, and overridable download URLs.3) End-to-end usage and troubleshooting docs (
docker/HBASE.md)4) Dependency allowlist update (
install-dist/scripts/dependency/known-dependencies.txt)hbase-shaded-endpoint-2.0.6.jarhbase-endpoint-2.6.5.jarhbase-shaded-client-2.6.5.jarWhy
Impact
How to verify
Hbase upgrade varification ( Hbase Backend version 2.0.6 and client libary version 2.6.5)
Fresh install verification ( Hbase Backebd and client version 2.6.5)
Notes
SHA512 verification remains enforced by default during Docker image build.
ALLOW_UNVERIFIED_DOWNLOAD=true is intended only for trusted/restricted test environments.