HBASE-30225 Performance degradation observed on ycsb reads benchmark after HBASE-2972 by wchevreuil · Pull Request #8353 · apache/hbase

wchevreuil · 2026-06-12T20:15:39Z

No description provided.

Apache9

Mind explaining more about the changes here? I do not fully get what you are trying to fix here...

Apache9 · 2026-06-13T13:42:33Z

  public BlockCacheKey(Path hfilePath, long offset, boolean isPrimaryReplica, BlockType blockType) {
    this(hfilePath.getName(), hfilePath.getParent().getName(),
      hfilePath.getParent().getParent().getName(), offset, isPrimaryReplica, blockType,
-      HFileArchiveUtil.isHFileArchived(hfilePath));


Why here we can use false directly?

We are actually only using this constructor on tests now. Because the HFileUtils.isArchived call seems too expensive to be performed on every block cache key creation, that check should be delegated to the reader, which can do it once during it's initialisation. Writers don't even need it, since a being written file is guaranteed to be "non-archived", so writers can benefit from constructors that alwas set archived as false. Let me add javadoc to explain this.

Then why not just remoev this flag in BlockCacheKey if it does not change?

It might change when it's created by a Reader (see HFileReaderImpl#1352).

Apache9 · 2026-06-13T13:43:12Z

-    this.hfileName = HFILE_NAME_POOL.intern(hfileName);
-    this.regionName = (regionName != null) ? REGION_NAME_POOL.intern(regionName) : null;
-    this.cfName = (cfName != null) ? CF_NAME_POOL.intern(cfName) : null;
+    this.hfileName = hfileName;


If we do not do intern here, why we still put the above 3 string pools in this class?

We are doing intern from readers and writers, we need those to access a single pool per file/region/cf. Since both reader and writer classes already reference BlockCacheKey, and the string re-use is relevant in the BlockCacheKey usage, I found it makes sense to keep the pools in the BlockCacheKey.

Why not introduce a new class to do this?

wchevreuil · 2026-06-15T12:17:30Z

Mind explaining more about the changes here? I do not fully get what you are trying to fix here...

As explained in the jira description:

When executing ycsb read workloads, we observed a ~30% latency degradation (please see flamegrahs attached to the jira).

The problem was that we added logic for parsing the file Path into region name, family name, as well checks for archiving all on the BlockCacheKey constructor used by HFileReaderImpl on the beginning of each block read. As seen on the flame graphs attached covering a five minutes window on one of the RSes, around 30% of the CPU time was spent on the BlockCacheKey constructor, either calling Path.getParent() or HFileUtils.isHFileArchived().

Apache9 · 2026-06-15T14:27:43Z

Mind explaining more about the changes here? I do not fully get what you are trying to fix here...

As explained in the jira description:

When executing ycsb read workloads, we observed a ~30% latency degradation (please see flamegrahs attached to the jira).

The problem was that we added logic for parsing the file Path into region name, family name, as well checks for archiving all on the BlockCacheKey constructor used by HFileReaderImpl on the beginning of each block read. As seen on the flame graphs attached covering a five minutes window on one of the RSes, around 30% of the CPU time was spent on the BlockCacheKey constructor, either calling Path.getParent() or HFileUtils.isHFileArchived().

So the intention here, is to not always call intern when creating BlockCacheKey, and move intern to other places?

wchevreuil · 2026-06-15T16:35:03Z

Mind explaining more about the changes here? I do not fully get what you are trying to fix here...

As explained in the jira description:
When executing ycsb read workloads, we observed a ~30% latency degradation (please see flamegrahs attached to the jira).
The problem was that we added logic for parsing the file Path into region name, family name, as well checks for archiving all on the BlockCacheKey constructor used by HFileReaderImpl on the beginning of each block read. As seen on the flame graphs attached covering a five minutes window on one of the RSes, around 30% of the CPU time was spent on the BlockCacheKey constructor, either calling Path.getParent() or HFileUtils.isHFileArchived().

So the intention here, is to not always call intern when creating BlockCacheKey, and move intern to other places?

Sort of. The main problem is not the intern call itself, but the parsing of file name, region and CF name from the path. Doing it inside the BlockCacheKey constructor is too costly, we saw 10% degradation on latency and throughput of ycsb workloads. Moving this parsing to HFileWriterImpl/HFileReaderImpl initialisation means we only need to do it once, as all block cache keys on each writer/reader instance will refer to same values.

We could maybe leave the pools and intern calls private to BlockCacheKey, but we would miss the original parsed strings from getting pooled. Less efficient, but I think it's not a big deal, as we would only be doing it at the reader/writer level. Code-wise, I think it would be cleaner and less confusing. Let me do it.

…after HBASE-2972

VladRodionov

The direction makes sense to me, but I think HFileReaderImpl should also cache the HFile name once.

VladRodionov · 2026-06-16T04:01:09Z

-    BlockCacheKey cacheKey =
-      new BlockCacheKey(path, dataBlockOffset, this.isPrimaryReplicaReader(), expectedBlockType);
-
+    BlockCacheKey cacheKey = new BlockCacheKey(path.getName(), family, region, dataBlockOffset,


A bit off top, I am really surprised that that parsing qualified file name can be expensive at all, but you see that in your benchmarks. Q: why did not you cache file name in this case? path.getName()?

The culprits, as seem by the async profiler flamegraphs, are Path.getParent() and HFileArchiveUtil.isFileArchived(). I didn't think of doing the same for the file name because that didn't seem to cause problems, but yeah, could do it for the sake of code consistency.

VladRodionov · 2026-06-16T04:02:40Z

   * Construct a new BlockCacheKey using a file path. File, column family and region information
-   * will be extracted from the passed path. This should be used when inserting keys into the cache,
-   * so that region cache metrics are recorded properly.
+   * will be extracted from the passed path, so that region cache metrics are recorded properly/


nit: properly

Apache9 reviewed Jun 13, 2026

View reviewed changes

wchevreuil requested a review from Apache9 June 15, 2026 13:21

HBASE-30225 Performance degradation observed on ycsb reads benchmark …

a1f80d6

…after HBASE-2972

wchevreuil force-pushed the HBASE-30225 branch from 0d41fec to a1f80d6 Compare June 15, 2026 19:09

VladRodionov reviewed Jun 16, 2026

View reviewed changes

Conversation

wchevreuil commented Jun 12, 2026

Uh oh!

Apache9 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wchevreuil Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wchevreuil commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Apache9 commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wchevreuil commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

VladRodionov left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

wchevreuil Jun 15, 2026 •

edited

Loading

wchevreuil commented Jun 15, 2026 •

edited

Loading

Apache9 commented Jun 15, 2026 •

edited

Loading

wchevreuil commented Jun 15, 2026 •

edited

Loading