fix(java): fix codegen config hash when registering serializers#3122
fix(java): fix codegen config hash when registering serializers#3122sagiruthvik wants to merge 6 commits intoapache:mainfrom
Conversation
| private static final boolean isLittleEndian = ByteOrder.nativeOrder() == ByteOrder.LITTLE_ENDIAN; | ||
| private static final byte BITMAP = isLittleEndian ? isLittleEndianFlag : 0; | ||
| private static final short MAGIC_NUMBER = 0x62D4; | ||
| private static final AtomicInteger runtimeHashCounter = new AtomicInteger(0); |
There was a problem hiding this comment.
we don't need this, we could update configHash instance field direclty when invoke registerSerializer everytime. Andu compute new hash based hash of full classname of registered Serializer.
There was a problem hiding this comment.
Hi @chaokunyang
I think the intent was to keep with having the generated class suffix be a relatively small integer.
If going for a full hash of the registered serializers, then perhaps we implement hashCode() for Fory to include Config.hashCode()?
There was a problem hiding this comment.
The suffix don't use hash, it use hash as key to store a counter for suffix.
And we should not impl hash code for Fory. Maybe we should not name this method as hash either
There was a problem hiding this comment.
Yes. That's why we were following that pattern, but moved it to Fory to cover it being not just the initial config.
Now the incrementAndGet in the ctor and when modifying means that e.g. Fory#1 will get 1, and Fory#2 will get 2, but then Fory#2 gets a serializer added so it becomes 3.
There was a problem hiding this comment.
However, does this cover pooled Fory? Does that create multiple instances, so should still use the same trick, so we should use the Map in Fory but include the serializer hashes for computeIfAbsent
There was a problem hiding this comment.
Pooled Fory will apply register callback, which will invoke registerSerializer API on every pooled Fory instance. All fory instance will compute to same hash value
There was a problem hiding this comment.
Yes. So we cannot do what has been added on line 145. This approach is wrong.
You are agreeing with me.
There was a problem hiding this comment.
we could update configHash instance field direclty when invoke registerSerializer everytime. And compute new hash based hash of full classname of registered Serializer
This will work, why don't you use this approach?
There was a problem hiding this comment.
Fory will never be invoked By multiple threads, we never need to use any concurrent data structure here. Please remove AtomicInteger here.
There was a problem hiding this comment.
I've updated the implementation to compute the hash based on registered serializers instead of using a counter. Now configHash is calculated from config.hashCode() plus the hash of all registered serializer class names. This way, pooled Fury instances with the same serializers will share the same hash, while instances with different serializers get different hashes.
f19a744 to
23d71c7
Compare
| private final IdentityMap<Object, Object> originToCopyMap; | ||
| private int classDefEndOffset; | ||
| private int configHash; | ||
| private final Set<String> registeredSerializers = new HashSet<>(); |
There was a problem hiding this comment.
Please delete this, we don't need to store registered serializer
| return configHash; | ||
| } | ||
|
|
||
| private void invalidateCodegenCache() { |
There was a problem hiding this comment.
Please also delete this method
| registeredSerializers.add(type.getName() + ":" + serializerClass.getName()); | ||
| invalidateCodegenCache(); |
There was a problem hiding this comment.
| registeredSerializers.add(type.getName() + ":" + serializerClass.getName()); | |
| invalidateCodegenCache(); | |
| this.configHash = Objects.hashCode(configHash, type, serializerClass); |
There was a problem hiding this comment.
And org.apache.fory.Fory#registerSerializer(java.lang.Class<?>, org.apache.fory.serializer.Serializer<?>) also needs updates
23d71c7 to
d5b1adc
Compare
…registration and update test
d5b1adc to
dfbbaed
Compare
|
Rebased to solve merge conflict. It should now squash nicely |
|
If you want, you could change "when register serializers" in the title to "when registering serializers" for better English in the release notes :-) |
|
@nealeu @sagiruthvik Could you fix ci failure errors? Otherwise, we can't merge this PR |
|
Sorry. Bit of a push on here. Ping us if we don't pick it up next week |
Why?
What does this PR do?
Related issues
Closes #3116
Does this PR introduce any user-facing change?
Benchmark