Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Read1To3 on Big Endian for 2 byte inputs #1388

Merged
merged 1 commit into from
Feb 22, 2023
Merged

Fix Read1To3 on Big Endian for 2 byte inputs #1388

merged 1 commit into from
Feb 22, 2023

Conversation

miladfarca
Copy link
Contributor

Current implementation does not work with 2 byte inputs as mem1 and mem2 will both point to the least significant byte.

@@ -1083,7 +1083,7 @@ class ABSL_DLL MixingHashState : public HashStateBase<MixingHashState> {
unsigned char significant0 = mem0;
#else
unsigned char significant2 = mem0;
unsigned char significant1 = mem1;
unsigned char significant1 = len == 2 ? mem0 : mem1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think we don't need any of endian complications here. We should be fine returning (mem0 << 16) | (mem1 << 8) | mem2.

This only needs to be deterministic and use all the bytes at least once. It can duplicate bytes as long as it does so consistently.

See also

a = static_cast<uint64_t>((ptr[0] << 16) | (ptr[len >> 1] << 8) |
ptr[len - 1]);
.

Copy link
Contributor Author

@miladfarca miladfarca Feb 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But then the returned value won't match this on BE which will cause a number of test failures:

return static_cast<uint64_t>(m ^ (m >> (sizeof(m) * 8 / 2)));

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which tests fail and why does the return value have to match? Abseil hash is randomized and has no stability guarantees at all across platforms or even across runs on the same platform.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This recently added test case fails:

EXPECT_EQ(absl::HashOf(i16), absl::Hash<int16_t>{}(i16));

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me a while to realize that there is an IntegralFastPath above that this has to match, so this does have to be a precise read, it can't just use all the bytes. LowLevelHash has no such need.

I've sent your change to another colleague for review in case there is some other trick that can be used, but otherwise looks good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a problem, thanks for reviewing it.

@copybara-service copybara-service bot merged commit d6ea4df into abseil:master Feb 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants