• caglararli@hotmail.com
  • 05386281520

Collision probability between truncating vs subsequencing of SHA256 hash to 128 bits?

Çağlar Arlı      -    15 Views

Collision probability between truncating vs subsequencing of SHA256 hash to 128 bits?

I am taking a SHA256 hash output in hexadecimal (64 hex nibbles), subsequencing it by taking every other character to make it 32 hex nibbles or 128 bits, and formatting it into a UUID string.

This isn't being used for security purposes, mainly as a way to create a deterministic UUID. I've read from a couple sources that truncating SHA256 to 128 bits is still more collision resistant compared to MD5.

My question is, does taking every other hex nibble instead of truncating the first 32 hex nibbles of the SHA256 hash output affect collision probability in any way?

My intuition is that it shouldn't affect collision probability at all, but all sources I've read only discussed the truncation of the first n characters of SHA256 hash, and nothing about subsequencing every other character.

As a note, I also recently learned that UUIDv5 uses SHA1 to create a hash, and then truncates the first 16 bytes to create a reproducible UUID. So I believe the method I am using should produce similar collision safety.

For my purposes, there would be at most 1 billion unique records.

Using the following approximate formula for accidental collision probability:

k^2/2n where:
k is the number of records (1 billion)
n is the number of total possible hashes (2^128).

The probability of collision is: 1.47 x 10^-21. It is low enough that I feel safe that a collision would not occur.

Can anyone confirm or deny this for me?