Landon Noll looking up Fremont Peak Observatory 0.8m telescope Leonid 2001 meteor squall count at Fremont Peak

Too many demands for a single cryptographic hash function to satisfy

[chongo's home] [Astronomy] [Mathematics] [Prime Numbers] [Programming] [Technology] [contacting Landon]


Too many demands for a single cryptographic hash function to satisfy

Cryptographic hash functions are used for a very wide variety of purposes. These many purposes place a wide variety of demands on these hash functions. Below the Definitions and assumptions section you will find a partial set of requirements for Cryptographically sound hash functions. This long list of requirements reflect just some of the demands placed on cryptographic hash today.

I believe it is unreasonable to expect that a single, useful cryptographic hash function can satisfy ALL requirements we effectively place given the wide variety of demands that we have today.

My concern with the NIST Cryptographic Hash Algorithm Competition is that it seeks a single candidate algorithm. Instead of a single cryptographic hash function, we need several such functions. Each of those several hash functions could be designed to better satisfy a subset of the requirements for Cryptographically sound hash functions.

Designers of cryptographic protocols and procedures would need to determine their hash function requirements and then select the appropriate hash function that was designed to meet those hash function requirements. Some may object on the basis that this would complicate cryptographic protocols and procedures. I believe that today's simplistic "one cryptographic hash function fits all" possible requirements is both wrong and naive. Instead we need specifically designed cryptographic hash functions designed to need a select set of function requirements.

Some function requirements are difficult to achieve than others. It is likely that a cryptographic hash function designed to meet those more difficult to achieve requirements will be more complex. Moreover, that added complexity comes with a performance and implementation cost. Applications that do not require the more difficult requirements could opt to use cryptographic hash functions that are likely to be less complex and easier to implement and use.

It would be nice if there were a "one cryptographic hash function fits all" that was fast and east to implement. Unfortunately the world of cryptographic mathematic appears to not allow such naive wish. We really do need standardize on more than one cryptographic hash function. We really do need those family of cryptographic hash functions to be more specifically target to useful subset of function requirements.

Comments and corrections welcome

This page is a work in progress. Moreover, the notation on this page is prone to human (my) error. Comments, corrections and suggestions are welcome.

See How to contact Landon for my latest contact information.

Cryptographically sound hash function requirements

NOTE: Please see the Definitions and assumptions section at the bottom for details on the variables, constants and related assumption made in the requirements.
  1. Zero inverse resistant hash map:

    It is "hard to find" x

    such that:

    h(x) == 0
    BTW: This is known as resistance to the TPM hardware design flaw mega-crack.

  2. Inverse resistant hash map:

    Given x,

    where 0 < x < R
    it is hard to find y

    such that:

    h(y) == x
    BTW: This is also known as resistance to 1st pre-image attacks.

  3. Collision resistant hash map:

    It is "hard to find" x,
    AND it is "hard to find" y,

    where x != y
    such that:
    h(x) == h(y)
    BTW: This requirement resists finding hash collisions.

  4. Second pre-image collision resistant hash map:

    Given x,

    it is "hard to find" y,

    where x != y
    such that:
    h(x) == h(y)
    BTW: This resists finding a hash collision for a particular given value.

  5. Concatenation collision resistant hash map:

    Given x,
    AND given y,

    where x != y
    it is "hard to find" p,
    where 0 ≤ (p | x) < R,
    AND where 0 ≤ (p | y) < R
    such that:
    h(p | x) == h(p | y)
    OR

    it is "hard to find" q,

    where 0 ≤ (x | q) < R,
    AND where 0 ≤ (y | q) < R
    such that:
    h(x | q) == h(y | q)
    BTW: This is known as the suffix or prefix collision attack.

  6. Double concatenation collision resistant hash map:

    Given x,
    AND given y,

    where x != y,
    it is "hard to find" j,
    AND it is "hard to find" k,
    where j != k
    AND where either j or k (but not both) may be empty,
    AND where (j | x) != (k | y)
    AND where 0 ≤ (j | x) < R
    AND where 0 ≤ (k | y) < R
    such that:
    h(j | x) == h(k | y)
    OR

    it is "hard to find" p,
    AND it is "hard to find" q,

    where p != q
    AND where either p or q (but not both) may be empty,
    AND where (x | p) != (y | q)
    AND where 0 ≤ (x | p) < R
    AND where 0 ≤ (y | q) < R
    such that:
    h(x | p) == h(y | q)
    BTW: This is a more general form of the suffix or prefix collision attack that allows for different prefixes or suffixes to be used.

  7. Single bit set or clear resistant hash map:

    It is "hard to find" x

    such that:

    h(x) is a power of 2
    OR
    The 2's compliment of h(x) is a power of 2
    BTW: The first case means it is hard to find a value that, when hashed, will produce a value with only a single bit set to 1. The second case means it is hard to find a value that, when hashed, will produce a value with only a single bit set to 0.

  8. Incremental or decremental pre-image collision resistant hash map:

    Given x,

    it is "hard to find" y,

    such that:

    h(x) + 1 == h(y)
    OR

    such that:

    h(x) - 1 == h(y)
  9. Domain coverage map:

    Foreach 0 ≤ y < D,
    there exists at least one value x,

    where 0 ≤ x < R
    such that:
    h(x) == y
    BTW: This means that the hash function is able output all possible domain values.

  10. Compound zero inverse resistant hash map:

    Given x,

    it is "hard to find" N,

    where 1 < N < B
    such that:
    hN(x) == 0
  11. Compound inverse resistant hash map:

    Given x,

    it is "hard to find" y,
    AND it is "hard to find" N,

    where 1 < N < B
    such that:
    hN(y) == x
  12. Compound collision resistant hash map:

    It is "hard to find" x,
    AND it is "hard to find" y,

    where x != y
    AND it is "hard to find" N,
    where 1 < N < B
    such that:
    hN(x) == hN(y)
    BTW: This resists finding any hash collision by repeated hashing.

  13. Compound 2nd pre-image collision resistant hash map:

    Given x,

    it is "hard to find" y,

    where x != y
    AND it is "hard to find" N,
    where 1 < N < B
    such that:
    hN(x) == hN(y)
    BTW: This resists finding a hash collision by repeated hashing for a particular given value.

  14. Double compound collision resistant hash map:

    It is "hard to find" x,
    AND it is "hard to find" y,

    where x != y
    AND it is "hard to find" M,
    where 2 < M < B
    AND it is "hard to find" N,
    where 1 < N < M
    such that:
    hN(x) == hM(y)
  15. Double compound second pre-image resistant hash map:

    Given x,

    it is "hard to find" y,

    where x != y
    AND it is "hard to find" M,
    where 2 < M < B
    AND it is "hard to find" N,
    where 1 < N < M
    such that:
    hN(x) == hM(y)
  16. Function-related pre-image collision resistant hash map:

    Given x,
    AND given an O(1) non-degenerate elementary function f(y),

    where 0 ≤ f(y) < D
    for all y,
    where 0 ≤ y < R
    it is "hard to find" z

    such that:

    f(h(x)) == h(z)
    BTW: This means that one cannot find two values whose hash has an elementary relationship.

    Don't be fooled by the word elementary. Here the word elementary refers to an function that falls under the scope of elementary number theory. Elementary functions can be quite complex!

    Simple examples include:

    f(x) == 2*(x)
    f(x) == x2 mod D
    f(x) == the count of the number of 1 bits in x
    f(x) == the xth prime (where the required primes have been precomputed)
    Certain degenerate functions are excluded. For example, this function is excluded:
    f(any value) == h(z)
    Because it requires the h(z) ti be known and because it always returns that pre-determined answer regardless of input.

    Another example of a function that is excluded is a function that makes an exhaustive search of the entire hash space returning the first collision discovered after the hash space has been searched. This brute force exhaustive search function would run in constant time (because it always does a complete search of the same sized space before turning an answer). However for realistic hash space sizes, such a brute force function is impractical to run to completion.

  17. Compound function-related pre-image collision resistant hash map:

    Given x,
    AND given an O(1) non-trivial elementary function f(z),

    where 0 ≤ f(z) < D
    for all z in the half open interval [0,R)
    it is "hard to find" N,
    where 1 < N < B
    AND it is "hard to find" y

    such that:

    f(hN(x)) == h(y)
    BTW: This means that one cannot find a compound hash of a value that has an elementary relationship with the hash of another value.

    See the "Function-related pre-image collision resistant hash map" requirement for details on non-trivial elementary functions.

  18. Uniform dispersion map:

    For any set UN

    where 0 < N < B
    AND where the values UN have been "uniformly distributed" over the half open interval [0,R) by a "cryptographically sound deterministic bit generator"
    the hash of each member of UN

    appears to be "uniformly distributed" over the half open interval:

    [0,D)

    BTW: This means that the hash of random values appears to be well distributed across the hash function domain.

  19. Pseudo-random map:

    For a given N,

    where 0 < N < B,
    the concatenation of the hash of each member of UN appears to be a sequence generated by a "cryptographically sound deterministic bit generator".

    BTW: This means that the hash function acts as a cryptographically sound deterministic bit generator that was seeded by UN. That is:

    h(UN[0]) | h(UN[1]) | ... h(UN[N-1])
    passes the requirements for output of from a "cryptographically sound random bit generator".

  20. Pseudo-random bit flip dispersion map:

    For a given N,

    where 0 < N < B,
    AND for a given set UN that was generated over the half open interval [0,R) by a "cryptographically sound deterministic bit generator",
    foreach value x over the half open interval [0,N),
    AND foreach value i over the half open interval [0,W),
    the concatenation of the set of values from this expression:
    h(UN[x]) xor h(UN[x] xor 2i)
    passes the requirements for output of from a "cryptographically sound random bit generator".

    BTW: Randomly changing an input bit will appear to randomly change about half the output bits. Moreover, the output bit positions that change appear to be randomly selected.

Definitions and assumptions

  • h(x)

    This is a given cryptographic hash function.

  • All values are unsigned integers unless otherwise specified.

  • R and D

    The range (R) and domain (D) of the hash function h(x). In particular:

    For all 0 ≤ x < R: 0 ≤ h(x) < D
  • B

    This value is known as the "birthday domain limit" In particular:

    B = floor(sqrt(D))
    where floor() is the floor function and sqrt() is the square root function.

  • W

    This is the bit width of the output of the hash function. In particular:

    W is the smallest value such that 2WD
  • hN(x)

    A compound hash function. Example:

    h3(x) == h(h(h(x)))
  • L

    The lifetime of the hash function. In particular:

    L is the amount time where h(x) is expected to be used. After period T, h(x) will become deprecated for use.
    The value L is expected to be defined by either the hash function creators and/or the hash standard.

  • C

    A large collection of classical (non-quantum) computers. The value C is expected to be defined by either the hash function creators and/or the hash standard.

  • P

    A very small probability that is 0 < P < 1. The value P is expected to be defined by either the hash function creators and/or the hash standard. P is not an unsigned integer.

  • (p | x)

    The bit-wise concatenation of the values p AND x.

  • Variables j, k, p, q, x, y, z, are hash-able

    These variables are in the half open interval [0,R).

  • "hard to find"

    A solution is said to be "hard to find" if C classical (non-quantum) computers working constantly over the hash function lifetime L, has a probability of discovering a solution is ≤ P.

  • UN    {UN[0], UN[1], ... UN[N-1]}

    A set of N unique values such that:

    0 < N < D

    Foreach 0 ≤ i < N

    0 ≤ UN[i] < D

    Foreach 0 ≤ i < N-1

    Foreach i+1 ≤ j < N
    UN[i] != UN[j]
  • "True random source"

    A "true random source" is a theoretical random bit generator that is perfect in every way. The next value produced is completely uncorrelated with all previous values.

    Random bit statistical tests that run in polynomial time determine the likelihood (or lack thereof) that a given finite set of data could have been produced by a "true random source".

    For a sequence of generated bits to pass a statistical test, said sequence must pass with the same level of confidence as an equivalent length sequence from a true random source.

  • "Cryptographically strong random bit generator"

    A "cryptographically strong random bit generator" passes all statistical tests that run in polynomial time asymptotically. It will pass any statistical test for randomness that does not require an exponentially increasing to infinite amount of time to run. All such polynomial time statistical tests will be unable to distinguish the random bit generator from a "true random source".

    The values produced by the generator are random in an absolutely precise way. There is absolutely no better way to predict the sequence than by tossing a coin. Furthermore, having a large chunk of output from the random bit generator does not help in predicting past or future values.

  • "Cryptographically sound random bit generator"

    A "cryptographically sound random bit generator" is a very high quality random bit generator that is almost certainly a "cryptographically strong random bit generator". While the generator lacks a formal proof, there should exist a solid and well reasoned argument for its cryptographic strength.

    A "cryptographically sound random bit generator" will pass statistical tests with a very high level of confidence. In fact, sound generators will pass tests at the same confidence level as strong generators. In particular, any standard battery of statistical tests for randomness, when given a statistically significant amount of data, will not be able to distinguish the random bit generator from a "true random source".

  • The requirements listed are not indented to be orthogonal

    Some requirements may overlap other requirements. The list of requirements is not intended to be a minimal set of requirements.

  • The requirements listed are not complete.

    The list of requirements listed is not intended to be an exhaustive list of requirements.

TODO: Things to add someday

As I said, this is a work in progress. Someday I plan to add:
  • Define "uniformly distributed"
  • Increase the orthogonality of the requirements
  • Add more BTW text to each of the requirements
  • Add of the few missing requirements
  • Make the list of elementary function examples more cryptographically relevant


Valid HTML 4.01!

© 1994-2013 Landon Curt Noll
chongo (was here) /\oo/\
$Revision: 7.3 $ $Date: 2014/02/10 03:06:12 $