next up previous index
Next: The Integer hash function Up: Choice of hash function Previous: Choice of hash function   Index

The String hash function

In the String class, for example, the hash code h of a string s of length n is calculated as
\( \texttt{h} \;=\; \texttt{s[0]}*31^{n-1} + \texttt{s[1]}*31^{n-2}
+ \cdots + \texttt{s[n-1]} \)
or, in code,
int h = 0;
for (int i = 0; i < n; i++) {
    h = 31*h + s.charAt(i);
}
For example the hash code of hello uses the Unicode values of its characters
h e l l o
$104$ $101$ $108$ $108$ $111$
to give the value
\( 99162322 \;=\; 104*31^{4} + 101*31^{3} + 108*31^{2} + 108*31 +
111 \)
In general the arithmetic operations in such expressions will use 32-bit modular arithmetic ignoring overflow. For example
Integer.MAX_VALUE + 1 = Integer.MIN_VALUE
where
Integer.MAX_VALUE = $2147483647$
Integer.MIN_VALUE = $-2147483648$
Note that, because of wraparound associated with modular arithmetic, the hash code could be negative, or even zero. It happened to be positive in this case because hello is a fairly short string. Thus, for example, the hash code for helloa, which is 31 times the hash code for hello plus 97, would not be $3074032079$, which is outside the range of 32-bit signed integers, but $3074032079-2^{32} = -1220935217$.


next up previous index
Next: The Integer hash function Up: Choice of hash function Previous: Choice of hash function   Index
Peter Williams 2005-06-07