1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
\r
2 <!-- saved from url=(0044)http://www.cosy.sbg.ac.at/~ppalfrad/hashing/ -->
\r
3 <HTML><HEAD><TITLE>Hashing</TITLE>
\r
4 <META content="Weasel alias Peter Palfrader - weasel@netalive.org" name=Author>
\r
5 <META content="text/html; charset=iso-8859-1" http-equiv=Content-Type>
\r
7 content="hash, hashing, linear probing, squared probing, double hashing, pascal"
\r
10 content="A small paper that explains how hashing works with examples in Pascal."
\r
12 <META content="MSHTML 5.00.2314.1000" name=GENERATOR></HEAD>
\r
13 <BODY aLink=#ffff00 bgColor=#ffffff leftMargin=0 link=#000000 text=#000000
\r
14 topMargin=0 vLink=#000000>
\r
15 <TABLE border=0 cellPadding=4 cellSpacing=0 width="100%">
\r
18 <TD bgColor=#000000><FONT color=#ffffff face=verdana,arial,helvetica
\r
19 size=1><A href="http://www.cosy.sbg.ac.at/~ppalfrad/"><STRONG><FONT
\r
20 color=#ffffff>home</FONT></STRONG></A> | <STRONG><FONT
\r
21 color=#ffffff>Hashing</FONT></STRONG> </FONT></TD></TR>
\r
23 <TD bgColor=#777777><IMG alt="" height=1 src=""
\r
24 width=1></TD></TR></TBODY></TABLE>
\r
25 <TABLE border=0 cellPadding=4 cellSpacing=0 width="100%">
\r
29 <H1>Hashing</H1><FONT size=2>V 0.1</FONT>
\r
30 <P>Hashing is a method to store data in an array so that storing,
\r
31 searching, inserting and deleting data is fast (in theory it's O(1)). For
\r
32 this every record needs an unique key.
\r
33 <P>The basic idea is not to search for the correct position of a record
\r
34 with comparisons but to compute the position within the array. The
\r
35 function that returns the position is called the 'hash function' and the
\r
36 array is called a 'hash table'.
\r
37 <P>In our examples our key is an integer value as is the actual data.
\r
38 <P>[note that I use pascal syntax since this is easily readable by
\r
40 <P><!-- code2html add pas
46 --><!-- code2html delete start --><PRE><A name=1_line1></A>
\r
47 <A name=1_line2><STRONG>type</STRONG></A>
\r
48 <A name=1_line3> <FONT color=#993333>THash_Record</FONT> <FONT color=#4444ff>=</FONT> <STRONG>record</STRONG></A>
\r
49 <A name=1_line4> <FONT color=#993333>key</FONT> <FONT color=#4444ff>:</FONT> <FONT color=#993333>integer</FONT><FONT color=#4444ff>;</FONT></A>
\r
50 <A name=1_line5> <FONT color=#993333>data</FONT> <FONT color=#4444ff>:</FONT> <FONT color=#993333>integer</FONT><FONT color=#4444ff>;</FONT></A>
\r
51 <A name=1_line6> <STRONG>end</STRONG><FONT color=#4444ff>;</FONT></A>
\r
52 </PRE><PRE></PRE><!-- code2html delete stop -->
\r
53 <P>the hash table now looks like this:
\r
54 <P><!-- code2html add pas
56 HASH_LENGTH = 100; { the length of the hash table and so the }
57 { maximum number of records that can be stored }
60 THash_Table = array[0..HASH_LENGTH-1] of THash_Record;
61 --><!-- code2html delete start --><PRE><A name=2_line1></A>
\r
62 <A name=2_line2><STRONG>const</STRONG></A>
\r
63 <A name=2_line3> <FONT color=#993333>HASH_LENGTH</FONT> <FONT color=#4444ff>=</FONT> <FONT color=#ff0000>100</FONT><FONT color=#4444ff>;</FONT> <FONT color=#444444>{ the length of the hash table and so the }</FONT></A>
\r
64 <A name=2_line4> <FONT color=#444444>{ maximum number of records that can be stored }</FONT></A>
\r
65 <A name=2_line5></A>
\r
66 <A name=2_line6><STRONG>type</STRONG></A>
\r
67 <A name=2_line7> <FONT color=#993333>THash_Table</FONT> <FONT color=#4444ff>=</FONT> <STRONG>array</STRONG><FONT color=#4444ff>[</FONT><FONT color=#ff0000>0..</FONT><FONT color=#993333>HASH_LENGTH</FONT><FONT color=#4444ff>-</FONT><FONT color=#ff0000>1</FONT><FONT color=#4444ff>]</FONT> <STRONG>of</STRONG> <FONT color=#993333>THash_Record</FONT><FONT color=#4444ff>;</FONT></A>
\r
68 </PRE><PRE></PRE><!-- code2html delete stop -->
\r
69 <P>If we know that the key is in a small range we could use the key itself
\r
70 as an index (also called hash address) for our array. However this is very
\r
71 rarely the case so we have to find some kind of hash function. A very
\r
72 common and not so bad function is a simple MODulo function:
\r
73 <P><!-- code2html add pas
74 function Hash_Function(key : integer) : integer;
76 Hash_Function := key MOD HASH_LENGTH;
78 --><!-- code2html delete start --><PRE><A name=3_line1></A>
\r
79 <A name=3_line2><STRONG>function</STRONG> <FONT color=#993333>Hash_Function</FONT><FONT color=#4444ff>(</FONT><FONT color=#993333>key</FONT> <FONT color=#4444ff>:</FONT> <FONT color=#993333>integer</FONT><FONT color=#4444ff>)</FONT> <FONT color=#4444ff>:</FONT> <FONT color=#993333>integer</FONT><FONT color=#4444ff>;</FONT></A>
\r
80 <A name=3_line3><STRONG>begin</STRONG></A>
\r
81 <A name=3_line4> <FONT color=#993333>Hash_Function</FONT> <FONT color=#4444ff>:</FONT><FONT color=#4444ff>=</FONT> <FONT color=#993333>key</FONT> <FONT color=#993333>MOD</FONT> <FONT color=#993333>HASH_LENGTH</FONT><FONT color=#4444ff>;</FONT></A>
\r
82 <A name=3_line5><STRONG>end</STRONG><FONT color=#4444ff>;</FONT></A>
\r
83 </PRE><PRE></PRE><!-- code2html delete stop -->
\r
84 <P>If we now want to insert a record into the hash we could do it this
\r
86 <P><!-- code2html add pas
87 procedure Insert_Into_Hash( VAR hash : THash_table;
90 hash[ Hash_Function(rec.key) ] := rec;
92 --><!-- code2html delete start --><PRE><A name=4_line1></A>
\r
93 <A name=4_line2><STRONG>procedure</STRONG> <FONT color=#993333>Insert_Into_Hash</FONT><FONT color=#4444ff>(</FONT> <FONT color=#993333>VAR</FONT> <FONT color=#993333>hash</FONT> <FONT color=#4444ff>:</FONT> <FONT color=#993333>THash_table</FONT><FONT color=#4444ff>;</FONT></A>
\r
94 <A name=4_line3> <FONT color=#993333>rec</FONT> <FONT color=#4444ff>:</FONT> <FONT color=#993333>THash_record</FONT><FONT color=#4444ff>)</FONT><FONT color=#4444ff>;</FONT></A>
\r
95 <A name=4_line4><STRONG>begin</STRONG></A>
\r
96 <A name=4_line5> <FONT color=#993333>hash</FONT><FONT color=#4444ff>[</FONT> <FONT color=#993333>Hash_Function</FONT><FONT color=#4444ff>(</FONT><FONT color=#993333>rec<FONT color=#4444ff>.</FONT>key</FONT><FONT color=#4444ff>)</FONT> <FONT color=#4444ff>]</FONT> <FONT color=#4444ff>:</FONT><FONT color=#4444ff>=</FONT> <FONT color=#993333>rec</FONT><FONT color=#4444ff>;</FONT></A>
\r
97 <A name=4_line6><STRONG>end</STRONG><FONT color=#4444ff>;</FONT></A>
\r
98 </PRE><PRE></PRE><!-- code2html delete stop -->
\r
99 <P>But wait! What happens if two different keys return the same hash
\r
100 address from the hash function? Well if you have a good hash function this
\r
101 happens very rarely but it can and will happen. There are two ways to
\r
102 handle a so called 'hash collision'.
\r
105 <LI>collision handling within the hash table
\r
106 <LI>collision handling outside of the hash table </LI></UL>
\r
107 <H2>COLLISION HANDLING OUTSIDE OF THE HASH TABLE ...</H2>... means that
\r
108 our hash table no longer is an array of records but instead it is an array
\r
109 of pointers. Each pointer points to a linked list of the hash records.
\r
110 <P><!-- code2html add pas
112 PHash_Record = ^THash_Record; {a pointer type to the type THash_record }
113 THash_Record = record
116 next : PHash_Record; { the pointer to the next item in the list }
119 THash_Table = array[0..HASH_LENGTH-1] of PHash_Record;
120 --><!-- code2html delete start --><PRE><A name=5_line1></A>
\r
121 <A name=5_line2><STRONG>type</STRONG></A>
\r
122 <A name=5_line3> <FONT color=#993333>PHash_Record</FONT> <FONT color=#4444ff>=</FONT> ^<FONT color=#993333>THash_Record</FONT><FONT color=#4444ff>;</FONT> <FONT color=#444444>{a pointer type to the type THash_record }</FONT></A>
\r
123 <A name=5_line4> <FONT color=#993333>THash_Record</FONT> <FONT color=#4444ff>=</FONT> <STRONG>record</STRONG></A>
\r
124 <A name=5_line5> <FONT color=#993333>key</FONT> <FONT color=#4444ff>:</FONT> <FONT color=#993333>integer</FONT><FONT color=#4444ff>;</FONT></A>
\r
125 <A name=5_line6> <FONT color=#993333>data</FONT> <FONT color=#4444ff>:</FONT> <FONT color=#993333>integer</FONT><FONT color=#4444ff>;</FONT></A>
\r
126 <A name=5_line7> <FONT color=#993333>next</FONT> <FONT color=#4444ff>:</FONT> <FONT color=#993333>PHash_Record</FONT><FONT color=#4444ff>;</FONT> <FONT color=#444444>{ the pointer to the next item in the list }</FONT></A>
\r
127 <A name=5_line8> <STRONG>end</STRONG><FONT color=#4444ff>;</FONT></A>
\r
128 <A name=5_line9></A>
\r
129 <A name=5_line10> <FONT color=#993333>THash_Table</FONT> <FONT color=#4444ff>=</FONT> <STRONG>array</STRONG><FONT color=#4444ff>[</FONT><FONT color=#ff0000>0..</FONT><FONT color=#993333>HASH_LENGTH</FONT><FONT color=#4444ff>-</FONT><FONT color=#ff0000>1</FONT><FONT color=#4444ff>]</FONT> <STRONG>of</STRONG> <FONT color=#993333>PHash_Record</FONT><FONT color=#4444ff>;</FONT></A>
\r
130 </PRE><PRE></PRE><!-- code2html delete stop -->
\r
131 <P>If now a key return the same 'hash address' like an already stored item
\r
132 we simply insert the record into the list of this hash entry.
\r
133 <P><!-- code2html add pas
134 procedure Insert_Into_Hash( VAR hash : THash_Table;
137 hash_address : integer;
140 hash_address := Hash_Function(rec.key);
144 item^.next = hash[ hash_address ];
145 hash[ hash_address ] := item;
147 --><!-- code2html delete start --><PRE><A name=6_line1></A>
\r
148 <A name=6_line2><STRONG>procedure</STRONG> <FONT color=#993333>Insert_Into_Hash</FONT><FONT color=#4444ff>(</FONT> <FONT color=#993333>VAR</FONT> <FONT color=#993333>hash</FONT> <FONT color=#4444ff>:</FONT> <FONT color=#993333>THash_Table</FONT><FONT color=#4444ff>;</FONT></A>
\r
149 <A name=6_line3> <FONT color=#993333>rec</FONT> <FONT color=#4444ff>:</FONT> <FONT color=#993333>THash_Record</FONT><FONT color=#4444ff>)</FONT><FONT color=#4444ff>;</FONT></A>
\r
150 <A name=6_line4><FONT color=#993333>VAR</FONT></A>
\r
151 <A name=6_line5> <FONT color=#993333>hash_address</FONT> <FONT color=#4444ff>:</FONT> <FONT color=#993333>integer</FONT><FONT color=#4444ff>;</FONT></A>
\r
152 <A name=6_line6> <FONT color=#993333>item</FONT> <FONT color=#4444ff>:</FONT> <FONT color=#993333>PHash_record</FONT><FONT color=#4444ff>;</FONT></A>
\r
153 <A name=6_line7><STRONG>begin</STRONG></A>
\r
154 <A name=6_line8> <FONT color=#993333>hash_address</FONT> <FONT color=#4444ff>:</FONT><FONT color=#4444ff>=</FONT> <FONT color=#993333>Hash_Function</FONT><FONT color=#4444ff>(</FONT><FONT color=#993333>rec<FONT color=#4444ff>.</FONT>key</FONT><FONT color=#4444ff>)</FONT><FONT color=#4444ff>;</FONT></A>
\r
155 <A name=6_line9></A>
\r
156 <A name=6_line10> <FONT color=#993333>new</FONT><FONT color=#4444ff>(</FONT><FONT color=#993333>item</FONT><FONT color=#4444ff>)</FONT><FONT color=#4444ff>;</FONT></A>
\r
157 <A name=6_line11> <FONT color=#993333>item</FONT>^ <FONT color=#4444ff>:</FONT><FONT color=#4444ff>=</FONT> <FONT color=#993333>rec</FONT><FONT color=#4444ff>;</FONT></A>
\r
158 <A name=6_line12> <FONT color=#993333>item<FONT color=#4444ff>^.</FONT>next</FONT> <FONT color=#4444ff>=</FONT> <FONT color=#993333>hash</FONT><FONT color=#4444ff>[</FONT> <FONT color=#993333>hash_address</FONT> <FONT color=#4444ff>]</FONT><FONT color=#4444ff>;</FONT></A>
\r
159 <A name=6_line13> <FONT color=#993333>hash</FONT><FONT color=#4444ff>[</FONT> <FONT color=#993333>hash_address</FONT> <FONT color=#4444ff>]</FONT> <FONT color=#4444ff>:</FONT><FONT color=#4444ff>=</FONT> <FONT color=#993333>item</FONT><FONT color=#4444ff>;</FONT></A>
\r
160 <A name=6_line14><STRONG>end</STRONG><FONT color=#4444ff>;</FONT></A>
\r
161 </PRE><PRE></PRE><!-- code2html delete stop -->
\r
162 <P>To get an item out of the hash you can first determine it's hash
\r
163 address with the hash function and then use a classical linear search to
\r
164 find the item you want. If you keep the list sorted you could speed read
\r
165 access up for some cost at inserting items.
\r
166 <P><!-- code2html add pas
167 procedure Get_From_Hash( hash : THash_Table;
169 VAR rec : THash_Record);
171 hash_address : integer;
174 hash_address := Hash_Function(rec.key);
176 item := hash[ hash_address ];
177 while (item <> nil) and (item^.key <> key) do { make sure you have short }
178 item := item^.next; { curcuit bool evaluation, so }
179 { that the second comparison }
180 { is never made if item is nil}
184 { ITEM NOT IN LIST! }
186 --><!-- code2html delete start --><PRE><A name=7_line1></A>
\r
187 <A name=7_line2><STRONG>procedure</STRONG> <FONT color=#993333>Get_From_Hash</FONT><FONT color=#4444ff>(</FONT> <FONT color=#993333>hash</FONT> <FONT color=#4444ff>:</FONT> <FONT color=#993333>THash_Table</FONT><FONT color=#4444ff>;</FONT></A>
\r
188 <A name=7_line3> <FONT color=#993333>key</FONT> <FONT color=#4444ff>:</FONT> <FONT color=#993333>integer</FONT><FONT color=#4444ff>;</FONT></A>
\r
189 <A name=7_line4> <FONT color=#993333>VAR</FONT> <FONT color=#993333>rec</FONT> <FONT color=#4444ff>:</FONT> <FONT color=#993333>THash_Record</FONT><FONT color=#4444ff>)</FONT><FONT color=#4444ff>;</FONT></A>
\r
190 <A name=7_line5><FONT color=#993333>VAR</FONT></A>
\r
191 <A name=7_line6> <FONT color=#993333>hash_address</FONT> <FONT color=#4444ff>:</FONT> <FONT color=#993333>integer</FONT><FONT color=#4444ff>;</FONT></A>
\r
192 <A name=7_line7> <FONT color=#993333>item</FONT> <FONT color=#4444ff>:</FONT> <FONT color=#993333>PHash_record</FONT><FONT color=#4444ff>;</FONT></A>
\r
193 <A name=7_line8><STRONG>begin</STRONG></A>
\r
194 <A name=7_line9> <FONT color=#993333>hash_address</FONT> <FONT color=#4444ff>:</FONT><FONT color=#4444ff>=</FONT> <FONT color=#993333>Hash_Function</FONT><FONT color=#4444ff>(</FONT><FONT color=#993333>rec<FONT color=#4444ff>.</FONT>key</FONT><FONT color=#4444ff>)</FONT><FONT color=#4444ff>;</FONT></A>
\r
195 <A name=7_line10></A>
\r
196 <A name=7_line11> <FONT color=#993333>item</FONT> <FONT color=#4444ff>:</FONT><FONT color=#4444ff>=</FONT> <FONT color=#993333>hash</FONT><FONT color=#4444ff>[</FONT> <FONT color=#993333>hash_address</FONT> <FONT color=#4444ff>]</FONT><FONT color=#4444ff>;</FONT></A>
\r
197 <A name=7_line12> <STRONG>while</STRONG> <FONT color=#4444ff>(</FONT><FONT color=#993333>item</FONT> <FONT color=#4444ff><</FONT><FONT color=#4444ff>></FONT> <STRONG>nil</STRONG><FONT color=#4444ff>)</FONT> <STRONG>and</STRONG> <FONT color=#4444ff>(</FONT><FONT color=#993333>item<FONT color=#4444ff>^.</FONT>key</FONT> <FONT color=#4444ff><</FONT><FONT color=#4444ff>></FONT> <FONT color=#993333>key</FONT><FONT color=#4444ff>)</FONT> <STRONG>do</STRONG> <FONT color=#444444>{ make sure you have short }</FONT></A>
\r
198 <A name=7_line13> <FONT color=#993333>item</FONT> <FONT color=#4444ff>:</FONT><FONT color=#4444ff>=</FONT> <FONT color=#993333>item<FONT color=#4444ff>^.</FONT>next</FONT><FONT color=#4444ff>;</FONT> <FONT color=#444444>{ curcuit bool evaluation, so }</FONT></A>
\r
199 <A name=7_line14> <FONT color=#444444>{ that the second comparison }</FONT></A>
\r
200 <A name=7_line15> <FONT color=#444444>{ is never made if item is nil}</FONT></A>
\r
201 <A name=7_line16> <STRONG>if</STRONG> <FONT color=#4444ff>(</FONT><FONT color=#993333>item</FONT><FONT color=#4444ff><</FONT><FONT color=#4444ff>></FONT><STRONG>nil</STRONG><FONT color=#4444ff>)</FONT> <STRONG>then</STRONG></A>
\r
202 <A name=7_line17> <FONT color=#993333>rec</FONT> <FONT color=#4444ff>:</FONT><FONT color=#4444ff>=</FONT> <FONT color=#993333>item</FONT>^</A>
\r
203 <A name=7_line18> <STRONG>else</STRONG></A>
\r
204 <A name=7_line19> <FONT color=#444444>{ ITEM NOT IN LIST! }</FONT></A>
\r
205 <A name=7_line20><STRONG>end</STRONG><FONT color=#4444ff>;</FONT></A>
\r
206 </PRE><PRE></PRE><!-- code2html delete stop -->
\r
207 <P>If you want to delete an item, you simply remove it from the linked
\r
209 <P><!-- code2html add pas
210 procedure Remove_From_Hash( VAR hash : THash_Table;
213 hash_address : integer;
217 hash_address := Hash_Function(rec.key);
219 item := hash[ hash_address ];
220 if (item<>nil) and (item^.key = key) then { make sure you have short }
221 begin { curcuit bool evaluation, so }
222 hash[ hash_address ] := item^.next { that the second comparison }
223 dispose(item); { deallocate memory } { is never made if item is nil}
227 while (item^.next <> nil) and
228 (item^.next^.key <> key) do
231 if (item^.next <> nil) then
234 item^.next := tmp^.next;
235 dispose(tmp); { deallocate memory }
238 { ITEM NOT IN LIST! }
241 --><!-- code2html delete start --><PRE><A name=8_line1></A>
\r
242 <A name=8_line2><STRONG>procedure</STRONG> <FONT color=#993333>Remove_From_Hash</FONT><FONT color=#4444ff>(</FONT> <FONT color=#993333>VAR</FONT> <FONT color=#993333>hash</FONT> <FONT color=#4444ff>:</FONT> <FONT color=#993333>THash_Table</FONT><FONT color=#4444ff>;</FONT></A>
\r
243 <A name=8_line3> <FONT color=#993333>key</FONT> <FONT color=#4444ff>:</FONT> <FONT color=#993333>integer</FONT><FONT color=#4444ff>)</FONT><FONT color=#4444ff>;</FONT></A>
\r
244 <A name=8_line4><FONT color=#993333>VAR</FONT></A>
\r
245 <A name=8_line5> <FONT color=#993333>hash_address</FONT> <FONT color=#4444ff>:</FONT> <FONT color=#993333>integer</FONT><FONT color=#4444ff>;</FONT></A>
\r
246 <A name=8_line6> <FONT color=#993333>item</FONT> <FONT color=#4444ff>:</FONT> <FONT color=#993333>PHash_record</FONT><FONT color=#4444ff>;</FONT></A>
\r
247 <A name=8_line7> <FONT color=#993333>tmp</FONT> <FONT color=#4444ff>:</FONT> <FONT color=#993333>PHash_record</FONT><FONT color=#4444ff>;</FONT></A>
\r
248 <A name=8_line8><STRONG>begin</STRONG></A>
\r
249 <A name=8_line9> <FONT color=#993333>hash_address</FONT> <FONT color=#4444ff>:</FONT><FONT color=#4444ff>=</FONT> <FONT color=#993333>Hash_Function</FONT><FONT color=#4444ff>(</FONT><FONT color=#993333>rec<FONT color=#4444ff>.</FONT>key</FONT><FONT color=#4444ff>)</FONT><FONT color=#4444ff>;</FONT></A>
\r
250 <A name=8_line10></A>
\r
251 <A name=8_line11> <FONT color=#993333>item</FONT> <FONT color=#4444ff>:</FONT><FONT color=#4444ff>=</FONT> <FONT color=#993333>hash</FONT><FONT color=#4444ff>[</FONT> <FONT color=#993333>hash_address</FONT> <FONT color=#4444ff>]</FONT><FONT color=#4444ff>;</FONT></A>
\r
252 <A name=8_line12> <STRONG>if</STRONG> <FONT color=#4444ff>(</FONT><FONT color=#993333>item</FONT><FONT color=#4444ff><</FONT><FONT color=#4444ff>></FONT><STRONG>nil</STRONG><FONT color=#4444ff>)</FONT> <STRONG>and</STRONG> <FONT color=#4444ff>(</FONT><FONT color=#993333>item<FONT color=#4444ff>^.</FONT>key</FONT> <FONT color=#4444ff>=</FONT> <FONT color=#993333>key</FONT><FONT color=#4444ff>)</FONT> <STRONG>then</STRONG> <FONT color=#444444>{ make sure you have short }</FONT></A>
\r
253 <A name=8_line13> <STRONG>begin</STRONG> <FONT color=#444444>{ curcuit bool evaluation, so }</FONT></A>
\r
254 <A name=8_line14> <FONT color=#993333>hash</FONT><FONT color=#4444ff>[</FONT> <FONT color=#993333>hash_address</FONT> <FONT color=#4444ff>]</FONT> <FONT color=#4444ff>:</FONT><FONT color=#4444ff>=</FONT> <FONT color=#993333>item<FONT color=#4444ff>^.</FONT>next</FONT> <FONT color=#444444>{ that the second comparison }</FONT></A>
\r
255 <A name=8_line15> <FONT color=#993333>dispose</FONT><FONT color=#4444ff>(</FONT><FONT color=#993333>item</FONT><FONT color=#4444ff>)</FONT><FONT color=#4444ff>;</FONT> <FONT color=#444444>{ deallocate memory }</FONT> <FONT color=#444444>{ is never made if item is nil}</FONT></A>
\r
256 <A name=8_line16> <STRONG>end</STRONG></A>
\r
257 <A name=8_line17> <STRONG>else</STRONG></A>
\r
258 <A name=8_line18> <STRONG>begin</STRONG></A>
\r
259 <A name=8_line19> <STRONG>while</STRONG> <FONT color=#4444ff>(</FONT><FONT color=#993333>item<FONT color=#4444ff>^.</FONT>next</FONT> <FONT color=#4444ff><</FONT><FONT color=#4444ff>></FONT> <STRONG>nil</STRONG><FONT color=#4444ff>)</FONT> <STRONG>and</STRONG></A>
\r
260 <A name=8_line20> <FONT color=#4444ff>(</FONT><FONT color=#993333>item<FONT color=#4444ff>^.</FONT>next<FONT color=#4444ff>^.</FONT>key</FONT> <FONT color=#4444ff><</FONT><FONT color=#4444ff>></FONT> <FONT color=#993333>key</FONT><FONT color=#4444ff>)</FONT> <STRONG>do</STRONG></A>
\r
261 <A name=8_line21> <FONT color=#993333>item</FONT> <FONT color=#4444ff>:</FONT><FONT color=#4444ff>=</FONT> <FONT color=#993333>item<FONT color=#4444ff>^.</FONT>next</FONT><FONT color=#4444ff>;</FONT></A>
\r
262 <A name=8_line22></A>
\r
263 <A name=8_line23> <STRONG>if</STRONG> <FONT color=#4444ff>(</FONT><FONT color=#993333>item<FONT color=#4444ff>^.</FONT>next</FONT> <FONT color=#4444ff><</FONT><FONT color=#4444ff>></FONT> <STRONG>nil</STRONG><FONT color=#4444ff>)</FONT> <STRONG>then</STRONG></A>
\r
264 <A name=8_line24> <STRONG>begin</STRONG></A>
\r
265 <A name=8_line25> <FONT color=#993333>tmp</FONT> <FONT color=#4444ff>:</FONT><FONT color=#4444ff>=</FONT> <FONT color=#993333>item<FONT color=#4444ff>^.</FONT>next</FONT><FONT color=#4444ff>;</FONT></A>
\r
266 <A name=8_line26> <FONT color=#993333>item<FONT color=#4444ff>^.</FONT>next</FONT> <FONT color=#4444ff>:</FONT><FONT color=#4444ff>=</FONT> <FONT color=#993333>tmp<FONT color=#4444ff>^.</FONT>next</FONT><FONT color=#4444ff>;</FONT></A>
\r
267 <A name=8_line27> <FONT color=#993333>dispose</FONT><FONT color=#4444ff>(</FONT><FONT color=#993333>tmp</FONT><FONT color=#4444ff>)</FONT><FONT color=#4444ff>;</FONT> <FONT color=#444444>{ deallocate memory }</FONT></A>
\r
268 <A name=8_line28> <STRONG>end</STRONG></A>
\r
269 <A name=8_line29> <STRONG>else</STRONG></A>
\r
270 <A name=8_line30> <FONT color=#444444>{ ITEM NOT IN LIST! }</FONT></A>
\r
271 <A name=8_line31> <STRONG>end</STRONG><FONT color=#4444ff>;</FONT></A>
\r
272 <A name=8_line32><STRONG>end</STRONG><FONT color=#4444ff>;</FONT></A>
\r
273 </PRE><PRE></PRE><!-- code2html delete stop -->
\r
274 <H2>COLLISION HANDLING WITHIN THE HASH TABLE...</H2>... is totally
\r
275 different. Instead of having a list which can grow up to an arbitrary size
\r
276 all the data is stored in the hash table itself. This means that we can
\r
277 never under any circumstance store more records in the hash as our hash is
\r
278 long. This is obvious but I wanted to point it out.
\r
280 <LI>So what to do if the position we want to save our new record to is
\r
282 <LI>And how do we know that it is occupied anyway? </LI></UL>
\r
283 <P>Let's first solve the latter question since it is quite simply. We have
\r
284 to store wheter a certain hash element is occupied. We do this by changing
\r
285 the THash_Record record: <!-- code2html add pas
287 THash_Occupied = (ho_empty, ho_occupied);
289 THash_Record = record
290 occupied : THash_Occupied;
294 --><!-- code2html delete start --><PRE><A name=9_line1></A>
\r
295 <A name=9_line2><STRONG>type</STRONG></A>
\r
296 <A name=9_line3> <FONT color=#993333>THash_Occupied</FONT> <FONT color=#4444ff>=</FONT> <FONT color=#4444ff>(</FONT><FONT color=#993333>ho_empty</FONT>, <FONT color=#993333>ho_occupied</FONT><FONT color=#4444ff>)</FONT><FONT color=#4444ff>;</FONT></A>
\r
297 <A name=9_line4> </A>
\r
298 <A name=9_line5> <FONT color=#993333>THash_Record</FONT> <FONT color=#4444ff>=</FONT> <STRONG>record</STRONG></A>
\r
299 <A name=9_line6> <FONT color=#993333>occupied</FONT> <FONT color=#4444ff>:</FONT> <FONT color=#993333>THash_Occupied</FONT><FONT color=#4444ff>;</FONT></A>
\r
300 <A name=9_line7> <FONT color=#993333>key</FONT> <FONT color=#4444ff>:</FONT> <FONT color=#993333>integer</FONT><FONT color=#4444ff>;</FONT></A>
\r
301 <A name=9_line8> <FONT color=#993333>data</FONT> <FONT color=#4444ff>:</FONT> <FONT color=#993333>integer</FONT><FONT color=#4444ff>;</FONT></A>
\r
302 <A name=9_line9> <STRONG>end</STRONG><FONT color=#4444ff>;</FONT></A>
\r
303 </PRE><PRE></PRE><!-- code2html delete stop -->
\r
304 <P>So now to the former question: What to do if an address is already
\r
305 occupied. The answer is straightforward: The have to store it at a
\r
306 different address :) There are several methods to find a new, empty
\r
307 address. Note that this address should have something to do with the key
\r
308 since we want to be able to find it easily later.
\r
309 <H3>linear probing:</H3>We add a constant number (usually 1) to the
\r
310 position the hash_function returns until we have found an empty place: <!-- code2html add pas
311 pos0 := Hash_Function(key);
314 hash_address := (pos0 + i) MOD HASH_LENGTH;
316 until Hash[ hash_address].occupied <> ho_occupied;
317 --><!-- code2html delete start --><PRE><A name=10_line1></A>
\r
318 <A name=10_line2> <FONT color=#993333>pos0</FONT> <FONT color=#4444ff>:</FONT><FONT color=#4444ff>=</FONT> <FONT color=#993333>Hash_Function</FONT><FONT color=#4444ff>(</FONT><FONT color=#993333>key</FONT><FONT color=#4444ff>)</FONT><FONT color=#4444ff>;</FONT></A>
\r
319 <A name=10_line3> <FONT color=#993333>i</FONT> <FONT color=#4444ff>:</FONT><FONT color=#4444ff>=</FONT> <FONT color=#ff0000>0</FONT><FONT color=#4444ff>;</FONT></A>
\r
320 <A name=10_line4> <STRONG>repeat</STRONG></A>
\r
321 <A name=10_line5> <FONT color=#993333>hash_address</FONT> <FONT color=#4444ff>:</FONT><FONT color=#4444ff>=</FONT> <FONT color=#4444ff>(</FONT><FONT color=#993333>pos0</FONT> <FONT color=#4444ff>+</FONT> <FONT color=#993333>i</FONT><FONT color=#4444ff>)</FONT> <FONT color=#993333>MOD</FONT> <FONT color=#993333>HASH_LENGTH</FONT><FONT color=#4444ff>;</FONT></A>
\r
322 <A name=10_line6> <FONT color=#993333>inc</FONT><FONT color=#4444ff>(</FONT><FONT color=#993333>i</FONT><FONT color=#4444ff>)</FONT><FONT color=#4444ff>;</FONT></A>
\r
323 <A name=10_line7> <STRONG>until</STRONG> <FONT color=#993333>Hash</FONT><FONT color=#4444ff>[</FONT> <FONT color=#993333>hash_address</FONT><FONT color=#4444ff>]</FONT>.<FONT color=#993333>occupied</FONT> <FONT color=#4444ff><</FONT><FONT color=#4444ff>></FONT> <FONT color=#993333>ho_occupied</FONT><FONT color=#4444ff>;</FONT></A>
\r
324 </PRE><PRE></PRE><!-- code2html delete stop -->
\r
325 <H3>sqared probing:</H3>is very similar to linear probing but we do not
\r
326 simply add the increasing value but we add it's square. This is usually
\r
327 better than linear probing: <!-- code2html add pas
328 pos0 := Hash_Function(key);
331 hash_address := (pos0 + i*i) MOD HASH_LENGTH;
333 until Hash[ hash_address].occupied <> ho_occupied;
334 --><!-- code2html delete start --><PRE><A name=11_line1></A>
\r
335 <A name=11_line2> <FONT color=#993333>pos0</FONT> <FONT color=#4444ff>:</FONT><FONT color=#4444ff>=</FONT> <FONT color=#993333>Hash_Function</FONT><FONT color=#4444ff>(</FONT><FONT color=#993333>key</FONT><FONT color=#4444ff>)</FONT><FONT color=#4444ff>;</FONT></A>
\r
336 <A name=11_line3> <FONT color=#993333>i</FONT> <FONT color=#4444ff>:</FONT><FONT color=#4444ff>=</FONT> <FONT color=#ff0000>0</FONT><FONT color=#4444ff>;</FONT></A>
\r
337 <A name=11_line4> <STRONG>repeat</STRONG></A>
\r
338 <A name=11_line5> <FONT color=#993333>hash_address</FONT> <FONT color=#4444ff>:</FONT><FONT color=#4444ff>=</FONT> <FONT color=#4444ff>(</FONT><FONT color=#993333>pos0</FONT> <FONT color=#4444ff>+</FONT> <FONT color=#993333>i</FONT><FONT color=#4444ff>*</FONT><FONT color=#993333>i</FONT><FONT color=#4444ff>)</FONT> <FONT color=#993333>MOD</FONT> <FONT color=#993333>HASH_LENGTH</FONT><FONT color=#4444ff>;</FONT></A>
\r
339 <A name=11_line6> <FONT color=#993333>inc</FONT><FONT color=#4444ff>(</FONT><FONT color=#993333>i</FONT><FONT color=#4444ff>)</FONT><FONT color=#4444ff>;</FONT></A>
\r
340 <A name=11_line7> <STRONG>until</STRONG> <FONT color=#993333>Hash</FONT><FONT color=#4444ff>[</FONT> <FONT color=#993333>hash_address</FONT><FONT color=#4444ff>]</FONT>.<FONT color=#993333>occupied</FONT> <FONT color=#4444ff><</FONT><FONT color=#4444ff>></FONT> <FONT color=#993333>ho_occupied</FONT><FONT color=#4444ff>;</FONT></A>
\r
341 </PRE><PRE></PRE><!-- code2html delete stop -->
\r
342 <P>One disadvantage of all those probing methods is that they tend to
\r
343 build clusters aroung already occupied areas of the hash table. A soluton
\r
344 to this problem is
\r
345 <H3>double hashing:</H3>The idea is to add a value to the hash address
\r
346 that depends on the key of the item, so that the probing order is spezific
\r
347 to the key: <!-- code2html add pas
348 hash_address := Hash_Function(key);
349 while Hash[ hash_address].occupied = ho_occupied do
350 hash_address := (hash_address + Hash_Function(key)) MOD HASH_LENGTH;
352 double_hashing := hash_address;
353 --><!-- code2html delete start --><PRE><A name=12_line1></A>
\r
354 <A name=12_line2> <FONT color=#993333>hash_address</FONT> <FONT color=#4444ff>:</FONT><FONT color=#4444ff>=</FONT> <FONT color=#993333>Hash_Function</FONT><FONT color=#4444ff>(</FONT><FONT color=#993333>key</FONT><FONT color=#4444ff>)</FONT><FONT color=#4444ff>;</FONT></A>
\r
355 <A name=12_line3> <STRONG>while</STRONG> <FONT color=#993333>Hash</FONT><FONT color=#4444ff>[</FONT> <FONT color=#993333>hash_address</FONT><FONT color=#4444ff>]</FONT>.<FONT color=#993333>occupied</FONT> <FONT color=#4444ff>=</FONT> <FONT color=#993333>ho_occupied</FONT> <STRONG>do</STRONG></A>
\r
356 <A name=12_line4> <FONT color=#993333>hash_address</FONT> <FONT color=#4444ff>:</FONT><FONT color=#4444ff>=</FONT> <FONT color=#4444ff>(</FONT><FONT color=#993333>hash_address</FONT> <FONT color=#4444ff>+</FONT> <FONT color=#993333>Hash_Function</FONT><FONT color=#4444ff>(</FONT><FONT color=#993333>key</FONT><FONT color=#4444ff>)</FONT><FONT color=#4444ff>)</FONT> <FONT color=#993333>MOD</FONT> <FONT color=#993333>HASH_LENGTH</FONT><FONT color=#4444ff>;</FONT></A>
\r
357 <A name=12_line5> </A>
\r
358 <A name=12_line6> <FONT color=#993333>double_hashing</FONT> <FONT color=#4444ff>:</FONT><FONT color=#4444ff>=</FONT> <FONT color=#993333>hash_address</FONT><FONT color=#4444ff>;</FONT></A>
\r
359 </PRE><PRE></PRE><!-- code2html delete stop -->
\r
360 <P>Hash_Function(rec.key) and HASH_LENGTH should have a greatest common
\r
361 factor of 1 so that the complete hash table is probed. This is guaranteed
\r
362 if HASH_LENGTH is a prime number.
\r
364 <P>So inserting an item looks like this: <!-- code2html add pas
365 procedure Insert_Into_Hash( VAR hash : THash_Table;
371 <get the hash address, probing for an empty entry>
372 hash[ hash_address ] := rec;
373 rec.occupied := ho_occupied;
375 --><!-- code2html delete start --><PRE><A name=13_line1></A>
\r
376 <A name=13_line2><STRONG>procedure</STRONG> <FONT color=#993333>Insert_Into_Hash</FONT><FONT color=#4444ff>(</FONT> <FONT color=#993333>VAR</FONT> <FONT color=#993333>hash</FONT> <FONT color=#4444ff>:</FONT> <FONT color=#993333>THash_Table</FONT><FONT color=#4444ff>;</FONT></A>
\r
377 <A name=13_line3> <FONT color=#993333>rec</FONT> <FONT color=#4444ff>:</FONT> <FONT color=#993333>THash_Record</FONT><FONT color=#4444ff>)</FONT><FONT color=#4444ff>;</FONT></A>
\r
378 <A name=13_line4><FONT color=#993333>VAR</FONT></A>
\r
379 <A name=13_line5> <FONT color=#993333>hash_address</FONT><FONT color=#4444ff>;</FONT></A>
\r
380 <A name=13_line6><STRONG>begin</STRONG></A>
\r
381 <A name=13_line7> </A>
\r
382 <A name=13_line8> <FONT color=#4444ff><</FONT><FONT color=#993333>get</FONT> <FONT color=#993333>the</FONT> <FONT color=#993333>hash</FONT> <FONT color=#993333>address</FONT>, <FONT color=#993333>probing</FONT> <STRONG>for</STRONG> <FONT color=#993333>an</FONT> <FONT color=#993333>empty</FONT> <FONT color=#993333>entry</FONT><FONT color=#4444ff>></FONT></A>
\r
383 <A name=13_line9> <FONT color=#993333>hash</FONT><FONT color=#4444ff>[</FONT> <FONT color=#993333>hash_address</FONT> <FONT color=#4444ff>]</FONT> <FONT color=#4444ff>:</FONT><FONT color=#4444ff>=</FONT> <FONT color=#993333>rec</FONT><FONT color=#4444ff>;</FONT></A>
\r
384 <A name=13_line10> <FONT color=#993333>rec<FONT color=#4444ff>.</FONT>occupied</FONT> <FONT color=#4444ff>:</FONT><FONT color=#4444ff>=</FONT> <FONT color=#993333>ho_occupied</FONT><FONT color=#4444ff>;</FONT></A>
\r
385 <A name=13_line11><STRONG>end</STRONG><FONT color=#4444ff>;</FONT></A>
\r
386 </PRE><PRE></PRE><!-- code2html delete stop -->
\r
387 <P>To search for an item, you have to first get the hash_address and then
\r
388 probe until you either find an empty entry or the searched entry.
\r
389 <P>Deleting an item is rather simple: Search for the address like when you
\r
390 search the item and set the occupied value to ho_occupied. But wait! This
\r
391 would break our searching, since we search until we either find the object
\r
392 or an empty entry.
\r
393 <P>The solution is to modify THash_Occupied a bit so that it reads: <!-- code2html add pas
394 THash_Occupied = (ho_empty, ho_occupied, ho_deleted);
395 --><!-- code2html delete start --><PRE><A name=14_line1></A>
\r
396 <A name=14_line2> <FONT color=#993333>THash_Occupied</FONT> <FONT color=#4444ff>=</FONT> <FONT color=#4444ff>(</FONT><FONT color=#993333>ho_empty</FONT>, <FONT color=#993333>ho_occupied</FONT>, <FONT color=#993333>ho_deleted</FONT><FONT color=#4444ff>)</FONT><FONT color=#4444ff>;</FONT></A>
\r
397 </PRE><PRE></PRE><!-- code2html delete stop -->
\r
398 <P>If we delete an item we set the occupied state to ho_deleted. When
\r
399 inserting Items we treat ho_deleted like it was ho_empty and when
\r
400 searching for an item we treat as if it was ho_occupied.
\r
406 <P>If you have questions, feel free to contact me! </P></TD></TR>
\r
412 <TD><FONT face=verdana,arial,helvetica size=1><!-- hhmts start -->Last
\r
413 modified: Sun Aug 1 20:26:14 CEST 1999 <!-- hhmts end -->by <A
\r
414 href="mailto:weasel@netalive.org">Peter Palfrader</A></FONT>
\r
415 <BR></TD></TR></TBODY></TABLE></BODY></HTML>
\r