Lecture 10 & 11
HASHING
1
Course Supervisor: Syeda Nazia Ashraf
Data Structures & Algorithm
CSC-102
MOTIVATION
Linear Search
• Simplest Algorithm to search for a specific
target key in a data collection.
• Examines each element
• Takes 10 times longer to search for an element
in an array of 100 elements as compared to
the 10 element array O(n).
2
MOTIVATION
Binary Search
• Requires element to be in an order(sorted).
• Search time depends on the logarithm of the
collection size O(log n).
• Takes twice as long on average to search for
an element in an array of 100 elements as
compared to the 10 element array.
3
MOTIVATION
Conclusion
• The time taken for a search using each of
these methods depends on the size of the
collection.
• Hash data structures – Allow the storage and
retrieval of data in an average time which
does not depend at all on the collection size.
4
HASHING
• Hashing is the transformation of a string
of characters into a usually shorter fixed-
length value or key that represents the
original string.
• Hashing is used to index and retrieve items in
a database because it is faster to find the item
using the shorter hashed key than to find it
using the original value.
5
HASHING
Hash Tables(Hash Map)
• Simplest data structure.
• Hash Function – Basis of Hash Tables.
Hash Functions
• A hash function is any function that can be
used to map data of arbitrary size to data of
fixed size.
6
HASHING
Hashes
• The values returned by a hash function are
called hash values, hash codes, hash sums, or
simply hashes.
• Hash values are used to determine the
location in the table for the given element.
7
HASHING
IS THERE ANY PARAMETER FOR A GOOD
HASH FUNCTION?
• A good hash function is the one that
distributes the numbers fairly evenly in the
hash tables.
8
POPULAR HASH FUNCTIONS
1. Division Method
• A key (given element) is mapped into one of m
slots using the function.
h(k) = k mod m
Where m is the size of the table and is usually
chosen to be a prime number and k is the key. 9
Different types of hash functions are used for the mapping
of keys into tables.
(a) Division Method
(b) Mid-square Method
(c) Folding Method
10
1. Division Method
• Choose a number m larger than the number n of keys
in k
• The number m is usually chosen to be a prime no. or
a number without small divisors
• The hash function H is defined as,
H(k) = k(mod m) or H(k) = k(mod m) + 1
• Denotes the remainder, when k is divided by m
• 2nd formula is used when range is from 1 to m.
11
• Example:
Elements are: 3205, 7148, 2345
Table size: 0 – 99 (prime)
m = 97 (prime no. close to 99)
H(k)=k(mod m) i.e 3205 mod 97=4
H(3205)= 4, H(7148)=67, H(2345)=17
• For 2nd formula add 1 into the remainders.
• H(k)=k(mod m)+1 to obtain:
• H(3205)= 4+1=5, H(7148)=67+1=68,
H(2345)=17+1=18
DIVISION METHOD
3205
2345
7148
17
67
.
0
.
4
.
99
.
3205
2345
7148
18
68
0
.
5
.
99
POPULAR HASH FUNCTIONS Contd…
2. Folding Method
• The key is partitioned into a number of parts k1 +
k2 + k3 + … kn
• where each part except possibly the last part
has the same number of digits as the required
hash address.
• Then the parts are added together, ignoring the
last carry. That is,
h(k)= k1 + k2 + k3 + … kn
• Sometimes the even numbered parts (k2, k4 …)
are reversed before adding.
12
Folding Method
• Here we are dealing with a hash table with
index from 00 to 99, i.e., two-digit hash table
• So we divide the K numbers of two digits
H(7148) = 71 + 48 = 119, here we will eliminate the
leading carry (i.e., 1). So H(7148) = 71 + 48 = 19
Folding Method
• Sometimes, for extra "milling;" the even-
numbered parts, k2, k4, . . . , are each reversed
before the addition
• H(7148) = 71 + 84 = 155, here we will eliminate the
leading carry (i.e., 1). So H(7148) = 71 + 84 = 55
FOLDING METHOD
Example
• Create a hash table for the Keys 3205, 7148,
2345 by using Folding Method
Solution
• Partition K into a number of parts.
• Each part has the same number of digits as
the required address.
• Add parts together ignoring the last carry.
• h(3205) , h(k)= 32 + 05 , hashed key= 37
• h(7148) , h(k)= 71 + 48 , hashed key= 119 (Discard
leading digit 1) = 19
• h(2345) , h(k)= 23 + 45 , hashed key= 68
15
FOLDING METHOD Contd…
• Alternatively , one may want to reverse the
second part before adding.
• h(3205) = 32 + 50 = 82
• h(7148) = 71 + 84 = 155 (Discard 1) = 55
• h(2345) = 23 + 54 = 77
• Creation of the hash table on board.
16
POPULAR HASH FUNCTIONS Contd…
3. Midsquare Method
• The key is squared . The hash function is
defined by
• h(k) = l where l is obtained by deleting
digits from both the ends of k2.
17
Mid-Square Method
• The key is squared and the address selected from the
middle of the squared number
• The hash function H is defined by:
h(k) = k2 = l
• Where l is obtained by digits from both the end of k2
starting from left
• The most obvious limitation of this method is the size of
the key
• Given a key of 6 digits, the product will be 12 digits, which
may be beyond the maximum integer size of many
computers
• Same number of digits must be used for all of the keys
Mid-Square Method - Example
• Consider following keys in the table and its hash
index :
Mid-Square Method - Example
Hash Table with Mid-Square Division
MID-SQAURE METHOD
Example
• Create a hash table for the Keys 3205, 7148,
2345 by using Mid-square Method
Solution:
• Square K.
• Strip predetermined digits from front and rear.
• e.g., use thousands and ten thousands places
• K: 3205 7148 2345
• k2: 10272025 51093904 5499025
• hashed key=h(k):72 93 99
• 4th and 5th digits counting from the right side, are
chosen for hash address.
21
22
•Table size [0..99]
•A..Z ---> 1,2, ...26
•0..9 ----> 27,...36
•Key: CS1 --->3+19+28 (concatenate) = 31,928
•(31,928)2 = 1,019,397,184 → 10 digits
•Extract middle 2 digits (5th and 6th) as table size
is 0..99.
•Get 39, so: H(CS1) = 39.
Hashing a string key
Hash Function Examples
Let h(k) = k % 15. Then,
if k = 25 129 35 2501 47 36
h(k) = 10 9 5 11 2 6
Storing the keys in the array is straightforward:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
_ _ 47 _ _ 35 36 _ _ 129 25 2501 _ _ _
Thus, delete and find can be done in O(1), and
also insert, except…
Hash Function
What happens when you try to insert: k = 65 ?
k = 65
h(k) = 5
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
_ _ 47 _ _ 35 36 _ _ 129 25 2501 _ _ _
65(?)
This is called a collision.
25
• If two keys map on the same hash table
index then we have a collision.
• As the number of elements in the table
increases, the likelihood of a collision
increases - so make the table as large as
practical
• Collisions may still happen, so we need a
collision resolution strategy
COLLISION
COLLISION
• When a hash function maps two different keys to the same
table address, a collision is said to occur.
• Two elements can not be stored at the same location in the
hash table.
• Two approaches are used to resolve collisions.
• Open Hashing : Means that collisions are resolved by
storing the colliding object in a separate area.
• Separate chaining
• Closed Hashing (Open Addressing) : In closed hashing, all
keys are stored in the hash table itself.
• Linear Probing
• Quadratic Probing
• Double Hashing
26
What is Probing?
If the table position given by the hashed key is already
occupied, increase the position by some amount, until an
empty position is found
27
CLOSED HASHING METHODS (COLLISION
RESOLUTION TECHNIQUES)
Linear Probing
• Here we place the elements by using the hash
function
hi(x) = (h(x) + i) mod TableSize.
• One of the methods for dealing with collisions.
• If a data element hashes to a location in the table
which is already occupied , the table is searched
consecutively from that location until an empty
location is found.
• The key would then be stored in the empty location.
• rap around from the last to the first bucket array
location if necessary.
28
LINEAR PROBING
Exercise Question
• h(K) = K mod 7
• Insert keys: 76 93 40 47 10 55
29
30
31
32
33
Linear probing
hash table after
each insertion
34
LINEAR PROBING Contd…
Disadvantage
• Clustering- Elements appearing next to one
another thus increasing search time.
Searching/ lookup
• To search for a given key x, the cells of T are
examined, beginning with the cell at
index h(x) (where h is the hash function) and
continuing to the adjacent cells h(x) + 1, h(x) + 2,
..., until finding either an empty cell or a cell
whose stored key is x.
35
36
37
LINEAR PROBING Contd…
Deletion
• It is also possible to remove a key–value pair from
the dictionary. However, it is not sufficient to do
so by simply emptying its cell. This would affect
searches for other keys that have a hash value
earlier than the emptied cell, but that are stored
in a position later than the emptied cell. The
emptied cell would cause those searches to
incorrectly report that the key is not present.
• Use Tombstones or markers.
38
39
40
CLOSED HASHING METHODS (COLLISION
RESOLUTION TECHNIQUES)
2. Quadratic Probing
• Here we place the elements by using the
hash function
• hi(x) = (h(x) + i2) mod TableSize.
• Fast searching as compared to linear
probing.
• secondary clustering since keys that have
the same hash value also have the same
probe sequence
41
42
2. Quadratic Probing
• Quadratic probing is a solution to the clustering
problem
– Linear probing adds 1, 2, 3, etc. to the original
hashed key
– Quadratic probing adds 12, 22, 32 etc. to the original
hashed key
• However, whereas linear probing guarantees that all
empty positions will be examined if necessary,
quadratic probing does not
43
• If the table size is prime, this will try approximately
half the table slots.
• More generally, with quadratic probing, insertion may
be impossible if the table is more than half-full!
H(k) = h, h+1, h+4, h+9, h+25,……, h+i2
44
Quadratic Probing
• Quadratic Probing eliminates primary clustering
problem of linear probing.
• Collision function is quadratic.
– The popular choice is f(i) = i2.
• If the hash function evaluates to h and a search in
cell h is inconclusive, we try cells h + 12, h+22, … h
+ i2.
– i.e. It examines cells 1,4,9 and so on away from the
original probe.
• Remember that subsequent probe points are a
quadratic number of positions from the original
probe point.
QUADRATIC PROBING Cont…
Example
• h(K) = K mod 7
• Insert keys: 76 93 40 47 10 55
45
46
47
48
49
A quadratic probing
hash table after each
insertion (note that
the table size was
poorly chosen
because it is not a
prime number).
CLOSED HASHING METHODS (COLLISION
RESOLUTION TECHNIQUES)
3. Double Hashing
• uses a secondary hash function h’(k) and
places the colliding item in the first
available cell of the series.
• The value calculated by the second hash
functions acts as an offset.
50
51
3. Double Hashing
• 2nd hash function H’ is used to resolve the collision.
• Suppose a record R with key k has hash address H(k)=h
and H’(k) = h’ ≠ m
• Therefore we can search the locations with addresses,
H’(k) = h, h+h’, h+2h’, h+3h’,…….
• If m is prime, then this sequence access all the
locations.
52
A Good Double Hash Table
DOUBLE HASHING Cont…
53
index
count
54
55
56
Let the keys are 76, 93, 40, 47, 10, 55 and table size is 7 then apply Double
hashing technique for each insertion.
Open addressing: store the key/entry in a different position.
Separate Chaining
• Chain together several keys/entries in each
position.
• Instead of storing the data item directly in the
hash table, each hash table entry contains a
reference to a data structure, e.g. a linked list.
• In the worst case scenario, all items hash to the
same value . Thus we store them in the data
structure ( linked list ).
57
Open addressing
58
• The idea is to keep a list of all elements that hash to
the same value.
– The array elements are pointers to the first nodes of the
lists.
– A new item is inserted to the front of the list.
• Advantages:
– Better space utilization for large items.
– Simple collision handling: searching linked list.
– Overflow: we can store more items than the hash table
size.
– Deletion is quick and easy: deletion from the linked list.
Separate Chaining
59
Disadvantages of Separate Chaining
• Parts of the array might never be used.
• As chains get longer, search time increases
to O(n) in the worst case.
• Constructing new chain nodes is relatively
expensive.
• Is there a way to use the “unused” space
in the array instead of using chains to
make more space?
60
0
1
2
3
4
5
6
7
8
9
0
81 1
64 4
25
36 16
49 9
Keys: 0, 1, 4, 9, 16, 25, 36, 49, 64, 81
hash(key) = key % 10.
SEPARATE CHAINING
• In our example, we use a linked list:
• keys: 5, 17, 37, 20, 42, 3, 11
61
62
Applications of Hashing
• Compilers use hash tables to keep track of declared
variables
• A hash table can be used for on-line spelling checkers — if
misspelling detection (rather than correction) is important,
an entire dictionary can be hashed and words checked in
constant time
• Game playing programs use hash tables to store seen
positions, thereby saving computation time if the position
is encountered again
• Hash functions can be used to quickly check for inequality
— if two elements hash to different values they must be
different

LECT 10, 11-DSALGO(Hashing).pdf

  • 1.
    Lecture 10 &11 HASHING 1 Course Supervisor: Syeda Nazia Ashraf Data Structures & Algorithm CSC-102
  • 2.
    MOTIVATION Linear Search • SimplestAlgorithm to search for a specific target key in a data collection. • Examines each element • Takes 10 times longer to search for an element in an array of 100 elements as compared to the 10 element array O(n). 2
  • 3.
    MOTIVATION Binary Search • Requireselement to be in an order(sorted). • Search time depends on the logarithm of the collection size O(log n). • Takes twice as long on average to search for an element in an array of 100 elements as compared to the 10 element array. 3
  • 4.
    MOTIVATION Conclusion • The timetaken for a search using each of these methods depends on the size of the collection. • Hash data structures – Allow the storage and retrieval of data in an average time which does not depend at all on the collection size. 4
  • 5.
    HASHING • Hashing isthe transformation of a string of characters into a usually shorter fixed- length value or key that represents the original string. • Hashing is used to index and retrieve items in a database because it is faster to find the item using the shorter hashed key than to find it using the original value. 5
  • 6.
    HASHING Hash Tables(Hash Map) •Simplest data structure. • Hash Function – Basis of Hash Tables. Hash Functions • A hash function is any function that can be used to map data of arbitrary size to data of fixed size. 6
  • 7.
    HASHING Hashes • The valuesreturned by a hash function are called hash values, hash codes, hash sums, or simply hashes. • Hash values are used to determine the location in the table for the given element. 7
  • 8.
    HASHING IS THERE ANYPARAMETER FOR A GOOD HASH FUNCTION? • A good hash function is the one that distributes the numbers fairly evenly in the hash tables. 8
  • 9.
    POPULAR HASH FUNCTIONS 1.Division Method • A key (given element) is mapped into one of m slots using the function. h(k) = k mod m Where m is the size of the table and is usually chosen to be a prime number and k is the key. 9 Different types of hash functions are used for the mapping of keys into tables. (a) Division Method (b) Mid-square Method (c) Folding Method
  • 10.
    10 1. Division Method •Choose a number m larger than the number n of keys in k • The number m is usually chosen to be a prime no. or a number without small divisors • The hash function H is defined as, H(k) = k(mod m) or H(k) = k(mod m) + 1 • Denotes the remainder, when k is divided by m • 2nd formula is used when range is from 1 to m.
  • 11.
    11 • Example: Elements are:3205, 7148, 2345 Table size: 0 – 99 (prime) m = 97 (prime no. close to 99) H(k)=k(mod m) i.e 3205 mod 97=4 H(3205)= 4, H(7148)=67, H(2345)=17 • For 2nd formula add 1 into the remainders. • H(k)=k(mod m)+1 to obtain: • H(3205)= 4+1=5, H(7148)=67+1=68, H(2345)=17+1=18 DIVISION METHOD 3205 2345 7148 17 67 . 0 . 4 . 99 . 3205 2345 7148 18 68 0 . 5 . 99
  • 12.
    POPULAR HASH FUNCTIONSContd… 2. Folding Method • The key is partitioned into a number of parts k1 + k2 + k3 + … kn • where each part except possibly the last part has the same number of digits as the required hash address. • Then the parts are added together, ignoring the last carry. That is, h(k)= k1 + k2 + k3 + … kn • Sometimes the even numbered parts (k2, k4 …) are reversed before adding. 12
  • 13.
    Folding Method • Herewe are dealing with a hash table with index from 00 to 99, i.e., two-digit hash table • So we divide the K numbers of two digits H(7148) = 71 + 48 = 119, here we will eliminate the leading carry (i.e., 1). So H(7148) = 71 + 48 = 19
  • 14.
    Folding Method • Sometimes,for extra "milling;" the even- numbered parts, k2, k4, . . . , are each reversed before the addition • H(7148) = 71 + 84 = 155, here we will eliminate the leading carry (i.e., 1). So H(7148) = 71 + 84 = 55
  • 15.
    FOLDING METHOD Example • Createa hash table for the Keys 3205, 7148, 2345 by using Folding Method Solution • Partition K into a number of parts. • Each part has the same number of digits as the required address. • Add parts together ignoring the last carry. • h(3205) , h(k)= 32 + 05 , hashed key= 37 • h(7148) , h(k)= 71 + 48 , hashed key= 119 (Discard leading digit 1) = 19 • h(2345) , h(k)= 23 + 45 , hashed key= 68 15
  • 16.
    FOLDING METHOD Contd… •Alternatively , one may want to reverse the second part before adding. • h(3205) = 32 + 50 = 82 • h(7148) = 71 + 84 = 155 (Discard 1) = 55 • h(2345) = 23 + 54 = 77 • Creation of the hash table on board. 16
  • 17.
    POPULAR HASH FUNCTIONSContd… 3. Midsquare Method • The key is squared . The hash function is defined by • h(k) = l where l is obtained by deleting digits from both the ends of k2. 17
  • 18.
    Mid-Square Method • Thekey is squared and the address selected from the middle of the squared number • The hash function H is defined by: h(k) = k2 = l • Where l is obtained by digits from both the end of k2 starting from left • The most obvious limitation of this method is the size of the key • Given a key of 6 digits, the product will be 12 digits, which may be beyond the maximum integer size of many computers • Same number of digits must be used for all of the keys
  • 19.
    Mid-Square Method -Example • Consider following keys in the table and its hash index :
  • 20.
    Mid-Square Method -Example Hash Table with Mid-Square Division
  • 21.
    MID-SQAURE METHOD Example • Createa hash table for the Keys 3205, 7148, 2345 by using Mid-square Method Solution: • Square K. • Strip predetermined digits from front and rear. • e.g., use thousands and ten thousands places • K: 3205 7148 2345 • k2: 10272025 51093904 5499025 • hashed key=h(k):72 93 99 • 4th and 5th digits counting from the right side, are chosen for hash address. 21
  • 22.
    22 •Table size [0..99] •A..Z---> 1,2, ...26 •0..9 ----> 27,...36 •Key: CS1 --->3+19+28 (concatenate) = 31,928 •(31,928)2 = 1,019,397,184 → 10 digits •Extract middle 2 digits (5th and 6th) as table size is 0..99. •Get 39, so: H(CS1) = 39. Hashing a string key
  • 23.
    Hash Function Examples Leth(k) = k % 15. Then, if k = 25 129 35 2501 47 36 h(k) = 10 9 5 11 2 6 Storing the keys in the array is straightforward: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 _ _ 47 _ _ 35 36 _ _ 129 25 2501 _ _ _ Thus, delete and find can be done in O(1), and also insert, except…
  • 24.
    Hash Function What happenswhen you try to insert: k = 65 ? k = 65 h(k) = 5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 _ _ 47 _ _ 35 36 _ _ 129 25 2501 _ _ _ 65(?) This is called a collision.
  • 25.
    25 • If twokeys map on the same hash table index then we have a collision. • As the number of elements in the table increases, the likelihood of a collision increases - so make the table as large as practical • Collisions may still happen, so we need a collision resolution strategy COLLISION
  • 26.
    COLLISION • When ahash function maps two different keys to the same table address, a collision is said to occur. • Two elements can not be stored at the same location in the hash table. • Two approaches are used to resolve collisions. • Open Hashing : Means that collisions are resolved by storing the colliding object in a separate area. • Separate chaining • Closed Hashing (Open Addressing) : In closed hashing, all keys are stored in the hash table itself. • Linear Probing • Quadratic Probing • Double Hashing 26 What is Probing? If the table position given by the hashed key is already occupied, increase the position by some amount, until an empty position is found
  • 27.
  • 28.
    CLOSED HASHING METHODS(COLLISION RESOLUTION TECHNIQUES) Linear Probing • Here we place the elements by using the hash function hi(x) = (h(x) + i) mod TableSize. • One of the methods for dealing with collisions. • If a data element hashes to a location in the table which is already occupied , the table is searched consecutively from that location until an empty location is found. • The key would then be stored in the empty location. • rap around from the last to the first bucket array location if necessary. 28
  • 29.
    LINEAR PROBING Exercise Question •h(K) = K mod 7 • Insert keys: 76 93 40 47 10 55 29
  • 30.
  • 31.
  • 32.
  • 33.
    33 Linear probing hash tableafter each insertion
  • 34.
  • 35.
    LINEAR PROBING Contd… Disadvantage •Clustering- Elements appearing next to one another thus increasing search time. Searching/ lookup • To search for a given key x, the cells of T are examined, beginning with the cell at index h(x) (where h is the hash function) and continuing to the adjacent cells h(x) + 1, h(x) + 2, ..., until finding either an empty cell or a cell whose stored key is x. 35
  • 36.
  • 37.
  • 38.
    LINEAR PROBING Contd… Deletion •It is also possible to remove a key–value pair from the dictionary. However, it is not sufficient to do so by simply emptying its cell. This would affect searches for other keys that have a hash value earlier than the emptied cell, but that are stored in a position later than the emptied cell. The emptied cell would cause those searches to incorrectly report that the key is not present. • Use Tombstones or markers. 38
  • 39.
  • 40.
  • 41.
    CLOSED HASHING METHODS(COLLISION RESOLUTION TECHNIQUES) 2. Quadratic Probing • Here we place the elements by using the hash function • hi(x) = (h(x) + i2) mod TableSize. • Fast searching as compared to linear probing. • secondary clustering since keys that have the same hash value also have the same probe sequence 41
  • 42.
    42 2. Quadratic Probing •Quadratic probing is a solution to the clustering problem – Linear probing adds 1, 2, 3, etc. to the original hashed key – Quadratic probing adds 12, 22, 32 etc. to the original hashed key • However, whereas linear probing guarantees that all empty positions will be examined if necessary, quadratic probing does not
  • 43.
    43 • If thetable size is prime, this will try approximately half the table slots. • More generally, with quadratic probing, insertion may be impossible if the table is more than half-full! H(k) = h, h+1, h+4, h+9, h+25,……, h+i2
  • 44.
    44 Quadratic Probing • QuadraticProbing eliminates primary clustering problem of linear probing. • Collision function is quadratic. – The popular choice is f(i) = i2. • If the hash function evaluates to h and a search in cell h is inconclusive, we try cells h + 12, h+22, … h + i2. – i.e. It examines cells 1,4,9 and so on away from the original probe. • Remember that subsequent probe points are a quadratic number of positions from the original probe point.
  • 45.
    QUADRATIC PROBING Cont… Example •h(K) = K mod 7 • Insert keys: 76 93 40 47 10 55 45
  • 46.
  • 47.
  • 48.
  • 49.
    49 A quadratic probing hashtable after each insertion (note that the table size was poorly chosen because it is not a prime number).
  • 50.
    CLOSED HASHING METHODS(COLLISION RESOLUTION TECHNIQUES) 3. Double Hashing • uses a secondary hash function h’(k) and places the colliding item in the first available cell of the series. • The value calculated by the second hash functions acts as an offset. 50
  • 51.
    51 3. Double Hashing •2nd hash function H’ is used to resolve the collision. • Suppose a record R with key k has hash address H(k)=h and H’(k) = h’ ≠ m • Therefore we can search the locations with addresses, H’(k) = h, h+h’, h+2h’, h+3h’,……. • If m is prime, then this sequence access all the locations.
  • 52.
    52 A Good DoubleHash Table
  • 53.
  • 54.
  • 55.
  • 56.
    56 Let the keysare 76, 93, 40, 47, 10, 55 and table size is 7 then apply Double hashing technique for each insertion.
  • 57.
    Open addressing: storethe key/entry in a different position. Separate Chaining • Chain together several keys/entries in each position. • Instead of storing the data item directly in the hash table, each hash table entry contains a reference to a data structure, e.g. a linked list. • In the worst case scenario, all items hash to the same value . Thus we store them in the data structure ( linked list ). 57 Open addressing
  • 58.
    58 • The ideais to keep a list of all elements that hash to the same value. – The array elements are pointers to the first nodes of the lists. – A new item is inserted to the front of the list. • Advantages: – Better space utilization for large items. – Simple collision handling: searching linked list. – Overflow: we can store more items than the hash table size. – Deletion is quick and easy: deletion from the linked list. Separate Chaining
  • 59.
    59 Disadvantages of SeparateChaining • Parts of the array might never be used. • As chains get longer, search time increases to O(n) in the worst case. • Constructing new chain nodes is relatively expensive. • Is there a way to use the “unused” space in the array instead of using chains to make more space?
  • 60.
    60 0 1 2 3 4 5 6 7 8 9 0 81 1 64 4 25 3616 49 9 Keys: 0, 1, 4, 9, 16, 25, 36, 49, 64, 81 hash(key) = key % 10.
  • 61.
    SEPARATE CHAINING • Inour example, we use a linked list: • keys: 5, 17, 37, 20, 42, 3, 11 61
  • 62.
    62 Applications of Hashing •Compilers use hash tables to keep track of declared variables • A hash table can be used for on-line spelling checkers — if misspelling detection (rather than correction) is important, an entire dictionary can be hashed and words checked in constant time • Game playing programs use hash tables to store seen positions, thereby saving computation time if the position is encountered again • Hash functions can be used to quickly check for inequality — if two elements hash to different values they must be different