Friday, 2 October 2015

How Hashmap data structure works internally? How hashcode and equals method work in hashmap? Explain with real time example.

Hashmap data structure works internally?
How hashcode and equals method work in hashmap?


This is the famous interview question for the beginners as well as for experienced, So Let's see what it is all about.

Hashmap is very popular data structure and found useful for solving many problems due to O(1) time complexity for both get and put operation.
Before getting into Hashmap internals, Please read Hashmap basics and Hashcode.

Internal working of Get and Put operation.


Hashmap store objects in key-value pair in a table.
  1. Objects are stored by method hashmap.put(key, value) and


  2. Objects are retrieved by calling hashmap.get(key) method.
For detail explanation on hashmap get and put API, Please read this post How Hashmap put and get API works.

Put Operation


Hashmap works on principle of hashing and internally uses hashcode as a base, for storing key-value pair.
With the help of hashcode, Hashmap stores objects and retrieves it in constant time O(1).


Lets recap "Employee Letter Box" example, we saw in last post on Hashcode.


From above letter box example we can say,

With the help of hashcode(start alphabet of First name), all letters are distributed across several letter box and due to which searching for the particular Letter within Letter box became very fast.

Employee just need to look into appropriate Letter box and search letter only within that box, ignoring other boxes. This make searching for Letter very fast.
Hashcode is basically used to distribute the objects systematically, so that searching can be done faster.
Let's try to shape Letter box example in way Hashmap works.



Hashmap works very similar to above Employee Letter box example.
  1. It make use of array in place of Letter Boxes.
  2. It make use of linked list for storing Key-Value pair and each Node of linked list corresponds to mail in above example.
 Hashmap uses Array and Linked list datastructure internally for storing key-value pair.



Lets try to put below key-value pair in hashmap,
  1. hashMap.put("hello","world");
  2. hashMap.put("jayesh","patel");
  3. hashMap.put("khyati","patel");
  4. hashMap.put("khyati","pokar");
Before going into details, Let understand what is hashcode,
In Employee Letter box example, hashcode of Employee is "First character of Name".

In case of Object, hashcode is a Integer value which represents object. 
hashcode is calculated using hashcode() method and which uses object properties to compute it.

Please refer this link for detail explanation on Details on Hashcode


Note:
1. Default Hashmap size is 16. It means, default array size is 16, starting from index 0 to 15, like shown below,

2. Let's try to put key-value pair hashmap.put("hello","world"); 
Above line says, value "world" needs to be stored against key "hello".   

Step 1 : Hashmap will compute the hashcode of key "hello",
Step 2 : Say, hashcode of key "hello" computed to integer value "12345".

[In Employee Letter Box example, hashcode was "first alphabet of employee name", So employee directly jumps to corresponding Letter box as he/she was aware that Letter box with same alphabet will be present.]

In case of Hashmap,
after computing hashcode
12345, we can not jump directly to array index 12345, because array size is 16(from 0 to 15) and index 12345 is not present.
                 
So, hashcode 12345 need to be converted into number between 0 to 15 to put it in array. So here comes the use of indexFor() method.

indexFor(hash, table.length) is used to calculate exact index in table array for storing the Key-Value pair.

Step 3:  So, hashcode is further computed to get array index also known as Bucket in hashmap terminology. Once, bucket is known, Key-Value pair is placed in that bucket.


Lets say bucket evaluated from hashcode 12345 is 2, 

After placing object in hashmap, it will look like below.


3.
Let's try to put key-value pair hashmap.put("jayesh","patel");

Step 3 will be repeated, hashcode of key "jayesh" is computed and let's say hashcode evaluated is 450, and bucket or index of array evaluated is 11.
   
After placing object in hashmap, it will look like below.


Note: 
Items that have the same hashcode will fall in same bucket,
So if more than one item is falling in the same bucket then those items will be
stored in Linked list like shown below.


4. Let's try to put key-value pair hashmap.put("khyati","patel");
Step 3 will be repeated, hashcode of key "khyati" is computed and let's say hashcode evaluated is 1200, and bucket or index of array evaluated is 2.

After placing object in hashmap, it will look like below.



5. Let's try to put key-value pair hashmap.put("khyati","pokar"); 
Step 3 will be repeated, hashcode() method will be called to compute hashcode of  "khyati".
                
In Step 4, hashcode evaluated for "khyati" gave
1200, then second
time if you calculate hashcode of "khyati", it will always return same hashcode 1200. and ultimately, bucket will also be same for same key. So in our example it will be Bucket 2.

Point to note here is, hashcode() method always return same hashcode for same key, irrespective of number of times it is called.

If we compare the same with our Employee Letter box example, then hashcode of Employee "Daniel" will be evaluate to "D", irrespective of number of times it is calculated.

Bucket 2 is not empty, So for placing Key-Value pair, "khyati"-"pokar",  it first goes to Bucket 2 and from there,
   
1.
It will encounter first key-value pair having key "hello".
hashmap will compare, key "hello" with new key "khyati", and check, is it same??

(internally, equals() method does this comparison of two keys)
It is not same, so it will move on to next Key-Value pair.


   
2.
Next it will encounter
Key-Value pair having key, "khyati". hashmap will compare, key "khyati" with new key "khyati", and check, is it same??

It is same, so it will replace the value "patel" with "pokar" and return (and not move on to next Key-Value pair).           
 

After placing object in hashmap, it will look like below.


Remember, new item that need to be stored, first is compared with every item already present in link list, if the item with same key is found then instead of creating new node/key-value pair, only value for matched node will be replaced.
Note:
              If two Objects have same hashcode, it doesn't mean both are same. hashcode is used to  
              group all Objects having same hashcode in one bucket, So that searching within
              bucket can be done faster compare to all Objects dumped in one bucket.

GET Operation:


Lets try to get already stored key-value pair in hashmap using key,

1. hashMap.get("khyati");
Get operation will follow the same procedure which is used in put operation.
  
Step 1: First hashcode of key "khyati" is computed, 


Step 2: Remember, hashcode of key "khyati" was evaluated to 1200 in put operationIf we calculate hashcode of "khyati" again, then it always evaluate to 1200.
                
Step 3: Now. bucket/array index need to be calculated from hashcode. indexFor() method will be used to get exact bucket/index for the key "khyati". 
                
Hashcode for same key will always evaluate same, similarly bucket/index calculated
for same hashcode will always return same index. 

indexFor() method will always return same index 2 for hashcode 1200.

Irrespective of number of times it is called, indexFor() method will always return same index for same hashcode.
     
Step 4: Once the bucket is known, equals() method will come in picture for comparing keys, It will start looking all the items in the bucket 2 for key "khyati",
                
First key-value pair in bucket is "hello-world", equals method will compare key "hello" with search key "khyati", both are NOT SAME, It will ignore and check next key-value pair.
                
Second key-value pair is "khyati-pokar" and key "khyati" will be compared with search key "khyati", both are SAMESo it will return value "pokar", stored against key "khyati" and return. 


If no item with matching key is found in the Bucket, then it will return value as null.

hashcode and equals method


  1. We saw, two in-different object's can have same hashcode, which ultimately have same bucket.


  2. To get the object, each key-value pair in that bucket need to be to compared against the key until bject is not found.
That is where hashcode() and equals() method come into picture.
 

In our Employee Letter box example, we can say, 

  1. hashcode() methods helps in finding the correct Letter box and,
  2. equals() method helps to look for each letter within that Letter box.

So we can say, 
hashcode() method generate hashcode of a object, which helps in identifying exact bucket. 
equals() method compares each element in the bucket until matching key is not found.

In hashmap, for both put and get operation,
1. First hashcode() method will come in picture, for bucket(array index) evaluation from key.
2. Second equals() method will comes in picture, which compares each key-value pair in identified bucket.
  

In Put operation, equals() method checks "is there any matching key present",

If present, then it will replace the value at matched key. 
If not present, then new key-value pair is placed at end.

In Get operation, equals() method checks "is there any matching key present",
If present, then it will return the value stored against matched key.
If not present, then it will return null. 

Null Key Placement


We just saw, how put operation of hashmap works, say hashmap.put("hello","world");

1. First hashcode of key "hello" is evaluated, which is supplied to indexFor method to identify exact bucket.

2. What if we try to put key-value pair like, hashmap.put(null,"world"); 
Above statement is perfectly valid and key here is null.
If key is null then it will be stored at first bucket / array[0], because hashcode of null key is always 0.

Let's see internally how Put operation works for handling NULL Key.



Simplified HashMap Internal Architecture representation


You may also like to see


What is Hashmap data structure? What is the need of Hashmap? How Hashmap data structure works in Java? What is Hashcode? Can 2 objects have same hashcode?

How time complexity of Hashmap get() and put() operation is O(1)? Is it O(1) in any condition?

What is Load factor and Rehashing in Hashmap?

When get method go to infinite loop in HashMap?



Enjoy !!!! 

If you find any issue in post or face any error while implementing, Please comment.

No comments:

Post a Comment