HashSet in Java

HashSet class implements the Set interface.The Set interface does not allow any duplication.HashSet is unordered , unsorted Set implementation. The Hashset uses the hash code  values of the objects to be stored.So performance with HashSet depends on the hashCode() implementation of objects being stored .Now , let us see a sample code

import java.util.HashSet;
import java.util.Iterator;
import java.util.Set;
public class HashSetSample {
Set set = null;
public HashSetSample(){
set = new HashSet() ;
}
public void addItemsToSet() {
String[] listItems = {"dog", "cat", "cow", "elephant", "sheep"};
for (int i = 0; i < listItems.length; i++) { set.add(listItems[i]); } } public void displaySet() { System.out.println("Displaying contents of set"); Iterator itr = set.iterator(); while(itr.hasNext()){ System.out.println("Item = "+itr.next()); } } public void removeItems() { System.out.println("Removing contents of set"); set.remove("dog"); set.remove("cat"); set.remove("cow"); set.remove("elephant"); set.remove("sheep"); System.out.println("Contents removed ,now size of set = " + set.size()); } public static void main(String[] args) { HashSetSample sample = new HashSetSample(); sample.addItemsToSet(); sample.displaySet(); sample.removeItems(); } }

In this example a list of String objects are adding to the HashSet object. Here an Iterator instance is used to iterate the set.Then removing the objects  one by one .Mow let us see the output

Output

Displaying contents of set

Item = cat

Item = elephant

Item = cow

Item = sheep

Item = dog

Removing contents of set

Contents removed  ,now size of set = 0

Difference between Hashset and LinkedHashSet

The order of iteration cannot be guaranteed in case of Hashset.But if we are using LinkedHashSet  the  order with which items inserted can be preserved.   

HashSet and  hashCode()

HashSet implementation depends on hash code value of objects storing in it.So for effective insertion search and iteration ,  the class whose objects are going to store in the Hashset ,  needs to override hashCode()  and equals() method.The concept is well explained here.

case 1)Without overriding hashCode() and  equals()

In this case , we are planning to put four Employee objects in a HashSet .After putting those objects , we are initializing a new object with same attributes of an object which is already stored in the HashSet.

Let us see our Employee.java first:

public class Employee {
private int id;
private String name;
public Employee(int empId, String name) {
this.id = empId;
this.name = name;
}
public int getId() {
return id;
}
public void setId(int id) {
this.id = id;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String toString() {
return "Id = " + getId() + " Name = " + getName();
}
}

Now let us see the Main.java.It adds four Employee objects (emp1 ,emp2,emp3 and emp4)and later trying to check for an object which is meaningfully equal to emp1(attributes are same  for  both the objects).

import java.util.HashSet;
import java.util.Set;
public class Main {
private Set set = null;
public Main() {
set = new HashSet();
}
public void addItems() {
Employee emp1 = new Employee(1,"Bijoy");
Employee emp2 = new Employee(2,"Karthik");
Employee emp3 = new Employee(3,"JayaKrishnan");
Employee emp4 = new Employee(4,"Dexter");
set.add(emp1);
set.add(emp2);
set.add(emp3);
set.add(emp4);

}
public void getItems() {
Employee emp = new Employee(1,"Bijoy");
System.out.println(emp +" is present = "+set.contains(emp1));
}
public static void main(String[] args) {
Main main = new Main();
main.addItems();
main.getItems();
}
}

In  Employee.java shown above , we are not overriding  hashCode() and equals() . So all objects are having different hash code  values. Even the emp1 and emp objects are having different hash codes. So the output is false.Let us verify it.(For more details about hashCode() and equals() ,see my older post )

Output

Id = 1 Name = Bijoy  : is present = false

Case 2)Employee.java with overridden hashCode() and equals()

Now let us override  hashCode() and equals() methods in Employee.java.So the changed Employee.java becomes:

public class Employee {
private int id;
private String name;
public Employee(int empId, String name) {
this.id = empId;
this.name = name;
}
public int getId() {
return id;
}
public void setId(int id) {
this.id = id;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String toString() {
return "Id = " + getId() + " Name = " + getName();
}
public int hashCode(){
return getId();
}
public boolean equals(Object object){
boolean status = false;
Employee employee = (Employee)object;
if(getId() == employee.getId()){
status = true;
} else{
status = false;
}
return status;
}
}

Use the Main.java shown in case 1 .Run Main.java .In this case the emp and emp1  are having the same hash code value , because the hash code value depends on the attribute 'id'.So the search occurs in the same memory slot where emp1 is stored. The  equals() method also returns true.So  'true' is going to be the output.

Output

Id = 1 Name = Bijoy   is present = true

Summary

Set avoids duplication of elements .A HashSet implementation relies on hash code of objects storing in it.So if we need to store objects of a class in HashSet ,then we need to  override hashCode() and equals() methods in that class.If  our class  does not override hashCode() then:

1)Insertion of multiple objects with same attribute values(meaningfully same objects ) can happen. So meaningfully same objects can present in multiple times in a  HashSet.This is  duplication.

2)Search for an object may not be effective.Because meaningfully equal objects are  having different hash code values in this case .So  it is not possible to identify such objects.(As of the example in case 1  explained above)