Set and HashSet
Set is a collection without duplicates. This lesson covers Set usage and deduplication principles.
Set Interface Characteristics
| Feature | Description |
|---|---|
| Unordered | No fixed order (HashSet) |
| No duplicates | Automatic deduplication |
| No index | Cannot access by index |
HashSet
HashSet is based on HashMap and is the most commonly used Set implementation.
Creating HashSet
JAVA
import java.util.HashSet;
import java.util.Set;
Set<String> set1 = new HashSet<>();
Set<String> set2 = new HashSet<>(100); // Specified capacity
Set<String> set3 = new HashSet<>(set1); // From another collection
Basic Operations
JAVA
import java.util.HashSet;
import java.util.Set;
public class HashSetDemo {
public static void main(String[] args) {
Set<String> set = new HashSet<>();
// Add elements
set.add("Alice");
set.add("Bob");
set.add("Charlie");
set.add("Alice"); // Duplicate, won't be added
System.out.println("Set: " + set); // [Alice, Bob, Charlie] (unordered)
// Check
System.out.println("Size: " + set.size()); // 3
System.out.println("Contains Alice: " + set.contains("Alice")); // true
System.out.println("Is empty: " + set.isEmpty()); // false
// Remove
set.remove("Bob");
System.out.println("After remove: " + set);
// Traverse
for (String name : set) {
System.out.println(name);
}
}
}
Deduplication Principle
HashSet uses hashCode and equals to determine if elements are duplicates.
Deduplication Flow
TEXT
1. Calculate element's hashCode
2. If hashCode different → not duplicate
3. If hashCode same → call equals
4. equals returns true → duplicate
5. equals returns false → not duplicate (hash collision)
Example: Custom Object Deduplication
JAVA
import java.util.HashSet;
import java.util.Objects;
import java.util.Set;
public class Student {
private String name;
private int age;
public Student(String name, int age) {
this.name = name;
this.age = age;
}
// Must override hashCode and equals
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Student student = (Student) o;
return age == student.age && Objects.equals(name, student.name);
}
@Override
public int hashCode() {
return Objects.hash(name, age);
}
@Override
public String toString() {
return name + "(" + age + ")";
}
public static void main(String[] args) {
Set<Student> set = new HashSet<>();
set.add(new Student("Alice", 20));
set.add(new Student("Bob", 22));
set.add(new Student("Alice", 20)); // Duplicate, won't be added
System.out.println("Size: " + set.size()); // 2
System.out.println("Set: " + set);
}
}
⚠️ Important: When storing custom objects in HashSet, you must override hashCode and equals methods, otherwise deduplication won't work correctly.
TreeSet
TreeSet is based on red-black tree and elements are automatically sorted.
Characteristics
| Feature | Description |
|---|---|
| Ordered | Elements sorted by natural or custom order |
| No duplicates | Automatic deduplication |
| No index | Cannot access by index |
Example: TreeSet
JAVA
import java.util.Set;
import java.util.TreeSet;
public class TreeSetDemo {
public static void main(String[] args) {
// Natural ordering
Set<Integer> numbers = new TreeSet<>();
numbers.add(3);
numbers.add(1);
numbers.add(4);
numbers.add(1); // Duplicate, won't be added
numbers.add(5);
System.out.println("Ordered set: " + numbers); // [1, 3, 4, 5]
// String ordering
Set<String> names = new TreeSet<>();
names.add("Charlie");
names.add("Alice");
names.add("Bob");
System.out.println("Ordered names: " + names); // [Alice, Bob, Charlie]
}
}
Custom Ordering
JAVA
import java.util.Comparator;
import java.util.Set;
import java.util.TreeSet;
public class CustomSortDemo {
public static void main(String[] args) {
// Sort by length
Set<String> words = new TreeSet<>(Comparator.comparingInt(String::length));
words.add("Hello");
words.add("Hi");
words.add("Java");
words.add("Go");
System.out.println("By length: " + words); // [Go, Hi, Java, Hello]
}
}
LinkedHashSet
LinkedHashSet maintains insertion order.
Characteristics
| Feature | Description |
|---|---|
| Ordered | In insertion order |
| No duplicates | Automatic deduplication |
Example: LinkedHashSet
JAVA
import java.util.LinkedHashSet;
import java.util.Set;
public class LinkedHashSetDemo {
public static void main(String[] args) {
Set<String> set = new LinkedHashSet<>();
set.add("Charlie");
set.add("Alice");
set.add("Bob");
System.out.println("Ordered set: " + set); // [Charlie, Alice, Bob]
}
}
Set Operations
Union
JAVA
Set<Integer> set1 = new HashSet<>(Arrays.asList(1, 2, 3));
Set<Integer> set2 = new HashSet<>(Arrays.asList(3, 4, 5));
Set<Integer> union = new HashSet<>(set1);
union.addAll(set2);
System.out.println("Union: " + union); // [1, 2, 3, 4, 5]
Intersection
JAVA
Set<Integer> intersection = new HashSet<>(set1);
intersection.retainAll(set2);
System.out.println("Intersection: " + intersection); // [3]
Difference
JAVA
Set<Integer> diff = new HashSet<>(set1);
diff.removeAll(set2);
System.out.println("Difference: " + diff); // [1, 2]
Set Comparison
| Feature | HashSet | TreeSet | LinkedHashSet |
|---|---|---|---|
| Order | Unordered | Sorted | Insertion order |
| Implementation | HashMap | Red-black tree | Linked list + HashMap |
| Performance | O(1) | O(log n) | O(1) |
| null | 1 allowed | Not allowed | 1 allowed |
Selection Guide
| Need | Choice |
|---|---|
| Don't care about order | HashSet |
| Need sorting | TreeSet |
| Maintain insertion order | LinkedHashSet |
Example: Deduplication Practice
JAVA
import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
public class RemoveDuplicates {
// Method 1: Set deduplication
public static <T> List<T> removeDuplicates1(List<T> list) {
return new ArrayList<>(new LinkedHashSet<>(list));
}
// Method 2: Stream deduplication
public static <T> List<T> removeDuplicates2(List<T> list) {
return list.stream().distinct().collect(Collectors.toList());
}
public static void main(String[] args) {
List<String> list = new ArrayList<>();
list.add("Alice");
list.add("Bob");
list.add("Alice");
list.add("Charlie");
list.add("Bob");
System.out.println("Original: " + list);
System.out.println("Deduplicated: " + removeDuplicates1(list));
}
}
❓ Frequently Asked Questions
Q How to choose between Set and List?
A Use Set for deduplication. Use List when you need ordered and duplicate elements.
Q What's TreeSet's sorting rule?
A Default is natural ordering (Comparable). You can customize with Comparator.
Q Why is HashSet fast?
A Based on HashMap, uses hashCode for direct lookup with O(1) time complexity.
📖 Summary
- Set has no duplicates: HashSet is unordered, TreeSet is sorted, LinkedHashSet maintains insertion order
- Deduplication relies on hashCode and equals methods
- Set operations: union (addAll), intersection (retainAll), difference (removeAll)
📝 Exercises
- Deduplication: Remove duplicates from a List while maintaining original order
- Intersection: Find the intersection of two Sets
- Sorting: Use TreeSet to sort strings by length
Next Lesson
In the next lesson, we'll learn about Map and HashMap — key-value pair collections.



