Set and HashSet

Set is a collection without duplicates. This lesson covers Set usage and deduplication principles.

Set Interface Characteristics

Feature Description
Unordered No fixed order (HashSet)
No duplicates Automatic deduplication
No index Cannot access by index

HashSet

HashSet is based on HashMap and is the most commonly used Set implementation.

Creating HashSet

JAVA
import java.util.HashSet;
import java.util.Set;

Set<String> set1 = new HashSet<>();
Set<String> set2 = new HashSet<>(100);  // Specified capacity
Set<String> set3 = new HashSet<>(set1);  // From another collection

Basic Operations

JAVA
import java.util.HashSet;
import java.util.Set;

public class HashSetDemo {
    public static void main(String[] args) {
        Set<String> set = new HashSet<>();
        
        // Add elements
        set.add("Alice");
        set.add("Bob");
        set.add("Charlie");
        set.add("Alice");  // Duplicate, won't be added
        System.out.println("Set: " + set);  // [Alice, Bob, Charlie] (unordered)
        
        // Check
        System.out.println("Size: " + set.size());  // 3
        System.out.println("Contains Alice: " + set.contains("Alice"));  // true
        System.out.println("Is empty: " + set.isEmpty());  // false
        
        // Remove
        set.remove("Bob");
        System.out.println("After remove: " + set);
        
        // Traverse
        for (String name : set) {
            System.out.println(name);
        }
    }
}

Deduplication Principle

HashSet uses hashCode and equals to determine if elements are duplicates.

Deduplication Flow

TEXT
1. Calculate element's hashCode
2. If hashCode different → not duplicate
3. If hashCode same → call equals
4. equals returns true → duplicate
5. equals returns false → not duplicate (hash collision)

Example: Custom Object Deduplication

JAVA
import java.util.HashSet;
import java.util.Objects;
import java.util.Set;

public class Student {
    private String name;
    private int age;
    
    public Student(String name, int age) {
        this.name = name;
        this.age = age;
    }
    
    // Must override hashCode and equals
    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        Student student = (Student) o;
        return age == student.age && Objects.equals(name, student.name);
    }
    
    @Override
    public int hashCode() {
        return Objects.hash(name, age);
    }
    
    @Override
    public String toString() {
        return name + "(" + age + ")";
    }
    
    public static void main(String[] args) {
        Set<Student> set = new HashSet<>();
        set.add(new Student("Alice", 20));
        set.add(new Student("Bob", 22));
        set.add(new Student("Alice", 20));  // Duplicate, won't be added
        
        System.out.println("Size: " + set.size());  // 2
        System.out.println("Set: " + set);
    }
}
▶ Try it Yourself
⚠️ Important: When storing custom objects in HashSet, you must override hashCode and equals methods, otherwise deduplication won't work correctly.

TreeSet

TreeSet is based on red-black tree and elements are automatically sorted.

Characteristics

Feature Description
Ordered Elements sorted by natural or custom order
No duplicates Automatic deduplication
No index Cannot access by index

Example: TreeSet

JAVA
import java.util.Set;
import java.util.TreeSet;

public class TreeSetDemo {
    public static void main(String[] args) {
        // Natural ordering
        Set<Integer> numbers = new TreeSet<>();
        numbers.add(3);
        numbers.add(1);
        numbers.add(4);
        numbers.add(1);  // Duplicate, won't be added
        numbers.add(5);
        System.out.println("Ordered set: " + numbers);  // [1, 3, 4, 5]
        
        // String ordering
        Set<String> names = new TreeSet<>();
        names.add("Charlie");
        names.add("Alice");
        names.add("Bob");
        System.out.println("Ordered names: " + names);  // [Alice, Bob, Charlie]
    }
}
▶ Try it Yourself

Custom Ordering

JAVA
import java.util.Comparator;
import java.util.Set;
import java.util.TreeSet;

public class CustomSortDemo {
    public static void main(String[] args) {
        // Sort by length
        Set<String> words = new TreeSet<>(Comparator.comparingInt(String::length));
        words.add("Hello");
        words.add("Hi");
        words.add("Java");
        words.add("Go");
        System.out.println("By length: " + words);  // [Go, Hi, Java, Hello]
    }
}

LinkedHashSet

LinkedHashSet maintains insertion order.

Characteristics

Feature Description
Ordered In insertion order
No duplicates Automatic deduplication

Example: LinkedHashSet

JAVA
import java.util.LinkedHashSet;
import java.util.Set;

public class LinkedHashSetDemo {
    public static void main(String[] args) {
        Set<String> set = new LinkedHashSet<>();
        set.add("Charlie");
        set.add("Alice");
        set.add("Bob");
        System.out.println("Ordered set: " + set);  // [Charlie, Alice, Bob]
    }
}
▶ Try it Yourself

Set Operations

Union

JAVA
Set<Integer> set1 = new HashSet<>(Arrays.asList(1, 2, 3));
Set<Integer> set2 = new HashSet<>(Arrays.asList(3, 4, 5));

Set<Integer> union = new HashSet<>(set1);
union.addAll(set2);
System.out.println("Union: " + union);  // [1, 2, 3, 4, 5]

Intersection

JAVA
Set<Integer> intersection = new HashSet<>(set1);
intersection.retainAll(set2);
System.out.println("Intersection: " + intersection);  // [3]

Difference

JAVA
Set<Integer> diff = new HashSet<>(set1);
diff.removeAll(set2);
System.out.println("Difference: " + diff);  // [1, 2]

Set Comparison

Feature HashSet TreeSet LinkedHashSet
Order Unordered Sorted Insertion order
Implementation HashMap Red-black tree Linked list + HashMap
Performance O(1) O(log n) O(1)
null 1 allowed Not allowed 1 allowed

Selection Guide

Need Choice
Don't care about order HashSet
Need sorting TreeSet
Maintain insertion order LinkedHashSet

Example: Deduplication Practice

JAVA
import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;

public class RemoveDuplicates {
    // Method 1: Set deduplication
    public static <T> List<T> removeDuplicates1(List<T> list) {
        return new ArrayList<>(new LinkedHashSet<>(list));
    }
    
    // Method 2: Stream deduplication
    public static <T> List<T> removeDuplicates2(List<T> list) {
        return list.stream().distinct().collect(Collectors.toList());
    }
    
    public static void main(String[] args) {
        List<String> list = new ArrayList<>();
        list.add("Alice");
        list.add("Bob");
        list.add("Alice");
        list.add("Charlie");
        list.add("Bob");
        
        System.out.println("Original: " + list);
        System.out.println("Deduplicated: " + removeDuplicates1(list));
    }
}

❓ Frequently Asked Questions

Q How to choose between Set and List?
A Use Set for deduplication. Use List when you need ordered and duplicate elements.
Q What's TreeSet's sorting rule?
A Default is natural ordering (Comparable). You can customize with Comparator.
Q Why is HashSet fast?
A Based on HashMap, uses hashCode for direct lookup with O(1) time complexity.

📖 Summary

📝 Exercises

  1. Deduplication: Remove duplicates from a List while maintaining original order
  2. Intersection: Find the intersection of two Sets
  3. Sorting: Use TreeSet to sort strings by length

Next Lesson

In the next lesson, we'll learn about Map and HashMap — key-value pair collections.

100%

🙏 帮我们做得更好

我们是刚上线的编程教程站,几个人的小团队,精力有限。页面虽经检查,难免还有疏漏——链接失效、排版错乱、内容有误、语言生硬……

如果您发现了,麻烦告诉我们,我们会在收到反馈后第一时间进行修复,再次感谢您的光临 🙏