Java 8 – How to find duplicate in a Stream or List ?

In this article, we will discuss how to find and count duplicates in a Stream or List in different ways

Find and count duplicates in a Stream/List :

  1. Using Stream.distinct() method
  2. Using Stream.filter() and Collections.frequency() methods
  3. Using Stream.filter() and Set.add() methods
  4. Using Collectors.toMap() method and
    • Use Math::addExact for summation of duplicates
    • Use Integer::sum for summation of duplicates
    • Use Long::sum for summation of duplicates
  5. Using Collectors.groupingBy() method and
    • Use Collectors.counting() method
    • Use Collectors.summingInt() method
  6. Using Map object and Collection.forEach() method and
    • Use Map.getOrDefault() method
    • Use Map.merge() method and lambda Expression for summation of duplicates
    • Use Map.merge() method and Integer::sum for summation of duplicates

Let us discuss each one with example and description

1. Using Stream.distinct() method :

  • Stream.distinct() method eliminates duplicate from Original List and store into new List using collect(Collectors.toList()) method which results into unique list
  • For finding duplicates, iterate through original List and remove elements by comparing elements in unique list and store into new Set using collect(Collectors.toSet()) method which results into duplicate list

FindDuplicatesUsingDistinctMethod.java

package net.bench.resources.java.stream;

import java.util.ArrayList;
import java.util.List;
import java.util.stream.Collectors;

// using Stream.distinct() method
public class FindDuplicatesUsingDistinctMethod {

	public static void main(String[] args) {

		// 1. list of Strings
		List<String> companies = new ArrayList<String>();


		// 1.1 add string elements to List
		companies.add("Meta");
		companies.add("Apple");
		companies.add("Amazon");
		companies.add("Netflix");
		companies.add("Meta"); // duplicate
		companies.add("Google");
		companies.add("Apple"); // duplicate


		// 1.2 print original List to console
		System.out.println("1. Original List with duplicates : \n");
		companies.forEach(System.out::println);


		// 2. get unique elements after removing duplicates
		List<String> distinctCompanies = companies
				.stream()
				.distinct()
				.collect(Collectors.toList());


		// 2.1 print unique elements
		System.out.println("\n2. Unique elements : \n");
		distinctCompanies.forEach(System.out::println);


		// 3. get duplicate elements
		for (String distinctCompany : distinctCompanies) {
			companies.remove(distinctCompany);
		}


		// 3.1 print duplicate elements
		System.out.println("\n3. Duplicate elements : \n");
		companies.forEach(System.out::println);
	}
}

Output:

1. Original List with duplicates : 

Meta
Apple
Amazon
Netflix
Meta
Google
Apple

2. Unique elements : 

Meta
Apple
Amazon
Netflix
Google

3. Duplicate elements : 

Meta
Apple

2. Using Stream.filter() and Collections.frequency() methods :

  • Convert Original list into Set using collect(Collectors.toSet()) method which results into new Set with unique elements
  • For finding duplicates, use Stream.filter() method by checking whether Collections.frequency() method returns value greater than 1 or not
    • If it is greater than 1, then it means that there are duplicates present in the Original List
    • Finally, store those elements into another new Set using collect(Collectors.toSet()) method

FindDuplicatesUsingFilterAndCollectionsFrequency.java

package net.bench.resources.java.stream;

import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
import java.util.Set;
import java.util.stream.Collectors;

// using Stream.filter() and Collections.frequency() methods
public class FindDuplicatesUsingFilterAndCollectionsFrequency {

	public static void main(String[] args) {

		// 1. list of Strings
		List<String> companies = new ArrayList<String>();


		// 1.1 add string elements to List
		companies.add("Meta");
		companies.add("Apple");
		companies.add("Amazon");
		companies.add("Netflix");
		companies.add("Meta"); // duplicate
		companies.add("Google");
		companies.add("Apple"); // duplicate


		// 1.2 print original List to console
		System.out.println("1. Original List with duplicates : \n");
		companies.forEach(System.out::println);


		// 2. get unique elements after removing duplicates
		Set<String> distinctCompanies = companies
				.stream()
				.collect(Collectors.toSet());


		// 2.1 print unique elements
		System.out.println("\n2. Unique elements : \n");
		distinctCompanies.forEach(System.out::println);


		// 3. get duplicate elements
		Set<String> duplicateCompanies = companies
				.stream()
				.filter(company -> Collections.frequency(companies, company) > 1)
				.collect(Collectors.toSet());


		// 3.1 print duplicate elements
		System.out.println("\n3. Duplicate elements : \n");
		duplicateCompanies.forEach(System.out::println);
	}
}

Output:

1. Original List with duplicates : 

Meta
Apple
Amazon
Netflix
Meta
Google
Apple

2. Unique elements : 

Netflix
Meta
Google
Apple
Amazon

3. Duplicate elements : 

Meta
Apple

3. Using Stream.filter() and Set.add() methods :

  • Create HashSet object to store/add unique elements
  • For finding duplicates,
    • use Stream.filter() method by adding elements into newly created HashSet object
    • if it returns false then it means that there are duplicates present in the Original List
    • finally, store those elements into another new Set using collect(Collectors.toSet()) method
  • By doing this,
    • newly created HashSet object will contain only unique elements
    • filtered stream contains duplicate elements in another Set

FindDuplicatesUsingFilterAndSetAddMethod.java

package net.bench.resources.java.stream;

import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
import java.util.stream.Collectors;

// using Stream.filter() and Set.add() methods
public class FindDuplicatesUsingFilterAndSetAddMethod {

	public static void main(String[] args) {

		// 1. list of Strings
		List<String> companies = new ArrayList<String>();


		// 1.1 add string elements to List
		companies.add("Meta");
		companies.add("Apple");
		companies.add("Amazon");
		companies.add("Netflix");
		companies.add("Meta"); // duplicate
		companies.add("Google");
		companies.add("Apple"); // duplicate


		// 1.2 print original List to console
		System.out.println("1. Original List with duplicates : \n");
		companies.forEach(System.out::println);


		// 2. create Set object to store unique elements
		Set<String> distinctCompanies = new HashSet<>();


		// 3. get duplicate elements
		Set<String> duplicateCompanies = companies
				.stream()
				.filter(company -> !distinctCompanies.add(company))
				.collect(Collectors.toSet());


		// 2.1 print unique elements
		System.out.println("\n2. Unique elements : \n");
		distinctCompanies.forEach(System.out::println);


		// 3.1 print duplicate elements
		System.out.println("\n3. Duplicate elements : \n");
		duplicateCompanies.forEach(System.out::println);
	}
}

Output:

1. Original List with duplicates : 

Meta
Apple
Amazon
Netflix
Meta
Google
Apple

2. Unique elements : 

Netflix
Meta
Google
Apple
Amazon

3. Duplicate elements : 

Meta
Apple

4. Using Collectors.toMap() method :

  • Collectors.toMap() method can be used to convert Stream/List into Map with actual Stream/List elements being Key and their duplicate count as Value
  • For Key,
    • we will use Function.identity() method or
    • lambda expression (element -> element)
  • For Count of duplicate, we can use any one of the following ways

4.1 Use Math::addExact for counting duplicates :

  • Method reference Math::addExact can be used to add/sum duplicates in the Integer form

FindDuplicateCountUsingCollectorsToMap1.java

package net.bench.resources.java.stream;

import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.function.Function;
import java.util.stream.Collectors;

// using Collectors.toMap() and Math::addExact
public class FindDuplicateCountUsingCollectorsToMap1 {

	public static void main(String[] args) {

		// 1. list of Strings
		List<String> companies = new ArrayList<String>();


		// 1.1 add string elements to List
		companies.add("Meta");
		companies.add("Apple");
		companies.add("Amazon");
		companies.add("Netflix");
		companies.add("Meta"); // duplicate
		companies.add("Google");
		companies.add("Apple"); // duplicate


		// 1.2 print original List to console
		System.out.println("1. Original List with duplicates : \n");
		companies.forEach(System.out::println);


		// 2. get duplicate count using Map
		Map<String, Integer> duplicateCountMap = companies
				.stream()
				.collect(
						Collectors.toMap(Function.identity(), company -> 1, Math::addExact)
						);


		// 2.1 print Map for duplicate count
		System.out.println("\n2. Map with Key and its duplicate count : \n");
		duplicateCountMap.forEach(
				(key, value) -> System.out.println("Key : " + key + "\t Count : " + value)
				);
	}
}

Output:

1. Original List with duplicates : 

Meta
Apple
Amazon
Netflix
Meta
Google
Apple

2. Map with Key and its duplicate count : 

Key : Netflix	 Count : 1
Key : Google	 Count : 1
Key : Meta	 Count : 2
Key : Apple	 Count : 2
Key : Amazon	 Count : 1

4.2 Use Integer::sum for counting duplicates :

  • Method reference Integer::sum can be used to add/sum duplicates in the Integer form

FindDuplicateCountUsingCollectorsToMap2.java

package net.bench.resources.java.stream;

import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.function.Function;
import java.util.stream.Collectors;

//using Collectors.toMap() and Integer::sum
public class FindDuplicateCountUsingCollectorsToMap2 {

	public static void main(String[] args) {

		// 1. list of Strings
		List<String> companies = new ArrayList<String>();


		// 1.1 add string elements to List
		companies.add("Meta");
		companies.add("Apple");
		companies.add("Amazon");
		companies.add("Netflix");
		companies.add("Meta"); // duplicate
		companies.add("Google");
		companies.add("Apple"); // duplicate


		// 1.2 print original List to console
		System.out.println("1. Original List with duplicates : \n");
		companies.forEach(System.out::println);


		// 2. get duplicate count using Map
		Map<String, Integer> duplicateCountMap = companies
				.stream()
				.collect(
						Collectors.toMap(Function.identity(), company -> 1, Integer::sum)
						);


		// 2.1 print Map for duplicate count
		System.out.println("\n2. Map with Key and its duplicate count : \n");
		duplicateCountMap.forEach(
				(key, value) -> System.out.println("Key : " + key + "\t Count : " + value)
				);
	}
}

Output:

1. Original List with duplicates : 

Meta
Apple
Amazon
Netflix
Meta
Google
Apple

2. Map with Key and its duplicate count : 

Key : Netflix	 Count : 1
Key : Google	 Count : 1
Key : Meta	 Count : 2
Key : Apple	 Count : 2
Key : Amazon	 Count : 1

4.3 Use Long::sum for counting duplicates :

  • Method reference Long::sum can be used to add/sum duplicates in the Long form

FindDuplicateCountUsingCollectorsToMap3.java

package net.bench.resources.java.stream;

import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.function.Function;
import java.util.stream.Collectors;

//using Collectors.toMap() and Long::sum
public class FindDuplicateCountUsingCollectorsToMap3 {

	public static void main(String[] args) {

		// 1. list of Strings
		List<String> companies = new ArrayList<String>();


		// 1.1 add string elements to List
		companies.add("Meta");
		companies.add("Apple");
		companies.add("Amazon");
		companies.add("Netflix");
		companies.add("Meta"); // duplicate
		companies.add("Google");
		companies.add("Apple"); // duplicate


		// 1.2 print original List to console
		System.out.println("1. Original List with duplicates : \n");
		companies.forEach(System.out::println);


		// 2. get duplicate count using Map
		Map<String, Long> duplicateCount = companies
				.stream()
				.collect(
						Collectors.toMap(Function.identity(), company -> 1L, Long::sum)
						);


		// 2.1 print Map for duplicate count
		System.out.println("\n2. Map with Key and its duplicate count : \n");
		duplicateCount.forEach(
				(key, value) -> System.out.println("Key : " + key + "\t Count : " + value)
				);
	}
}

Output:

1. Original List with duplicates : 

Meta
Apple
Amazon
Netflix
Meta
Google
Apple

2. Map with Key and its duplicate count : 

Key : Netflix	 Count : 1
Key : Google	 Count : 1
Key : Meta	 Count : 2
Key : Apple	 Count : 2
Key : Amazon	 Count : 1

5. Using Collectors.groupingBy() method :

  • Collectors.groupingBy() method accepts 2 values,
    • 1st input-argument can be used as Key
    • 2nd input-argument can be used to store duplicate count as Value
  • So basically Collectors.groupingBy() method used to convert Stream/List into Map according to classification/category
  • For Key,
    • we will use Function.identity() method
    • lambda expression (element -> element)
  • For counting duplicates, we can use either of the below method,
    1. Collectors.counting() method
    2. Collectors.summingInt() method

5.1 Use Collectors.counting() method for counting duplicates :

  • Collectors.counting() method counts duplicates

FindDuplicateCountUsingGroupingByAndCounting1.java

package net.bench.resources.java.stream;

import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.function.Function;
import java.util.stream.Collectors;

// using Collectors.toMap() and groupingBy() and counting()
public class FindDuplicateCountUsingGroupingByAndCounting1 {

	public static void main(String[] args) {

		// 1. list of Strings
		List<String> companies = new ArrayList<String>();


		// 1.1 add string elements to List
		companies.add("Meta");
		companies.add("Apple");
		companies.add("Amazon");
		companies.add("Netflix");
		companies.add("Meta"); // duplicate
		companies.add("Google");
		companies.add("Apple"); // duplicate


		// 1.2 print original List to console
		System.out.println("1. Original List with duplicates : \n");
		companies.forEach(System.out::println);


		// 2. get unique elements
		Set<String> distinctCompanies = companies
				.stream()
				.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))
				.entrySet()
				.stream()
				.map(Map.Entry::getKey)
				.collect(Collectors.toSet());


		// 2.1 print unique elements
		System.out.println("\n2. Unique elements : \n");
		distinctCompanies.forEach(System.out::println);


		// 3. get duplicate elements
		Set<String> duplicateCompanies = companies
				.stream()
				.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))
				.entrySet()
				.stream()
				.filter(company -> company.getValue() > 1)
				.map(Map.Entry::getKey)
				.collect(Collectors.toSet());


		// 3.1 print duplicate elements
		System.out.println("\n3. Duplicate elements : \n");
		duplicateCompanies.forEach(System.out::println);


		// 4. get duplicate count using Map
		Map<String, Long> duplicateCount = companies
				.stream()
				.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));


		// 4.1 print Map for duplicate count
		System.out.println("\n4. Map with Key and its duplicate count : \n");
		duplicateCount.forEach(
				(key, value) -> System.out.println("Key : " + key + "\t Count : " + value)
				);
	}
}

Output:

1. Original List with duplicates : 

Meta
Apple
Amazon
Netflix
Meta
Google
Apple

2. Unique elements : 

Netflix
Google
Meta
Apple
Amazon

3. Duplicate elements : 

Meta
Apple

4. Map with Key and its duplicate count : 

Key : Netflix	 Count : 1
Key : Google	 Count : 1
Key : Meta	 Count : 2
Key : Apple	 Count : 2
Key : Amazon	 Count : 1

5.2 Use Collectors.summingInt() method for counting duplicates :

  • Collectors.summingInt() method counts duplicates by adding/increasing value by 1 for duplicate identity/Key

FindDuplicateCountUsingGroupingByAndCounting2.java

package net.bench.resources.java.stream;

import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.function.Function;
import java.util.stream.Collectors;

//using Collectors.toMap() and groupingBy() and summingInt()
public class FindDuplicateCountUsingGroupingByAndCounting2 {

	public static void main(String[] args) {

		// 1. list of Strings
		List<String> companies = new ArrayList<String>();


		// 1.1 add string elements to List
		companies.add("Meta");
		companies.add("Apple");
		companies.add("Amazon");
		companies.add("Netflix");
		companies.add("Meta"); // duplicate
		companies.add("Google");
		companies.add("Apple"); // duplicate


		// 1.2 print original List to console
		System.out.println("1. Original List with duplicates : \n");
		companies.forEach(System.out::println);


		// 2. get unique elements
		Set<String> distinctCompanies = companies
				.stream()
				.collect(Collectors.groupingBy(Function.identity(), Collectors.summingInt(c -> 1)))
				.entrySet()
				.stream()
				.map(Map.Entry::getKey)
				.collect(Collectors.toSet());


		// 2.1 print unique elements
		System.out.println("\n2. Unique elements : \n");
		distinctCompanies.forEach(System.out::println);


		// 3. get duplicate elements
		Set<String> duplicateCompanies = companies
				.stream()
				.collect(Collectors.groupingBy(Function.identity(), Collectors.summingInt(c -> 1)))
				.entrySet()
				.stream()
				.filter(company -> company.getValue() > 1)
				.map(Map.Entry::getKey)
				.collect(Collectors.toSet());


		// 3.1 print duplicate elements
		System.out.println("\n3. Duplicate elements : \n");
		duplicateCompanies.forEach(System.out::println);


		// 4. get duplicate count using Map
		Map<String, Integer> duplicateCount = companies
				.stream()
				.collect(Collectors.groupingBy(Function.identity(), Collectors.summingInt(c -> 1)));


		// 4.1 print Map for duplicate count
		System.out.println("\n4. Map with Key and its duplicate count : \n");
		duplicateCount.forEach(
				(key, value) -> System.out.println("Key : " + key + "\t Count : " + value)
				);
	}
}

Output:

1. Original List with duplicates : 

Meta
Apple
Amazon
Netflix
Meta
Google
Apple

2. Unique elements : 

Netflix
Google
Meta
Apple
Amazon

3. Duplicate elements : 

Meta
Apple

4. Map with Key and its duplicate count : 

Key : Netflix	 Count : 1
Key : Google	 Count : 1
Key : Meta	 Count : 2
Key : Apple	 Count : 2
Key : Amazon	 Count : 1

6. Using Map object and Collection.forEach() method :

  • Create HashMap object to store String element as Key and their respective duplicate count as Value
  • Note: HashMap doesn’t allow duplicate Key

6.1 Use Map.getOrDefault() method :

  • Iterate through original List and store/put element into newly created HashMap to get unique elements as Key and their respective duplicate count as Value
  • At the time of iterating original list,
    • For Key, store unique element from List
    • For Value, start with 1 as count and increment by 1 for each duplicate using Map’s getOrDefault() method

FindDuplicateCountUsingMapAndForEach1.java

package net.bench.resources.java.stream;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

// using Map object and Collection.forEach() method
public class FindDuplicateCountUsingMapAndForEach1 {

	public static void main(String[] args) {

		// 1. list of Strings
		List<String> companies = new ArrayList<String>();


		// 1.1 add string elements to List
		companies.add("Meta");
		companies.add("Apple");
		companies.add("Amazon");
		companies.add("Netflix");
		companies.add("Meta"); // duplicate
		companies.add("Google");
		companies.add("Apple"); // duplicate


		// 1.2 print original List to console
		System.out.println("1. Original List with duplicates : \n");
		companies.forEach(System.out::println);


		// 2. create HashMap object
		Map<String, Integer> duplicateCountMap = new HashMap<>();


		// 2.1 iterate and store duplicate count into Map object
		companies.forEach(company -> duplicateCountMap.put((String)company, 
				duplicateCountMap.getOrDefault((String)company, 0) + 1));


		// 2.2 print to console
		System.out.println("\n2. Map with Key and its duplicate count : \n");
		System.out.println(duplicateCountMap);
	}
}

Output:

1. Original List with duplicates : 

Meta
Apple
Amazon
Netflix
Meta
Google
Apple

2. Map with Key and its duplicate count : 

{Netflix=1, Meta=2, Google=1, Apple=2, Amazon=1}

6.2 Use Map.merge() method and lambda for counting duplicates :

  • Use Map’s merge() method to store/put into newly created HashMap to get unique elements as Key and their respective duplicate count as Value
  • At the time of iterating original list,
    • For Key, store unique element from List
    • For Value, start with 1 as count and use lambda expression (a, b) -> a + b for counting duplicates by adding/summing

FindDuplicateCountUsingMapAndForEach2.java

package net.bench.resources.java.stream;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

//using Map object and Collection.forEach() method
public class FindDuplicateCountUsingMapAndForEach2 {

	public static void main(String[] args) {

		// 1. list of Strings
		List<String> companies = new ArrayList<String>();


		// 1.1 add string elements to List
		companies.add("Meta");
		companies.add("Apple");
		companies.add("Amazon");
		companies.add("Netflix");
		companies.add("Meta"); // duplicate
		companies.add("Google");
		companies.add("Apple"); // duplicate


		// 1.2 print original List to console
		System.out.println("1. Original List with duplicates : \n");
		companies.forEach(System.out::println);


		// 2. create HashMap object
		Map<String, Integer> duplicateCountMap = new HashMap<>();


		// 2.1 iterate and store duplicate count into Map object
		companies.forEach(company -> duplicateCountMap.merge(company, 1, (a, b) -> a + b));


		// 2.2 print to console
		System.out.println("\n2. Map with Key and its duplicate count : \n");
		System.out.println(duplicateCountMap);
	}
}

Output:

1. Original List with duplicates : 

Meta
Apple
Amazon
Netflix
Meta
Google
Apple

2. Map with Key and its duplicate count : 

{Netflix=1, Google=1, Meta=2, Apple=2, Amazon=1}

6.3 Use Map.merge() and Integer::sum for counting duplicates :

  • Use Map’s merge() method to store/put into newly created HashMap to get unique elements as Key and their respective duplicate count as Value
  • At the time of iterating original list,
    • For Key, store unique element from List
    • For Value, use method reference Integer::sum for counting duplicates

FindDuplicateCountUsingMapAndForEach3.java

package net.bench.resources.java.stream;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

//using Map object and Collection.forEach() method
public class FindDuplicateCountUsingMapAndForEach3 {

	public static void main(String[] args) {

		// 1. list of Strings
		List<String> companies = new ArrayList<String>();


		// 1.1 add string elements to List
		companies.add("Meta");
		companies.add("Apple");
		companies.add("Amazon");
		companies.add("Netflix");
		companies.add("Meta"); // duplicate
		companies.add("Google");
		companies.add("Apple"); // duplicate


		// 1.2 print original List to console
		System.out.println("1. Original List with duplicates : \n");
		companies.forEach(System.out::println);


		// 2. create HashMap object
		Map<String, Integer> duplicateCountMap = new HashMap<>();


		// 2.1 iterate and store duplicate count into Map object
		companies.forEach(company -> duplicateCountMap.merge(company, 1, Integer::sum));


		// 2.2 print to console
		System.out.println("\n2. Map with Key and its duplicate count : \n");
		System.out.println(duplicateCountMap);
	}
}

Output:

1. Original List with duplicates : 

Meta
Apple
Amazon
Netflix
Meta
Google
Apple

2. Map with Key and its duplicate count : 

{Netflix=1, Google=1, Meta=2, Apple=2, Amazon=1}

Related Articles:

References:

Happy Coding !!
Happy Learning !!

Java 8 - How to store multiple values for single key in HashMap ?
Java 8 - How to remove duplicate from Arrays ?