In this article, we will count and print number of repeated word occurrences in a String i.e.;
From a given String, we will be counting & printing
- Number of repeated words
- Along with its count
Steps for counting repeated word occurrences:
- Create empty HashMap of type String & Integer
- Split the String using space a delimiter and assign it to String[]
- Iterate through String[] array after splitting using for-each loop
- Note: we will convert all strings into lowercase before checking for case-insensitive purpose
- Check whether particular word is already present in the HashMap using containsKey(k) method of Map interface
- If it contains, then increase the count value by 1 using put(K, V) method of Map
- Otherwise insert using put() method of Map with count value as 1
- Finally, print Map using keySet() or entrySet() method of Map.Entry interface
- Code sorting logic for printing count value in descending order using Comparator interface
- Again print after sorting
ReadCountPrintRepeatedWordOccurencesInString.java
package in.bench.resources.count.print.occurences; import java.util.ArrayList; import java.util.Collections; import java.util.Comparator; import java.util.HashMap; import java.util.LinkedHashMap; import java.util.List; import java.util.Map; import java.util.Map.Entry; import java.util.Set; public class ReadCountPrintRepeatedWordOccurencesInString { // main() method - entry point to start execution public static void main(String[] args) { // sample test string String testStr = "Science blank Maths blank blank" + " Physics blank Maths"; // invoke to count & print for supplied file countAndPrintRepeatedWordOccurences(testStr); } /** * this method is used * to count number repeated word occurrences * @param fileName */ public static void countAndPrintRepeatedWordOccurences( String strContent) { // Step 1: create Map of String-Integer Map<String, Integer> mapOfRepeatedWord = new HashMap<String, Integer>(); // Step 2: split line using space as delimiter String[] words = strContent.split(" "); // Step 3: iterate through String[] array for(String word : words) { // Step 4: convert all String into lower case, // before comparison String tempUCword = word.toLowerCase(); // Step 5: check whether Map contains particular word if(mapOfRepeatedWord.containsKey(tempUCword)){ // Step 6: If contains, increase count value by 1 mapOfRepeatedWord.put(tempUCword, mapOfRepeatedWord.get(tempUCword) + 1); } else { // Step 7: otherwise, make a new entry mapOfRepeatedWord.put(tempUCword, 1); } } System.out.println("Before sorting : \n"); System.out.println("Words" + "\t\t" + "Count"); System.out.println("======" + "\t\t" + "====="); // Step 8: print word along with its count for(Map.Entry<String, Integer> entry : mapOfRepeatedWord.entrySet()){ System.out.println(entry.getKey() + "\t\t" + entry.getValue()); } // Step 9: Sorting logic by invoking sortByCountValue() Map<String, Integer> wordLHMap = sortByCountValue( mapOfRepeatedWord); System.out.println("\n\nAfter sorting" + " in descending order of count : \n"); System.out.println("Words" + "\t\t" + "Count"); System.out.println("======" + "\t\t" + "====="); // Step 10: Again print after sorting for(Map.Entry<String, Integer> entry : wordLHMap.entrySet()) { System.out.println(entry.getKey() + "\t\t" + entry.getValue()); } } /** * this method sort acc. to count value * @param mapOfRepeatedWord * @return */ public static Map<String, Integer> sortByCountValue( Map<String, Integer> mapOfRepeatedWord) { // get entrySet from HashMap object Set<Map.Entry<String, Integer>> setOfWordEntries = mapOfRepeatedWord.entrySet(); // convert HashMap to List of Map entries List<Map.Entry<String, Integer>> listOfwordEntry = new ArrayList<Map.Entry<String, Integer>>( setOfWordEntries); // sort list of entries using Collections.sort(ls, cmptr); Collections.sort(listOfwordEntry, new Comparator<Map.Entry<String, Integer>>() { @Override public int compare(Entry<String, Integer> es1, Entry<String, Integer> es2) { return es2.getValue().compareTo(es1.getValue()); } }); // store into LinkedHashMap for maintaining insertion Map<String, Integer> wordLHMap = new LinkedHashMap<String, Integer>(); // iterating list and storing in LinkedHahsMap for(Map.Entry<String, Integer> map : listOfwordEntry){ wordLHMap.put(map.getKey(), map.getValue()); } return wordLHMap; } }
Output:
Before sorting : Words Count ====== ===== blank 4 maths 2 science 1 physics 1 After sorting in descending order of count : Words Count ====== ===== blank 4 maths 2 science 1 physics 1
Note: Stop at Step 8, if there is no business requirements for sorting either way (ascending or descending)
Reading from file in Java 1.7 version:
- In the above example, we counted repeated words from String content
- Similarly, we can read file from local drive location and count number of repeated words
- While doing so, we need to provide catch block with FileNotFoundException and IOException for exception raised, as we are dealing with files
- we will use try-with-resources statement introduced in Java 1.7 version, as it handles automatic resource management implicitly i.e.;
- auto-closing of opened resources without explicit closing inside finally block after necessary null-safety checks
- thus, it improves readability of the code by reducing number of lines of code
Sample text file:
ReadingFromFileInJava7.java
package in.bench.resources.count.print.occurences; import java.io.BufferedReader; import java.io.FileNotFoundException; import java.io.FileReader; import java.io.IOException; import java.util.ArrayList; import java.util.Collections; import java.util.Comparator; import java.util.HashMap; import java.util.LinkedHashMap; import java.util.List; import java.util.Map; import java.util.Set; import java.util.Map.Entry; public class ReadPrintCharFromFileInJava7 { public static void main(String[] args) { // invoke to count & print for supplied file countAndPrintRepeatedCharacterOccurences("D://WORKSPACE/" + "TEST_WORKSPACE/Java_8_Project/BRN2.txt"); } /** * this method is used * to count number repeated word occurrences * @param fileName */ public static void countAndPrintRepeatedCharacterOccurences( String fileName) { // local variables String line = ""; // Step 1: create Map of Character-Integer Map<Character, Integer> mapOfRepeatedChar = new HashMap<Character, Integer>(); // Step A: Read file using try-with-resources statement try(BufferedReader bufferedReader = new BufferedReader( new FileReader(fileName))) { // Step B: Read line from file while ((line = bufferedReader.readLine()) != null) { // Step 2: convert String into character-array // using toCharArray() method char[] chArray = line.toCharArray(); // Step 3: iterate through char[] array for(char ch : chArray) { // Step 4: leave spaces if(ch != ' '){ // Step 5: check // whether Map contains same character if(mapOfRepeatedChar.containsKey(ch)){ // Step 6: If contains, // increase count value by 1 mapOfRepeatedChar.put(ch, mapOfRepeatedChar.get(ch) + 1); } else { // Step 7: otherwise, make a new entry mapOfRepeatedChar.put(ch, 1); } } } } System.out.println("Before sorting : \n"); System.out.println("Char" + "\t" + "Count"); System.out.println("====" + "\t" + "====="); // Step 8: get keySet() for iteration Set<Character> character = mapOfRepeatedChar.keySet(); // Step 9: print word along with its count for(Character ch : character) { System.out.println(ch + "\t" + mapOfRepeatedChar.get(ch)); } // Step 10: Sorting logic // by invoking sortByCountValue() Map<Character, Integer> wordLHMap = sortByCountValue( mapOfRepeatedChar); System.out.println("\n\nAfter sorting" + " in descending order of count : \n"); System.out.println("Char" + "\t" + "Count"); System.out.println("====" + "\t" + "====="); // Step 11: Again print after sorting for(Map.Entry<Character, Integer> entry : wordLHMap.entrySet()){ System.out.println(entry.getKey() + "\t" + entry.getValue()); } } catch (FileNotFoundException fnfex) { fnfex.printStackTrace(); } catch (IOException ioex) { ioex.printStackTrace(); } } /** * this method sort acc. to count value * @param mapOfRepeatedWord * @return */ public static Map<Character, Integer> sortByCountValue( Map<Character, Integer> mapOfRepeatedWord) { // get entrySet from HashMap object Set<Map.Entry<Character, Integer>> setOfWordEntries = mapOfRepeatedWord.entrySet(); // convert HashMap to List of Map entries List<Map.Entry<Character, Integer>> listOfwordEntry = new ArrayList<Map.Entry<Character, Integer>>( setOfWordEntries); // sort list of entries using Collections.sort(ls, cmptr); Collections.sort(listOfwordEntry, new Comparator<Map.Entry<Character, Integer>>() { @Override public int compare(Entry<Character, Integer> es1, Entry<Character, Integer> es2) { return es2.getValue().compareTo(es1.getValue()); } }); // store into LinkedHashMap for maintaining insertion Map<Character, Integer> wordLHMap = new LinkedHashMap<Character, Integer>(); // iterating list and storing in LinkedHahsMap for(Map.Entry<Character, Integer> map : listOfwordEntry) { wordLHMap.put(map.getKey(), map.getValue()); } return wordLHMap; } }
Output:
Before sorting : Words Count ====== ===== MATHS 2 BLANK 4 SCIENCE 1 PHYSICS 1 After sorting in descending order of count : Words Count ====== ===== BLANK 4 MATHS 2 SCIENCE 1 PHYSICS 1
Note: Stop at Step 8, if there is no business requirements for sorting either way (ascending or descending)
Hope, you found this article very helpful. If you any suggestion or want to contribute to improve this article, then share with us. We will include that code here.
Happy Coding !!
Happy Learning !!