User Tools

Site Tools


number_of_matches_per_file

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
number_of_matches_per_file [2021/10/31 17:58] – [Solution using grep and awk] adminnumber_of_matches_per_file [2021/11/04 14:59] (current) – [Solution using grep and awk] admin
Line 1: Line 1:
 ===== Task ===== ===== Task =====
-Given+Given a bunch of files in a directory, count the number of times a word occurs in each file. For example, given 
 <code> <code>
 % tail -n +1 * % tail -n +1 *
Line 17: Line 18:
 </code> </code>
  
-count the number of times the word foo occurred in each file. The expected answer is+count the number of occurrences of 'fooin each file. The expected answer is
 <code> <code>
 junk1.txt:4 junk1.txt:4
Line 25: Line 26:
 tags | Number of matches per file tags | Number of matches per file
  
-Sample code demoes | cat with filename+sample code demoes | cat with filename
 ===== Solution using git grep and awk ===== ===== Solution using git grep and awk =====
 If it is a git repository If it is a git repository
Line 63: Line 64:
  
 ==== References ==== ==== References ====
-  * https://stackoverflow.com/questions/39945363/frequency-count-for-file-column-in-bash - frequency counting using awk+  * https://stackoverflow.com/questions/39945363/frequency-count-for-file-column-in-bash - count frequencies using awk
  
 ==== tags ==== ==== tags ====
Line 70: Line 71:
 ===== Solution using grep and awk ===== ===== Solution using grep and awk =====
 <code> <code>
-grep -ro "came across" * | awk -F':' '{freq[$1]++} END{for (file in freq) print file ":" freq[file]}'+grep -ro foo * | awk -F':' '{freq[$1]++} END{for (file in freq) print file ":" freq[file]}'
 </code> </code>
  
 Useful if git is not available. Useful if git is not available.
-===== Solution using find and grep =====+===== Solution using findgrep and wc =====
 <code> <code>
 find * -printf 'echo "%p:$(grep -o "foo" %p | wc -l)";' | sh find * -printf 'echo "%p:$(grep -o "foo" %p | wc -l)";' | sh
Line 103: Line 104:
 </code> </code>
  
-Note: You have to use "grep -o" instead of "grep -c". If a string occurs multiple times in a line, "grep -o" matches each of them separately. But "grep -c" counts them together. For example+Note: You have to use "grep -o" and not "grep -c". If a string occurs multiple times in a line, "grep -o" matches each of them separately. But "grep -c" counts them together. For example
  
 <code> <code>
number_of_matches_per_file.1635703125.txt.gz · Last modified: 2021/10/31 17:58 by admin