User Tools

Site Tools


remove_lines_with_duplicate_letters

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
remove_lines_with_duplicate_letters [2023/08/11 14:40] – removed - external edit (Unknown date) 127.0.0.1remove_lines_with_duplicate_letters [2023/08/27 23:48] (current) raju
Line 1: Line 1:
 +===== Remove lines with duplicate letters =====
 +==== Task ====
 +Find five letter words that are made up of letters {h, i, k, e, r, s} where each letter appears only once.
 +==== Issue ====
 +<code>
 +grep -Ei "^[hikers]{5}$" /usr/share/dict/american-english
 +</code>
 +gives
 +<code>
 +Essie
 +Hesse
 +Irish
 +Kerri
 +Reese
 +Sheri
 +...
 +</code>
 +where some letters are duplicated in some of the words. For example, the letter 'i' appears twice in the word 'Irish'. We need to filter out all such words.
 +
 +==== Solution ====
 +Pipe the output to ''grep -viE '(.).*\1' ''. For example
 +
 +<code>
 +% grep -Ei "^[hikers]{5}$" /usr/share/dict/american-english | grep -viE "(.).*\1"
 +Sheri
 +Shrek
 +heirs
 +hiker
 +hikes
 +hires
 +sheik
 +shire
 +shirk
 +skier
 +</code>
 +
 +==== Ref ====
 +  * https://unix.stackexchange.com/questions/352927/delete-all-lines-that-contain-duplicate-letters -> comment by Stéphane Chazelas
 +    * I got the idea from here.
 +  * (Mastering Regular Expressions, Jeffrey E.F. Friedl, 2006) book -> Chapter 1 "Introduction to Regular Expressions" -> section "Parentheses and Backreferences" -> pages 20 through 22 go over this concept.
 +