Linux, by default, comes up with a huge collection of handy and powerful tools that can make the life a ton easier. Does your workload requires lots of manipulation of text files and strings? Then you may be already familiar with the “uniq” command. It’s quite common in that type of workspace.
Are you new to “uniq”? Don’t worry! Let’s have a look at “uniq” and its usage.
What “uniq” is
Linux comes up with a big collection of tools and features. “uniq” is one of them. The tool works on the command line interface. The purpose of this tool is reporting and/or omitting repeated lines or strings of a file.
The tool just filters adjacent matching from INPUT (standard input) and write out to OUTPUT (standard output). If no option is specified, the tool will merge all the occurrences into the first occurrence.
At first, let’s grab a text file to work with. For demo purpose, my “demo.txt” contains a number of duplicate entries.
Omitting duplicates on a file
Note – Instead of “cat”, I’m using “bat” – a beautiful clone of “cat” with additional advanced features.
Now, open up the file with “uniq” –
Displaying the number of repetition
Using the “-c” flag, you can also ask “uniq” how many times a repeated line is repeated.
bat demo.txt uniq -c demo.txt
Printing the “duplicates” ONLY
Need to figure out the duplicates ONLY? Then use the “-d” flag with “uniq”.
bat demo.txt uniq -d demo.txt
Printing the “unique” ONLY
If you need to work with the unique lines ONLY, then you should use the “-u” flag.
bat demo.txt uniq -u demo.txt
Disable case sensitivity
When “uniq” is checking the strings, all the characters are case-sensitive, so, “Viktor” and “viktor” are different entry. Need to disable the case sensitivity? Then use the “-i” parameter.
bat demo.txt uniq -i demo.txt
Sorting and finding duplicates
In most of the cases, the duplicates may be scattered throughout the entire file. When you use “uniq” on that file, it can be misleading and confusing. So, what to do?
Well, sort out the file contents first and then, do the “uniq” thing!
bat demo.txt sort demo.txt | uniq -c
In cases, you may need to ignore the presence of a first few characters in the file. In that case, use the “-s” flag followed by the number of character you’d like to omit from the OUTPUT.
uniq -s 3 demo.txt
Exporting the output to a file
It’s quite easy to export the output of any command into a file. Just check out this awesome tutorial on exporting command output to a file.
uniq -u demo.txt > demo-1.txt bat demo-1.txt