How to remove duplicate lines using awk?
— Kaushal ModiIf you type echo "Hi\nHow\nHi\nAre\nHi\nYou?\nAre", you will get this in your terminal:
Hi
How
Hi
Are
Hi
You?
Are
Here’s how we can remove the duplicate lines using awk ..
echo "Hi\nHow\nHi\nAre\nHi\nYou?\nAre" | awk '\!x[$0]++'
The above will give this output:
Hi
How
Are
You?
The escape char \ is required for ! in tcsh.
This is how that awk snippet works:
- Initially the x array will be empty.
- When $0 is
Hi,x[$0]=x[Hi]=0. So!x[Hi]will beTrueand it will be printed out. - After that the
x[Hi]becomes 1 because of the++increment operator. - Next time when
$0==Hi, asx[Hi]==1,!x[Hi]will beFalseand so $0 won’t be printed out.