How to remove duplicate lines using awk?
— Kaushal ModiIf you type echo "Hi\nHow\nHi\nAre\nHi\nYou?\nAre"
, you will get this in your terminal:
Hi
How
Hi
Are
Hi
You?
Are
Here’s how we can remove the duplicate lines using awk
..
echo "Hi\nHow\nHi\nAre\nHi\nYou?\nAre" | awk '\!x[$0]++'
The above will give this output:
Hi
How
Are
You?
The escape char \
is required for !
in tcsh.
This is how that awk snippet works:
- Initially the x array will be empty.
- When $0 is
Hi
,x[$0]=x[Hi]=0
. So!x[Hi]
will beTrue
and it will be printed out. - After that the
x[Hi]
becomes 1 because of the++
increment operator. - Next time when
$0==Hi
, asx[Hi]==1
,!x[Hi]
will beFalse
and so $0 won’t be printed out.