Emacs, scripting and anything text oriented.

How to remove duplicate lines using awk?

Kaushal Modi

If you type echo "Hi\nHow\nHi\nAre\nHi\nYou?\nAre", you will get this in your terminal:

Hi
How
Hi
Are
Hi
You?
Are

Here’s how we can remove the duplicate lines using awk ..

echo "Hi\nHow\nHi\nAre\nHi\nYou?\nAre" |  awk '\!x[$0]++'

The above will give this output:

Hi
How
Are
You?

The escape char \ is required for ! in tcsh.

This is how that awk snippet works:

  • Initially the x array will be empty.
  • When $0 is Hi, x[$0]=x[Hi]=0. So !x[Hi] will be True and it will be printed out.
  • After that the x[Hi] becomes 1 because of the ++ increment operator.
  • Next time when $0==Hi, as x[Hi]==1, !x[Hi] will be False and so $0 won’t be printed out.

If you have written a response to this, enter your response post's URL below.

Or, you can send a "comment" webmention (it's OK if you don't know what that means). When asked about your website on an IndieAuth login screen, simply type https://commentpara.de.

Markdown Support**bold**, _italics_, ~~strikethrough~~, [descr](link), `monospace`, ```LANG\nline1\nline2\n``` (Yep, multi-line code blocks too, with syntax highlighting!), auto-hyperlinking.

Webmentions #