loading...

What are your UNIX pipeline commands that saved you from lot of coding/time?

djviolin profile image István Lantos ・1 min read

This question came to my mind last day when I wanted to free up some space on my HDDs and when it comes to what to delete (feel the pain), I wanted to know without lot of right-clicks which folders are the fattiest. As a good Windows user, I installed Total Commander, because some random Google result told me to do so.

Then I realised, heck, I have an entire UNIX environment on my PC (MSYS2), so maybe there is an utterly simple one liner command for achieve this. And guess what, it has: :)

$ du -sh * | sort -h

Life hack, place this in your .bashrc file:

# Within the current directory, list all folders and files
# and print their size, then sort it from smallest to largest:
alias du='du -sh * | sort -h'

What are your UNIX pipeline commands, where you can combine some program's standard output and creating something wicked simple time saver?

Posted on Oct 27 '17 by:

Discussion

markdown guide
 

The most common pipes I use are:

grep # ALL HAIL. Should know these:
grep -F # fixed/literal mode, use to search exactly what you type, no regex
grep -i # case insensitive
grep -v # reverse, shows only lines that do NOT match
grep -w # matches on "words" only, grep -w "lag" won't match "flag"

sort # usually combine with uniq

uniq # usually combined with sort

wc -l # count lines

head -n 10 # top 10! you won't believe number 7!

tail -f # tail with follow, useful for log that are being updated, such as:

tail -f | grep -i error

cut # to only get certain columns, for example:
cut -d' ' -f12-15 # split on spaces, only show columns 12 through 15
# or use gawk when dealing with buffers
gawk '{$12,$13,$14,$15}'

Here's a real command I'm using in a demo in 5 minutes to show something that's hidden deep in our logs:

tail -f our.log | grep -w "WARNING" | grep --color=always -F "something=" | cut -d' ' -f12-15

EDIT: Unfortunately that doesn't work because of file buffering. Here's an actual working command for a log that is getting updated in live, but it does effectively the same thing:

tail -f our.log| grep --line-buffered -w "WARNING" | grep --color=always -F --line-buffered "something=" | gawk '{print $12,$13,$14,$15}'

Bonus: Here's a super evil one >:)

$ yes > /dev/null & # does nothing but waste CPU
 

sort -u to optimize sort | uniq ;)

 

Unless you need to count the unique results: blah | sort -n | uniq -c

 

Cheers for the tip about grep options. I do a lot of grepping, but I think those options will save me heaps of time in the future.

 

I think process substitution is incredible helpful. Took me a few years to find out about it.
A super powerful command is xargs. Also make sure to have a look at the -I and -p flags, both really powerful.
I just used these two features as part of an article I wrote,
Automate Your Mac Setup and Keep It Up to Date
Compare the output of two commands, get the result list and pass in one line to another command:

comm -13 <(sort brew.txt) <(brew leaves | sort) | xargs brew rm

And some aliases I like:

# Start a webserver, also accepts a port as optional argument
alias server='python3 -m http.server'

# Copy to clipboard on Mac or Linux
alias copy="$(which pbcopy &> /dev/null && echo pbcopy || echo 'xclip -sel clip')"

# Pipe my public key to my clipboard.
alias pubkey="more ~/.ssh/id_rsa.pub | copy && echo '=> Public key copied to pasteboard.'"

If you are curious which commands you use a lot, you can find out like this:
What Are Your Most Used Shell Commands?

 

Oooh I feel like this is a weakness of mine. Looking forward to the responses

 
alias ctrlc='xclip -selection clipboard -i'
alias ctrlv='xclip -selection clipboard -o'

Examples:

# copy something without opening editors or showing it in terminal
cat ~/.ssh/mykey.pub | ctrlc

# paste clipboard somewhere
ctrlv >> ~/.ssh/known_hosts


ctrlv | tr '[:upper:]' '[:lower:]'
ctrlv | tr -d "\n"
ctrlv | base64 -d 
ctrlv | base64
ctrlv | md5sum

cat myfile | base64 | ctrlc
cat myfile | md5sum | ctrlc



ctrlv | base64 | ctrlc
ctrlv | base64 -d | ctrlc
 

The Just Commit Everything command:

alias gitasdf='git add -A && git commit -m asdf && git push'
 

Having a shortcut to add and commit is priceless. I use this function with a useful commit message and commit often.

$lg "My useful description"
# Lazy Git
function lg() {
    git add .
    git commit -a -m "$1"
    git push
}
# End Lazy Git
 

This might be very dangerous if you don't git pull after your commit. So I suggest:

# Lazy Git
function lg() {
    git add .
    git commit -a -m "$1"
    git pull
    git push
}
# End Lazy Git

Thank you! I have updated my function.

 
  1. -a in git commit is redundant since everything was already added one line above.
  2. this function will sooner or later commit log files, core dumps and other accidentally created in the pwd directory crap to the origin.
  3. if I saw this kind of function/alias in configs, I would immediately fire the owner, because it basically shows they do not understand what git is used for.

Hint: for this workflow one typically uses a dropbox/gdrive.

 

I have something similar, but I'm inserting a timestamp as a commit message. Still totally useless. :)

I also replaced origin word in the git repo creation at git remote add gitlab .... So if I pushing something to gitlab, I use the gitlab command, which using git push -u gitlab --all. I have similar to Github.

 

Why do you use git? Just use dropbox

 

xargs is amazing for just about anything that doesn't like piped results. Want to pull a bunch of Docker images in one command?

echo "ubuntu gcc alpine nginx" | xargs -n 1 docker pull
 

So if I understand correctly, this is equivalent to the following?

for img in ubuntu gcc alpine nginx; do docker pull "${img}"; done

I never thought of using xargs this way, good to know!

I guess you could parallellize the loop by adding -P 0 to the xargs invocation.

 

I didn't consider using parallelization, that's a great idea! Though, what happens with Docker if it pulls in two copies of the same image at once? If two images have the same dependency, will Docker deal with this parallel pull fine, or will it bork?

EDIT: Docker seems to handle this rather well; It just queues up the layer pulls in some arbitrary order. Still, I don't know if I entirely trust this, but I can't see a reason for this to not work (And I can see a reason that one might want this to work!).

 

This is cool! I never thought about that.

 

Some dumb and possibly dangerous variations I use:

# A blunt tool to fix changed ruby files before a commit
alias precommitfix="git status -s | awk '{print $2}' | grep .rb$ | xargs bundle exec rubocop -a"

# Delete merged remote branches (remove --dry-run after verifying)
git branch -r --merged | \
grep origin | \
grep -v '>' | \
grep -v master | \
xargs -L1 | \
awk '{sub(/origin\//,"");print}'| \
xargs git push origin --delete --dry-run

 

I have used perl in pipes a lot for when you need to change something easily using a regular expression.

$ find . -name "*Relevant*.pm" | xargs perl -pi -e 's/something/somethingelse/'

The find name part can be tweaked to suit your needs and the regular expression in the Perl part also.

 

I often use this one get the pid of the parents of zombie process.

ps -A -ostat,ppid | awk '/[zZ]/{print $2}'

Apart from that I often use grep, sort -u, sed.

A funny one is to pipe fortune with cowsay

fortune -a | cowsay -f vader
 

I can provide a counter to your question. Working on an embedded system with multiple processors, we needed to generate a system binary that included microcontroller and DSP images, where the micro booted first and loaded the DSP after verifying a CRC. I built the CRC out of a pipeline featuring awk. Don't do that. It certainly didn't save me any time.

 

I know it's about pipe commande but since I saw some other interesting elements without pipe in it, I want to share some of them that help me daily. Hope you don't mind.

I have multiple aliases in my .aliases but some also in my .gitconfig more specific to git.
My favorite one is :

[alias]
        # Remove branches that have already been merged with master
    # a.k.a. ‘delete merged’
    dm = "!git branch --merged | grep -v '\\*' | xargs -n 1 git branch -d"
        # List contributors with number of commits
    contributors = shortlog --summary --numbered
        # pull repository, rebasing, by auto stashing before and restore after and prune"
        up = "pull --rebase --autostash --prune"

You can find more of them into my .dotfiles repo on github

 
topcmd ()
{ 
    history | awk '{a[$4]++}END{for(i in a){print a[i] " " i}}' | sort -rn | head
}

This prints the top 10 most used commands in the history. I then created one-letter aliases for these (e.g. alias s='git status'), saving numerous keystrokes.

 

I use this pattern to find out the last commit that for which a certain condition was true, e.g. in which something wasn't yet broken.

In this example, test.sh just returns a non-zero code to mimic the condition you want to check for, but this could be anything that can be tested in a shell conditional.

until ./test.sh; do git checkout HEAD~1; done;
 

If you have a binary test such as tesh.sh in your example, use git bisect to find faulty commit faster.

 

Top client IP in apache logs.

# cut -d" " -f1 apache.access.log | sort | uniq -c | sort -rn | head
 349963 22.208.1.241
  16434 15.99.2.62
   8685 7.8.27.98
   2047 52.14.4.76
    265 83.12.37.3
    149 3.71.24.250
     78 14.213.14.6
     13 182.37.3.88

Work also for any field : top URL, top browser etc. Damn fast !

 

Not exactly pipes but some very useful aliases that sometimes save me a lot of keystrokes

# open the last edited file with neovim
alias lnvim='nvim -c'\'':e#<1'\'
alias pbpaste='xclip -i -selection clipboard -o'
alias pbcopy='xclip -selection clipboard'

zsh global aliases:

alias -g ND='./**/*(/om[1])' # newest directory --> ls -d ND
alias -g NF='./**/*(.om[1])' # newest file

cd ~/music/recordings && audacity NF
# using fasd cd I just type
z recor<Enter> auda<Tab> NF

Also using zsh with:

1 - fasd cd
2 - zsh-autosuggestions
3 - zsh-autopar

But xargs combined with find like:

 find -type f -print0 | xargs -0 cp -t ~/backup

print0 and xargs -0 makes us avoid some erros comming from files with spaces on its names

 

It is also handful:

ls -la --sort <time|size>
 

I use this one a lot:
find * -type f | xargs grep "the-string-im-looking-for"

 

Why not use grep -R to avoid find? Might also want to try git grep and/or ag (silversearcher).

 

grep -R might not be available on older systems.

 

Honestly, I didn't know about that flag.
Thanks for the tip!
:)

 

I have some personal libraries that are easily installable in other hosts, by following the README:

github.com/stroparo/ds

I also have a personal setup script which installs my selection of APT (debian/ubuntu) and Yum (fedora/redhat) packages, plus Oh-My-Zsh, plus the Daily Shells ('ds') repo linked to above. Just follow the installation instructions for setup-dev.sh at the bottom of this README:

github.com/stroparo/cmds#setup-a-l...

Best,

 

I don't have too many size in my hard disk so I usually use:

df -h | grep ^/dev/sda
 

Also df -h . does the same (if you're in a dir on that volume)

 

I have a lot of data in JSON format, and gron has been a lifesaver. Combined with the usual suspects grep, sed, cut, awk, sort, etc.

 

Cool! This looks a lot simpler than jq (which I use regularly).

 

This is a link to my bash_profile where I added a bunch of functions and aliases that helped me save a lot of time github.com/robertodessi/custom_con...

 

On a mac:

‘pbpaste | sort | pbcopy’

To sort (or otherwise process) what’s in the clipboard.

 
hgrep () {
        grep --color=auto --color -E "$1|$" "${@:2}"
}

this will just highlight the word you are greping for but still print out the entire file.

 

I've used this exact line a couple days ago!

 

Exploded with files + dirs:

du -ma | sort -rn | head # summary (10 items)
du -ma | sort -rn | less # scroll with the 'less' command