Cover image for Bash Commands for Bioinformatics Beginners

Bash Commands for Bioinformatics Beginners

nahid18 profile image Abdullah Al Nahid ・2 min read

Disclaimer: This article is a collection of useful bash commands that I have been using regularly. I'll keep editing this article to include more later on.

1. Count Fasta Sequences

This command will help you to count fasta sequences in a fasta file.

grep -c ">" <fasta-file-name>

Example: grep -c ">" sequences.fasta

2. Count Empty Lines

If you want to count empty/blank lines in a file, then this command will help you.

grep -c "^$" <file-name>

Example: grep -c "^$" sequences.fasta

3. Remove Empty Lines

To remove empty/blank lines from a file, this command is useful.

sed -i "/^$/d" <file-name>

Example: sed -i "/^$/d" sequences.fasta

4. Merge Multiple CSV Files

If you have multiple CSV files with same header, then you can use this command to merge them. The command below has two parts.
In the head -n 1 <a-csv-file> > combined.out part, replace <a-csv-file> with any of the .csv file. Check example section for better understanding.

head -n 1 <a-csv-file> > combined.out && tail -n+2 -q *.csv >> combined.out  

After running this command, rename the combined.out file to combined.csv

Suppose, you have 4 csv files.
So your should run:
head -n 1 file1.csv > combined.out && tail -n+2 -q *.csv >> combined.out
Then you will see a new combined.out file. Don't forget to rename this file to combined.csv

5. Unzip all zip files

Unzipping all .zip files in a folder is a breeze with this command.

unzip "*.zip"

However, this command will not delete the .zip files after unzipping. If you are looking for a command to delete all the files with same file format (.zip, .gz, .txt), then scroll below.

6. Delete all files with same file format

Suppose, you want to delete multiple files with same file format, then these two commands will come in handy for you.
First, check what files you are going to remove:

find . -name "*<file-type>" -type f 

Then, run the delete command:

find . -name "*<file-type>" -type f -delete 

Check the example part for better understanding.

If you want to delete all the .gz files, then the commands should look like this:

find . -name "*.gz" -type f
find . -name "*.gz" -type f -delete

To delete all .zip files:

find . -name "*.zip" -type f
find . -name "*.zip" -type f -delete

That's all for now. Feel free to add more bash commands in the comments section and I will add them to the article.

Posted on by:

nahid18 profile

Abdullah Al Nahid


Self-taught Programmer with a passion for Bioinformatics tools, databases and Data Science.


Editor guide