DEV Community

Cover image for Bash Commands for Bioinformatics Beginners: Part 1
Abdullah Al Nahid
Abdullah Al Nahid

Posted on • Updated on

Bash Commands for Bioinformatics Beginners: Part 1

Disclaimer: This article is a collection of useful bash commands that I have been using regularly.

1. Count Fasta Sequences

This command will help you to count fasta sequences in a fasta file.

grep -c ">" <fasta-file-name>
Enter fullscreen mode Exit fullscreen mode

Example: grep -c ">" sequences.fasta

2. Count Empty Lines

If you want to count empty/blank lines in a file, then this command will help you.

grep -c "^$" <file-name>
Enter fullscreen mode Exit fullscreen mode

Example: grep -c "^$" sequences.fasta

3. Remove Empty Lines

To remove empty/blank lines from a file, this command is useful.

sed -i "/^$/d" <file-name>
Enter fullscreen mode Exit fullscreen mode

Example: sed -i "/^$/d" sequences.fasta

4. Merge Multiple CSV Files

If you have multiple CSV files with same header, then you can use this command to merge them. The command below has two parts.
In the head -n 1 <a-csv-file> > combined.out part, replace <a-csv-file> with any of the .csv file. Check example section for better understanding.

head -n 1 <a-csv-file> > combined.out && tail -n+2 -q *.csv >> combined.out  
Enter fullscreen mode Exit fullscreen mode

After running this command, rename the combined.out file to combined.csv

Example:
Suppose, you have 4 csv files.
file1.csv
file2.csv
file3.csv
file4.csv
So your should run:
head -n 1 file1.csv > combined.out && tail -n+2 -q *.csv >> combined.out
Then you will see a new combined.out file. Don't forget to rename this file to combined.csv

5. Unzip all zip files

Unzipping all .zip files in a folder is a breeze with this command.

unzip "*.zip"
Enter fullscreen mode Exit fullscreen mode

However, this command will not delete the .zip files after unzipping. If you are looking for a command to delete all the files with same file format (.zip, .gz, .txt), then scroll below.

6. Delete all files with same file format

Suppose, you want to delete multiple files with same file format, then these two commands will come in handy for you.
First, check what files you are going to remove:

find . -name "*<file-type>" -type f 
Enter fullscreen mode Exit fullscreen mode

Then, run the delete command:

find . -name "*<file-type>" -type f -delete 
Enter fullscreen mode Exit fullscreen mode

Check the example part for better understanding.

Example:
If you want to delete all the .gz files, then the commands should look like this:

find . -name "*.gz" -type f
find . -name "*.gz" -type f -delete
Enter fullscreen mode Exit fullscreen mode

To delete all .zip files:

find . -name "*.zip" -type f
find . -name "*.zip" -type f -delete
Enter fullscreen mode Exit fullscreen mode



That's all for now. Feel free to add more bash commands in the comments section and I will add them to the article.

Oldest comments (0)