Zaki Arrozi Arsyad

Posted on May 26, 2020 • Edited on Jun 21, 2020

linux : file manipulation

#linux #terminal #cli #devops

For more powerful linux cli, we can combine a command directly with a file, or combining two or more commands in one shot.

It's called redirection when we combine a command with a file.

# copy file_1 content into file2
cat FILE_1.txt > FILE_2.txt

# create a new file with list of content of / directory
ls / > FILE.txt

The other one is called pipeline

# list all files or folder which are matched with these keywords
ls /home/host/ | grep KEYWORD_1 | grep KEYWORD_2

Some commands that we will use often to work with file in linux :

GREP

Grep is used for filtering files or folder based on a keyword.
Basic usage :

# display all line matched
grep KEYWORD FILE.txt

# display all line matched from multiple files
grep KEYWORD FILE_1.txt FILE_2.txt FILE_3.txt

# ignore keyword case
grep -i KEYWORD FILE.txt

# display lines without the matched line
grep -v KEYWORD FILE.txt

# display with line number
grep -n KEYWORD FILE.txt

# display a count of matching lines
grep -c KEYWORD FILE.txt

# display matched line for either keyword 1 or keyword 2
grep -w "KEYWORD_1 | KEYWORD_2" FILE.txt

# display 4 lines after the keyword, the number can be customized
grep -A 4 KEYWORD

# display 4 lines before the keyword, the number can be customized
grep -B 4 KEYWORD

REGEX

We can also combine grep command with regular expression. Be careful when using regex, because it is case sensitive.
Basic usage :

grep -E "REGEX_PATTERN" FILE.txt

This command will display all lines which are matched with the regex pattern.

SED

Sed is stand for Stream Editor. Sed can be used for searching a file, find and replace, insertion or deletion. Sed is super useful for file editing without opening it. Sed also support regular expression pattern.
Basic usage :

sed with s parameter for substitution and / for the delimiter

# replace first unix word in each line with linux
sed 's/unix/linux/' FILE.txt

# replace all unix word in each line with linux
sed 's/unix/linux/g' FILE.txt

# replace second unix word in each line with linux
sed 's/unix/linux/2' FILE.txt

# replace with case insensitive
sed 's/unix/linux/i' FILE.txt

# replace all unix word except in fifth line
sed '5!s/unix/linux/g' FILE.txt

# replace unix word from second and next line
sed 's/unix/linux/2g' FILE.txt

# replace unix word in third line
sed '3 s/unix/linux/' FILE.txt

# replace unix word in first line until third line
sed '1,3 s/unix/linux/' FILE.txt

# replace unix word in start from third line
sed '3,$ s/unix/linux/' FILE.txt

# display only replaced lines
sed -n 's/unix/linux/p' FILE.txt

# display replaced lines twice
sed 's/unix/linux/p' FILE.txt

sed with d for deletion

# delete fifth line
sed '5d' FILE.txt

# delete last line
sed '$d' FILE.txt

# delete third line until sixth line
sed '3,6d' FILE.txt

# delete from third line and next
sed '3,$d' FILE.txt

# delete line with matched pattern
sed '/REGEX_PATTERN/d' FILE.txt

# delete line with matched case insensitive pattern
sed -i '/REGEX_PATTERN/d' FILE.txt

# delete line with matched pattern and two lines after
sed '/abc/,+2d' FILE.txt

# delete every second line, start from third line
sed '3~2d' FILE.txt

# delete blank line
sed '/^$/d' FILE.txt

# delete blank line w
sed -i '/^#/d;/^$/d' FILE.txt

sed with G for file spacing

# insert one blank line after each line
sed G FILE.txt

# insert two blank lines after each line
sed 'G:G' FILE.txt

# insert a black line above every line which matches pattern
sed '/REGEX_PATTERN/x;p;x;' FILE.txt

# insert a blank line below every line which matches pattern
sed '/REGEX_PATTERN/G' FILE.txt

sed with -n p for displaying a file

# display a file
sed -n p FILE.txt

# display only fourth line
sed -n '4p' FILE.txt

# display from second line to fifth line
sed -n '2,5p' FILE.txt

# display only last line
sed -n '$p' FILE.txt

# display from third line and next
sed -n '3,$p' FILE.txt

# display entire file except second line to fourth line
sed -n '2,4d' FILE.txt

# display only line with matched pattern
sed -n '/REGEX_PATTERN/p' FILE.txt

# matched pattern start from fifth line
sed -n '/REGEX_PATTERN/,5p' FILE.txt

# display from second line to matched pattern line
sed -n '2,/REGEX_PATTERN/p' FILE.txt

sed with = for numbering lines

# add a number in front of each line
sed = FILE.txt

CUT

As the name suggests, cut i used for cutting out the section from each line of file and writing the result to a new file. We can use this command to cut parts of a line by some of these commands :

cut with -b or --bytes=LIST Select by specifying a byte, a set of bytes, or a range of bytes.

# display first, second, and third byte of each line
cut -b '1,2,3' FILE.txt

# display first until third byte and fifth until seventh byte
cut -b '1-3,5-7' FILE.txt

# display third and next byte
cut -b '3-' FILE.txt

# display until third byte
cut -b '-3' FILE.txt

cut with -c or --characters=LIST Select by specifying a character, a set of characters, or a range of characters.

# display first, second, and third characters of each line
cut -c '1,2,3' FILE.txt

# display first until third characters and fifth until seventh characters
cut -c '1-3,5-7' FILE.txt

# display third and next characters
cut -c '3-' FILE.txt

# display until third characters
cut -c '-3' FILE.txt

cut with -f or --fields=LIST Select by specifying a field, a set of fields, or a range of fields. This command needs specified delimiter written with -d or --delimiter=DELIMITER

# display all fields without specified delimiter
cut -f '1' FILE.txt

# display first until third fields, with comma delimiter
cut -d\, -f '1-3' FILE.txt

# display from third fields, with comma delimiter
cut -d\, -f '3-' FILE.txt

# display until third fields, with comma delimiter
cut -d\, -f '-3' FILE.txt

cut with --complement Displays all bytes, characters, or fields except the selected.

# display all bytes except the selected
cut --complement -b '1' FILE.txt

# display all characters except the selected
cut --complement -c '1' FILE.txt

# display all fields except the selected
cut --complement -d\, -f '1' FILE.txt

cut with -s or --only-delimiter Do not print lines not containing delimiters.

# display only fields which have delimiter
cut -d\, -f '1-4' FILE.txt -s

cut with --output-delimiter=STRING Modify the output delimiter.

# display fields with # delimiter
cut -d\, -f '1-4' FILE.txt --output-delimiter="#"

SORT

This command will do as the name implies.

# sort the lines ascending
sort FILE.txt

# sort the lines descending
sort -r FILE.txt

# sort and create a new file
sort -o OUTPUT_FILE.txt ORIGINAL_TEXT.txt

# sort the numbering lines in ascending
sort -n FILE.txt

# sort the numbering lines in descending
sort -nr FILE.txt

# sort and remove duplicates
sort -u FILE.txt

# checks if a file is already sorted and return unsorted lines (if any)
sort -c FILE.txt

# sort a table by the nth column
sort -k2 FILE.txt

# sort by month for list of months
sort -M FILE.txt

UNIQ

We can delete the duplicated lines with this command. Uniq basic command will return the same result with sort -u.
Basic usage :

# print only uniq line
uniq FILE.txt

# print a number how many times a line was repeated
uniq -c FILE.txt

# print repeated lines, only display one duplicated line per group
uniq -d FILE.txt

# print repeated lines, display all duplicated lines
uniq -D FILE.txt

# prints only the unique lines
uniq -u FILE.txt

# skip nth field to be skipped while comparing uniqueness
uniq -f 2 FILE.txt

# skip nth character to be skipped while comparing uniqueness
uniq -s 2 FILE.txt

# limit the comparison to a set number of characters 
uniq -w 2 FILE.txt

# make the comparison case-insensitive
uniq -i FILE.txt

TR

We can use this command to translate or delete characters.
Basic usage :

# translate from lowercase with uppercase
cat FILE.txt | tr "[a-z]" "[A-Z]"

# translate from lowercase with uppercase
cat FILE.txt | tr "[:lower:]" "[:upper:]"

# translate whitespace with tab
cat FILE.txt | tr "[:space:]" "\t"

# translate brackets with curly barckets
cat FILE.txt | tr "()" "{}"

# translate multiple whitespace with single space
cat FILE.txt | tr -s "[:space:]" " "

# delete specific A character
cat FILE.txt | tr -d "A"

# delete digit character
cat FILE.txt | tr -d "[:digit:]"

# delete all except digit
cat FILE.txt | tr -cd "[:digit:]"

AWK

Awk is used for manipulating data and generating reports.
Fields identifier :

$0 : entire line of text
$1 : first field
$2 : second field
$6 : sixth field
$NF : stands for “number of fields,” and represents the last field
NR : display number of fields
OFS : output field separator
BEGIN : begin of the file
END : end of the file

Basic usage :



# view file
awk '{print}' FILE.txt

# view specific lines
awk '{print $2,$3,$6}' FILE.txt

# view last field each lines
awk '{print $NF}' FILE.txt

# view number of fields each line
awk '{print NR}' FILE.txt

# print specific lines with / separator
awk 'OFS="/" {print $2,$3,$6}' FILE.txt

# add text in the beginning of the file
awk 'BEGIN {print "some text"} {print $2,$3,$6}' FILE.txt

# add text in the ending of the file
awk 'END {print "some text"} {print $2,$3,$6}' FILE.txt

# print the lines which is matched with the given pattern
awk '/REGEX_PATTERN/' FILE.txt

# print specific field after matched pattern each line 
awk '/REGEX_PATTERN/ {print $2,$3,$6}' FILE.txt
`