DEV Community

Zaki Arrozi Arsyad
Zaki Arrozi Arsyad

Posted on • Edited on

linux : file manipulation

For more powerful linux cli, we can combine a command directly with a file, or combining two or more commands in one shot.

It's called redirection when we combine a command with a file.

# copy file_1 content into file2
cat FILE_1.txt > FILE_2.txt

# create a new file with list of content of / directory
ls / > FILE.txt
Enter fullscreen mode Exit fullscreen mode

The other one is called pipeline

# list all files or folder which are matched with these keywords
ls /home/host/ | grep KEYWORD_1 | grep KEYWORD_2
Enter fullscreen mode Exit fullscreen mode

Some commands that we will use often to work with file in linux :

GREP

Grep is used for filtering files or folder based on a keyword.
Basic usage :

# display all line matched
grep KEYWORD FILE.txt

# display all line matched from multiple files
grep KEYWORD FILE_1.txt FILE_2.txt FILE_3.txt

# ignore keyword case
grep -i KEYWORD FILE.txt

# display lines without the matched line
grep -v KEYWORD FILE.txt

# display with line number
grep -n KEYWORD FILE.txt

# display a count of matching lines
grep -c KEYWORD FILE.txt

# display matched line for either keyword 1 or keyword 2
grep -w "KEYWORD_1 | KEYWORD_2" FILE.txt

# display 4 lines after the keyword, the number can be customized
grep -A 4 KEYWORD

# display 4 lines before the keyword, the number can be customized
grep -B 4 KEYWORD
Enter fullscreen mode Exit fullscreen mode

REGEX

We can also combine grep command with regular expression. Be careful when using regex, because it is case sensitive.
Basic usage :

grep -E "REGEX_PATTERN" FILE.txt
Enter fullscreen mode Exit fullscreen mode

This command will display all lines which are matched with the regex pattern.


SED

Sed is stand for Stream Editor. Sed can be used for searching a file, find and replace, insertion or deletion. Sed is super useful for file editing without opening it. Sed also support regular expression pattern.
Basic usage :

  • sed with s parameter for substitution and / for the delimiter
# replace first unix word in each line with linux
sed 's/unix/linux/' FILE.txt

# replace all unix word in each line with linux
sed 's/unix/linux/g' FILE.txt

# replace second unix word in each line with linux
sed 's/unix/linux/2' FILE.txt

# replace with case insensitive
sed 's/unix/linux/i' FILE.txt

# replace all unix word except in fifth line
sed '5!s/unix/linux/g' FILE.txt

# replace unix word from second and next line
sed 's/unix/linux/2g' FILE.txt

# replace unix word in third line
sed '3 s/unix/linux/' FILE.txt

# replace unix word in first line until third line
sed '1,3 s/unix/linux/' FILE.txt

# replace unix word in start from third line
sed '3,$ s/unix/linux/' FILE.txt

# display only replaced lines
sed -n 's/unix/linux/p' FILE.txt

# display replaced lines twice
sed 's/unix/linux/p' FILE.txt
Enter fullscreen mode Exit fullscreen mode
  • sed with d for deletion
# delete fifth line
sed '5d' FILE.txt

# delete last line
sed '$d' FILE.txt

# delete third line until sixth line
sed '3,6d' FILE.txt

# delete from third line and next
sed '3,$d' FILE.txt

# delete line with matched pattern
sed '/REGEX_PATTERN/d' FILE.txt

# delete line with matched case insensitive pattern
sed -i '/REGEX_PATTERN/d' FILE.txt

# delete line with matched pattern and two lines after
sed '/abc/,+2d' FILE.txt

# delete every second line, start from third line
sed '3~2d' FILE.txt

# delete blank line
sed '/^$/d' FILE.txt

# delete blank line w
sed -i '/^#/d;/^$/d' FILE.txt
Enter fullscreen mode Exit fullscreen mode
  • sed with G for file spacing
# insert one blank line after each line
sed G FILE.txt

# insert two blank lines after each line
sed 'G:G' FILE.txt

# insert a black line above every line which matches pattern
sed '/REGEX_PATTERN/x;p;x;' FILE.txt

# insert a blank line below every line which matches pattern
sed '/REGEX_PATTERN/G' FILE.txt
Enter fullscreen mode Exit fullscreen mode
  • sed with -n p for displaying a file
# display a file
sed -n p FILE.txt

# display only fourth line
sed -n '4p' FILE.txt

# display from second line to fifth line
sed -n '2,5p' FILE.txt

# display only last line
sed -n '$p' FILE.txt

# display from third line and next
sed -n '3,$p' FILE.txt

# display entire file except second line to fourth line
sed -n '2,4d' FILE.txt

# display only line with matched pattern
sed -n '/REGEX_PATTERN/p' FILE.txt

# matched pattern start from fifth line
sed -n '/REGEX_PATTERN/,5p' FILE.txt

# display from second line to matched pattern line
sed -n '2,/REGEX_PATTERN/p' FILE.txt
Enter fullscreen mode Exit fullscreen mode
  • sed with = for numbering lines
# add a number in front of each line
sed = FILE.txt
Enter fullscreen mode Exit fullscreen mode

CUT

As the name suggests, cut i used for cutting out the section from each line of file and writing the result to a new file. We can use this command to cut parts of a line by some of these commands :

  • cut with -b or --bytes=LIST Select by specifying a byte, a set of bytes, or a range of bytes.
# display first, second, and third byte of each line
cut -b '1,2,3' FILE.txt

# display first until third byte and fifth until seventh byte
cut -b '1-3,5-7' FILE.txt

# display third and next byte
cut -b '3-' FILE.txt

# display until third byte
cut -b '-3' FILE.txt
Enter fullscreen mode Exit fullscreen mode
  • cut with -c or --characters=LIST Select by specifying a character, a set of characters, or a range of characters.
# display first, second, and third characters of each line
cut -c '1,2,3' FILE.txt

# display first until third characters and fifth until seventh characters
cut -c '1-3,5-7' FILE.txt

# display third and next characters
cut -c '3-' FILE.txt

# display until third characters
cut -c '-3' FILE.txt
Enter fullscreen mode Exit fullscreen mode
  • cut with -f or --fields=LIST Select by specifying a field, a set of fields, or a range of fields. This command needs specified delimiter written with -d or --delimiter=DELIMITER
# display all fields without specified delimiter
cut -f '1' FILE.txt

# display first until third fields, with comma delimiter
cut -d\, -f '1-3' FILE.txt

# display from third fields, with comma delimiter
cut -d\, -f '3-' FILE.txt

# display until third fields, with comma delimiter
cut -d\, -f '-3' FILE.txt
Enter fullscreen mode Exit fullscreen mode
  • cut with --complement Displays all bytes, characters, or fields except the selected.
# display all bytes except the selected
cut --complement -b '1' FILE.txt

# display all characters except the selected
cut --complement -c '1' FILE.txt

# display all fields except the selected
cut --complement -d\, -f '1' FILE.txt
Enter fullscreen mode Exit fullscreen mode
  • cut with -s or --only-delimiter Do not print lines not containing delimiters.
# display only fields which have delimiter
cut -d\, -f '1-4' FILE.txt -s
Enter fullscreen mode Exit fullscreen mode
  • cut with --output-delimiter=STRING Modify the output delimiter.
# display fields with # delimiter
cut -d\, -f '1-4' FILE.txt --output-delimiter="#"
Enter fullscreen mode Exit fullscreen mode

SORT

This command will do as the name implies.

# sort the lines ascending
sort FILE.txt

# sort the lines descending
sort -r FILE.txt

# sort and create a new file
sort -o OUTPUT_FILE.txt ORIGINAL_TEXT.txt

# sort the numbering lines in ascending
sort -n FILE.txt

# sort the numbering lines in descending
sort -nr FILE.txt

# sort and remove duplicates
sort -u FILE.txt

# checks if a file is already sorted and return unsorted lines (if any)
sort -c FILE.txt

# sort a table by the nth column
sort -k2 FILE.txt

# sort by month for list of months
sort -M FILE.txt
Enter fullscreen mode Exit fullscreen mode

UNIQ

We can delete the duplicated lines with this command. Uniq basic command will return the same result with sort -u.
Basic usage :

# print only uniq line
uniq FILE.txt

# print a number how many times a line was repeated
uniq -c FILE.txt

# print repeated lines, only display one duplicated line per group
uniq -d FILE.txt

# print repeated lines, display all duplicated lines
uniq -D FILE.txt

# prints only the unique lines
uniq -u FILE.txt

# skip nth field to be skipped while comparing uniqueness
uniq -f 2 FILE.txt

# skip nth character to be skipped while comparing uniqueness
uniq -s 2 FILE.txt

# limit the comparison to a set number of characters 
uniq -w 2 FILE.txt

# make the comparison case-insensitive
uniq -i FILE.txt
Enter fullscreen mode Exit fullscreen mode

TR

We can use this command to translate or delete characters.
Basic usage :

# translate from lowercase with uppercase
cat FILE.txt | tr "[a-z]" "[A-Z]"

# translate from lowercase with uppercase
cat FILE.txt | tr "[:lower:]" "[:upper:]"

# translate whitespace with tab
cat FILE.txt | tr "[:space:]" "\t"

# translate brackets with curly barckets
cat FILE.txt | tr "()" "{}"

# translate multiple whitespace with single space
cat FILE.txt | tr -s "[:space:]" " "

# delete specific A character
cat FILE.txt | tr -d "A"

# delete digit character
cat FILE.txt | tr -d "[:digit:]"

# delete all except digit
cat FILE.txt | tr -cd "[:digit:]"
Enter fullscreen mode Exit fullscreen mode

AWK

Awk is used for manipulating data and generating reports.
Fields identifier :

  • $0 : entire line of text
  • $1 : first field
  • $2 : second field
  • $6 : sixth field
  • $NF : stands for β€œnumber of fields,” and represents the last field
  • NR : display number of fields
  • OFS : output field separator
  • BEGIN : begin of the file
  • END : end of the file

Basic usage :



# view file
awk '{print}' FILE.txt

# view specific lines
awk '{print $2,$3,$6}' FILE.txt

# view last field each lines
awk '{print $NF}' FILE.txt

# view number of fields each line
awk '{print NR}' FILE.txt

# print specific lines with / separator
awk 'OFS="/" {print $2,$3,$6}' FILE.txt

# add text in the beginning of the file
awk 'BEGIN {print "some text"} {print $2,$3,$6}' FILE.txt

# add text in the ending of the file
awk 'END {print "some text"} {print $2,$3,$6}' FILE.txt

# print the lines which is matched with the given pattern
awk '/REGEX_PATTERN/' FILE.txt

# print specific field after matched pattern each line 
awk '/REGEX_PATTERN/ {print $2,$3,$6}' FILE.txt
`
Enter fullscreen mode Exit fullscreen mode

Top comments (2)

Collapse
 
russoue profile image
Mohammad Husain

Awesome!

Collapse
 
mhasan profile image
Mahmudul Hasan

Thanks, this is helpful documentation!