DEV Community

Cover image for Multiline fixed string search and replace with cli tools
Sundeep
Sundeep

Posted on • Originally published at learnbyexample.github.io

Multiline fixed string search and replace with cli tools

Credit: Cover image generated using carbon

This post shows how you can use ripgrep, perl and sd to perform multiline fixed string search and replace operations from the command line. Solution with GNU sed is also discussed, along with its limitations.

Fixed string matching

The below sample input file will be used in the examples in this post.

$ cat ip.txt
This is a multiline
sample input with lots
of special characters
like . () * [] $ {}
^ + ? \ and ' and so on.
This post shows how
you can do fixed
-string multiline
search with cli tools.
Enter fullscreen mode Exit fullscreen mode

ripgrep

ripgrep supports -U option to allow multiline matching. Use -F option to turn off regexp matching, i.e. treat the search string literally. In bash shell (and likely most other shells), you can press enter key to insert literal newline character for quoted values. When you do so, the next line starts with > and a space character. This isn't shown in the examples below to make it easier to copy-paste the commands.

$ rg -UF 'like . () * [] $ {}
^ + ? \ and' ip.txt
4:like . () * [] $ {}
5:^ + ? \ and ' and so on.

# -l option shows only filename instead of all the matching lines
$ rg -lUF 'like . () * [] $ {}
^ + ? \ and' ip.txt
ip.txt
Enter fullscreen mode Exit fullscreen mode

You'll have an issue if your search string itself contains single quote characters. Avoid using double quotes as a workaround, as that has its own set of special characters. You can work around by concatenating multiple strings next to each other, along with escaped single quote characters as needed.

# -N option disables line number prefix
$ rg -NUF 'like . () * [] $ {}
^ + ? \ and '\'' and' ip.txt
like . () * [] $ {}
^ + ? \ and ' and so on.
Enter fullscreen mode Exit fullscreen mode

If your search string starts with - character, you'll have to use -- before the search argument.

$ rg -NUF -- '-string multiline
search' ip.txt
-string multiline
search with cli tools.
Enter fullscreen mode Exit fullscreen mode

perl

You can use -0777 option with perl to slurp the entire input as a single string. Another advantage with perl is that you can use files to pass the search and replace strings. Thus, you don't have to worry about any character that may clash with shell metacharacters. See my Perl one-liners cookbook if you are not familiar with using perl from the command line.

$ cat search_1.txt
like . () * [] $ {}
^ + ? \ and ' and so on.

# display filename if the given search string matches
$ perl -0777 -nE '!$#ARGV ? $s=$_ :
                  /\Q$s/ && say $ARGV' search_1.txt ip.txt
ip.txt
Enter fullscreen mode Exit fullscreen mode

However, you'll have to make sure the file doesn't end with a newline if you are providing partial lines for searching, or take care of it within the perl script.

$ cat search_2.txt
-string multiline
search

# no output because there's a newline at the end of search_2.txt file
$ perl -0777 -nE '!$#ARGV ? $s=$_ :
                  /\Q$s/ && say $ARGV' search_2.txt ip.txt

# this will remove newline at the end of file before assigning to $s
$ perl -0777 -nE '!$#ARGV ? $s=s/\n\z//r :
                  /\Q$s/ && say $ARGV' search_2.txt ip.txt
ip.txt
Enter fullscreen mode Exit fullscreen mode

By default, ripgrep gives entire matching lines. To get rest of the line with perl, you'll have to explicitly add a pattern around the search string.

# $& variable has the entire matching portion
$ perl -0777 -nE '!$#ARGV ? $s=s/\n\z//r :
                  /\Q$s/ && say $&' search_2.txt ip.txt
-string multiline
search

# use 'say $& while /.*\Q$s\E.*/g' if there are multiple matches
$ perl -0777 -nE '!$#ARGV ? $s=s/\n\z//r :
                  /.*\Q$s\E.*/ && say $&' search_2.txt ip.txt
-string multiline
search with cli tools.
Enter fullscreen mode Exit fullscreen mode

Fixed string substitution

ripgrep

ripgrep also supports replacing matched string with something else using the -r option. By default, you'll see only matched lines in the output. Use --passthru option to display all the input lines, even if they do not match the given search string. See my blog post for more details about the -r option and various ways you can use it for substitution requirements.

$ rg --passthru -NUF 'like . () * [] $ {}
^ + ? \ and' -r '====
----
====' ip.txt
This is a multiline
sample input with lots
of special characters
====
----
==== ' and so on.
This post shows how
you can do fixed
-string multiline
search with cli tools.
Enter fullscreen mode Exit fullscreen mode

Apart from having to workaround single quote, you'll have to use $$ instead of $ as it is used for backreferences in the replacement section.

$ echo 'sample input' | rg --passthru -F 'in' -r '$a'
sample put
$ echo 'sample input' | rg --passthru -F 'in' -r '$$a'
sample $aput
Enter fullscreen mode Exit fullscreen mode

perl

With perl, you can use files for both search and replace strings. And, you can easily choose to replace first or all occurrences, unlike ripgrep where it always replaces all the matches.

$ cat replace.txt
---------------------
$& = $1 + $2 / 3 \ 4
=====================

$ perl -0777 -ne '$#ARGV==1 ? $s=$_ : $#ARGV==0 ? $r=$_ :
                  print s/\Q$s/$r/gr' search_1.txt replace.txt ip.txt
This is a multiline
sample input with lots
of special characters
---------------------
$& = $1 + $2 / 3 \ 4
=====================
This post shows how
you can do fixed
-string multiline
search with cli tools.
Enter fullscreen mode Exit fullscreen mode

As seen before, you'll have to remove newline from search string for partial line matching.

# use $r=s/\n\z//r to avoid trailing newline from replace.txt
$ perl -0777 -ne '$#ARGV==1 ? $s=s/\n\z//r : $#ARGV==0 ? $r=$_ :
                  print s/\Q$s/$r/gr' search_2.txt replace.txt ip.txt
This is a multiline
sample input with lots
of special characters
like . () * [] $ {}
^ + ? \ and ' and so on.
This post shows how
you can do fixed
---------------------
$& = $1 + $2 / 3 \ 4
=====================
 with cli tools.
Enter fullscreen mode Exit fullscreen mode

sd

sd supports fixed string and Rust regexp based substitution. Unlike ripgrep, -s option for fixed string will apply to both search and replacement sections. sd does in-place editing for file inputs by default, you can use -p to preview results on the terminal. Multiline matching is automatically performed by default.

$ echo 'sample input' | sd -s 'in' '$a'
sample $aput

$ sd -ps 'like . () * [] $ {}
^ + ? \ and' '====
----
====' ip.txt
This is a multiline
sample input with lots
of special characters
====
----
==== ' and so on.
This post shows how
you can do fixed
-string multiline
search with cli tools.
Enter fullscreen mode Exit fullscreen mode

Saving file contents to a variable

Trailing newlines and ASCII NUL characters will be lost if you wish to save contents of a file as bash variables using var=$(< filename) command. See stackoverflow: pitfalls of reading file into shell variable for details.

$ printf '\na\0b\n123\n\n\n\n\n\n\n\n' > t1
$ a=$(< t1)

# NUL character is lost after the assignment
# all the trailing newlines are lost as well
$ printf '%b' "$a" | cat -A
$
ab$
123
Enter fullscreen mode Exit fullscreen mode

ripgrep

If your search string doesn't have multiple trailing newlines or ASCII NUL characters, then you can save file contents to variables and then pass them to ripgrep. Single trailing newline will not normally cause an issue for searching operations as ripgrep will append a newline while displaying results anyway. If you want to make sure input file also contains the trailing newline, then you can manually concatenate a newline character to the search string.

$ s=$(< search_1.txt)
# use "$s"$'\n' if you want to match trailing newline as well
$ rg -NUF "$s" ip.txt
like . () * [] $ {}
^ + ? \ and ' and so on.

# use -- if the search string starts with - character
$ s=$(< search_2.txt)
$ rg -NUF -- "$s" ip.txt
-string multiline
search with cli tools.
Enter fullscreen mode Exit fullscreen mode

For substitution operations, you'll have to preprocess the replacement file to replace $ with $$.

$ s=$(< search_1.txt)
$ r=$(sed 's/\$/$$/g' replace.txt)

# here, removal of trailing newline doesn't cause an issue,
# as it evens out between search and replace strings
$ rg --passthru -NUF "$s" -r "$r" ip.txt
This is a multiline
sample input with lots
of special characters
---------------------
$& = $1 + $2 / 3 \ 4
=====================
This post shows how
you can do fixed
-string multiline
search with cli tools.
Enter fullscreen mode Exit fullscreen mode

Here, partial line has to be matched. So, $() assignment works well for the search string. If the trailing newline of the replacement string isn't needed, then $() assignment again is good enough. Otherwise, you can modify the replacement string as -r "$r"$'\n'

$ s=$(< search_2.txt)
$ r=$(sed 's/\$/$$/g' replace.txt)

$ rg --passthru -NUF -r "$r" -- "$s" ip.txt
This is a multiline
sample input with lots
of special characters
like . () * [] $ {}
^ + ? \ and ' and so on.
This post shows how
you can do fixed
---------------------
$& = $1 + $2 / 3 \ 4
===================== with cli tools.
Enter fullscreen mode Exit fullscreen mode

sd

As mentioned before, -s option for sd applies to both search and replacement sections. So, the usage is lot simpler compared to ripgrep.

# -- is needed here because replace.txt starts with - character
$ sd -ps -- "$(< search_1.txt)" "$(< replace.txt)" ip.txt
This is a multiline
sample input with lots
of special characters
---------------------
$& = $1 + $2 / 3 \ 4
=====================
This post shows how
you can do fixed
-string multiline
search with cli tools.
Enter fullscreen mode Exit fullscreen mode

GNU sed

To follow a similar approach with GNU sed, you'll have to preprocess the strings to escape metacharacters. Assuming input doesn't have ASCII NUL characters, you can use -z option to slurp entire input as a single string.

Here's an example for multiline search.

# escape all BRE metacharacters
# replace literal newlines with \n
$ s=$(sed -z 's#[[^$*.\/]#\\&#g; s/\n/\\n/g' search_1.txt)

# since newlines are replaced with \n,
# trailing newlines will be preserved here
$ echo "$s"
like \. () \* \[] \$ {}\n\^ + ? \\ and ' and so on\.\n

# display filename if input matches the given multiline search string
# tr is used to change NUL character after filename to newline
$ sed -nz '/'"$s"'/F' ip.txt | tr '\0' '\n'
ip.txt
Enter fullscreen mode Exit fullscreen mode

And here's an example for multiline substitution.

# last newline is removed here to allow partial line matching
$ s=$(sed -z 's#[[^$*.\/]#\\&#g; s/\n$//; s/\n/\\n/g' search_2.txt)

# escape all replacement section metacharacters
# and prefix \ character to literal newlines, except the last line
$ r=$(sed 's:[\\/&]:\\&:g; $!s/$/\\/' replace.txt)
$ echo "$r"
---------------------\
$\& = $1 + $2 \/ 3 \\ 4\
=====================

# if you need trailing newline from replace.txt,
# use sed -z 's/'"$s"'/'"$r"'\n/g'
$ sed -z 's/'"$s"'/'"$r"'/g' ip.txt
This is a multiline
sample input with lots
of special characters
like . () * [] $ {}
^ + ? \ and ' and so on.
This post shows how
you can do fixed
---------------------
$& = $1 + $2 / 3 \ 4
===================== with cli tools.
Enter fullscreen mode Exit fullscreen mode

Discussion (0)