Muhammad

Posted on Sep 20, 2023

Commandline Productivity Part 1: fzf - The Command-Line Fuzzy Finder

#linux #commandline #bash

I've been using the command line extensively at my day job. I utilize various command line tools, enhancing my workflow and boosting my productivity. Therefore, I'm launching a new biweekly series where I'll cover the tools I use, dedicating each post to a specific tool. In today's post, we'll explore fzf - The Command Line Fuzzy Finder, and discuss how it can improve daily your workflow.

What's `fzf` and why use it?

Before I dive into fzf, I'd like to take a moment to explain what fuzzy matching is and discuss the algorithms used behind the scenes in fuzzy matching.

What's Fuzzy Matching?

Fuzzy matching is a technique used in computing to find strings that are approximately equal or closely resemble each other. Unlike exact matching, where the aim is to find an exact match or replicate, fuzzy matching identifies matches that may not be perfect but are "close enough" based on a set criteria.

Fuzzy matching is frequently used in data cleaning, where it helps in identifying duplicate records in large databases, even when the entries aren't exactly the same (e.g., "McDonald's" vs. "Mc Donalds").

It's also useful in search engines and autocorrect features, where slight variations or typos in a search term can still yield the desired results.

How it works? In Simplest Terms

Fuzzy matching algorithms evaluate strings based on various metrics, such as the number of changes required to turn one string into another (edit distance) or the number of shared character sequences. The outcome is often a score that represents the similarity between the two strings, with higher scores indicating greater similarity.

Algorithms used within Fuzzy Matching

Edit Distance (Levenshtein Distance):
This algorithm measures the similarity between two strings by determining the minimum number of single-character edits (i.e., insertions, deletions, or substitutions) required to change one string into the other. For example, the Levenshtein distance between "kitten" and "sitting" is 3.
Damerau-Levenshtein Distance: This is an extension of the Levenshtein distance, taking into account transpositions (swapping of two adjacent characters). For instance, the distance between "flaw" and "lawn" considering a transposition would be 1 using Damerau-Levenshtein.
Smith-Waterman Algorithm: Originally developed for bioinformatics, this local sequence alignment algorithm can also be used for text comparisons. It's particularly effective for scoring the similarity of substrings.
Jaro and Jaro-Winkler Distance: These are measures of similarity between two strings. The Jaro-Winkler distance gives more weight to the prefix of the strings and is especially useful for short strings like person names.
n-gram Analysis: In this technique, strings are broken down into overlapping substrings of 'n' characters. These n-grams are then compared to identify similarities. For example, using 2-grams (or bigrams), the word "hello" can be broken down into ["he", "el", "ll", "lo"]. For example, using 2-grams (or bigrams), the word "hello" can be broken down into ["he", "el", "ll", "lo"].
Token-Based Matching: This approach involves breaking strings into tokens (typically words) and comparing these tokens for similarity. Techniques like cosine similarity or Jaccard similarity can then be applied on these tokens.
Tf-idf (Term Frequency-Inverse Document Frequency): While more common in information retrieval systems, it can be applied to fuzzy matching. It measures how important a word is within a document relative to a collection of documents. It can be used in conjunction with cosine similarity for document comparisons
Longest Common Subsequence (LCS): This algorithm identifies the longest sequence of characters that two strings have in common. The LCS of "ABCBDAB" and "BDCAB" is "BCAB".

Different use-cases and applications may demand different algorithms or combinations of them. The choice often depends on the specific requirements of the task , such as the need for speed versus accuracy, the nature of the data, and the context in which the fuzzy matching is being applied.

So Now what's `fzf` and why use it?

fzf is a flexible tool that allows you to search and navigate any list (files, command history, git branches, etc.) using fuzzy matching. In essence, fuzzy matching means that you don't need to type exact search terms; instead, you can make typos or give partial input, and fzf will intelligently suggest matches.

For instance, if you have files named "important_document", "imported_files", and "impromptu_notes", typing "imp doc" in fzf might highlight "important_document" as the top match even though the search isn't an exact substring.

fzf is incredibly fast, enabling swift searches through files and command history. It offers an intuitive interface that lets you search through files in real-time as you type. Additionally, fzf provides numerous integrations with other tools, including Vim, among others.

Setting up and using `fzf`.

In order to use fzf you can simply follow this link which directs you to the installation instructions for fzf tailored to your OS. However, if you're on macOS, you can install fzf with the command brew install fzf, then execute /opt/homebrew/opt/fzf/install to install the shell completions.

`fzf` usage

Here's the basic usage of fzf,

File Search

fzf

Command History Search

Press CTRL + R in your terminal to interactively search through your command history.

Preview Window

$ find dir/ | fzf --preview 'cat {}'

Using fzf with Other Commands

$ ls -l $(fzf)  # List the details of a selected file

Select and Kill Processes

$ kill -9 $(ps aux | fzf | awk '{print $2}')

Filter Git Branches

$ git branch | fzf

Searching through your browser history (FireFox)

To search through your browser history, you can also utilize fzf. The SQLite database, which stores the history, is typically found at the following path on a Mac:

~/Library/Application Support/Firefox/Profiles/*.default-release.

After navigating to that directory, execute:

sqlite3 places.sqlite "SELECT url FROM moz_places" | fzf

Here we've covered the basic usage of fzf, but the tool offers so much more to explore and utilize. I hope this provided insight into how fzf can enhance your daily workflow. If you have any tips related to fzf, please share them in the comments. I'd also love to hear how you've been using fzf.

If you loved this post, you can always support my work by buying me a coffee. Also, if you end up sharing this on Twitter, definitely tag me @muhammad_o7.

Note: If you like to be notified about the upcoming posts you can follow me here or you can leave your email here

ORIGINALLY PUBLISHED HERE

DEV Community

Commandline Productivity Part 1: fzf - The Command-Line Fuzzy Finder

What's `fzf` and why use it?

What's Fuzzy Matching?

How it works? In Simplest Terms

Algorithms used within Fuzzy Matching

So Now what's `fzf` and why use it?

Setting up and using `fzf`.

`fzf` usage

Top comments (0)

Read next

Resolving CUDA Version and GPU Architecture Issues in ContourCraft

Using Bash to upload files to AWS S3

🐧 Linux: The Open-Source Powerhouse 🚀

Mastering Linux Process Management Like a Pro 🚀

What's fzf and why use it?

What's Fuzzy Matching?

How it works? In Simplest Terms

Algorithms used within Fuzzy Matching

So Now what's fzf and why use it?

Setting up and using fzf.

fzf usage

Read next

Resolving CUDA Version and GPU Architecture Issues in ContourCraft

Using Bash to upload files to AWS S3

🐧 Linux: The Open-Source Powerhouse 🚀

Mastering Linux Process Management Like a Pro 🚀

What's `fzf` and why use it?

So Now what's `fzf` and why use it?

Setting up and using `fzf`.

`fzf` usage