DEV Community

Cover image for From Scratch: How to Develop a File Search Tool Rivaling "Everything" Using Pure C#
RoJ
RoJ

Posted on

From Scratch: How to Develop a File Search Tool Rivaling "Everything" Using Pure C#

It's well known that searching for files by reading the USN journal is far more efficient than traditional recursive folder traversal. However, achieving the extreme speed of "Everything" is not simple—even a delay of just a few dozen milliseconds can make a world of difference in user experience.

Today, "Everything"'s interface style feels somewhat classic, and some of its operational habits are difficult to highly customize. Presumably, many users desire a search tool that is both extremely efficient and perfectly tailored to their personal usage habits: such as more flexible hotkeys, richer right-click menus, and subsequent operation linkages. Precisely for this reason, I decided to redevelop such a tool and document the thought process and the "pitfalls" encountered throughout the development. The final code and implementation are completely open source.


1. Overall Approach

The entire project can be divided into two main parts: Data Processing and UI Interaction.

  • Data Processing Flow:

    • a. Read file information from the disk cache
    • b. Load data into memory
    • c. Perform string matching
    • d. Return matching results
  • UI Interaction Flow:

    • a. Respond to user input
    • b. Trigger search and obtain results
    • c. Update the interface display

2. Improving Data Processing Efficiency

a. Reading File Information from the Disk Cache

We obtain file information by reading the USN journal. Each record should contain at least: the File Reference Number (FileReferenceNumber, ulong), the Parent File Reference Number (ParentFileReferenceNumber, ulong), and the file name (including folder names). Note: the USN does not directly provide the full path; we need to piece together the real path step by step based on the parent references.

Another key aspect is incremental updates. The USN can record file state changes (like creation, deletion, renaming). By monitoring events such as USN_REASON_FILE_CREATE | USN_REASON_FILE_DELETE | USN_REASON_RENAME_NEW_NAME, efficient data updates can be achieved, far superior to traditional file monitoring methods.

b. Search Strategy in Memory

The most basic search method is to traverse all strings, but this performance is far from sufficient. I tried the following methods:

  • Building an Index Mechanism: Using a Bloom-filter-like structure with ulong type for preliminary filtering. Preliminary screening via pre-computed "OR" operations showed a measured 2–3x search speed improvement in tests.

    public static ulong TBS(string txt)
    {
        ulong indexValue = 0;
    
        for (int i = 0; i < SCREENCHARNUM; i++)
        {        
            if (txt.Contains(alphbet[i], StringComparison.OrdinalIgnoreCase))
            {
                SetBit(ref indexValue, i);
            }
            else
            {
                ClearBit(ref indexValue, i);
            }
        }
        return indexValue;
    }
    
  • Multithreaded Search: Grouping data and processing it in parallel with multiple threads. However, in practical tests (with 5M+ data points), performance gains were minimal due to the frequent need for lock synchronization of results, and it only added code complexity.

  • Pinyin Initial Support: This project includes Pinyin initial retrieval functionality. While the implementation is relatively straightforward, it introduces some matching overhead. Removing this feature could further improve search speed.

c. Method for Returning Results

Search results are stored in a fixed-length array. Each search operation only sequentially updates the references for the top matching results and the total count, incurring almost no additional overhead.


3. Optimizing UI Response and Rendering

The importance of UI response design is no less than that of data processing; they are equally crucial. This project tried both WinForms and Avalonia UI frameworks, ultimately choosing Avalonia for its AOT publishing support, which offers a more modern interface effect while maintaining similar performance.

Responding to User Input and Executing Tasks

We execute "user input" and "string matching" on different threads, using an asynchronous signaling mechanism to trigger or cancel search tasks,坚决避免 (resolutely avoiding) blocking the UI thread. The search task runs in a persistent loop to avoid the overhead of repeatedly creating threads.

Efficiently Updating the UI

Facing the real-time display of massive data, a virtualized UI is mandatory.

  • In WinForms, the virtual mode of ListView can be used to dynamically generate items.
  • In Avalonia, I used a ListBox virtualization solution combined with the ReactiveUI framework, binding the DataContext to a variable-length data model. Each time the search results update, display items are lazily generated via the IEnumerable interface, achieving smooth rendering.

    public class DataViewModel<T> : ReactiveObject
    {
        private IList<T> _allData = [];
        private IEnumerable<T> _displayedData = [];
        private int _displayCount = 100;
    
        public IEnumerable<T> DisplayedData
        {
            get => _displayedData;
            private set => this.RaiseAndSetIfChanged(ref _displayedData, value);
        }
    
        public int DisplayCount
        {
            get => _displayCount;
            set
            {
                _displayCount = value;
                UpdateDisplayedData();
            }
        }
    
        public DataViewModel()
        {
        }
    
        public void Bind(IList<T> _allData)
        {
            if (this._allData != _allData)
            {
                // Generate test data (in practice, loaded from file or DB)
                this._allData = _allData;
                UpdateDisplayedData();
            }
        }
    
        private void UpdateDisplayedData()
        {
            // Use LINQ's Take(), which is lazy-evaluated and performs well
            DisplayedData = _allData.Take(DisplayCount);
        }
    
        // Quickly switch to a different scale
        public void SetDisplayCount(int count)
        {
            if (DisplayCount != count)
            {
                if (count < 0) count = 0;
                DisplayCount = count;
            }
        }
    }
    

File paths and icons are obtained dynamically (caching can be appropriately introduced). Icons were initially loaded asynchronously, but this caused flickering. Switching to synchronous acquisition resulted in more stable performance.

public string FileName => PathHelper.getfilePath(fileName).ToString();
public string FilePath => PathHelper.GetPath(this).ToString();
Enter fullscreen mode Exit fullscreen mode

4. Additional Features

On this foundation, we also implemented features such as global hotkeys, search history, and custom right-click menus, further enhancing practicality and personalization.


5. Conclusion

If you have any suggestions or ideas about this tool, feel free to discuss! The project is completely open source on GitHub. If you find it useful, welcome to give it a Star ⭐️! https://github.com/LdotJdot/TDS

Top comments (0)