🚀Supercharged SLIM models Multistep RAG analysis that never leaves your CPU🧑‍💻

Many of us are used to models running in the cloud, sending API calls to far-away servers, filed away as training data for the next wave of GPTs. And how else would this even work? Surely an individual laptop just doesn't have the power to manage and execute the workflows that a cloud based service does.

Consider, for a moment, the mighty ant. At first glance, it may seem insignificant—a mere speck in the grand tapestry of nature. Yet, beneath its tiny exterior lies a powerhouse of strength, resilience, and ingenuity.

Enter SLIM - Structured Language Instruction Models.🏋️

These models are tiny and run comfortably on a CPU, but pack a punch when it comes to providing specialized, structured outputs. Instead of an AI summary being more bullet points or god forbid paragraphs, SLIM models output a variety of structured data like CSVs, JSONs, and SQL.

The highly specialized nature of the SLIM models is precisely what makes them so powerful - instead of a general solution to a large problem, stringing together a few SLIM models yields more robust performance with greater flexibility.

To show just how much these models can do, we are going to take a look at a tech tale worthy of invoking Gavin Belson: The partnership-turned-rivalry between Microsoft and IBM.

🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜🐜

0️⃣ Setup 🛠️

Make sure you have installed llmware and imported the libraries we are going to use. The code below should get you all set up.

Run this command in your terminal

pip install llmware

Add these imports to the top of your code.

import os
import shutil

from llmware.agents import LLMfx
from llmware.library import Library
from llmware.retrieval import Query
from llmware.configs import LLMWareConfig
from llmware.setup import Setup

1️⃣ Build a Knowledge Base of Microsoft Documents 📖

First we need to create a database to query. In your case it can be anything from customer service reports to earnings calls, but for now we will use a range of Microsoft-related documents.

def multistep_analysis():

    """ In this example, our objective is to research Microsoft history and rivalry in the 1980s with IBM. """

    #   step 1 - assemble source documents and create library

    print("update: Starting example - agent-multistep-analysis")

    #   note: the program attempts to automatically pull sample document into local path
    #   depending upon permissions in your environment, you may need to set up directly
    #   if you pull down the samples files with Setup().load_sample_files(), in the Books folder,
    #   you will find the source: "Bill-Gates-Biography.pdf"
    #   if you have pulled sample documents in the past, then to update to latest: set over_write=True

    print("update: Loading sample files")

    sample_files_path = Setup().load_sample_files(over_write=False)
    bill_gates_bio = "Bill-Gates-Biography.pdf"
    path_to_bill_gates_bio = os.path.join(sample_files_path, "Books", bill_gates_bio)

    microsoft_folder = os.path.join(LLMWareConfig().get_tmp_path(), "example_microsoft")

    print("update: attempting to create source input folder at path: ", microsoft_folder)

    if not os.path.exists(microsoft_folder):
        os.mkdir(microsoft_folder)
        os.chmod(microsoft_folder, 0o777)
        shutil.copy(path_to_bill_gates_bio,os.path.join(microsoft_folder, bill_gates_bio))

    #   create library
    print("update: creating library and parsing source document")

    LLMWareConfig().set_active_db("sqlite")
    my_lib = Library().create_new_library("microsoft_history_0210_1")
    my_lib.add_files(microsoft_folder)

2️⃣ Locate Mentions of IBM and Create an Agent to Process Them 🔍

In our first pass we focus on any mention of IBM, and since we have a multistep process we can analyse these instances on a more granular level.

query = "ibm"
    search_results = Query(my_lib).text_query(query)
    print(f"update: executing query to filter to key passages - {query} - results found - {len(search_results)}")

    #   create an agent and load several tools that we will be using
    agent = LLMfx()
    agent.load_tool_list(["sentiment", "emotions", "topic", "tags", "ner", "answer"])

    #   load the search results into the agent's work queue
    agent.load_work(search_results)

3️⃣ Pick out Negative Sentiment 🫳

This is where you get to decide the depth of your analysis for each item. For our scenario, we want only mentions of IBM that carry negative sentiment (evidence of the rivalry.)

    while True:

        agent.sentiment()

        if not agent.increment_work_iteration():
            break

    #   analyze sections where the sentiment on ibm was negative
    follow_up_list = agent.follow_up_list(key="sentiment", value="negative")

4️⃣ Deep Dive Analysis 🤿

Now that we have picked out the instances we want to explore further, we arm our agent with tools - each tool is a SLIM model built to perform at the highest level on each individual task, providing a comprehensive overview of the pertinent results.

for job_index in follow_up_list:

        # follow-up 'deep dive' on selected text that references ibm negatively
        agent.set_work_iteration(job_index)
        agent.exec_multitool_function_call(["tags", "emotions", "topics", "ner"])
        agent.answer("What is a brief summary?", key="summary")

    my_report = agent.show_report(follow_up_list)

    activity_summary = agent.activity_summary()

    for entries in my_report:
        print("my report entries: ", entries)

    return my_report

Results 🎉🎉🎉

Your multi-step local RAG model should return a filled out dictionary that looks something like this:

report 1 entries:  {'sentiment': ['negative'], 'tags': '["IBM", "COBOL", "PL/1", "BAL", "OS/2", "Presentation Manager", "K.", "OS/2 1.0", "December 1987", "1.0"]', 'emotions': ['anger'], 'topics': ['ibm'], 'people': [], 'organization': ['IBM'], 'misc': ['OS/2', 'Presentation Manager'], 'summary': ['•IBM wrote "clunky" code that was top-heavy with lines of documentation to make the software "easy to service."\t\t•IBM wrote "clunky" code that was top-heavy with lines of documentation to make the software "easy to service."\t\t•IBM wrote "clunky" code that was top-heavy with lines of documentation to make the software "easy to service."\t\t•IBM wrote'], 'source': {'query': 'ibm', '_id': '174', 'text': 'writers were contemptuous of IBM and it\'s coding   culture. In the increasingly irrelevant world of IBM, the classical   languages were COBOL, PL/1, and BAL (Basic Assembly Language),   NOT C!    J.    In addition, IBM wrote "clunky" code that was top-heavy with lines of   documentation to make the software "easy to service."   K.    Finally, in December 1987 OS/2 1.0 without Presentation Manager ', 'doc_ID': 1, 'block_ID': 173, 'page_num': 35, 'content_type': 'text', 'author_or_speaker': 'IBM_User', 'special_field1': '', 'file_source': 'Bill-Gates-Biography.pdf', 'added_to_collection': 'Mon Jul  1 13:14:36 2024', 'table': '', 'coords_x': 162, 'coords_y': 414, 'coords_cx': 34, 'coords_cy': 45, 'external_files': '', 'score': -4.040003091801133, 'similarity': 0.0, 'distance': 0.0, 'matches': [[29, 'ibm'], [100, 'ibm'], [215, 'ibm']], 'account_name': 'llmware', 'library_name': 'microsoft_history_0210_1'}}

The beauty of the output is the structured nature. You could easily write a program to hand off your report to, a program that wouldn't need to waste precious time parsing natural language and could just flip to the right part of the dictionary. Besides saving time, you also increase accuracy and consistency.

If you want to learn more, below is a video walkthrough for this tutorial.