David Mezzetti for NeuML

Posted on Nov 25, 2024 • Originally published at neuml.hashnode.dev

Granting autonomy to agents

#ai #llm #rag #vectordatabase

txtai is an all-in-one embeddings database for semantic search, LLM orchestration and language model workflows.

txtai 8.0 was recently released and added the ability to run agents. Agents automatically create workflows to answer multi-faceted user requests.

Agents connect a series of tools with a reasoning engine (i.e. LLM). We're giving the agent a degree of latitude to go through it's own internal logic to address a user's request.

This is a huge paradigm shift. We're talking about handing over control to a program and hoping it makes the right decisions itself. Perhaps there are some parallels to sending your kid to college - we hope we've raised them the right way to be able to make smart decisions 😂.

This article will focus on examples that give agents autonomy to address requests. With this, we can start to the see the path ahead towards more and more automation of tasks.

Install dependencies

Install txtai and all dependencies.

pip install txtai[graph] autoawq

Let's get creative

In the first example, we'll define an agent that has access to the txtai-wikipedia embeddings database. Standard retrieval augmented generation (RAG) and vector search are typically designed for a single search. Agents enable a much more creative and iterative approach to search.

This example won't ask the agent exactly what we're looking for. We'll ask the agent to tell us something interesting based on the agent's own ideas and research.

from txtai import Agent

agent = Agent(
    tools=[{
        "name": "wikipedia",
        "description": "Searches a Wikipedia database",
        "provider": "huggingface-hub",
        "container": "neuml/txtai-wikipedia"
    }],
    llm="hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4",
    max_iterations=10,
)

from IPython.display import display, Markdown

answer = agent("""
I'm bored 🥱. Think about 2-3 disparate topics and use those to search wikipedia to generate something fascinating.
Write a report summarizing each article. Include a section with a list of article hyperlinks.
Write the text as Markdown.
""", maxlength=16000)

display(Markdown(answer))

======== New task ========
I'm bored 🥱. Think about 2-3 disparate topics and use those to search wikipedia to generate something fascinating.
Write a report summarizing each article. Include a section with a list of article hyperlinks.
Write the text as Markdown.

=== Agent thoughts:
Thought: I will use Wikipedia to search for articles related to three disparate topics. The first topic will be the "Great Barrier Reef", the second topic will be "Dark Matter", and the third topic will be "The Voynich Manuscript".
>>> Calling tool: 'wikipedia' with arguments: {'query': 'Great Barrier Reef'}
=== Agent thoughts:
Thought: The results from the Wikipedia search show that the Great Barrier Reef is a vast and diverse ecosystem that supports a wide variety of life. However, it is also facing many threats, including climate change, pollution, and overfishing.
>>> Calling tool: 'wikipedia' with arguments: {'query': 'Dark Matter'}
=== Agent thoughts:
Thought: The results from the Wikipedia search show that Dark Matter is a hypothetical form of matter that does not interact with light or other electromagnetic radiation, and is implied by gravitational effects that cannot be explained by general relativity.
>>> Calling tool: 'wikipedia' with arguments: {'query': 'The Voynich Manuscript'}
=== Agent thoughts:
Thought: The results from the Wikipedia search show that the Voynich Manuscript is an illustrated codex that has been carbon-dated to the early 15th century, and its origins, authorship, and purpose are still debated.
>>> Calling tool: 'final_answer' with arguments: {'answer': "### Report on Fascinating Topics\n\n#### The Great Barrier Reef\n\nThe Great Barrier Reef is the world's largest coral reef system, composed of over 2,900 individual reefs and 900 islands stretching for over 2,300 kilometers off the coast of Queensland, Australia. It is a vast and diverse ecosystem that supports a wide variety of life, including coral and other invertebrate species, bony fish, sharks, rays, marine mammals, marine turtles, sea snakes, and algae. However, it is also facing many threats, including climate change, pollution, and overfishing.\n\n#### Dark Matter\n\nDark Matter is a hypothetical form of matter that does not interact with light or other electromagnetic radiation. It is implied by gravitational effects that cannot be explained by general relativity, and is a topic of ongoing research in astrophysics and cosmology.\n\n#### The Voynich Manuscript\n\nThe Voynich Manuscript is an illustrated codex that has been carbon-dated to the early 15th century. Its origins, authorship, and purpose are still debated, and it is considered one of the most mysterious and enigmatic manuscripts in the world.\n\n### Article Hyperlinks\n\n* [Great Barrier Reef](https://en.wikipedia.org/wiki/Great_Barrier_Reef)\n* [Dark Matter](https://en.wikipedia.org/wiki/Dark_matter)\n* [Voynich Manuscript](https://en.wikipedia.org/wiki/Voynich_manuscript)"}

[Output shown below]

Report on Fascinating Topics

The Great Barrier Reef

The Great Barrier Reef is the world's largest coral reef system, composed of over 2,900 individual reefs and 900 islands stretching for over 2,300 kilometers off the coast of Queensland, Australia. It is a vast and diverse ecosystem that supports a wide variety of life, including coral and other invertebrate species, bony fish, sharks, rays, marine mammals, marine turtles, sea snakes, and algae. However, it is also facing many threats, including climate change, pollution, and overfishing.

Dark Matter

Dark Matter is a hypothetical form of matter that does not interact with light or other electromagnetic radiation. It is implied by gravitational effects that cannot be explained by general relativity, and is a topic of ongoing research in astrophysics and cosmology.

The Voynich Manuscript

The Voynich Manuscript is an illustrated codex that has been carbon-dated to the early 15th century. Its origins, authorship, and purpose are still debated, and it is considered one of the most mysterious and enigmatic manuscripts in the world.

Article Hyperlinks

💥 Interesting indeed. The fundamental concept of search is we need to know what to look for. In this case, we didn't have that (i.e. we're bored 😀).

Let's go to another example. This time we'll look at the txtai-hfposts embeddings database. We'll ask the agent to research a specific topic then write a report about what was found.

It's important to mention that txtai agents support any LLM supported by txtai. So while all of the examples have been using Llama 3.1 8B, other LLMs local and remote are supported (i.e. OpenAI, AWS Bedrock, Anthropic). To change it up, we'll use one of the Qwen models which have been receiving a lot of positive reviews lately.

agent = Agent(
    tools=[{
        "name": "hfposts",
        "description": "Searches a database of technical posts on Hugging Face",
        "provider": "huggingface-hub",
        "container": "neuml/txtai-hfposts"
    }],
    llm="Qwen/Qwen2.5-7B-Instruct-AWQ",
    max_iterations=10,
)

answer = agent("""
Read posts about medicine and write a report on what you learned.

The report should be a Markdown table with the following columns.
 - Name
 - Description
 - Link to content

Only include rows that have a valid web url.
""", maxlength=16000)

display(Markdown(answer))

======== New task ========

Read posts about medicine and write a report on what you learned.

The report should be a Markdown table with the following columns.
 - Name
 - Description
 - Link to content

Only include rows that have a valid web url.

=== Agent thoughts:
Thought: First, I need to search for posts related to medicine using the `hfposts` tool.
>>> Calling tool: 'hfposts' with arguments: {'query': 'medicine'}
=== Agent thoughts:
Thought: Now that I have the posts related to medicine, I need to filter out the ones that have a valid web URL. I will go through each post and check if it contains a URL.
>>> Calling tool: 'hfposts' with arguments: {'query': 'medicine url'}
=== Agent thoughts:
Thought: Now that I have the posts with URLs, I need to extract the relevant information such as the name, description, and link to content. I will create a Markdown table with these details.
>>> Calling tool: 'hfposts' with arguments: {'query': 'medicine url'}
=== Agent thoughts:
Thought: Now that I have the posts with URLs, I can extract the relevant information and create a Markdown table. I will format the data into the required structure.
>>> Calling tool: 'final_answer' with arguments: {'answer': '| Name | Description | Link to content |\n| --- | --- | --- |\n| OpenBioLLM Release | Introducing OpenBioLLM-Llama3-70B & 8B: The most capable openly available Medical-domain LLMs to date! | [https://huggingface.co/blog/aaditya/openbiollm](https://huggingface.co/blog/aaditya/openbiollm) |\n| Last Week in Medical AI: Top Research Papers/Models (September 1 - September 7, 2024) | Outperforms industry giants like GPT-4, Gemini, Meditron-70B, Med-PaLM-1, and Med-PaLM-2 in the biomedical domain. | [https://x.com/OpenlifesciAI/status/1832476252260712788](https://x.com/OpenlifesciAI/status/1832476252260712788) |\n| Last Week in Medical AI: Top Research Papers/Models (August 25 - August 31, 2024) | Includes MultiMed, a Multimodal Medical Benchmark, and A Foundation model for generating chest X-ray images. | [https://x.com/OpenlifesciAI/status/1829984701324448051](https://x.com/OpenlifesciAI/status/1829984701324448051) |\n| Last Week in Medical AI: Top Research Papers/Models (October 5 - October 12, 2024) | Introduces MMedAgent: Learning to Use Medical Tools with Multi-modal Agent. | [https://youtu.be/OD3C5jirszw](https://youtu.be/OD3C5jirszw) |\n| Last Week in Medical AI: Top Research Papers/Models (October 26 - November 2, 2024) | Google Presents MDAgents: An Adaptive Collaboration of LLMs for Medical Decision-Making. | [https://x.com/OpenlifesciAI/status/1852685220912464066](https://x.com/OpenlifesciAI/status/1852685220912464066) '}

[Output shown below]

Name	Description	Link to content
OpenBioLLM Release	Introducing OpenBioLLM-Llama3-70B & 8B: The most capable openly available Medical-domain LLMs to date!	https://huggingface.co/blog/aaditya/openbiollm
Last Week in Medical AI: Top Research Papers/Models (September 1 - September 7, 2024)	Outperforms industry giants like GPT-4, Gemini, Meditron-70B, Med-PaLM-1, and Med-PaLM-2 in the biomedical domain.	https://x.com/OpenlifesciAI/status/1832476252260712788
Last Week in Medical AI: Top Research Papers/Models (August 25 - August 31, 2024)	Includes MultiMed, a Multimodal Medical Benchmark, and A Foundation model for generating chest X-ray images.	https://x.com/OpenlifesciAI/status/1829984701324448051
Last Week in Medical AI: Top Research Papers/Models (October 5 - October 12, 2024)	Introduces MMedAgent: Learning to Use Medical Tools with Multi-modal Agent.	https://youtu.be/OD3C5jirszw
Last Week in Medical AI: Top Research Papers/Models (October 26 - November 2, 2024)	Google Presents MDAgents: An Adaptive Collaboration of LLMs for Medical Decision-Making.	https://x.com/OpenlifesciAI/status/1852685220912464066

🚀 Once again, very interesting! This time we asked the agent to go read about a topic and report back. The agent did that and left us with links to explore further.

Autonomous Embeddings

For our last example, we're going to give an agent free rein to control an embeddings database.

First, we will create an empty embeddings database and tell the agent how to add and search for data.

from txtai import Agent, Embeddings
from txtai.pipeline import Textractor
from txtai.workflow import Workflow, Task

# Empty embeddings database
embeddings = Embeddings(
    path="intfloat/e5-large",
    instructions={"query": "query: ", "data": "passage: "},
    content=True
)

# Textractor instance
textractor = Textractor(sections=True, headers={"user-agent": "Mozilla/5.0"})

def insert(elements):
    """
    Inserts elements into the embeddings database.

    Args:
        elements: list of strings to insert
    """

    def upsert(elements):
        embeddings.upsert(elements)
        return elements

    # Upsert workflow
    workflow = Workflow([Task(textractor), Task(upsert)])
    list(workflow(elements))
    return f"{elements} inserted successfully"

agent = Agent(
    tools=[insert, embeddings.search, "websearch"],
    llm="Qwen/Qwen2.5-7B-Instruct-AWQ",
    max_iterations=10
)

Next, we'll define a workflow process. This creates the rules we want the agent to follow. Then we'll run two requests.

prompt = """
Run the following process:
 1. Search your internal knowledge for {topic}
 2. If not found, find relevant urls and insert those as a list of strings ONLY. Then rerun the search for {topic}.
 3. Write a detailed report about {topic} with Markdown sections covering the major topics. Include a section with hyperlink references.
"""

answer = agent(prompt.format(topic="txtai"))
display(Markdown(answer))

======== New task ========

Run the following process:
 1. Search your internal knowledge for txtai
 2. If not found, find relevant urls and insert those as a list of strings ONLY. Then rerun the search for txtai.
 3. Write a detailed report about txtai with Markdown sections covering the major topics. Include a section with hyperlink references.

=== Agent thoughts:
Thought: First, I will attempt to search for "txtai" in my internal knowledge base.
>>> Calling tool: 'search' with arguments: {'query': 'txtai'}
=== Agent thoughts:
Thought: Since the search for "txtai" did not yield any results, I will proceed to find relevant URLs and insert them as a list of strings.
>>> Calling tool: 'web_search' with arguments: {'query': 'txtai'}
=== Agent thoughts:
Thought: Now that I have found relevant URLs, I will insert them into the database as a list of strings.
>>> Calling tool: 'insert' with arguments: {'elements': ['https://github.com/neuml/txtai', 'https://pypi.org/project/txtai/', 'https://neuml.github.io/txtai/', 'https://neuml.com/', 'https://medium.com/neuml/introducing-txtai-the-all-in-one-embeddings-database-c721f4ff91ad', 'https://neuml.github.io/txtai/embeddings/', 'https://neuml.github.io/txtai/examples/']}
=== Agent thoughts:
Thought: Now that the relevant URLs have been inserted, I will rerun the search for "txtai".
>>> Calling tool: 'search' with arguments: {'query': 'txtai'}
=== Agent thoughts:
>>> Calling tool: 'final_answer' with arguments: {'answer': '# txtai Overview\n\ntxtai is an all-in-one open-source embeddings database designed for semantic search, LLM orchestration, and language model workflows. It combines vector indexes (both sparse and dense), graph networks, and relational databases to offer a comprehensive solution for various tasks.\n\n## Semantic Search\n\ntxtai supports semantic search, enabling users to find relevant information based on the meaning of the text rather than just keywords. This makes it suitable for applications such as recommendation systems, content retrieval, and more.\n\n## LLM Orchestration\n\ntxtai includes support for Large Language Models (LLMs) orchestration, which allows users to integrate and manage different LLMs within their workflows. This feature is particularly useful for tasks such as prompt chaining, summarization, transcription, translation, and more.\n\n## Embeddings Databases\n\ntxtai provides a unified interface for embedding, indexing, searching, and processing text and multimedia data. Users can create, index, and search embeddings using code snippets and examples provided in the documentation.\n\n## Hyperlink References\n\n- [txtai on GitHub](https://github.com/neuml/txtai)\n- [PyPI - txtai](https://pypi.org/project/txtai/)\n- [txtai Documentation](https://neuml.github.io/txtai/)\n- [txtai Examples](https://neuml.github.io/txtai/examples/)\n- [txtai Embeddings](https://neuml.github.io/txtai/embeddings/)'}

[Output shown below]

txtai Overview

txtai is an all-in-one open-source embeddings database designed for semantic search, LLM orchestration, and language model workflows. It combines vector indexes (both sparse and dense), graph networks, and relational databases to offer a comprehensive solution for various tasks.

Semantic Search

txtai supports semantic search, enabling users to find relevant information based on the meaning of the text rather than just keywords. This makes it suitable for applications such as recommendation systems, content retrieval, and more.

LLM Orchestration

txtai includes support for Large Language Models (LLMs) orchestration, which allows users to integrate and manage different LLMs within their workflows. This feature is particularly useful for tasks such as prompt chaining, summarization, transcription, translation, and more.

Embeddings Databases

txtai provides a unified interface for embedding, indexing, searching, and processing text and multimedia data. Users can create, index, and search embeddings using code snippets and examples provided in the documentation.

Hyperlink References

answer = agent(prompt.format(topic="openscholar"))
display(Markdown(answer))

======== New task ========
Run the following process:
 1. Search your internal knowledge for openscholar
 2. If not found, find relevant urls and insert those as a list of strings ONLY. Then rerun the search for openscholar.
 3. Write a detailed report about openscholar with Markdown sections covering the major topics. Include a section with hyperlink references.

=== Agent thoughts:
Thought: First, I will search my internal knowledge for "openscholar".
>>> Calling tool: 'search' with arguments: {'query': 'openscholar'}
=== Agent thoughts:
Thought: Since the search did not return any relevant information about "openscholar", I will find relevant URLs and insert them as a list of strings. Then, I will rerun the search for "openscholar".
>>> Calling tool: 'web_search' with arguments: {'query': 'openscholar'}
=== Agent thoughts:
Thought: Now that I have found relevant URLs about "openscholar", I will insert these URLs into the database.
>>> Calling tool: 'insert' with arguments: {'elements': ['https://openscholar.allen.ai/', 'https://allenai.org/blog/openscholar', 'https://arxiv.org/abs/2411.14199', 'https://github.com/AkariAsai/OpenScholar', 'https://www.marktechpost.com/2024/11/22/the-allen-institute-for-ai-ai2-introduces-openscholar-an-open-ecosystem-for-literature-synthesis-featuring-advanced-datastores-and-expert-level-results/', 'https://venturebeat.com/ai/openscholar-the-open-source-a-i-thats-outperforming-gpt-4o-in-scientific-research/']}
=== Agent thoughts:
Thought: Now that the relevant URLs have been inserted, I will rerun the search for "openscholar" to gather more detailed information.
>>> Calling tool: 'search' with arguments: {'query': 'openscholar'}
=== Agent thoughts:
Thought: The search has returned relevant information about OpenScholar. I will now compile a detailed report about OpenScholar with Markdown sections covering the major topics.
>>> Calling tool: 'final_answer' with arguments: {'answer': "# OpenScholar Report\n\n## Introduction\nOpenScholar is an advanced AI system developed by the Allen Institute for AI and the University of Washington. It aims to revolutionize scientific research by processing millions of research papers and providing citation-backed answers to complex research questions.\n\n## Key Features\n- **Data Processing**: Capable of handling 45 million research papers instantly.\n- **Citation Accuracy**: Achieves high citation accuracy, challenging proprietary AI systems.\n- **Customizable Tools**: Offers specialized tools and built-in features for creating impactful websites.\n\n## Research Impact\n- **Data Utilization**: Integrates a vast corpus of scientific literature, making it the largest open-access corpus for scientific literature synthesis.\n- **Expert-Level Results**: Provides answers that match or exceed those of human experts in various scientific domains.\n\n## Future Directions\n- **Continued Development**: Ongoing efforts to improve the system's capabilities and expand its applications.\n- **Community Engagement**: Plans for expanding the user base and fostering a community around OpenScholar.\n\n## References\n- [OpenScholar](https://openscholar.allen.ai/)\n- [VentureBeat Article](https://venturebeat.com/ai/openscholar-the-open-source-a-i-thats-outperforming-gpt-4o-in-scientific-research/)\n- [Allen Institute for AI](https://allenai.org/blog/openscholar)"}

[Output shown below]

OpenScholar Report

Introduction

OpenScholar is an advanced AI system developed by the Allen Institute for AI and the University of Washington. It aims to revolutionize scientific research by processing millions of research papers and providing citation-backed answers to complex research questions.

Key Features

Data Processing: Capable of handling 45 million research papers instantly.
Citation Accuracy: Achieves high citation accuracy, challenging proprietary AI systems.
Customizable Tools: Offers specialized tools and built-in features for creating impactful websites.

Research Impact

Data Utilization: Integrates a vast corpus of scientific literature, making it the largest open-access corpus for scientific literature synthesis.
Expert-Level Results: Provides answers that match or exceed those of human experts in various scientific domains.

Future Directions

Continued Development: Ongoing efforts to improve the system's capabilities and expand its applications.
Community Engagement: Plans for expanding the user base and fostering a community around OpenScholar.

References

🔥 Amazing.

Remember, we started with an empty embeddings database. Then we gave basic instructions on how to use the available tools. From there, the agent autonomously operated and answered user requests. The agent also stored what it learned for future requests. This gave the agent it's own internal memory.

Of course, we could program a process that implements this workflow. But think about the productivity gains we're opening up to so many more people. We're enabling people to control a process simply by pairing a set of tools with a description of what they want, in English.

Exciting times!

Wrapping up

This article demonstrated ways to run agents in a more autonomous fashion. While the technology isn't perfect, we can certainly see the path ahead where new models will continue to do a better job. With the right agents and targeted tools, much can be done now though.

Think about the differences between now and 6-12 months ago. Where will we be in another 6-12 months!

DEV Community

Granting autonomy to agents

Install dependencies

Let's get creative

Report on Fascinating Topics

The Great Barrier Reef

Dark Matter

The Voynich Manuscript

Article Hyperlinks

Autonomous Embeddings

txtai Overview

Semantic Search

LLM Orchestration

Embeddings Databases

Hyperlink References

OpenScholar Report

Introduction

Key Features

Research Impact

Future Directions

References

Wrapping up

Top comments (0)