Reading 10x more research papers

#devchallenge #agentaichallenge #ai #machinelearning

This is a submission for the Agent.ai Challenge: Productivity-Pro Agent (See Details)

What I Built

My motivation

Build an agent that solves a business need I have
Build an agent that showcases a majority of the features Agent.ai has
Build an agent that showcases best practices on agent building - meta-prompting, semi-agentic behavior, chain of thought reasoning, prompt chaining, routing, parallelization, evaluator, orchestrator-workers.
Build an agent that people can look at the code and learn best practices from.

The business need it addresses
As a business leader that consults with customers on GenAI, I want to stay on top of the latest happenings in GenAI. I'm subscribed to a few newsletters and I get links to the latest research papers being published. Some of them seem highly relevant (e.g there was a paper on Meta CoT prompting, another on Cache Augmented Generation) and I am always curious on the business implications. But I do not understand most of them as they are technical and they are very long. Also I have a colleague who is super technical and he too would like to read the research but does not have time to read 100 pages. If I could read 10x more research papers, it will help me stay ahead of the GenAI tsunami.

What the agent does

It is flexible enough to generate the research based on the input a person gives (I might want it to be non-technical, someone else might want a technical breakdown, someone might want it explained like a 5 year old)
It asks clarifying questions about the input
It applies meta-prompting principles to generate the research prompts on the fly (semi-agentic as opposed to hardcoded workflow) - since the ask can be varied.
It delegates each research prompt to its own worker and each worker has a role that is dynamically assigned
It collates and summarizes all the worker outputs
In addition to generating textual output, it takes the output and converts it into a podcast between 2 people discussing the output and plays it as audio and also generates the transcript as PDF and stores that on my Google Drive.

Technically I wanted to showcase a bunch of features in Agent.ai

Using labels as variables to drive interactivity
Showcasing IF/ELSE branching (allow users to upload the research as pdf or html
Semi-agentic behavior with meta-prompting and dynamically generating prompts.
Mixture of experts pattern - Using different AI actions as different workers and personas (Task Reframing Assistant / Strategic Task Planner / Report Organizer etc.)
Using the Strategic Task Planner to break down the Job to be done into smaller independent tasks that are executed by different roles dynamically (depends on the inputs).
Using a FOR loop to iterate through a list of task that then spawn multiple worker actions dynamically.
Storing the outputs inside the FOR loop in a list variable.
Accessing the list variable outside the FOR loop to summarize and generate a report.
In addition to the report, convert the output into an audio podcast.
Invoke an agent that takes text (transcript of the audio) and generates Audio.
Call a WebAPI that takes the podcast transcript and generates a PDF (and stores it in my Google Drive - currently only I can access). Wrote the code for this using ChatGPT.
I also wanted to build my own version of what I think OpenAI does with o1 when it does it's own reasoning (a lot of that is CoT/Meta prompting that I tried to mimic). It takes a long time to run as it is doing a lot of stuff. Just like o1.

Metaprompting Output (prompts generated dynamically)

[\n "You are a Theme Analyst. Your task is to identify and summarize the key themes in the research paper that are relevant to business leaders, focusing on the simplicity and composability of agentic systems. Reference the section 'Building effective agents'.",\n "You are a Strategy Advisor. Your task is to provide actionable recommendations for business leaders on when to use agentic systems versus simpler LLM setups, based on the trade-offs discussed in the section 'When (and when not) to use agents'.",\n "You are a Technology Consultant. Your task is to evaluate the benefits and drawbacks of using frameworks for implementing agentic systems, as outlined in the section 'When and how to use frameworks'.",\n "You are a Systems Architect. Your task is to outline the different workflow patterns for agentic systems and their ideal use cases, as described in the section 'Building blocks, workflows, and agents'.",\n "You are a Risk and Ethics Analyst. Your task is to assess the potential risks and ethical considerations of deploying autonomous agents in business environments, drawing insights from the section 'Agents'."\n]"

Demo

https://agent.ai/builder/agent/edit?id=zfe41l9kff9br2pv

https://youtu.be/lonJS6Xbc7A?feature=shared

Agent.ai Experience

I love that me as a business executive that is not a coder can build these kind of applications to solve real business needs. I would never have thought I would enter a hackathon or build 1 agent that would address multiple hackathon entries.

The number of integrations and flexibility of the platform allows me to build a lot of agents.

The biggest challenge is troubleshooting. My agent has over 25 actions and there are still some bugs and nuances that take a long time to tackle. Would be great if there was a way to make certain action inactive, like you comment out a line of code.

DEV Community

Reading 10x more research papers

What I Built

Demo

Agent.ai Experience

Top comments (0)