DEV Community: Aarushi Kansal

Personal movie recommendation agent with GPT4 + Neo4J

Aarushi Kansal — Tue, 20 Jun 2023 09:25:43 +0000

Large Language Model (LLM) powered applications become more powerful and intriguing when you start leveraging the model's reasoning abilities, rather than pure generation abilities.

And combine that reasoning with external tools like databases and APIs and you have yourself an application that can reason and take actions.

Over the past few months I've been deep in the world of using LLMs for both personalized recommendations and reasoning and combing the two concepts, in more of assistant style format.

When you think of recommendation engines, maybe you see it as a huuuge topic to tackle, teams of data scientists, ML engineers, GPUs?

While that holds true, you can still get started with some basic DS algorithms, an LLM and some UI libraries.

I wanted to see what I could do with a knowledge graph since they can be a good basis for recommendation engines. And luckily, LangChain already has a chain that can be used with Neo4j, so that's what we'll use.

Getting started

Data

First we need a dataset and we'll use the default movies one provided by Neo4j. You can also find others on Kaggle, or create your own (probably more for when you're ready for a production system, cause trust me, cleaning data is a tiresome task!)

You can find the various existing sets + set up your sandbox database here.

LangChain

If there's something you want to use with an LLM, your best bet is to first check what's going on in the LangChain world.

LLMs are very good at creating Cypher queries, and I wanted to use an LLM to give a user a conversational way to get their personalized recommendations. This essentially means, we need something that takes a users natural language, and creates a Cypher query out of it that can be used to query the DB.

And LangChain's Graph DB QA chain does just that.

Set up is farily simple, as shown below:

graph = Neo4jGraph(
    url="bolt://18.212.1.173:7687",
    username="neo4j", 
    password="finishes-executions-arcs"
)


os.environ['OPENAI_API_KEY'] = "sk-key"

chain = GraphCypherQAChain.from_llm(
    ChatOpenAI(model_name="gpt-4", temperature=0.0), graph=graph, verbose=True,
)

Main thing here is setting the temperature to 0 so the LLM doesn't try and get 'creative' with queries. We want a deterministic output here.

With just this set up alone, you can start doing some basic Q+A as well as getting it to run basic queries.

Q&A + Basic queries

Let's try out this chain, like so:

chain.run(
    """
        Set Cynthia Freeman's rating for Inception to 4.0.
    """
)

The key here is that while this LLM has no awareness of your actual data or schema by default, the chain runs a few queries to get the entire schema and pass the schema in as context into the prompt. So all of a sudden, it's like your LLM has a solid understanding of your exact DB! Simple but elegant solution really.

And that's why it's great at figuring out queries like the above.

Let's try out some other questions, we're working on recommendations right, so let's see if our LLM can find us movies in similar genres.

find me movies most similar to 'Inception'

At this point, most likely it'll try and base it on genre or imdb ratings. So far pretty good. The LLM is doing enough reasoning to understand what 'similar' could mean (ie. genre or ratings).

You can tell it to base it on genre, making sure it always basis similarity on genre. Go further and tell it actors and it'll correctly identify that - by figuring out the movie to actor/s relationship and counting how many shared actors a movie has to Inception.

Now, at this point, the aspect I'm more interested in is the different similarity functions and algorithms we can apply to start getting recommendations for movies.

In particular, I want to use Neo4j's data science library. You can go ahead and read about the different ways of calculating similarities, if you aren't familiar.

And I want to some collaborative filtering, based on kNN. Essentially, I want recommendations based on users with similar tastes to mine, their top rated movies, that I haven't rated (no rating infers hasn't been watched).

So you can try any form of that question now

Who are the 5 users with tastes in movies most similar to Aarushi Kansal? What movies have they rated highly that Aarushi Kansal hasn't rated yet?

This seems pretty specific right, but unfortunately, it doesn't give me quite what I want. The query it comes out with is:

MATCH (u1:User {name: "Aarushi Kansal"})-[:RATED]->(m1:Movie)<-[:IN_GENRE]-(g:Genre)-[:IN_GENRE]->(m2:Movie)<-[:RATED]-(u2:User)
WHERE NOT (u1)-[:RATED]->(m2)
WITH u2, count(*) AS similarity, m2.title AS recommended_movie, m2.imdbRating as rating
ORDER BY similarity DESC, rating DESC
RETURN u2.name AS user, recommended_movie
LIMIT 5

Okay so what if we go more specific?

Who are the 5 users with tastes in movies most similar to Aarushi Kansal? What movies have they rated highly that Aarushi Kansal hasn't rated yet? Use kNN and Pearson similarity

At this point, it tried hard, but the query just doesn't work at all:

MATCH (u1:User {name: "Aarushi Kansal"})-[:RATED]->(m1:Movie)<-[:RATED]-(u2:User)
WITH u1, u2, tofloat(count(m1)) as numCommonMovies
MATCH (u1)-[r1:RATED]->(m1:Movie)<-[r2:RATED]-(u2)
WITH u1, u2, numCommonMovies, m1,
     (r1.rating - u1.avgRating) * (r2.rating - u2.avgRating) as simNumer,
     (r1.rating - u1.avgRating) * (r1.rating - u1.avgRating) as simDenom1,
     (r2.rating - u2.avgRating) * (r2.rating - u2.avgRating) as simDenom2
WITH u1, u2, numCommonMovies, sum(simNumer) as simNumer, sum(simDenom1) as simDenom1, sum(simDenom2) as simDenom2
WITH u1, u2, simNumer, sqrt(simDenom1 * simDenom2) as simDenom
WHERE simDenom > 0
WITH u1, u2, numCommonMovies, simNumer / simDenom as pearson
ORDER BY pearson DESC, numCommonMovies DESC, u2.name ASC
LIMIT 10
MATCH (u2)-[r:RATED]->(m:Movie)
WHERE NOT (u1)-[:RATED]->(m) AND r.rating >= 4
RETURN u2.name as UserName, m.title as MovieTitle, r.rating as UserRating
ORDER BY r.rating DESC, m.title ASC, u2.name ASC;

At this point, the best solution is to actually give it an example of the query you actually want:

MATCH (u1:User {name:"Aarushi Kansal"})-[r:RATED]->(m:Movie)
WITH u1, avg(r.rating) AS u1_mean

MATCH (u1)-[r1:RATED]->(m:Movie)<-[r2:RATED]-(u2)
WITH u1, u1_mean, u2, COLLECT({r1: r1, r2: r2}) AS ratings WHERE size(ratings) > 10

MATCH (u2)-[r:RATED]->(m:Movie)
WITH u1, u1_mean, u2, avg(r.rating) AS u2_mean, ratings

UNWIND ratings AS r

WITH sum( (r.r1.rating-u1_mean) * (r.r2.rating-u2_mean) ) AS nom,
     sqrt( sum( (r.r1.rating - u1_mean)^2) * sum( (r.r2.rating - u2_mean) ^2)) AS denom,
     u1, u2 WHERE denom <> 0

WITH u1, u2, nom/denom AS pearson
ORDER BY pearson DESC LIMIT 10

MATCH (u2)-[r:RATED]->(m:Movie) WHERE NOT EXISTS( (u1)-[:RATED]->(m) )

RETURN m.title, SUM( pearson * r.rating) AS score
ORDER BY score DESC LIMIT 25

Annd, it works, I get movies like The Silence of the Lambs, Forest Gump, Pulp Fiction etc.

Up til now you can see we've gotten to a pretty good place, we can do querying on a knowledge graph, and even without too much context/ prompt engineering it's able to determine what relationships to search through (e.g. movies -> genres). As you add more guidance it gets better and better.

But having to give it an example of every similarity function or algorithm makes it a pretty poor assistant and bad user experience. Users would be better off just querying the DB or having some kind of button that runs the query etc.

And that's where we now combine what we have so far, with an Agent and Chainlit for that assistant style user experience.

Agent

First let's briefly summarize what we're aiming for.

A human-eqsue experience for a user, for movie recommendations. They should be able to ask for movies, based on their interests, update ratings and the agent should have some understanding about a user.

1) Chat interface
2) Natural language understanding
3) Access to data source (I'm also going to give it access to the internet, because I think an agent should be able get fun facts or summarizes, or even recent news about actors or directors)
4) Reasoning abilities to choose what tools and action to take

The code

For the interface, I'm using chainlit, a new UI library for building LLM apps, with an integration with Langchain.

I'm using the out of the box chat UI

For natural lanaguage understanding I'm using GPT-4, but you can sub out your favorite LLM here.

llm1 = OpenAI(temperature=0, streaming=True)
    # search = SerpAPIWrapper()
    memory = ConversationBufferMemory(
        memory_key="chat_history", return_messages=True)

With memory, so it can remember the context of our conversation (an agent that forgets messages, feels like bad UX!)

I'm using chat-zero-shot-react-description, which is a MRKL implementation for chat models. If you're interested in MRKL agents and an intro into tools, you can check out another blog of mine. But in a nutshell, this is what allows the model to choose what tool (DB or Google search) to use when answering a user's requests.

    tools = [
        Tool(
            name="Cypher search",
            func=cypher_tool.run,
            description="""
            Utilize this tool to search within a movie database, 
            specifically designed to find movie recommendations for users.
            This specialized tool offers streamlined search capabilities
            to help you find the movie information you need with ease.
            """,
        ),
        Tool(
            name="Google search",
            func=search.run,
            description="""
    Utilize this tool to search the internet when you're missing information. In particular if you want recent events or news.
    """,
        )
    ]
    return initialize_agent(
        tools, llm1, agent="chat-zero-shot-react-description", verbose=True, memory=memory
    )

Annnd, finally the Cypher/Neo4j/knowledge graph parts! So, remember earlier on, we found that the more context you give the LLM (details on which parts of the schema to use, example queries etc), the better it performs at finding the right movies for you? But the whole point here is to give the user an easy way to get recommendations, but not have to know or understand the inner workings of our DB or even Cypher.

Essentially, we want to bake the logic in the backend and the user never has to know.

So, what we're going to do is determine which exact usecases we want this agent to handle (or feel like an expert in). I.e. we're going to tell it exactly how to handle certain requests.

For the purpose of this post, I'm giving it the specifics of finding movie recommendations based on similar users and movies similar to another movie, based on content (i.e genres) using the Jacard index.

CYPHER_GENERATION_TEMPLATE = """Task:Generate Cypher statement to query a graph database. 
Instructions:
Make recommendations for a given user only. 
Update ratings for a given user only.
Schema:
{schema}
Username: 
{username}

Examples:
# When a user asks for movie recommendations: 


    # When asked for movies similar to a movie, use the weighted content algorithm, like this: 

CYPHER_GENERATION_PROMPT = PromptTemplate(
    input_variables=["schema", "question", "username"], template=CYPHER_GENERATION_TEMPLATE
)

Note: this is the template built in LangChain, that I've repurposed for my needs. If you remove this piece of code, it will still work, just with the defaults we used earlier.

At this point you might be thinking, we've made this agent kind of 'dumb' by only allowing it to do things..? An if statement could have sufficed? Well, don't worry giving it examples like this only expands it's so called knowledge. It's still able to do all the other things like find movies based on actors and update your ratings.

You can play around with different/more algorithms based on your needs.

Let's see it action so far by asking it "what movie should I watch?":

So you can see it starts using the query for collaborative filtering based on neighbourhood - exactly what I wanted.

Let's try another one, "Suggest movies similar to Inception"

Again, now you can see it using the desired query.

If you want, now you can also start asking it about actors, news about them and so on, which it'll handle via our Google search tool.

Annd there you have it, an end to end personalised movie agent for you!
You can check out the full code here.

Links

https://docs.chainlit.io/examples/mrkl
https://neo4j.com/docs/graph-data-science/current/
https://neo4j.com/docs/graph-data-science/current/algorithms/kmeans/
https://python.langchain.com/docs/modules/chains/additional/graph_cypher_qa
https://neo4j.com/developer-blog/exploring-practical-recommendation-systems-in-neo4j/

AWS + Falcon Quickstart

Aarushi Kansal — Sat, 03 Jun 2023 20:09:13 +0000

Okay, so I haven't used SageMaker in a looooong time (maybe like 4 years or so ago) and I have to say, it feels like a whole new ecosystem. It's definitely a lot more intuitive, and waay more features. And now with easy access to foundational models as well as models from HuggingFace, it's starting to climb up my list of preferred ML/AI platforms.

And in this post I wanna show you how to deploy and use the shiny new Falcon models.

What is SageMaker

It's basically AWS' fully managed, end to end ML platform. This includes an IDE (Jupyter Notebooks), storage, foundational models and one click deployments. All of the infrastructure management is handled by AWS.

Roles and Permissions

The first thing you're going to need is to create a role (or maybe you have a default role you want to use), that has the AmazonSageMakerFullAccess policy attached to it. Give it a name you'll remember so you can use it later on.

In my case, I'm gonna go with:

AmazonSageMakerRole-experimentation

Creating a Domain

So from AWS:

A domain includes an associated Amazon Elastic File System (EFS) volume; a list of authorized users; and a variety of security, application, policy, and Amazon Virtual Private Cloud (VPC) configurations. Each user in a domain receives a personal and private home directory within the EFS for notebooks, Git repositories, and data files.

Essentially, it's your home for everything you'll need to train, finetune, build and deploy models. Both as an individual or collaboratively in a team.

Go ahead and navigate to SageMaker > Domains and create a domain. It'll take you to a screen like this:

You'll go with the quick set up for the purpose of this post.

Here you can use any name you want for your domain and a user you'll create with the role you previously created.

Once you've chosen your names and selected your role, it'll take some time to spin up your domain.

Deploying a model

Now that you're domain is running, head into users and select launch. You'll have a few options, go for studio.

Create a new notebook and we'll use the following code:

import sagemaker
import boto3
from sagemaker.huggingface import HuggingFaceModel

try:
    role = sagemaker.get_execution_role()
except ValueError:
    iam = boto3.client('iam')
    role = role = iam.get_role(RoleName='sagemaker_execution_role')['Role']['Arn']
# Hub Model configuration. https://huggingface.co/models
hub = {
    'HF_MODEL_ID':'tiiuae/falcon-7b',
    'HF_TASK':'text-generation'
}

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
    transformers_version='4.26.0',
    pytorch_version='1.13.1',
    py_version='py39',
    env=hub,
    role=role, 
)

# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
    initial_instance_count=1, # number of instances
    instance_type='ml.m5.xlarge' # ec2 instance type
)

predictor.predict({
    "inputs": "Once upon a time in Narnia ",
})

This is the general code you'll use for all HuggingFace models. You'll just needed to change the details such as which model, what AWS instance, what Python libraries and versions etc.

Take note, in the predictor code, your choosing 1 instance of type ml.m5.xlarge. You can use different options here, but depending on your account you might need to adjust your quotas for resources like an ml.m5.xlarge instance!

Now you're ready to run and you should get some text generated.

The endpoint

So far you've just run it via a Jupyter notebook. But don't worry it's actually deployed an API you can use. So go to Deployments > Endpoints and you'll see the one you just deployed.

Once you click into it you'll be able to test the inference endpoint, as well as see various bits of information about your endpoint such as traffic patterns. You'll also see the actual endpoint that you can now use within your LLM powered applications.

Bits and pieces

This was a quickstart and SageMaker has a lot more to it. So I would recommend playing around with it as much as possible and deploy different models. Just remember to shut everything down when you're done!

Opensource with HuggingFace

Aarushi Kansal — Sun, 21 May 2023 16:17:32 +0000

"We Have No Moat, And Neither Does OpenAI"

"Open-source models are faster, more customizable, more private, and pound-for-pound more capable."

You've probably all heard these quotes by now and if you haven't checkout em out here

And both Google and OpenAI should rightfully be worried. We're well and truly in the age of democratized machine learning.

As big tech and the tech community quickly churn out open source models, it just makes our lives easier, to pick and choose which models serve us and in what way.

Given that, just open sourcing a model isn't enough, we have to be able to deploy, run and maintain it. Thankfully, even this is becoming easier and easier.

In this post I want to walk you through one of the simplest ways of deploying and (if you have money) scaling these models.

HuggingFace

Deploying on HuggingFace is definitely my favourite, it's super easy to get started and there's burgeoning community developing in the opensource machine learning space.

If you're not already familiar with HuggingFace, it's an end to end machine learning platform, with a GitHub like interface.

One of my favourite features is the model hub + the ease of deploying a model. And this is what I'm going to show you now.

First let's take a look at the model hub: https://huggingface.co/models

Here, you'll find all the open source models, from both individual people as well as companies like Google, Meta, OpenAI etc.

For the purpose of this blog, let's go ahead and find StabilityAI's stable-diffusion-2. A text to image model, because playing with images is fun: https://huggingface.co/stabilityai/stable-diffusion-2.

Here, you can see all the information about the model, relevant research papers, licenses, model card, examples, useages, the code etc.

To the right, you'll notice a deploy button. Go ahead and click that. There's three different options:

InferenceAPI (which allows you to use the model in API form for testing purposes: this is not for production use!)
Inference Endpoints (which is what we'll use)
Spaces (this allows you to deploy the model behind a GUI using Gradio, Streamlit etc. We won't cover this in this post).

Here you can choose a cloud provider: so far AWS has the most GPU options (and I haven't actually seen GCP usable, so I don't know if that's available yet or not!).

For this model, I chose a large GPU on AWS, feel free to try others, however, small ran out of memory for me!

Once you've clicked deploy, maybe go get a coffee or something cause deploying will take some time.

As soon as it's up and running, you can start using the built in UI for testing. More importantly, you can also use it as an API to power your applications.

For example, testing:

And the equivalent cuRL:

curl https://endpoint-name.us-east-1.aws.endpoints.huggingface.cloud \
-X POST \
-d '{"inputs":"A beautiful sunset over the horizon on a clear day in Rome"}' \
-H "Authorization: Bearer <hf_token>" \
-H "Content-Type: application/json"

Now, you can use this API and your token to build any apps or tools you want.

At this point, you're probably wondering about scalability, monitoring and observability.

Scalability

If you flick through the tabs at the top, you can configure replica autoscaling to suit your needs.

Monitoring

The analytics tab is a good starting point, to monitor latency, requests, and utilization.

I would suggest building out more detailed monitoring dashboard once you actually build an application on top of this.

Observability

The logs tab again is a good starting point to see what's going on with your model.

I would suggest more verbose, detailed logging at various levels of your actual application (LB, API gateway, server etc)

Annd, there you have it, a fairly production ready Generative AI API for you to utilize.

Note: Always remember to delete or pause your deployed models, if you don't want a nasty surprise at the end of the month!

Last thought I want to leave you with: really consider the cost of hosting a model vs using one as a service through something like OpenAI. A lot of these services are very cheap, compared to running a model yourself (on prem or cloud), so always do some maths and projections before making a choice!

Superpower your push notifications with AI

Aarushi Kansal — Mon, 15 May 2023 00:30:47 +0000

Lately, I've been spending a lot of time thinking about how best to target users in a meaningful way that doesn't feel so spammy - I've come to the conclusion that AI can help me with super, super personalized, almost real time personalization.

In particular, I want to focus on push notifications - in my experience, our phones are over saturated with notifications. Looking at my screen right now, I have about 56 notifications from different apps and none of them really stand out to me.

Is using AI over engineering? Well, I would say for most companies or apps with a couple of thousand users it's not scalable to handcraft push notifications that are personalised, timely and also have the ability to change certain aspects based on user behavior.

So let's see how we can do better with AI. For the purpose of this demo, let's pretend we're working for a food delivery app - you know the kind; restaurants, delivery drivers and food orderers in one eco-system.

Okay, so let's consider why and when we'd want to send push notifications. I see a few potentials:

Big events (sports, holidays, those international something days)
New restaurants
Targeting a user based on who they are (dietary requirements, are they traveling? Where do they normally order from? etc)

There's more but let's keep it to these ones for now.

First up, we're going to start with generic push notifications for restaurants.

For this, I'm using a subset of Zomato restaurants, pulled from here and cleaned for my needs.

Data set up

We're going to ingest all the restaurant data, and their details into a vector database - this is what's going to allow us to plug in an LLM eventually, which will create descriptions, push notifications and eventually make decisions on changing pushes and when to send pushes.

You can find the full code here

But let's focus on some specifics:

Restaurant = {
        "classes": [
            {
                "class": "Restaurant",
                "description": "An MunchMate Restaurant.",
                "moduleConfig": {
                    "text2vec-openai": {
                        "skip": False,
                        "vectorizeClassName": False,
                        "vectorizePropertyName": False
                    }
                },
                "vectorIndexType": "hnsw",
                "vectorizer": "text2vec-openai",
                "properties": [
                    {
                        "name": "description",
                        "dataType": ["text"],
                        "description": "The general description written by an LLM.",
                        "moduleConfig": {
                            "text2vec-openai": {
                                "skip": False,
                                "vectorizePropertyName": False,
                                "vectorizeClassName": False
                            }
                        }
                    },
              // code omitted     
    }

Take a look at the Restaurant class, which contains all of the details of the restaurant (or any other entity you want to focus on for your usecase). In particular notice, we use text2vec-openai module to vectorize the description - here you can use any other model and module (plenty of new things coming out).

The reason I'm only vectorizing description in this example is because that's the only thing I envision doing a semantic search on later down the line.

Descriptions

Okay, so next up, we need to write a description for the restaurant. I don't want to do it manually TBH!

So, let's use our LLM to actually generate a restaurant's description. Here, it depends on your usecase, in particular your CRM and/or marketing strategy on what you want to focus on about the restaurant.

My prompt is as follows:

    Create a description, for a restaurant with the following details:
    Restaurant Name: {restaurantName}
    Cuisines: {cuisines}
    Delivering now: {isDeliveringNow}
    Table Booking Offered: {hasTableBooking}
    City: {city}

    Stick to the information provided. Do not make up any information about the restaurant in your description.

With this, I end up with various descriptions like this:

BJ's Country Buffet is an American and BBQ restaurant located in Albany, offering a range of delicious dishes. With both dine-in and delivery options, you can enjoy the restaurant's offerings from the comfort of your own home. You can also book a table in advance to make sure you have the perfect spot to enjoy their delicious cuisine.

The reason I'm setting up descriptions rather than just creating push notifications directly from the restaurants details is because I want this to become a base for all different types of marketing campaigns, CRM and anything that involves content to get users back into the app.

Now on to the fun stuff.

Push Notifications For Events

Like we talked about initially, one of the usecases for custom pushes could be specific events. A push crafted for each individual restaurant.

Again, you can check out the full code here

I'll walk you through some parts of it.

events = ["valentines day", "super bowl", "international women's day"]
    for e in events:
        generatePrompt = """
          Write a tempting, short push notification, with a short heading for the following new Restaurant:
          Description: {description}.
         Do not make up any information in the push notification
         Target the push towards: {event}

        Style:
        Heading:
        Content: 
        """


// code omitted

new_push_properties = {
                "content": push_content,
                "event": e
            }
            new_push_id = get_valid_uuid(uuid4())
            client.data_object.create(
                data_object=new_push_properties,
                class_name="Push",
                uuid=new_push_id
            )
            client.data_object.reference.add(
                from_uuid=restaurant["_additional"]["id"],
                from_property_name="hasPush",
                to_uuid=new_push_id
            )

So, this time we've changed our prompt to creating a push notification, based on the description. This is where you can also get a bit more creative: maybe you have a certain style or tone of voice that you want all pushes to sound like. This is where you can also pass examples of those in.

In this example, I end up with pushes like this:

Heading: Sweet Valentine's Day!
Content: Satisfy your sweet tooth at Cookie Shoppe!

This is pretty good so far, minimal manual effort - code it once and every time there's a new event, you can have a bunch of pushes for each restaurant. Or maybe every time you have a new restaurant added on a particular event day, an automatic push is generated and sent out - hopefully getting you users to that restaurant.

Let's take it further and personalise to each individual user. Not just by name, but by information we already have about them.

Personalised Push Notifications

Okay so for this step you're going to need information about your users to create a sort of biography about them. For simplicities sake, I've created bios for users in this example. But in reality you can get more creative and collect information about users based on their eating history, you can ask them to input preferences. For other applications, maybe you can think about social logins and asking to collect some information about them from there.

Let me walk you through some of the specifics:

First, I've gone ahead and created a User class. Which just contains a name and bio (remember in a real life case this data structure is likely to be more complex).

    user_schema = {
        "classes": [
            {
                "class": "User",
                "description": "A user.",
                "properties": [
                    {
                        "dataType": ["text"],
                        "name": "biography",
                    },
                    {
                        "dataType": ["text"],
                        "name": "name"
                    }
                ]
            }
        ]
    }

Next, I've given two users bios.

"biography": "Alex enjoys Mexican food, hates valentines day, loves the superbowl",
        "name": "Alex"


"biography": "Alice is a vegetarian, woman, loves Valentines day and the superbowl.",
        "name": "Alice"

Note here, one of the good things about LLMs is that you don't need to be super strict on the structure of bio or what information to include - essentially whatever information you have - pop it in and let the LLM work it's creativity to give some personalized. Some personalization is better than none I say.

The prompt I'm using:

            generatePrompt = """
                 Write a short, targeted, clever, punchy push notification, with a short heading for the following new Restaurant:
                Description: {description}.
                The push notification should appeal to this user, based on their biography, likes and dislikes: {person}.
                Base the push notification on today's event: {event}.
                Use their name.
                Do not make up any information in the push notification. 
                Style:
                Heading:
                Content: 
            """

Again, you can be as creative as you want with the prompt!

And I end up with pushes like these:

Heading: Bye Bye V-Day!
Content: Alex, treat yourself to BBQ at BJ's! Enjoy dine-in or delivery.
Heading: Valentine's Day at Austin's!
Content: Alex, forget roses, get BBQ! Order now: Austin's BBQ & Oyster Bar
Heading: Skip the Date Night
Content: Alex, forget V-Day with taco night at El Vaquero!
Heading: Bye Bye Valentines!
Content: Alex, end the day with something sweet at Cookie Shoppe!


Heading: Veg Out Today!
Content: Alice, make Valentine's Day extra special with a delicious feast from Austin's BBQ and Oyster Bar!
Heading: Celebrate V-Day with Alice!
Content: Enjoy Mexican-style vegetarian dishes at El Vaquero Mexican Restaurant in Albany!
Heading: Valentine's Day at Cookie Shoppe!
Content: Alice, satisfy your sweet tooth with our delicious veg options!

What I really like about these is it's taking into account differences in each person based on their bios. For example, Alex doesn't like Valentines day, but Alice does. Or the fact that Alice is vegetarian, but we don't know Alex's dietary requirements so it doesn't focus on that aspect.

Annnd there you have it - few hundred lines of code and now you can have you're own fully personalised push notification builder.

Take it a step further and you can build out full blow marketing campaigns.

In my personal work, I'm going to be taking it further and figuring out the tone of voice my users react best to. If it's of interest, let me know and I'll write about that too!

Giving your Large Language Model skills: Stoicism Meets AI

Aarushi Kansal — Fri, 28 Apr 2023 19:47:21 +0000

By now, you've probably already seen the onslaught of large language models (LLMs), whether closed source (e.g Open AI), open source (e.g. LLamA, HuggingChat) or maybe you're hosting one yourself (💰💰) and the language skills they possess.

In this post I want to talk about giving your LLMs "skills". When I talk about skills, I mean more abilities that a "human" might be able to do. In particular, I'm going to focus on the ability to focus on a very specific domain, reasoning and the ability to take an action and memory.

Why?
Well, these "skills" are going to start becoming the basis of any LLM application you want to build. Purely prompting an LLM via an API or UI can only take you so far. A general LLM isn't trained or finetuned on your specific data, it doesn't have access to the most recent data/events/news and it doesn't have a memory by default.

Okay so, the application I'm going to focus on is a chatbot for Stoicism for the modern age. While Stoicism is great, it's ancient and I want actionable, modern day advice BASED on Stocism. Like so: https://www.linkedin.com/feed/update/urn:li:activity:7056228587254755328/

While models like GPT-4 already have knowledge on Stoicism, I want my model to narrow down on particular practitioners. I don't want it be basing it's Stoic advice on all of the internet/freely available data. So, enter: Weaviate DB, and two of my favourite books: Meditations by Marcus Aurelius and Letters from a Stoic written by Seneca.

And the next thing I mentioned was "actionable, modern day"...so where would I as human being get any information really? Google. And that's exactly what I'm going to give my LLM the ability to do.

Now, as a chatbot, it also needs to have memory, I need it to be able to remember what we were talking about? There are a few ways to that and in this case, we're going to use the simplest form, just via the prompt.

And finally, and my favourite: the ability to reason. As a chatbot that gives modern day advice based on Stoicism, it needs to be able to not just answer a question but logically think about how to get to the answer. For example: the user wants Stoic advice in a modern context...so I should search the DB and then how do I make it modern? I should Google it, what do I need to Google?

So, with all of that in mind, this is roughly what're going to build:

We're using two tools: Google search (via Serper API) and a Vector DB (Weaviate), which contains the two books mentioned earlier.

On top of that we're using the concept of "Agent" which is a wrapper around a model you'll input the query in here, and get outputted an action to take.

The AgentExecutor, written in Python is responsible for actually executing the action (i.e Google search or search DB).

And with this kind of set up, here's an example of it's thoughts and actions:

Here I ask it how to get motivation for the gym, first it searches for motivation via Stoicism then understands and reasonss and then decides to Google for building habits aligning with your values.

Show me the code!!

You can find the full code here: https://gist.github.com/aarushik93/2a9c9c050e78b34ff2a701bf5c6faf31. Below I'll walk you through the most relevant bits!

Note: We're not going through ingesting data into a VectorDB in this post. I will show you in another post or you can check out the weaviate documentation: https://weaviate.io/developers/weaviate

client = weaviate.Client(
    url=WEAVIATE_URL,
    additional_headers={
        'X-OpenAI-Api-Key': OPENAI_API_KEY
    }
)

vectorstore = Weaviate(client, "Paragraph", "content")
ret = vectorstore.as_retriever()

AI = OpenAI(temperature=0.2, openai_api_key=OPENAI_API_KEY)

# Set up the question-answering system
qa = RetrievalQA.from_chain_type(
    llm=AI,
    chain_type="stuff",
    retriever=ret,
)

search = GoogleSerperAPIWrapper(serper_api_key=SERPER_API_KEY)

# Set up the conversational agent
tools = [
    Tool(
        name="Stoic System",
        func=qa.run,
        description="Useful for getting information rooted in Stoicism. Ask questions based on themes, life issues and feelings ",
    ),
    Tool(
        name="Search",
        func=search.run,
        description="Useful for when you need to get current, up to date answers."
    )
]

In this block we're doing a few things:

Setting up the Weaviate and SerperAPI clients
Setting up the RetrievalQA chain, which is going to allow us to query the DB using the question we/AgentExecutor provides it AND stop the LLM from hallucianting or making up answers.

The prompt template for that is:

"""Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.
{context}
Question: {question}
Helpful Answer:"""

And finally, we're setting up the Tools for the AgentExecutor to actually use: Take note of the descriptions, these are what help the LLM determine which tool it tells the Agent it wants to use.

Okay, next up:

prefix = """You are a Stoic giving people advice using Stoicism, based on the context and memory available.
            Your answers should be directed at the human, say "you".
            Add specific examples, relevant in 2023, to illustrate the meaning of the answer.
            You can use these two tools:"""
suffix = """Begin!"
Chat History:
{chat_history}
Latest Question: {input}
{agent_scratchpad}"""

## agent 
prompt = ZeroShotAgent.create_prompt(
    tools,
    prefix=prefix,
    suffix=suffix,
    input_variables=["input", "chat_history", "agent_scratchpad"],
)

This is the actual prompt we're using to get the LLM to act like a Stoic.

First, take note, we tell it has access to two tools...we didn't specify the tools yet but that's okay, this is a template and the Agent wrapper will actually craft the prompt, with the tools and all.

Next, take notice we're also creating this prompt as a template for the chat history, with a latest question, that way the LLM has a so called memory.

Keep in mind with this approach the memory is limited by the number of tokens we can actually send with the context. And that depends on the model you're using.

Also take not of this so called "agent scratchpad": which is where the model can do it's "thinking".

The next part to take note of:

agent_chain = AgentExecutor.from_agent_and_tools(
    agent=agent, tools=tools, verbose=True, memory=st.session_state.memory
)

This is the actual executor, when the LLM outputs a decision, this is the thing that executes that decision.

And there you have it. Your Stoic Bot.

Prompt Engineering

Aarushi Kansal — Sat, 22 Apr 2023 21:42:16 +0000

I want to start off this series, with some general theory and then we'll go into building an actual application!

By now you've probably already tried some form of zero shot or few shot prompting. If you haven't here's quick explanation:

Both are ways of getting a large language model (LLM) to perform a task that it wasn't specifically trained for without finetuning. Some usecases could be:

Sentiment analysis on text the model's never seen before
Writing in your, or your company's style or tone of voice
Creating product names The list goes on, but I think you get it.

Zero Shot:

The simplest one is to just tell or ask the model to do what you want, with no examples whatsoever. This can work quite well, given the vast training data most of the really popular LLMs are trained on.

Few Shot:

This one just means giving the model a few examples of the task or final output you want. It's a way of guiding the model to your final outcome.

Both of these are a great ways of getting simple, shorter tasks done. But once you have anything more complex, you need more of a "human" brain. And human brains allow us to think and reason. That's where chain of thought comes into the picture.

Chain of Thought

The full paper is here

The TLDR; chain of thought is a way to prompt the LLM to reason its way through a problem by chaining rationales, one step at a time, until it reaches a solution. Again, there are two types:

Zero shot:

This involves getting the model to work through a problem using natural language, like "show your rationale, step by step" or ""Let's break this down into smaller steps and consider each one in turn. What is the first step we need to take, and what information do we need to gather in order to take that step? Once we have that information, what is the next logical step to take? Let's continue this process until we have a clear understanding of the issue at hand and a plan for how to proceed."

Few shot:

With shot you give some examples of rationale chains, so it knows how to work through a problem. For example, by providing a few examples on how to do BODMAS, before asking it to solve a maths problem.

So, there you have it, a very quick theory session on prompt engineering. In the next post, we'll start putting this (in particular chain of thought) into action by building a content engine. All powered by AI.

Dwight AI

Aarushi Kansal — Sat, 22 Apr 2023 00:22:36 +0000

Introducing my new lil project: Dwight Schrute the Reporter...an AI driven Dwight Schrute that post about the news 🤖

I've been working on lots of different AI things atm and in this project I'm bringing some of the most most interesting things together all with various ML models for different tasks:

Hyper relevant to a user
Logic and reasoning
Memory
NLP for technical things like working with an API

With the aim being a full blown human like news writer. Follow along here as Dwight posts and gets smarter: https://dwightdoesthenews.com/

Secrets With SOPS

Aarushi Kansal — Sat, 13 Nov 2021 11:58:54 +0000

SOPS (Secrets OperationS) is an open source tool from Mozilla, intended to edit, encrypt, decrypt a range of different file types, such as YAML, JSON, ENV etc.

Encryption can be done in variety of ways, using major cloud providers encryption tools, PGP, and even age.

In this article, we'll focus on using AWS + KMS. A similar setup and workflow can be used for GCP and Azure as well.

Installing

Download + install one of:

More details can be found on the SOPS github repo.

Configuring

Pre-requistes for this are:

A ready to use KMS key.
Correctly configured AWS credentials, for example:

[default]
aws_access_key_id = <access-key-id>
aws_secret_access_key = <access-key>

[kmsuser]
aws_access_key_id = <kmsuser-access-key-id>
aws_secret_access_key = <kmsuer-access-key>

A separate kmsuser is not a requirement, but SOPS supports switching profiles, which will be discussed later on.

Next, you'll need to set up your sops configuration, which means telling sops which key to use, possibly what profile and what role to use.

Set up a .sops.yaml, locally.

Some configurations are as follows:

sops:
    kms:
    - arn: arn:aws:kms:ap-southeast-2:036762315531:key/46b7ee9d-d11a-4a7e-83a5-c83fe5c93e8f

This is the most basic configuration. It specifies KMS and the specific resource to use for encryption and decryption.
There is no profile or role listed, so it uses your default credentials.

sops:
    kms:
    - arn: arn:aws:kms:ap-southeast-2:036762315531:key/00aa1727-d895-4dc9-a10c-96ad40470a91
      aws_profile: kmsuser

In some situations you'll want to define alternative credentials, so you can specify which profile to use, from your credentials file.

sops:
    kms:
    -   arn: arn:aws:kms:ap-southeast-2:036762315531:key/00aa1727-d895-4dc9-a10c-96ad40470a91
        role: arn:aws:iam::913492025681:role/sopsuser

SOPS also allows you to make use of AWS' roles feature, meaning you can use KMS from multiple accounts.

Using

Encryption:

sops -e secrets.yaml > secrets.enc.yaml

Decryption:

sops -d secrets.enc.yaml > secrets.yaml

Example:

Plaintext secrets:

apiVersion: v1
kind: Secret
metadata:
    name: t0p-S3cret
type: Opaque
data:
    password: 12345-password

Encrypted secrets:

apiVersion: ENC[AES256_GCM,data:690=,iv:GM5Rle5baQNBC4MBECfVEY9YZzAeywnHcrcclGnwAVw=,tag:xN311xVOyvqC+TXy16KNcQ==,type:str]
kind: ENC[AES256_GCM,data:PGiPB4h3,iv:t9kAkvT9u38dwqOtBAPXEcLGqBa07/Ggk4gEhO/SzSQ=,tag:4NN94br3Ut9EmB/zMjkWMw==,type:str]
metadata:
  name: ENC[AES256_GCM,data:AZP+jxs5kVJQyh5ZcxROzIuuZgTsEQ==,iv:wA2OVYCQ8icb10XIRxTZu+QMILUoORrIOJmh30rmX84=,tag:VGBR8shJQ8x7RQY0R5fMqQ==,type:str]
type: ENC[AES256_GCM,data:fqP1lGtK,iv:bzhdcaZ1WyJpgy4v3Q2MS0J6q3XNLRtC2qbdWHkoqtk=,tag:dGbR3gWt54lnRRIYtq7i9w==,type:str]
data:
  password: ENC[AES256_GCM,data:ihVGHIa/SqDxC64wzFRvtFcKtk3WPmpjIWUh3HxCo60=,iv:gcxL6u2JNh+T7lXb5VbfZS9aKun8ZOAK+X93uJ4Vd6M=,tag:/y5UTFa3mIiAaV6RPif9mQ==,type:str]
sops:
  kms:
    - arn: arn:aws:kms:ap-southeast-2:036762315531:key/46b7ee9d-d11a-4a7e-83a5-c83fe5c93e8f
      created_at: "2021-11-12T06:28:22Z"
      enc: AQICAHguJRDZ0cg53Sh5Mus9w8WLD236AYz81m6wFTHAa6ObgQFSNXL+AHX+kn+akWNtP7aQAAAAfjB8BgkqhkiG9w0BBwagbzBtAgEAMGgGCSqGSIb3DQEHATAeBglghkgBZQMEAS4wEQQMKaUlIgOrUMmOA/LzAgEQgDs62h0/zahsnr+4z1trkI+Euk5WkWqkQBnBh3KijqPEJJnKnPE9v41vSGJLbfeI8QOruvR6YwU2V3G7LQ==
      aws_profile: kmsuser
  gcp_kms: []
  azure_kv: []
  hc_vault: []
  age: []
  lastmodified: "2021-11-12T06:28:23Z"
  mac: ENC[AES256_GCM,data:Lwo28isqP6hA2nxjXDTnkglZjj8Ip1+W+erYlV/dq7r7YoJWAE+vFbWdiKIm4wE7bhSsoNQiIFGbQVqRx7VoGjwAE8A//0BCfrd7i5dTS5+/c0BOiLLrpNSqdTxRiNTUMGcvvWWnmkf+uBmMN/pOhyXwhdB+z9h0ST6Y3rR+zHE=,iv:l01KhN0a6BeoIIn45lbUamNKBNWX2eTMo7ToA2OsF/I=,tag:jfuPs6lS913B7Yb0jMjefA==,type:str]
  pgp: []
  unencrypted_suffix: _unencrypted
  version: 3.7.1

As you can see here, we have a regular secrets manifest, which is encrypted and can then be checked in or shared freely.

SOPS encrypts all the values, not just secrets, specifies metadata such as profile and kms key used.

CI

There are a number of ways to use sops encrypted secrets in your CI workflow.

The most basic way is to install sops, decrypt and apply the decrypted file to your cluster.
For example:

sops -d secrets.enc.yaml | kubectl apply -f -

However, it's most like you're using some kind of manifest management tool and will want secrets to work within that ecosystem. To achieve this there are some wrappers for sops:

Final thoughts

SOPS is a great tool to get started with a GitOps style of secret management. However, there are some consideration you should take into account before committing to this solution:

Key rotation
Lack of control over who can see secrets once in the cluster
Scalability for large teams, or a large number of secrets

Why is IBM cloud not widely used?

Aarushi Kansal — Fri, 03 Jul 2020 18:55:23 +0000

I started using IBM mainly because I needed more computing power for my machine learning projects, and I actually found IBM less complicated, cheaper (for my projects not sure about other usecases) and the overall user experience was much nicer than other cloud providers....and yet I don't know anyone in my network that uses IBM for their work or side projects?

How to create your own invisibility cloak

Aarushi Kansal — Sun, 27 Oct 2019 12:00:46 +0000

I wanted to start learning about computer vision and I grew up on Harry Potter, so as my first experiment, I tried to make Harry's coveted invisibility cloak.

This is an introduction into colour detection, colour spaces, and opencv.

For this demo, I am using GoCV, as I also want to further experiment with Go + computer vision.

For anyone not familiar, this is what an invisibility cloak is

It makes the wearer invisible!

Theory

Color spaces

The first aspect we need to understand is colour spaces. Most people will be familiar with the RGB colour space. If you're not here's an article explaining RGB.

The problem with this colour space is that, it attempts to describe the level of red, green and blue, all in one value. This works for display purposes, but not for detecting colour in video.

Colour in the natural world is very much dependant on lighting, and any object has a range of shades, even for the same colour.

This is where the HSV color space comes into play. HSV describes color with one channel, hue (H), which means that color is very much only dependant on the hue, and different shades, and lighting are taken into account by saturation (S) and value (V).
So in our example, a cloth might all green, but because of the folds, the lighting in the room etc, different parts of the cloth will have slightly different colors. Using HSV, all we would need to do is adjust the V, to take into account light changes, and our algorithm is still able to recognise different greens still being green.

Masks and morphing in CV

A mask is essentially just another image, that you can apply on top of another image of the same size.

Morhping is the term we use to describe applying certain transformations to an image. OpenCV offers a few different transformations, all of which are based on a 'Kernal'.

In the world of image processing, a kernal is a matrix, which is used to actually apply the different effects we want, such as blurring or dilation.

Check out transformations here

Show me the code

Code here

Let's look more closely at the code.

In this block, we are doing all the set up, creating the video capture object (0 refers to webcam), creating the window, setting up the various mask objects we will use, and setting up the HSV values for a green cloak.

Here we grab the static background, with a sleep, which gives the webcam time to start up and video to load (if we capture the background as soon as the webcam starts, we end up with some odd lighting in the background).

We then start continuously reading frames from the webcam, and doing some transformations.

First, we convert our frame to the HSV colour space.
Then we filter that HSV image for the pixels that are within our HSV green space, and create a mask. This allows us to create a mask specifically for our green cloak.

Once we've identified our cloak, the magic can begin.

Here we can see we create our kernal, we then do a dilation. Up to this step we have a black dilated mask on our cloak.

A dilation is a type of morphological transformation that increases the object's area. I'm using this to ensure all the coloured areas is covered by our mask, while still keeping the shape of our cloak.

Since we want it to be invisible, we then invert the mask.

Now, we have an 'invisible' mask, in the shape of our cloak.

The final step is to combine the masks, with the frame, and the background and display the finished product on screen.

And now you have your very own invisibility cloak!

Notes

I suggest editing the HSV values to suit your needs, play with lighting, and colours to get the best effect for you. In my case, the lighting in my apartment was quite nice to work with, dimmer in different areas, so I was able to play around with values and locations to come up with a an example such as this

Video here