DEV Community

Cover image for Intent Extraction with Google Gemini
Ranjan Dailata
Ranjan Dailata

Posted on

Intent Extraction with Google Gemini

Introduction

In this blog post, you will be guided with the mechanism on how to perform the "Intent" mining or extraction from the specified context. Intent mining is very essential for businesses and depending upon the situations, based upon the extracted intents, one could utilize and do some interesting things.

Intent Mining

Let's first try to understand what exactly the "Intent" mining is all about. The following definition or content was created by ChatGPT.

Intent Data Mining is a process that involves extracting valuable insights and information regarding the intentions, preferences, and behaviors of individuals or entities based on their digital interactions and activities. This data mining approach is particularly relevant in the context of online behaviors, such as web searches, social media interactions, and other digital engagements.

In the business and marketing context, Intent Data Mining focuses on understanding the signals that indicate a user's interest in specific products, services, or topics. By analyzing patterns and correlations within large datasets, businesses can identify key indicators of user intent, such as search queries, content consumption, or social media engagement. This information is then leveraged to make informed decisions about marketing strategies, sales approaches, and customer engagement.

Hands-on

  1. Please head over to the Google Colab
  2. Make sure to login to the Google Cloud and get the Project Id and Location Info.
  3. Use the below code for Vertex AI initialization purposes.
import sys

# Additional authentication is required for Google Colab
if "google.colab" in sys.modules:
    # Authenticate user to Google Cloud
    from google.colab import auth

    auth.authenticate_user()

PROJECT_ID = "<<project_id>>"  # @param {type:"string"}
LOCATION = "<<location>>"  # @param {type:"string"}

if "google.colab" in sys.modules:
    # Define project information
    PROJECT_ID = PROJECT_ID
    LOCATION = LOCATION

    # Initialize Vertex AI
    import vertexai
    vertexai.init(project=PROJECT_ID, location=LOCATION)
Enter fullscreen mode Exit fullscreen mode

The basic requirement for accomplishing the intent extraction is done via the careful consideration of the intent extraction prompt. Here's the code snippet for the same.

import vertexai
from vertexai.preview.generative_models import GenerativeModel, Part

def get_intent_extraction_prompt(content):
  schema = """
  "intents":[
    "intent": "",
    "statement": ""
  ]
  """
  prompt = f"""You are an expert intent detector. Your job is to detect and list down all the intents within the below content. Output the same in the specified JSON schema format. 
    Here's the content:
    ---
    {content}
    ---
    Here's the schema: 
    {schema}
    Do not respond with your own suggestions or recommendations or feedback.
 """
  return prompt
Enter fullscreen mode Exit fullscreen mode

Now let's see a generic code for executing the above intent extraction prompt using the Google Gemini Pro model. Here's the code snippet for the same.

import vertexai
from vertexai.preview.generative_models import GenerativeModel, Part

def execute_prompt(prompt, max_output_tokens=8192):
  model = GenerativeModel("gemini-pro")
  responses = model.generate_content(
    prompt,
    generation_config={
        "max_output_tokens": max_output_tokens,
        "temperature": 0,
        "top_p": 1
    },
  stream=True,
  )

  final_response = []

  for response in responses:
      final_response.append(response.candidates[0].content.parts[0].text)

  return ".".join(final_response)
Enter fullscreen mode Exit fullscreen mode

Now is the time to perform the prompt execution and do some JSON transformation for the extraction of topics. Here's the code snippet for the same.

Code block for extracting the JSON from the LLM response. Please note, at this time, Google Gemini Pro being released to the public and has some known issues in building the clean and formatted structured JSON response. Hence, the need to tweak a bit.

import re
import json

def extract_json(input_string):
    # Extract JSON within ``` block
    matches = re.findall(r'```(.*?)```', input_string, re.DOTALL)

    if matches:
        # Join the matches into a single string
        json_content = ''.join(matches)

        # Remove periods
        json_content = re.sub(r'\.', '', json_content)

        return json_content
    else:
        print("No ``` block found.")
        return None
Enter fullscreen mode Exit fullscreen mode

Now is the time to execute the code and perform the intent extraction. Here's the code snippet for the same. In the below example, you will notice the intention of the user is to summarize the content.

intents = []
instruct_prompt = f"get me the summary for the following content"
prompt = get_intent_extraction_prompt(instruct_prompt)
response = execute_prompt(prompt)
extracted_json = extract_json(response)
if extracted_json != None:
  intents.append(extracted_json)
Enter fullscreen mode Exit fullscreen mode

IntentExtraction

Top comments (0)