DEV Community

Cover image for Build a Competitive Intelligence Tool Powered by AI
Shola Jegede
Shola Jegede

Posted on

Build a Competitive Intelligence Tool Powered by AI

Everyone doing business right now has a competitor and with how fast technology is moving, staying ahead of the competition isn't just important—it's essential.

To do this, many companies turn to Competitive Intelligence (CI) tools, which help track competitor activities, offerings, market shifts, and customer behavior and sentiments. When powered by AI, these tools take things a step further, analyzing data and turning it into actionable insights that can help these businesses make smarter decisions and maintain a competitive edge.

Rather than just observing trends, AI-enhanced CI tools give companies a deeper understanding of what's happening in their industry, often in real time. With this, businesses can proactively adapt to shifts, respond to competitor moves, and even identify new opportunities before they become widely known. In fact, CI isn't just for large corporations—it can level the playing field for smaller businesses as well, empowering them to make data-driven decisions that are just as informed as those made by their bigger counterparts.

Why Competitive Intelligence Matters

Competitive intelligence (CI) isn't just about keeping an eye on what your competitors are doing—it's about gaining valuable insights that guide your business decisions. Whether you’re tweaking your pricing strategy, refining messaging, optimizing your value propositions, or developing new products, CI gives you the data you need to make informed choices. But it’s not just about collecting information; it’s about using that information effectively to stay ahead.

Top 3 key problems that AI-powered CI tools can solve

Here are a few key problems that AI-powered CI tools can solve:

With all these advantages, it is clear that integrating a Competitive Intelligence tool into your business is no longer optional—it’s essential.

How to Build a Competitive Intelligence Tool Using Python

Now that we understand the value of competitive intelligence, let's dive into how you can build your own AI-powered Competitive Intelligence Tool.

We would be using the following tools:

  • Python
  • LangChain
  • Ollama (Local LLM)
  • BrightData
  • Selenium
  • Streamlit

Step 1: Set up Python Environment

First, set up a Python environment. Then, in your project’s root folder, create a file named requirements.txt. Copy and paste the following dependencies into that file:

streamlit 
langchain 
langchain_ollama
selenium
beautifulsoup4
lxml 
html5lib
python-dotenv
Enter fullscreen mode Exit fullscreen mode

Next, activate your environment by running the following command:

./name_of_environment/Scripts/Activate
Enter fullscreen mode Exit fullscreen mode

Then, install all the dependencies at once by running the following command:

pip install -r requirements.txt
Enter fullscreen mode Exit fullscreen mode

Step 2: Streamlit UI

Create a Python file named main.py in your root folder. In this file, we'll build a simple Streamlit user interface.

Streamlit is an incredibly straightforward tool for creating Python-based web applications with minimal code. It's one of the easiest ways to interact with tools like Large Language Models (LLMs), which we'll be using in this tutorial.

Here’s the code to set up the interface:

import streamlit as st

st.title("Competitive Intelligence Tool (Demo)")
url = st.text_input("Enter Competitor's Website URL")

if st.button("Gather Insights"):
    if url:
        st.write("Analyzing the website...")
Enter fullscreen mode Exit fullscreen mode

To run the Streamlit application, open your terminal, activate your virtual environment (if not already active), and type the following command, specifying the name of the Python file that contains your Streamlit app (main.py in this case):

streamlit run main.py
Enter fullscreen mode Exit fullscreen mode

It will spin open a web server containing the application.

Streamlit Web Server
Once we build the Streamlit UI, the next step is to actually grab data from the website that we want to scrape. In order to do that, we are going to be using a Python module known as Selenium.

Selenium allows us to automate a web browser, so we can actually naviagte to a web page, grab all of the content that's on that page and then we can apply some filtering on the content and then pass it into an LLM like ChatGPT or Gemini, and then we can use that LLM to parse through the data and give us a meaningful response.

Step 3: Set up Bright Data

Bright Data Dashboard

Bright Data is a web data platform that enables businesses to collect and structure any public web data as well as to see the Web accurately from any location without getting blocked or misled thanks to a wide proxy network.

For this tutorial, you can use them entirely for free.

Click here to create an account.

After that, go to your dashboard and make a new instance/zone of a tool called Scraping Browser.

Create scraping browser

The scraping browser includes a captcha solver as well as connects to a proxy network. What this means is that it would automatically give your new IP adresses and cycle through those in order to simulate as if you were a real user accessing a website.

It also means that if there is a captcha, it will automatically solve it for you so you don't need to deal with being blocked by CAPTCHAs.

So, type in a zone name and create it.

Image description

Then click OK.

One main advantage of Bright Data for developers is, this just works with the code that you already have.

In our case, we are using Selenium. So, just copy the URL

Image description

Then create a .env file in your root directory and paste the URL:

SBR_WEBDRIVER="paste_the_url_here"
Enter fullscreen mode Exit fullscreen mode

Step 4: Web Scraping Component

Next, create a new file named scrape.py. This is where we will write our web scraping functionality, separating it from the main file so it's easier for us to navigate.

To start, import a few selenium modules into your scrape.py file, then write a function that takes a website's domain, scrape all the contents of the webpage, cleans it, and returns all of the content.

from selenium.webdriver import Remote, ChromeOptions
from selenium.webdriver.chromium.remote_connection import ChromiumRemoteConnection
from bs4 import BeautifulSoup
from dotenv import load_dotenv
import os

load_dotenv()

SBR_WEBDRIVER = os.getenv("SBR_WEBDRIVER")

def scrape_website(website):
    print("Connecting to Scraping Browser...")
    sbr_connection = ChromiumRemoteConnection(SBR_WEBDRIVER, "goog", "chrome")
    with Remote(sbr_connection, options=ChromeOptions()) as driver:
        driver.get(website)
        print("Waiting captcha to solve...")
        solve_res = driver.execute(
            "executeCdpCommand",
            {
                "cmd": "Captcha.waitForSolve",
                "params": {"detectTimeout": 10000},
            },
        )
        print("Captcha solve status:", solve_res["value"]["status"])
        print("Navigated! Scraping page content...")
        html = driver.page_source
        return html


def extract_body_content(html_content):
    soup = BeautifulSoup(html_content, "html.parser")
    body_content = soup.body
    if body_content:
        return str(body_content)
    return ""


def clean_body_content(body_content):
    soup = BeautifulSoup(body_content, "html.parser")

    for script_or_style in soup(["script", "style"]):
        script_or_style.extract()

    # Get text or further process the content
    cleaned_content = soup.get_text(separator="\n")
    cleaned_content = "\n".join(
        line.strip() for line in cleaned_content.splitlines() if line.strip()
    )

    return cleaned_content


def split_dom_content(dom_content, max_length=6000):
    return [
        dom_content[i : i + max_length] for i in range(0, len(dom_content), max_length)
    ]
Enter fullscreen mode Exit fullscreen mode

Step 5: Setup Ollama LLM

Create a new file called parse.py. Then copy and paste the code below and then we will setup Ollama locally which would be used to execute the LLM.

from langchain_ollama import OllamaLLM
from langchain_core.prompts import ChatPromptTemplate

template = (
    "You are tasked with extracting or generating specific information based on the following text content: {dom_content}. "
    "Please follow these instructions carefully:\n\n"
    "1. **Follow Instructions:** Perform the task as described here: {parse_description}.\n"
    "2. **Precise Output:** Provide the most concise and accurate response possible.\n"
    "3. **No Additional Text:** Do not include extra comments, explanations, or unrelated information in your response."
)

model = OllamaLLM(model="llama3.2")

def parse_with_ollama(dom_chunks, parse_description):
    """
    Handles a variety of tasks based on the provided parse_description.
    Parameters:
        - dom_chunks (list of str): Chunks of text content to process.
        - parse_description (str): Instruction for the task (e.g., "Find competitors", "Summarize product").
    Returns:
        - str: Combined result of all tasks across chunks.
    """
    prompt = ChatPromptTemplate.from_template(template)
    chain = prompt | model

    parsed_results = []
    for i, chunk in enumerate(dom_chunks, start=1):
        response = chain.invoke(
            {"dom_content": chunk, "parse_description": parse_description}
        )
        print(f"Processed batch {i}: {response}")
        parsed_results.append(response)

    return "\n".join(parsed_results)
Enter fullscreen mode Exit fullscreen mode

Ollama allows you to run opensource LLMs locally on your computer. So, you don't need to rely on things like API tokens and it is completely free.

To get started with Ollama, visit this link: https://ollama.com/download

Once Ollama is downloaded and installed, open your terminal or command prompt and type the Ollama command:

ollama
Enter fullscreen mode Exit fullscreen mode

You'd get something that looks like this:

Image description

Next, what you need to do is pull an Ollama model. You will need to download the Ollama model locally before your code can be executed.

To do that, visit https://github.com/ollama/ollama

Here you would see all the different models that you can use.

Image description

Pick an appropriate model based on the specs of your computer. For this tutorial, we are using the Llama 3.2 model. It requires only 3 GB of RAM.

Next, go back to your terminal or command prompt and run this command:

ollama pull llama 3.2
Enter fullscreen mode Exit fullscreen mode

This will then download the model for you onto your computer. Once this is complete, you can now go on to use this model in your parse.py file.

Step 6: Test your tool

Now you can go on to run your code using this command:

streamlit run main.py
Enter fullscreen mode Exit fullscreen mode

And it's all set.

You can go on to modify the code how ever you want, enable it to get data from multiple URLS or multiple domains all at once.

Add data visualization to make it actionable for your business using pandas.pydata.org and matplotlib.org

Or even automate the data collection process to track competitor updates regularly. Use cron jobs or Python's schedule module to run the data scraping and analysis scripts at defined intervals.

To see the full code, check out the GitHub repo:

GitHub logo sholajegede / CompetiAI

AI-powered Competitive Intelligence Tool

CompetiAI

AI-powered Competitive Intelligence Tool






Conclusion

There is a lot of potential to building a Competitive Intelligence Tool for your business or within your product. By combining web scraping and text analysis, you can create a tool that helps you stay ahead of the competition and make smarter decisions.

This can significantly improve your product development, marketing strategies, sales outreach, and overall market awareness.

The competitive edge these tools offer is invaluable, especially in industries where changes happen rapidly and competition is fierce. With advancements in AI and machine learning, you can expect even more sophisticated capabilities, from predictive analytics to real-time market alerts.

If you're considering building a CI tool, starting with a project like this is a fantastic way to get hands-on experience. Experiment, iterate, and enhance the tool as you identify new ways it can add value to your business operations.

Have thoughts or feedback on this tutorial? Share them in the comments below, or feel free to connect with me. I'd love to hear about how you're using competitive intelligence to transform your business!

Top comments (0)