DEV Community

Creating a AI-enabled Slackbot with AWS Bedrock Knowledge Base

One of the lowest-friction, highest-ROI applications of large language models (LLMs) so far has been the internal AI assistant. Yes, AI doesn't have to be all about customer-facing chatbots or fully autonomous agents. Just a simple interface for users to ask questions like the following can be a powerful tool:

  • "How do I deploy this service?"
  • "What's the on-call runbook for this alert?"
  • "Where is the latest diagram for the design doc?"

These questions already have answers — scattered across Confluence pages, Google Docs, GitHub READMEs, and Slack threads. The problem isn’t generation. It’s retrieval.

Out of the box, LLMs are great at reasoning and summarization, but they’re completely disconnected from your organization’s institutional knowledge. Prompt stuffing helps a bit. Fine-tuning helps in very narrow cases. But neither scales when your knowledge base changes weekly, or when correctness actually matters.

This is the void that retrieval-augmented generation (RAG) fills.

RAG bridges the gap between probabilistic language models and deterministic internal knowledge. Instead of asking an LLM to guess, you retrieve relevant documents first, then ask the model to synthesize an answer grounded in that context. The result is an assistant that feels intelligent without being reckless — and, crucially, one that stays up to date without constant retraining.

If you're already on AWS, Amazon Bedrock Knowledge Bases provides an easy way to create, deploy, and integrate a RAG into your existing infrastructure. In this post, we'll walk through how to use AWS Bedrock Knowledge Base and connected to a Slackbot for a realistic internal, AI-enabled assistant use case.

Setting up AWS Bedrock Knowledge Base

From AWS console, navigate to Amazon Bedrock. Under Build, choose Knowledge Bases. As of time of writing, AWS currently supports indexing unstructured data via creating a custom vector store, utilizing Kendra GenAI service, or enabling semantic search with structured data (e.g., databases, tables).

Since most internal data is likely to be unstructured (e.g., Confluence documentation, markdown files, etc), we'll choose "Create knowledge base with vector store" option. As of time of writing, AWS supports Confluence, Salesforce, Sharepoint, and Web Crawlers on top of S3 (note: there is a limit of 5 data sources at the moment). For the purpose of this demo, let's choose Confluence. To connect, we'll need to store credentials in AWS Secret Manager as described in the detailed guide.

Next, we need to configure our data source parsing strategy (either AWS default parser or utilizing a foundation model like Claude as a parser) as well as chunking strategy for our vector database. Bedrock will automatically chunk documents, generate embeddings, and store vectors in OpenSearch Serverless service based on our configurations here. The performance of the RAG will depend on these parameters, but for a quick demo, we can use default chunking and use Amazon Titan embeddings to start out with.

Once the vector store is set up, we just have to manually sync our data store by syncing the data source. You can imagine adding Sharepoint for internal PDFs, crawling open source library documentation websites, as well as some internally hosted S3 files.

Setting up a Slack bot

With the "hard" part out of the way, we need to set up a Slack App via the Slack Admin Console. The key things we need are:

  1. Enabling Socket Mode
  2. Minimally, chat:write, app_mentions:read, and channels:history under OAuth Scopes
  3. Then grab the bot tokens under "Basic Information" page

The final part is to actually code up a Slack bot. We can use the Slack Bolt SDK to quickly spin up a bot using Python. We want the bot to do three things at a high-level:

  1. Parse Slack events (or respond to mentions, slash commands, etc)
  2. Query the Knowledge Base
  3. Generate a response

A quick pseudocode could look like:

def handler(event, context):
    text = extract_slack_message(event)

    retrieval = bedrock.retrieve(
        knowledgeBaseId=KB_ID,
        query=text,
        retrievalConfiguration={"vectorSearchConfiguration": {"numberOfResults": 5}}
    )

    prompt = build_prompt(text, retrieval["results"])

    response = bedrock_runtime.invoke_model(
        modelId="arn:aws:bedrock:us-east-1:...:inference-profile/us.anthropic.claude-sonnet-4-5-20250929-v1:0",
        body=prompt
    )

    post_to_slack(response)
Enter fullscreen mode Exit fullscreen mode

Tuning for performance

Now time for the real magic. Because LLMs are non-deterministic, we need to guide it with some context for better performance. While RAG provides most of our "internal" knowledge, we can still use prompt engineering to guide the generation side.

You can include a prompt like:

You are an internal engineering assistant.

Answer the question using ONLY the provided context.
If the answer is not in the context, say you do not know.

<context>
{{retrieved_chunks}}
</context>

Question: {{user_question}}
Enter fullscreen mode Exit fullscreen mode

and pass it with the user's questions to dictate what the LLM will do.

The other dial we can turn is how we embed and store our internal knowledge. AWS has a great guide on how content chunking works for knowledge bases. The key takeaway is that depending on how the data is structured, different chunking schemes will perform better. For example, lots of Confluence documentation has a natural hierarchical pattern with headings and body, so using hierarchical chunking can link information better and lead to better retrieval performance.

Wrapping up

AI-enabled Slackbots are quickly becoming the front door to internal knowledge. With Amazon bedrock Knowledge Bases, AWS has made it easy to build a RAG without knowing how to operate and maintain a vector database for the most part.

With powerful LLMs like ChatGPT and Claude, creating a Slack bot is easier than ever. But if you would like to compare your solution with a working model, there is a slightly outdated yet functional example from AWS team on Github that you can follow.

Top comments (0)