DEV Community

Mike Anderson
Mike Anderson

Posted on

Building a Safe Internal AI Assistant with Amazon Kendra and Amazon Bedrock

AWS Data Security

A practical, human guide for teams trying to reduce risky copy/paste into external AI tools

Let’s start with the real problem.

Most teams are not using ChatGPT, Claude, Midjourney, Canva, or other AI tools because they want to break security policy. They use them because they are busy, under pressure, and trying to get work done.

A developer needs help with an error message.

A security engineer needs the latest data-handling rule.

An HR or IT team member needs the right internal process.

A project manager needs to understand which AWS account belongs to which client.

The answer probably exists somewhere already. It may be in Confluence, Google Drive, Slack, an AWS runbook, or an old project folder.

But if finding the answer internally takes 20 minutes and an external AI tool gives a useful answer in 20 seconds, people will naturally choose speed.

That is the real security problem we are trying to solve together.

Not: “How do we stop people from using AI?”

The better question is:

How do we give employees a safe, approved, useful AI assistant that helps them work faster without leaking internal data?

That is where Amazon Kendra, Amazon Bedrock, and a properly designed Retrieval-Augmented Generation (RAG) architecture can help.


The environment we are solving for

This blog is written for an organization that looks like this:

  • Confluence stores security, IT, business, HR, policy, procedure, and development-environment design documents.
  • Google Drive is used for file sharing and cloud storage.
  • Google Workspace is the identity provider and SSO platform.
  • AWS has multiple accounts for different clients, projects, and environments.
  • Slack is used heavily for team messaging.
  • Employees use AI tools every day for coding, troubleshooting, writing, design, research, and operations.
  • External AI platforms are already in use, including ChatGPT, Claude, Gemini, Midjourney, Canva, and others.
  • There are limited guardrails today to prevent users from pasting sensitive internal data into those tools.

This is a common situation.

It does not mean the organization is careless. It usually means AI adoption has moved faster than governance, security tooling, and internal knowledge management.

So our job is to design something practical.

We need a solution that helps users, protects data, supports audit needs, and does not create so much friction that everyone works around it.


The problem in plain English

The organization has valuable knowledge, but it is scattered.

Some of it is in Confluence.

Some of it is in Google Drive.

Some of it is buried in Slack.

Some of it is tied to AWS accounts, client projects, runbooks, and architecture decisions.

When people cannot find the right answer quickly, they start doing this:

“I will just paste the policy, error message, runbook, or architecture snippet into ChatGPT and ask for help.”

That one action creates multiple risks:

  • internal policies may leave the organization;
  • client or project information may be exposed;
  • AWS architecture details may be shared with an unapproved vendor;
  • source code or secrets may be pasted by mistake;
  • HR, legal, or incident information may be disclosed;
  • the security team may have no audit trail;
  • the organization may breach contractual, regulatory, or privacy obligations.

This is why the answer cannot be only “write a policy.”

A policy helps, but people still need a better way to work.

The safer pattern is:

Give users an internal AI assistant that can answer from approved internal sources, respect permissions, use Google identity, log safely, and apply guardrails before sensitive content leaves the trusted environment.


What is RAG?

RAG stands for Retrieval-Augmented Generation.

That sounds technical, but the idea is simple.

A normal AI chatbot answers from what the model already knows or from whatever the user pastes into the chat.

A RAG assistant does something safer and more useful:

  1. The user asks a question.
  2. The system checks who the user is.
  3. The system searches approved internal sources.
  4. It retrieves only the content the user is allowed to access.
  5. It sends only the relevant excerpts to the AI model.
  6. The AI model writes an answer using that retrieved content.
  7. The answer includes sources where possible.
  8. The event is logged for security monitoring and audit.

The important point is this:

In a secure RAG design, the model is not the source of truth. Your approved internal documents are the source of truth.

That matters because we do not want the AI assistant inventing policy, guessing approvals, or exposing documents the user should not see.


What is Amazon Kendra?

Amazon Kendra is AWS’s managed enterprise search service.

For this design, think of Kendra as the search and retrieval layer.

It connects to approved repositories, indexes content, and returns the most relevant passages when a user asks a question.

In our scenario, Kendra can help search:

  • Confluence spaces;
  • Google Drive shared drives and approved folders;
  • selected Slack channels, if approved;
  • S3 buckets that contain approved AWS runbooks, architecture records, policies, or compliance documents.

Kendra is useful because it can support user-aware retrieval. In simple terms, it can help make sure users only receive search results they are allowed to see.

That is a big deal.

Without this, the AI assistant could become a very fast data leakage engine.

With this, the assistant can become a safer front door to internal knowledge.

But there is one rule we should be strict about:

Kendra must not become a dumping ground for every document in the company.

Indexing needs ownership, approval, classification, and access-control testing.


What is Amazon Bedrock?

Amazon Bedrock is AWS’s managed service for building generative AI applications with foundation models.

In this design, Bedrock is the answer-generation layer.

Kendra finds the relevant internal content. Bedrock turns that content into a readable answer.

A secure Bedrock setup should include:

  • a system prompt that tells the model to answer only from retrieved sources;
  • Bedrock Guardrails for sensitive data, prompt attacks, denied topics, and unsafe outputs;
  • refusal behavior when the answer is not available from approved content;
  • source references so users can verify the answer;
  • low-temperature settings for policy, compliance, and operational answers.

The model should not receive entire document libraries.

It should receive the smallest useful set of authorized excerpts needed to answer the user’s question.

That is how we reduce exposure while still helping the user.


What this solution can and cannot do

This part is important.

Amazon Kendra and Amazon Bedrock can help us build a safe internal AI assistant.

They can help employees stop pasting internal data into unmanaged AI tools because they now have a useful approved alternative.

But they do not automatically control what a user types into ChatGPT, Claude, Midjourney, Canva, or another external AI platform.

So the complete solution has two parts.

Part 1: Give users a safe internal AI assistant

This is the Kendra + Bedrock RAG platform.

It should be the preferred place to ask questions about internal policies, procedures, AWS runbooks, development-environment designs, and approved operational guidance.

Part 2: Control risky external AI usage

This requires security controls outside Kendra and Bedrock, such as:

  • an AI acceptable-use policy;
  • data classification;
  • CASB;
  • Secure Web Gateway;
  • DLP;
  • endpoint controls;
  • secure browser controls;
  • an approved AI vendor register;
  • legal and privacy review;
  • an exception process;
  • SIEM monitoring.

If we only build the internal assistant but do not manage external AI usage, the risk remains.

If we only block external AI but do not give users a good alternative, people will look for workarounds.

The balanced answer is to do both.


The target architecture

Here is the clean version of what we are building.

AWS Kendra architecture

Employee
  |
  | Google SSO
  v
Internal AI Portal or Slack Bot
  |
  v
API Gateway
  |
  v
RAG Backend
  |
  |-- Validate Google identity
  |-- Resolve groups from a trusted source
  |-- Check prompt for secrets or restricted content
  |-- Apply data-handling policy
  |
  v
Amazon Kendra
  |
  |-- Confluence connector
  |-- Google Drive connector
  |-- Optional Slack connector
  |-- Optional S3 approved knowledge source
  |-- ACL and user-context filtering
  |
  v
Authorized excerpts only
  |
  v
Amazon Bedrock + Bedrock Guardrails
  |
  v
Grounded answer with sources
  |
  v
User
Enter fullscreen mode Exit fullscreen mode

Security telemetry should flow to the security team:

API Gateway logs
Lambda application logs
CloudTrail
CloudWatch
Kendra admin/query events
Bedrock Guardrail events
CASB/SWG/DLP events
SIEM/SOAR
Enter fullscreen mode Exit fullscreen mode

External AI usage needs a separate control path:

User -> External AI Platform
        |
        v
CASB / SWG / DLP / Secure Browser / Endpoint Control
        |
        |-- Allow low-risk approved use
        |-- Warn the user
        |-- Block restricted data upload
        |-- Log the event
        |-- Route exception requests
Enter fullscreen mode Exit fullscreen mode

This gives us a practical model:

  • help users with internal AI;
  • reduce risky copy/paste;
  • enforce permissions;
  • monitor misuse;
  • preserve audit evidence.

Step 1: Start with the use cases, not the technology

This is where many AI projects go wrong.

They start by asking:

“Which model should we use?”

That is not the first question.

The better first question is:

“Which user problems are we solving safely?”

Good first use cases are:

  • “Where is the vendor data-sharing procedure?”
  • “What is the approved process for creating a new AWS account?”
  • “Which security standard applies to development environments?”
  • “What is the incident response process for suspected data leakage?”
  • “What is the approved way to share files with a client?”
  • “Which Confluence page explains our developer onboarding process?”
  • “Which AWS guardrails apply to client project accounts?”

These are valuable, common, and manageable.

Avoid starting with:

  • full Slack workspace search;
  • full HR file search;
  • legal folders;
  • finance exports;
  • customer data exports;
  • source-code repositories;
  • incident evidence;
  • production secrets;
  • all Google Drive content;
  • all Confluence spaces.

We are not trying to prove that the assistant can read everything.

We are proving that it can safely answer useful questions.


Step 2: Classify the data before indexing it

Before connecting Kendra to repositories, agree on a simple classification model.

Classification Example AI handling
Public Published marketing content Allowed in approved tools
Internal General internal procedures Allowed in internal RAG
Confidential Security designs, client/project documents Internal RAG only with ACL enforcement
Restricted Credentials, sensitive customer data, HR/legal/incident records Do not index unless explicitly approved

This classification does not need to be perfect on day one.

But it does need to be clear enough to stop unsafe indexing.

A good rule is:

If we would be uncomfortable seeing the content summarized in an AI answer, we should not index it until the owner, access model, and guardrails are ready.

Kendra metadata should include classification where possible.

The backend should also apply a second check before sending retrieved content to Bedrock.

That gives us defense in depth.


Step 3: Use Google identity properly

Google Workspace is already the identity provider, so we should use it.

But we need to avoid a common mistake.

A Google ID token can prove who the user is, but it may not contain all the group membership information needed for authorization.

So the RAG backend should not simply trust group names sent by the browser.

Better options are:

  1. Use an internal identity broker that validates Google SSO and issues signed application claims.
  2. Resolve group membership server-side using Google Cloud Identity or Directory APIs.
  3. Use AWS IAM Identity Center integrated with Google Workspace, if that fits your identity strategy.
  4. Maintain a controlled mapping between Google groups and Kendra filters.

The goal is simple:

The user should only retrieve documents they are already allowed to access in the source system.

If the assistant gives a user more access than Confluence or Google Drive would give them directly, the design has failed.


Step 4: Decide how to separate clients, projects, and AWS accounts

This matters a lot in multi-account AWS environments.

If your organization has separate AWS accounts for different clients or projects, your knowledge base should respect that separation.

There are three common patterns.

Option A: One central Kendra index

This is operationally simpler, but it requires mature ACLs and metadata.

Use it only when all content belongs to the same organization and cross-project leakage is not a strict contractual concern.

Option B: Separate Kendra index per client or project

This is usually better for consulting, MSP, MSSP, or project-based environments.

It reduces the risk of one client’s information appearing in another client’s answer.

Option C: Separate AWS account per client or project RAG environment

This is the strongest isolation model.

Use this when contracts, regulations, or customer commitments require strict separation.

For most organizations handling client-sensitive information, Option B or C is safer.

The operating principle is:

The RAG architecture should follow the same isolation model as the business and cloud environment.


Step 5: Connect Confluence carefully

Confluence is probably the best first source.

It usually contains policies, procedures, runbooks, architecture notes, and development-environment designs.

But do not connect all of Confluence at once.

Start like this:

  1. Pick one or two approved spaces.
  2. Assign a data owner for each space.
  3. Review permissions.
  4. Remove stale broad-access groups.
  5. Exclude test, archive, personal, and unrestricted spaces.
  6. Configure the Kendra Confluence connector.
  7. Enable ACL ingestion where supported.
  8. Sync the data source.
  9. Test access with users from different roles.
  10. Review what the assistant returns.

Use positive and negative tests.

Test Expected behavior
Security engineer asks for a security runbook they can access Answer returned with source
Developer asks for a restricted incident report No restricted source returned
HR user asks for development architecture Only authorized content returned
User asks about a policy outside approved spaces Assistant says it does not have enough approved context

Do not skip negative testing.

That is how you catch overexposure before users do.


Step 6: Connect Google Drive with extra caution

Google Drive is powerful, but permissions can be messy.

There may be shared links, inherited permissions, old project folders, personal files, externally shared files, and forgotten documents.

Start with Shared Drives, not every user’s My Drive.

Good first sources:

  • approved IT procedures;
  • approved security standards;
  • developer onboarding guides;
  • cloud architecture templates;
  • approved compliance summary documents;
  • non-sensitive AWS runbooks.

Avoid at the beginning:

  • personal My Drive content;
  • HR case folders;
  • legal folders;
  • finance exports;
  • raw customer exports;
  • incident evidence folders;
  • unreviewed client directories.

The checklist is simple:

  1. Identify the folder owner.
  2. Review external sharing.
  3. Remove broad link-based access where it is not needed.
  4. Configure the Kendra Google Drive connector.
  5. Use inclusion and exclusion rules.
  6. Validate document-level permissions.
  7. Test with users from different groups.
  8. Review logs and returned sources.

If Google Drive is not cleaned up before indexing, the assistant may expose historical permission mistakes faster than normal search ever did.

That is why we index slowly and test carefully.


Step 7: Treat Slack as a front end first, not a data source

Slack is useful, but it is risky to index.

It contains informal decisions, screenshots, troubleshooting notes, incident discussions, old opinions, pasted logs, and sometimes secrets.

So our recommended approach is:

Use Slack as a way to ask the assistant before using Slack as a source of truth.

A safer pattern looks like this:

  1. User asks the Slack bot a question.
  2. The Slack bot maps the Slack user to Google Workspace identity.
  3. The bot calls the internal RAG API.
  4. The API applies the same identity, Kendra, and Bedrock controls.
  5. The answer is returned as an ephemeral message or direct response.
  6. Sensitive answers are not posted into shared channels.

Only index Slack later, and only after legal, privacy, and data owners approve it.

If Slack indexing is approved, start with a small number of knowledge channels.

Do not index DMs by default.

Do not index all private channels by default.

Do not index incident channels without explicit approval.


Step 8: Add AWS knowledge through approved documents

The assistant does not need direct access to every AWS account.

That would create unnecessary risk.

Instead, publish approved AWS knowledge into Confluence, Google Drive, or S3.

Useful content includes:

  • AWS account inventory;
  • client/project ownership matrix;
  • landing zone standards;
  • SCP and guardrail documentation;
  • cloud deployment process;
  • incident response runbooks;
  • Security Hub, GuardDuty, Macie, and CloudTrail operating procedures;
  • WAF and CloudFront standards;
  • approved architecture decision records;
  • data classification by account or project.

This gives engineers the answers they need without giving the assistant broad live access to cloud environments.

For client or project separation, use:

  • separate indexes where needed;
  • metadata filters;
  • Google group mapping;
  • document ownership;
  • quarterly access reviews;
  • cross-project query monitoring.

Step 9: Add Bedrock Guardrails and application guardrails

Do not rely only on the model prompt.

Prompts are useful, but they are not enough for production security.

Use Bedrock Guardrails and application checks together.

Guardrails should cover:

  • prompt-injection attempts;
  • requests for secrets;
  • access keys, tokens, passwords, and private keys;
  • requests to bypass policy;
  • requests to exfiltrate data;
  • unsafe coding or operational instructions;
  • regulated personal data where blocking or masking is required;
  • unsupported answers where retrieved context is insufficient.

The application should also enforce rules such as:

Answer only from retrieved authorized sources.
If the source context is insufficient, say so.
Do not invent policy.
Do not infer approval.
Do not reveal secrets.
Do not summarize restricted content unless explicitly allowed.
Cite sources where possible.
Enter fullscreen mode Exit fullscreen mode

This protects the user too.

A good assistant should not give a confident but wrong answer.

For security, compliance, and operations, “I do not have enough approved context” is often the safest answer.


Step 10: Log safely

Security teams need visibility.

But logging everything is dangerous.

User questions may contain secrets, customer names, source code, incident details, or HR information.

Model answers may contain summarized confidential content.

Retrieved excerpts may contain restricted policy or architecture information.

So the production logging rule should be:

Log enough to investigate misuse, but not enough to create a second sensitive data repository.

Good fields to log:

  • hashed user ID;
  • timestamp;
  • request ID;
  • source application;
  • Kendra query ID;
  • number of retrieved passages;
  • classification counts;
  • guardrail decision;
  • block reason;
  • latency;
  • error code;
  • client/project metadata where safe.

Avoid logging by default:

  • raw user query;
  • full prompt;
  • retrieved excerpts;
  • model answer;
  • document body;
  • secrets or detected sensitive values.

This is one of the most important production controls.

Otherwise, the AI logging pipeline becomes its own data leakage risk.


Step 11: Control external AI platforms without making users the enemy

This is where the tone matters in real life.

Users are not the enemy.

Most risky AI behavior happens because users are trying to move fast and do the right thing with poor tools.

So the control strategy should feel fair:

  1. Give users a good internal assistant.
  2. Explain what data can and cannot go into external AI tools.
  3. Allow approved external AI tools for public or low-risk work.
  4. Block or warn when confidential or restricted data is pasted externally.
  5. Provide a quick exception process.
  6. Coach repeat offenders instead of only punishing them.
  7. Use SIEM reporting to find patterns and improve guidance.

A simple policy model:

Data type External AI Internal RAG
Public Allowed in approved tools Allowed
Internal Allowed only in approved enterprise AI tools Allowed
Confidential Not allowed in unmanaged AI tools Allowed with ACLs
Restricted Not allowed Only with explicit approval or not indexed

Technical controls may include:

  • CASB;
  • Secure Web Gateway;
  • DLP;
  • endpoint DLP;
  • secure browser;
  • browser extension control;
  • DNS/web filtering;
  • SaaS allowlist/blocklist;
  • enterprise AI vendor controls.

The goal is not to block innovation.

The goal is to make the safe path easier than the risky path.


Step 12: Monitor for misuse and control failure

The SOC should not monitor every question like a surveillance program.

But it should monitor meaningful risk signals.

Examples:

Signal Why it matters
Repeated blocked prompts User may be pasting secrets or restricted data
High query volume by one user Possible scraping or compromised account
Queries across many client names Possible reconnaissance or cross-client harvesting
Kendra ACL sync failures Could cause overexposure or missing access
New broad data source added Could expand searchable content unexpectedly
Bedrock Guardrail blocks Indicates policy or safety issues
External AI DLP blocks Indicates attempted sensitive upload
Slack bot used in sensitive channels May expose answers to the wrong audience

Example detection ideas:

IF a user submits more than 25 RAG queries in 10 minutes
AND queries reference more than 3 client or project names
THEN create a SOC alert for possible internal data harvesting.
Enter fullscreen mode Exit fullscreen mode
IF prompt DLP detects an access key, private key, password, or token
THEN block the request, show safe guidance, and create a security event.
Enter fullscreen mode Exit fullscreen mode
IF CASB/SWG blocks upload to an external AI domain
AND content classification is Confidential or Restricted
THEN create a DLP case and notify the data owner.
Enter fullscreen mode Exit fullscreen mode

The response should be proportionate.

Not every blocked prompt is malicious.

Sometimes the control worked, and the user simply needs guidance.


Step 13: Roll out in phases

A safe rollout is better than a big launch.

Use this sequence:

  1. Approve the first AI use cases.
  2. Define data classification and AI-handling rules.
  3. Create a dedicated AWS security AI account.
  4. Enable Bedrock model access.
  5. Create Bedrock Guardrails.
  6. Create the Kendra index.
  7. Connect one low-risk Confluence space.
  8. Validate ACL filtering.
  9. Deploy the RAG API.
  10. Connect an internal web portal.
  11. Add one approved Google Shared Drive.
  12. Validate Google identity and group mapping.
  13. Add Slack as a front end.
  14. Consider Slack indexing only after legal and privacy approval.
  15. Integrate logs with the SIEM.
  16. Pilot with security, IT, and engineering users.
  17. Expand by data source and use case.

At each phase, ask:

  • Did the assistant answer useful questions?
  • Did it refuse when it should?
  • Did it respect document permissions?
  • Did the logs avoid sensitive content?
  • Did users trust the answers?
  • Did the tool reduce external AI copy/paste?

That last question matters.

The goal is not just to deploy architecture.

The goal is to change behavior safely.


What a good answer looks like

Imagine a user asks:

Can I send customer data from the analytics platform to an external dashboard vendor?

A good answer would say something like:

Based on the approved data-handling and vendor-sharing procedures available to you, customer data may only be shared with an external vendor after data-owner approval, vendor risk review, an approved transfer method, and required contractual or privacy checks. I do not have an authorized source confirming that this specific dashboard vendor is approved for the analytics platform. Please submit the request through the approved vendor data-sharing workflow.

That answer helps the user.

It does not shame them.

It does not invent approval.

It points them to the right process.

An unsafe answer would be:

Yes, export the data and upload it to the vendor dashboard.

Another unsafe answer would be:

The restricted architecture diagram says the analytics platform contains these customer fields...

If the user is not authorized to see the restricted diagram, the assistant must not reveal or summarize it.


Implementation checklist

Identity

  • Google SSO is enabled.
  • MFA and conditional access are enforced where required.
  • Groups are resolved server-side.
  • Browser-supplied groups are not trusted.
  • Group-to-Kendra mapping is tested.
  • Privileged access is reviewed.

Kendra

  • The Confluence connector is scoped to approved spaces.
  • The Google Drive connector is scoped to approved Shared Drives or folders.
  • The Slack connector is optional and approved.
  • ACL ingestion is validated.
  • Public or no-ACL documents are reviewed.
  • Classification metadata is applied where practical.
  • Access tests include both allowed and denied users.

Bedrock

  • An approved model is selected.
  • A Bedrock Guardrail is configured and versioned.
  • Prompt-attack filtering is enabled.
  • Sensitive information filters are enabled.
  • Refusal behavior is tested.
  • Citations or source references are returned where possible.

AWS platform

  • A dedicated security AI AWS account is used.
  • IAM least privilege is applied.
  • KMS encryption is configured.
  • CloudTrail is enabled.
  • CloudWatch log retention is set.
  • Logs are forwarded to the SIEM.
  • API Gateway and Lambda do not log raw prompts by default.
  • WAF is used if the API is internet-exposed.

External AI governance

  • An AI acceptable-use standard is published.
  • An approved AI tools register is maintained.
  • CASB/SWG/DLP controls are enabled.
  • An exception workflow is defined.
  • User guidance includes safe and unsafe examples.
  • Violations are monitored and handled proportionately.

The honest conclusion

Yes, this design solves a real problem.

But only if we position it correctly.

Amazon Kendra and Amazon Bedrock are not magic controls that stop every external AI risk.

They are the foundation for a better internal option.

The real solution is the combination of:

  • an approved internal RAG assistant;
  • Google SSO and trusted group resolution;
  • Kendra ACL-aware retrieval;
  • Bedrock generation with guardrails;
  • safe logging;
  • data classification;
  • client/project isolation;
  • SIEM monitoring;
  • DLP/CASB/SWG controls for external AI;
  • clear policy and user education.

The human lesson is simple:

People will use the tool that helps them get work done. Security’s job is to make the safe tool useful enough that people choose it naturally.

That is how we reduce shadow AI.

That is how we protect internal knowledge.

And that is how we give employees the speed of AI without asking them to gamble with company, client, or personal data.

Top comments (0)