DEV Community

mgbec for AWS Community Builders

Posted on • Originally published at Medium on

Configure it Out with AWS AgentCore and Kiro

AI Security is a huge, ever evolving topic, with no simple and easy answers. Both the OWASP AI Exchange (https://owaspai.org/) and the OWASP GenAI Security Project (https://genai.owasp.org/) are incredible sources of information for all things AI, from threat intelligence, governance, MCP security, agentic security, and more. One recent release I have been looking at is the OWASP AIBOM Generator (https://genai.owasp.org/resource/owasp-aibom-generator/). As we’ve seen with some of the recent software supply chain attacks, understanding the dependencies we have in our ecosystem is critical. The OWASP AIBOM generator gives us the AI equivalent of a Software Bill of Material. The tool allows you to enter any Hugging Face model and generate an AIBOM in CycloneDX format. Available model metadata and dependencies are extracted and formatted in a machine readable and human understandable format. Since AIBOM’s, like AI in general, are rapidly evolving, this tool also provides a “completeness score” to indicate how much data is available regarding the model.

To test the tool yourself, you can go to https://huggingface.co/spaces/GenAISecurityProject/OWASP-AIBOM-Generator and enter in a model name, for example “google/functiongemma-270m-it”. The tool will generate a breakdown of the model field categories and completeness score. You are also able to download the json data.

AIBOMs will be incredibly important as we further integrate AI into our businesses. GenAI security, in general, is a huge topic and I wanted to see if I could investigate and streamline a process with any other pieces of the AI security puzzle. I’ve been experimenting with AWS Kiro as an IDE and AWS AgentCore as an agentic platform. My project here today with both of them is to build on the AIBOM generation and see what other types of security analysis we can automate. With the help of Kiro, this is what I came up with:

Model Security Analysis Workflow (https://github.com/mgbec/aibom-with-multiple-options)

The security analysis follows a 5-step process orchestrated by the AIBOMAgentOrchestrator:

  1. Model Information Gathering (HuggingFaceService)

    Fetches detailed model metadata from Hugging Face Hub
    Collects information about files, configuration, dependencies, license, author, etc.
    This provides the foundation for security assessment.

  2. AIBOM Generation (AIBOMGenerator)

    The system generates an OWASP-compliant AI Bill of Materials by:

    -Analyzing model files: Categorizes files as model weights (.bin, .safetensors), configuration (.json), or source code (.py)
    -Identifying components: Creates component entries for each file with metadata like supplier, version, and description
    -Detecting dependencies: Maps framework dependencies based on the model’s library (transformers, pytorch, etc.)
    -Security scanning: Automatically flags potential risks like:
    Pickle files (high severity — can execute arbitrary code)
    Missing or unknown licenses (medium severity)
    Suspicious file patterns

  3. AI-Powered Security Analysis (BedrockAgentService)

    AWS Bedrock provides intelligent security insights through this analysis process:

    -Creates a detailed prompt with AIBOM data and model information
    -Uses Claude 3 Sonnet to perform deep security analysis
    Analyzes patterns, dependencies, and potential vulnerabilities
    Security Assessment Categories:
    -Risk Scoring: 0–10 scale with risk levels (LOW/MEDIUM/HIGH/CRITICAL)
    -Vulnerability Detection: Known CVEs, unsafe formats, suspicious components
    -Compliance Issues: License problems, regulatory concerns
    -Recommendations: Actionable security improvements
    -File Analysis: Identifies unsafe formats and suspicious files

  4. Risk Evaluation

    The system evaluates multiple risk vectors:
    -Technical Risks: Unsafe file formats, known vulnerabilities
    -Legal Risks: License compliance, intellectual property issues
    -Operational Risks: Model provenance, supply chain security
    -Data Risks: Training data concerns, bias detection

  5. Reporting

Generates detailed HTML reports with:
-Executive summary with risk scores
-Detailed vulnerability breakdown
-Compliance gap analysis
-Actionable recommendations
-Visual risk indicators

Key Security Features

Automated Threat Detection:
-Scans for pickle files
-Identifies unknown/missing licenses
-Flags suspicious file patterns
-Detects outdated dependencies

AI-Enhanced Analysis:
-Uses large language models for pattern recognition
-Provides context-aware security recommendations
-Generates human-readable explanations
-Adapts to new threat patterns

OWASP Compliance:
-Follows OWASP AIBOM standards
-Uses CycloneDX format for interoperability
-Provides structured vulnerability data
-Enables supply chain transparency

Example Security Analysis Output

When you run the analysis, you get structured results like:

{
“risk_score”: 7.5,
“risk_level”: “HIGH”,
“vulnerabilities”: [

{
“type”: “unsafe_format”,
“severity”: “high”,
“description”: “Model uses pickle format which can execute arbitrary code”,
“cve_id”: “AIBOM-12345678”
}
],
“recommendations”: [
“Convert pickle files to safer formats like safetensors”,
“Verify model provenance and author reputation”
]
}

The integration with AWS Bedrock tries to ensure that the analysis stays current with emerging threats and security best practices.

But wait, before it sounds like I am terribly arrogant and think I have solved the AI security problem- this is more of a starting point. There are so many aspects of AI security that are not covered in my process- it is just square one, I fully admit.

That being said, let’s take a look at some of the ways we can evaluate models:

Analyze a model: agentcore invoke ‘{“action”: “analyze_model”, “model_name”: “BAAI/bge-m3”}’

Multiple model comparison: agentcore invoke ‘{“action”: “compare_models”, “model_names”: [“microsoft/DialoGPT-medium”, “facebook/blenderbot-400M-distill”]}’

Or, if you want to compare quite a few at once:

agentcore invoke ‘{
“action”: “compare_models”,
“model_names”: [
“microsoft/DialoGPT-small”,
“microsoft/DialoGPT-medium”,
“microsoft/DialoGPT-large”,
“facebook/blenderbot-400M-distill”,
“facebook/blenderbot-1B-distill”,
“google/flan-t5-small”
]
}’

Reporting:

The program attempts to build on the AIBOM information using Bedrock and an AgentCore agent.

If you ask for analysis of one model, you will be given: a security analysis, recommendations, analysis methodology, risk factor analysis, security checklist, and threat modeling information.

If you compare models, you aren’t given as much detail, but instead, you will see common components, unique components, and a short security comparison of the models.

Reports are generated and stored locally, as well as in an S3 bucket.

This analysis of the models is just a start, even if this was completely accurate. As we all have been learning, much of the security battle is in the workflow design, data security, infrastructure management, observability, and more. So, I am pointing us all back to the OWASP AI Security resources, as well as all the other risk management frameworks and resources that are being created globally. We live in interesting times!

Last note- this would have been much more difficult without the assistance of Kiro. I’ve been using it since last summer and it is just getting better and better. So, thanks to my extremely patient coder and indefatigable troubleshooter, Kiro (and all the real people behind the scene). All the work is greatly appreciated.

Top comments (0)