How to run Cisco Foundation-Sec-8B on Colab for FREE

#ai #security #llm

Cisco Foundation-Sec-8B is an open source LLM model as an expert for security domain. Since the training data is including threat intelligence reports, vulnerability databases, incident response documentation, and security standards. Please find detail from the official huggingface website:

https://huggingface.co/fdtn-ai/Foundation-Sec-8B

We can easily setup an environment on Google Colab to run the example code on hugginface.

First, go to colab and create a new file:

https://colab.research.google.com/#create=true

In the first code cell, please input the codes to setup an environment for the example code:

!pip install torch
!pip install git+https://github.com/huggingface/transformers
!pip install git+https://github.com/huggingface/accelerate
!pip install huggingface_hub

Click the left RUN button of the code cell, then you will see as below:

After the setup finished, your environment is ready. Now we can put the example code to run. Please add an code cell and input the codes (from example part 1):

# Import the required libraries
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

# Load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("fdtn-ai/Foundation-Sec-8B")
model = AutoModelForCausalLM.from_pretrained("fdtn-ai/Foundation-Sec-8B")

Click the left RUN button of the 2nd code cell, then you will see as below: (the download process of LLM model 16GB may need more than 5 minutes)

Finally, we can put another code cell and input the code (from example part 2):

# Example: Matching CWE to CVE IDs
prompt="""CVE-2021-44228 is a remote code execution flaw in Apache Log4j2 via unsafe JNDI lookups (“Log4Shell”). The CWE is CWE-502.

CVE-2017-0144 is a remote code execution vulnerability in Microsoft’s SMBv1 server (“EternalBlue”) due to a buffer overflow. The CWE is CWE-119.

CVE-2014-0160 is an information-disclosure bug in OpenSSL’s heartbeat extension (“Heartbleed”) causing out-of-bounds reads. The CWE is CWE-125.

CVE-2017-5638 is a remote code execution issue in Apache Struts 2’s Jakarta Multipart parser stemming from improper input validation of the Content-Type header. The CWE is CWE-20.

CVE-2019-0708 is a remote code execution vulnerability in Microsoft’s Remote Desktop Services (“BlueKeep”) triggered by a use-after-free. The CWE is CWE-416.

CVE-2015-10011 is a vulnerability about OpenDNS OpenResolve improper log output neutralization. The CWE is"""

# Tokenize the input
inputs = tokenizer(prompt, return_tensors="pt")

# Generate the response
outputs = model.generate(
    inputs["input_ids"],
    max_new_tokens=3,
    do_sample=True,
    temperature=0.1,
    top_p=0.9,
)

# Decode and print the response
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
response = response.replace(prompt, "").strip()
print(response)

Click the left RUN button of the 3rd code cell, then you will see as below: (Since this is the inference process, so we will wait for some time, in my case is 7 minutes), and the result is the same as example code expectation "CWE-117"

If you want to use a GPU to speed up the inference process, you can upgrade Colab and connect to a GPU-enabled machine (which requires payment), which will significantly reduce inference time.

Reference and thanks:

I've tried to use Colab with L4 GPU, the inference process need only 1 second to print out the result as below: