DEV Community

Cover image for Exploring Amazon Bedrock: Harnessing Mistral Large, Mistral 7B, and Mistral 8X7B
Renaldi for AWS Community Builders

Posted on

Exploring Amazon Bedrock: Harnessing Mistral Large, Mistral 7B, and Mistral 8X7B

In the evolving landscape of artificial intelligence, large language models (LLMs) like OpenAI's GPT-4 have been transformative, driving significant advancements and previously unattainable capabilities.

Image description

Initially, OpenAI's GPT series set the industry standard. However, as technology advanced, numerous new models emerged, each with unique strengths and specialized applications. Among these, the models from Mistral, particularly notable for their sophisticated reasoning and multilingual capacities, have marked a substantial progression in AI capabilities.

This guide delves into the latest offering from Mistral—Mistral Large—providing insights into its functionalities, performance comparisons, and real-world applications. It’s designed for a broad audience, from seasoned data scientists to developers and AI aficionados.

What is Mistral AI?

Mistral AI, established in April 2023 by former Meta Platforms and Google DeepMind employees, aims to revolutionize the AI market by delivering robust, open-source LLMs alongside commercial AI solutions. The launch of Mistral 7B in September 2023, a model with 7.3 billion parameters, notably outperformed other leading open-source models at the time, positioning Mistral AI as a frontrunner in open-source AI solutions.

Understanding Mistral Large

Launched in February 2024, Mistral Large stands as Mistral's flagship text generation model, boasting formidable reasoning and multilingual capabilities across several languages including English, French, Spanish, German, and Italian. It excels in code generation and mathematics, outperforming competitors across various benchmarks such as HumanEval, MBPP, and GSM8K.

Launching Your Experience with Mistral Large on Amazon Bedrock

To use Mistral models through Amazon Bedrock, you first need to access the models on the Amazon Bedrock platform. This involves selecting the specific Mistral AI models you wish to use and requesting access to them via the Amazon Bedrock console under the "Model access" section. Once access is granted, you can test the models using features like the Chat or Text in the Playgrounds section of the console.

For more programmatic interactions, such as within an application, you can utilize the AWS Command Line Interface (CLI) or the AWS Software Development Kit (SDK). These tools allow you to make API calls to Amazon Bedrock to invoke the Mistral models, specifying parameters such as the prompt, max tokens, and other settings that control the behavior of the model generation.

For example, you can send JSON-formatted requests specifying the model ID (like mistral.mistral-7b-instruct-v0:2 for the Mistral 7B model), the prompt, and other generation parameters. This setup enables you to integrate sophisticated AI-driven text generation, summarization, or other language processing tasks directly into your applications, leveraging the robust, scalable infrastructure of AWS. In this example, we'll discuss how we can call it from our Python code and integrate it into an application.

You'll need to first start with importing the relevant libraries.
These libraries serve distinct functions within Python to facilitate data handling, system operations, and interaction with external services. The json library handles encoding and decoding JSON data, a common format used for data exchange between servers and web applications. The os library provides a portable way of using operating system-dependent functionality like reading or writing to the filesystem, managing directories, and accessing environment variables. The sys module offers access to some variables used by the Python interpreter and functions that interact strongly with the interpreter, such as fetching command line arguments or exiting the program. Lastly, boto3 is the Amazon Web Services (AWS) SDK for Python, enabling Python scripts to perform actions like managing AWS services and resources, automating AWS operations, and directly interacting with AWS services like Amazon S3 and EC2. Together, these libraries are instrumental in generative AI applications that are scalable, interact with the operating system, and integrate with cloud-based services.

import json
import os
import sys
import boto3
Enter fullscreen mode Exit fullscreen mode

If you're already using Amazon Bedrock, the code below should be boilerplate at this point. We initialize the bedrock_runtime with the required AWS parameters to make the API request to the mistral model.

bedrock_runtime = boto3.client(
    service_name='bedrock-runtime', 
    aws_access_key_id=os.getenv('aws_access_key_id'),
    aws_secret_access_key=os.getenv('aws_secret_access_key'),
    region_name='us-west-2'
)
Enter fullscreen mode Exit fullscreen mode

We then configure an API request to invoke an AI language model, specifically the "mistral.mistral-7b-instruct-v0:2" model, to generate a short narrative inspired by the evolution of AI. The input_prompt variable sets the thematic prompt for the AI's response. The body of the request is structured in JSON and includes the prompt, a limit of 512 tokens for the response length (max_tokens), and parameters (top_p and temperature) set at 0.9 to influence the diversity and unpredictability of the AI's prose. The accept and contentType fields specify that both the incoming request and the outgoing response are expected to be in JSON format, standard for web APIs. This setup is typical in applications where developers need to utilize machine learning models to automate creative content generation.

input_prompt = "Write a short epic inspired by the evolution of AI."

modelId = "mistral.mistral-7b-instruct-v0:2"
body = json.dumps({
    "prompt": input_prompt,
    "max_tokens": 512,
    "top_p": 0.9,
    "temperature": 0.9,
})
accept = "application/json"
contentType = "application/json"

Enter fullscreen mode Exit fullscreen mode

We then set up a request to an AI model, specifying several parameters for generating text based on an input prompt. The modelId specifies the particular AI model to use, in this case, "mistral.mistral-7b-instruct-v0:2", which is likely a specific configuration of the Mistral 7B model tailored for instruction-based tasks. The body is structured as a JSON object containing the prompt, which is the user's input or question, and settings for the generation process: max_tokens limits the length of the output to 512 tokens, top_p sets the nucleus sampling parameter to 0.9 (filtering the model's token choices to the top 90% cumulative probability), and temperature adjusts the randomness of the output to 0.9, influencing the variety of responses. Both accept and contentType are set to "application/json", indicating that the request and response content should be in JSON format, which is a common content type for web API interactions. This setup is typically used to programmatically interact with language models via APIs, sending data in a structured format and expecting a response in a similar format.

response = bedrock_runtime.invoke_model(
    body=body,
    modelId=modelId,
    accept=accept,
    contentType=contentType
)
Enter fullscreen mode Exit fullscreen mode

Next, we take the body of the response object, read it, and then parse the JSON-encoded string into a Python dictionary. After converting the response into a dictionary, we accesses the first element of the outputs list (indexed by [0]). From this element, we retrieve the value associated with the key 'text'. Finally, we then print the extracted text to the console, prefixed by "Generated Epic:". This is used to display the result to the user.

response_body = json.loads(response['body'].read())
generated_epic = response_body['outputs'][0]['text']
print("Generated Epic:\n", generated_epic)
Enter fullscreen mode Exit fullscreen mode

And it's easy as that! There are many more potential prompts you can put forward. Some of them also include the below:

Analyze the sentiment of the following text: {text}

Here, we ask it to analyze sentiment and return an appropriate response based on it.

Mistral 8X7B is adept at generating code based on high-level descriptions. Here's an example of generating a Python function:

Generate a Python function for a function that takes a list of numbers and returns the sum of squares

That being said, in my opinion it still is a bit lacking for logical processes, so I still stick to using it for multilingual tasks.

Practical Applications of Mistral Large

Mistral Large's versatility shines in fields ranging from content creation and customer support to programming help and educational resources. Its ability to understand and generate text in multiple languages makes it particularly valuable for global applications. Use it to complement your workloads for translation-related work, particularly in communicating between different stakeholders.

Comparative Advantage

Mistral Large introduces advanced features like a 32K token context window for processing large texts and the capability for system-level moderation setup. Its API, available under an Apache 2.0 license, allows flexible and broad usage. When compared to other LLMs such as GPT-4, Claude 2, and LLaMA 2 70B, Mistral Large offers competitive performance at a more accessible price point, particularly excelling in reasoning and multilingual tasks. That being said, if performance is your sole focus, it would better to still use the aforementioned leading foundational models, as it still cannot match them.

Looking Ahead

The roadmap for Mistral Large promises continued enhancements and community-driven improvements, aiming to further refine its reasoning capabilities and broaden its language support. I would very much like to see it offer stronger community support, along with more powerful foundational models to be on par with the leading ones.

Conclusion

Mistral Large represents a significant leap in AI, combining advanced reasoning, multilingual support, and coding prowess in a cost-effective model. It stands out not only for its strong performance but also for leading in several benchmarks, offering a compelling alternative to commercial models like GPT-4.

Top comments (0)