DEV Community: Olumide Shittu

Orquesta raises €800,000 in pre-seed funding!

Olumide Shittu — Mon, 02 Oct 2023 11:11:27 +0000

We are thrilled to announce that we have raised €800.000 in an oversubscribed pre-seed funding round to develop the world’s most innovative collaboration platform that empowers companies to seamlessly integrate and operate their products utilizing the capabilities of Large Language Models.

Our funding round is led by Curiosity VC with Adriaan Mol’s investment office, Spacetime, as co-lead. In addition, Golden Egg Check Capital, and various angel investors such as Koen Köppen (Mollie), Milan Daniels, and Max Klijnstra (Otrium), as well as Arjé Cahn (Bloomreach), participated.

The rapid advancement in generative artificial intelligence and the emergence of Large Language Models (LLMs) have led to explosive growth in the number of companies adopting this relatively new form of AI. The World Economic Forum even predicts that these tools, which utilize both external and internal datasets to generate new content such as texts, summaries, translations, and inputs for chatbots, will replace 44% of today’s work activities by 2027.

Based on these developments and predictions, we have developed cutting-edge technology that enables our clients to seamlessly integrate and operate their products using the power of Large Language Models through a single collaboration platform. Our platform centralizes prompt management, streamlines experimentation, collects feedback, and provides real-time insights into performance and costs. It is compatible with all major Large Language Model providers, ensuring transparency and scalability in LLM Ops. Ultimately, this leads to shorter customer release cycles and reduced costs for both experiments and production environments. Orquesta is the first company to create such a platform in Europe.

Replacing a fragmented AI tech stack with one single gateway

LLM models are in a constant state of evolution, and research from McKinsey shows that organizations are struggling to keep up in a secure and scalable way. Founded by Sohrab Hosseini and Anthony Diaz, Orquesta offers companies a single gateway through which various LLM models, including those from OpenAI, Azure, Google, Hugging Face, and Cohere, can be directly integrated into their business operations.

All interactions and data are consolidated within one clear dashboard, simplifying the process of experimenting with different prompts in various models and facilitating localization and personalization. Additionally, the platform empowers clients of the technology startup to collect feedback directly from end users, domain experts, and other AI models. These features, combined with real-time insight into performance and costs, lead to optimized results at all times, reducing the release cycles of SaaS companies from weeks to minutes and minimizing the costs associated with experiments and production environments.

As Orquesta is a no-code platform, there is no programming work involved, which promotes efficiency and collaboration within software teams. Less time is required of the engineers, product management maintains the needed overview and internal domain experts can easily check if all interactions are accurate.

We prioritize enterprise-grade security and privacy, providing organizations the tools to comply with EU regulations and build secure products.

€800.000 in pre-seed funding

Today, we are proudly announcing the successful completion of our pre-seed funding round. A total of €800,000 is raised to fund the development and launch of the platform and focus on Europe, which is double the targeted amount. The round is led by Curiosity VC with Adriaan Mol’s investment office, Spacetime, as co-lead. In addition, Golden Egg Check Capital, and various angel investors such as Koen Köppen (Mollie), Milan Daniels and Max Klijnstra (Otrium), and Arjé Cahn (Bloomreach) participated.

“The accelerated development of new AI functionalities, leads to increased fragmentation, complexity and a need for speed within the technology landscape for our clients. This prevents the already scarce talent of our clients from focusing on strategic work and growth. With one LLM gateway, including all the necessary tooling, Orquesta is their partner to remain competitive.” says Sohrab Hosseini, co-founder of Orquesta. “We are very excited about the focus and multidisciplinary experience of the various VCs and Angels who match and support our level of ambition.”

Herman Kienhuis, managing partner of Curiosity VC: "The availability of Large Language Models (LLMs) offers tremendous opportunities. With Orquesta, Sohrab and Anthony have developed a no-code platform that allows companies to easily configure, test, and manage complex prompts and various LLMs, leading the way in the market to assist companies in the application of these new Generative AI technologies."

“Due to the acceleration of AI capabilities and adoption, we expect the landscape of LLM providers and models to grow exponentially. Orquesta has built a best-in-class no-code platform that allows engineers to focus on their proprietary product and shorten release cycles instead of managing LLM integrations, configurations, and rules”, says Adriaan Mol, founder of Mollie, Messagebird and Spacetime.

About Orquesta

Orquesta was founded in 2022 by Sohrab Hosseini and Anthony Diaz. It empowers companies to seamlessly integrate and operate their products using the capabilities of Large Language Models through a unified collaboration platform.

The platform centralizes prompt management, streamlines experimentation, collects feedback, and provides real-time insights into performance and costs. It's compatible with all major Large Language Model providers, ensuring transparency and scalability in Large Language Model Operations (LLM Ops), ultimately leading to shorter customer release cycles and reduced costs for both experiments and production environments.

About Curiosity

Curiosity is a Dutch venture capital fund focused on early-stage investments in ambitious, diverse teams based in the Benelux, Nordics, and Baltics. The fund invests in AI-driven software companies that serve the world, not eat it. Curiosity is led by two experienced operator-investors, Herman Kienhuis and Maurice Beckand Verwee, supported by a community of expert advisors and portfolio company founders who are all co-owners of the fund. Curiosity invested in a.o. Deeploy from Utrecht, Oslo-based Strise and Dreamdata from Copenhagen.

For more information, please visit https://www.curiosityvc.com

About Spacetime

Spacetime is an independent investment vehicle deploying capital and hard-won experience to fellow founders. It provides flexible and long-term growth capital to technology companies in various industries. Through active involvement and through sharing its knowledge, network, and experience, gained from building companies such as Mollie and MessageBird, it supports entrepreneurs in achieving growth for their business.

For more information, please visit www.spacetime.nl

Integrate Orquesta with LangChain

Olumide Shittu — Mon, 18 Sep 2023 13:10:09 +0000

Orquesta provides your product teams with no-code collaboration tooling to experiment, operate, and monitor LLMs and remote configurations within your SaaS. As an LLMOps engineer, using Orquesta, you can easily perform prompt engineering, prompt management, LLMOps, experimentation in production, push new versions directly to production, and have full observability and monitoring.

LangChain is a framework for developing applications powered by large language models. It enables applications that are data-aware to connect a language model to other sources of data, and it allows a language model to interact with its environment.

In this article, you will learn how to integrate Orquesta and LangChain. We will explain how to set a prompt in Orquesta, and request it from LangChain to predict an output. All this is possible with the help of the Orquesta Python SDK and can be implemented in a few easy steps.

Prerequisites

For you to be able to follow along in this tutorial, you will need the following:

Jupyter Notebook (or any IDE of your choice).
Orquesta Python SDK.

Step 1 - Install SDK and create a client instance

You can easily install the Python SDK and Cohere via the Python package installer pip.

pip install orquesta-sdk
pip install langchain

This will install the Orquesta SDK and LangChain on your local machine, but you need to understand that this command will only install the bare minimum requirements of LangChain. A lot of the value of LangChain comes when integrating it with various model providers, data stores, etc.

Grab your API Key from Orquesta (https://my.orquesta.dev/<workspace-name>/settings/developers ) which will be used to create a client instance.

import os
import time
from orquesta_sdk import OrquestaClient, OrquestaClientOptions
from orquesta_sdk.prompts import OrquestaPromptMetrics, OrquestaPromptMetricsEconomics
from orquesta_sdk.helpers import orquesta_openai_parameters_mapper

from langchain.schema import AIMessage, HumanMessage, SystemMessage
from langchain.chat_models import ChatOpenAI
from langchain.callbacks import get_openai_callback

Explanation:

Import the time module to calculate the total time for the program to run.
The OrquestaClient and the OrquestaClientOptions classes which are already defined in the orquesta_sdk module are imported.
To be able to log all the interactions with the LLM, we use the OrquestaPromptMetrics class.
Orquesta has many helper functions that map and interface between Orquesta and specific LLM provider, for this integration, we will make use of the orquesta_openai_parameters_mapper helper.
The AIMessage class is a message from an AI, HumanMessage is a message from a human, and the SystemMessage is a message for priming AI behaviour, usually passed in as the first of a sequence of input messages.
The ChatOpenAI class is an OpenAI Chat large language models API. To be able to use it, you should have the OpenAI Python package installed and the environment variable OPENAI_API_KEY set with your API key.

# Initialize Orquesta client
from orquesta_sdk import OrquestaClient, OrquestaClientOptions

api_key = "ORQUESTA-API-KEY"
options = OrquestaClientOptions(api_key=api_key, ttl=3600)
client = OrquestaClient(options)

An instance of the OrquestaClient class is created and initialized with the previously configured options object. This client instance can now interact with the Orquesta service using the provided API key for authentication.
In the next line of code, we create the instance of the OrquestaClientOptions and configure it with the api_key and the ttl (Time to Live) in seconds for the local cache; by default, it is 3600 seconds (1 hour).

Step 2 - Set up a chat prompt

Set up your chat prompt in the Orquesta dashboard. Make sure it is a chat prompt and not a completion prompt. Set your prompt key and domain (if you have any), and Publish.

Once that is set up, create your first chat prompt, give it a name prompt, and add all the necessary information. Click on Save.

As you can see from the screenshot, the prompt message is “What is a good name for a company that makes good beard oil”, and the model is openai/gpt-3.5-turbo. Click Save.

Step 3 - Request a variant from Orquesta

To request a specific variant from your newly created prompt, the Code Snippet Generator can easily generate the code for a prompt variant by right-clicking on the prompt or opening the Code Snippet component.

Copy the code snippet and paste it into your editor.

prompt = client.prompts.query(
    key="customer-support-chat",
    context={"environments": ["test"]},
    variables={"customer_name": ""},
    metadata={"chain-id": "js2938js2ja"},
)

Step 4 - Transform the message into LangChain format

The prompt from Orquesta is transformed into a format to pass into LangChain.

# Start time of the completion request
start_time = time.time()
print(f'Start time: {start_time}')

messages = []

for message in prompt.value.get("messages", []):
    role = message.get("role")
    content = message.get("content")

    if role == "system":
        messages.append(SystemMessage(content=content))
    elif role == "user":
        messages.append(HumanMessage(content=content))
    elif role == "assistant":
        messages.append(AIMessage(content=content))

parameters = orquesta_openai_parameters_mapper(prompt.value)

chat = ChatOpenAI(
    temperature=parameters.get("temperature"),
    max_tokens=parameters.get("max_tokens"),
    openai_api_key="api_key",
)

with get_openai_callback() as cb:
    result = chat(messages)

    # End time of the completion request
    end_time = time.time()
    print(f"End time: {end_time}")

    print(result.content)

    # Calculate the difference (latency) in milliseconds
    latency = (end_time - start_time) * 1000
    print(f'Latency is: {latency}')

    economics = OrquestaPromptMetricsEconomics(
        total_tokens=cb.total_tokens,
        completion_tokens=cb.completion_tokens,
        prompt_tokens=cb.prompt_tokens,
    )

    # Report the metrics back to Orquesta
    metrics = OrquestaPromptMetrics(
        economics=economics,
        llm_response=result.content,
        latency=latency
    )

    prompt.add_metrics(metrics=metrics)

Explanation

Initialize an empty list named messages, which will store message objects.
A for loop iterates through the list of messages obtained from prompt.value. If no messages are found, an empty list is used as a default value.
Within the loop, the code extracts the role and content attributes from each message.
Depending on the role of the message ("system", "user", or "assistant"), a message object is created and appended to the messages list.
Pass in the value of the prompt into the Orquesta OpenAI helper and store them in the parameters variable.
A ChatOpenAI object is created with specified parameters, including the temperature and maximum tokens, which affect the behaviour of the language model. The openai_api_key is provided as an argument.
The chat object is invoked with the messages list as an argument. This processes the messages using the language model and generates a response.

Finally, the content of the response generated by the language model is printed to the console.

The response from the LLM is “Beard Bliss”.

Wrap up

In conclusion, the integration of Orquesta SDK with LangChain brings forth a powerful synergy that amplifies the capabilities of both platforms, and you have been able to set up a prompt in Orquesta, create a client, connect with LangChain, and get a response from the LangChain OpenAI API.

Full code

Here is the full code for this tutorial.

import os
import time
from orquesta_sdk import OrquestaClient, OrquestaClientOptions
from orquesta_sdk.prompts import OrquestaPromptMetrics, OrquestaPromptMetricsEconomics
from orquesta_sdk.helpers import orquesta_openai_parameters_mapper

from langchain.schema import AIMessage, HumanMessage, SystemMessage
from langchain.chat_models import ChatOpenAI
from langchain.callbacks import get_openai_callback

# Initialize Orquesta client
from orquesta_sdk import OrquestaClient, OrquestaClientOptions

api_key = "ORQUESTA-API-KEY"
options = OrquestaClientOptions(api_key=api_key, ttl=3600)
client = OrquestaClient(options)

prompt = client.prompts.query(
    key="customer-support-chat",
    context={"environments": ["test"]},
    variables={"customer_name": ""},
    metadata={"chain-id": "js2938js2ja"},
)

# Start time of the completion request
start_time = time.time()
print(f'Start time: {start_time}')

messages = []

for message in prompt.value.get("messages", []):
    role = message.get("role")
    content = message.get("content")

    if role == "system":
        messages.append(SystemMessage(content=content))
    elif role == "user":
        messages.append(HumanMessage(content=content))
    elif role == "assistant":
        messages.append(AIMessage(content=content))

parameters = orquesta_openai_parameters_mapper(prompt.value)

chat = ChatOpenAI(
    temperature=parameters.get("temperature"),
    max_tokens=parameters.get("max_tokens"),
    openai_api_key="api_key",
)

with get_openai_callback() as cb:
    result = chat(messages)

    # End time of the completion request
    end_time = time.time()
    print(f"End time: {end_time}")

    print(result.content)

    # Calculate the difference (latency) in milliseconds
    latency = (end_time - start_time) * 1000
    print(f'Latency is: {latency}')

    economics = OrquestaPromptMetricsEconomics(
        total_tokens=cb.total_tokens,
        completion_tokens=cb.completion_tokens,
        prompt_tokens=cb.prompt_tokens,
    )

    # Report the metrics back to Orquesta
    metrics = OrquestaPromptMetrics(
        economics=economics,
        llm_response=result.content,
        latency=latency
    )

    prompt.add_metrics(metrics=metrics)

Integrate Orquesta with Cohere using Python SDK

Olumide Shittu — Mon, 18 Sep 2023 13:09:45 +0000

Orquesta provides your product teams with no-code collaboration tooling to experiment, operate, and monitor LLMs and remote configurations within your SaaS. Using Orquesta, you can easily perform prompt engineering, prompt management, experiment in production, push new versions directly to production, and roll back instantly.

Cohere, on the other hand, is an API that offers language processing to any system. It trains massive language models and puts them behind a very simple API.

Source: Cohere.

This article guides you through integrating your SaaS with Orquesta and Cohere using our Python SDK. By the end of the article, you'll know how to set up a prompt in Orquesta, perform prompt engineering, request a prompt variant using our SDK code generator, map the Orquesta response with Cohere, send a payload to Cohere, and report the response back to Orquesta for observability and monitoring.

Prerequisites

For you to be able to follow along in this tutorial, you will need the following:

Jupyter Notebook (or any IDE of your choice).
Orquesta Python SDK.

Integration

Follow these steps to integrate the Python SDK with Cohere.

Step 1 - Install SDK and create a client instance

pip install orquesta-sdk
pip install cohere

To create a client instance, you need to have access to the Orquesta API key, which can be found in your workspace https://my.orquesta.dev/<workspace-name>/settings/developers

Copy it and add the following code to your notebook to initialize the Orquesta client.

import time
import cohere
from orquesta_sdk import OrquestaClient, OrquestaClientOptions
from orquesta_sdk.helpers import orquesta_cohere_parameters_mapper
from orquesta_sdk.prompts import OrquestaPromptMetrics


# Initialize Orquesta client
from orquesta_sdk import OrquestaClient, OrquestaClientOptions

api_key = "ORQUESTA-API-KEY"

options = OrquestaClientOptions(
    api_key=api_key,
    ttl=3600
)

client = OrquestaClient(options)

Explanation:

Import the time module to calculate the total time for the program to run
We also import cohere, to be able to use the API
The OrquestaClient and the OrquestaClientOptions classes that are already defined in the orquesta_sdk module, are imported.
The Orquesta SDK has helper functions that map and interface between Orquesta and specific LLM providers. For this integration, we will make use of the orquesta_cohere_parameters_mapper helper
To log all the interactions with Cohere, we use the OrquestaPromptMetrics class
We create the instance of the OrquestaClientOptions and configure it with the api_key and the ttl (Time to Live) in seconds for the local cache; by default, it is 3600 seconds (1 hour)

Finally, an instance of the OrquestaClient class is created and initialized with the previously configured options object. This client instance can now interact with the Orquesta service using the provided API key for authentication.

Step 2 - Enable Cohere models in Model Garden

Head over to Orquesta's Model Garden and enable the Cohere models you want to use.

Step 3 - Set up a completion prompt and variants

The next step is to set up your completion prompt; ensure it is completion and not chat to use Cohere.

To create a prompt, click on Add Prompt, provide a prompt key, a Domain (optional) and select Completion.

Once that is set up, create your first completion, give it a name prompt, add all the necessary information, and click Save.

Step 4 - Request a variant from Orquesta using the SDK

Our flexible configuration matrix allows you to define multiple prompt variants based on custom context. This allows you to work with different prompts and hyperparameters with, for example, environment, country, locale or user segment. The Code Snippet Generator makes it easy to request a prompt variant.

Once you open the Code Snippet Generator, copy the code snippet and paste it into your editor.

# Query the prompt from Orquesta
prompt = client.prompts.query(
  key="data_completion",
  context={
    "environments": ["test"]
  },
  variables={  }
)

Step 5 - Map the Orquesta response to Cohere using a Helper

We have already established at the beginning of this tutorial that for us to be able to integrate these two technologies, we will make use of a Helper provided by Orquesta, which is orquesta_cohere_parameters_mapper.

# Start time of the completion request
start_time = time.time()
print(f'Start time: {start_time}') 

co = cohere.Client('COHERE-API-KEY') # Insert your Cohere API key
completion = co.generate(
    **orquesta_cohere_parameters_mapper(prompt.value),
    model=prompt.value.get("model"),
    prompt=prompt.value.get('prompt'),
)

# End time of the completion request
end_time = time.time()
print(f'End time: {end_time}')

# Calculate the difference (latency) in milliseconds
latency = (end_time - start_time) * 1000
print(f'Latency is: {latency}')

Explanation:

We start the time using the time module.
An instance of the Cohere client is created.
Using the generate() endpoint, we can generate realistic text conditioned on a given input
The generate() endpoint also receives other body parameters, such as the prompt as a required string, the model, the num_generations, max_tokens, temperature, etc. For simplicity, we are only working with model and prompt
We end the time and calculate latency.

Step 6 - Report analytics back to Orquesta

After each query, Orquesta generates a log with a Trace ID. Using the add_metrics() method, you can add additional information, such as the llm_response, metadata, latency, and economics.

# Tokenize responses
prompt_tokenization = co.tokenize(prompt.value.get('prompt'))
completion_tokenization = co.tokenize(completion.generations[0].text)

prompt_tokens = len(prompt_tokenization.tokens)
completion_tokens = len(completion_tokenization.tokens)
total_tokens = prompt_tokens + completion_tokens

# Report the metrics back to Orquesta
metrics = OrquestaPromptMetrics(
    economics={
        "total_tokens": total_tokens,
        "completion_tokens": completion_tokens,
        "prompt_tokens": prompt_tokens,
    },
    llm_response=completion.generations[0].text,
    latency=latency,
    metadata={
        "finish_reason": completion.generations[0].finish_reason,
    },
)

prompt.add_metrics(metrics=metrics)

Conclusion

With these easy steps, you have successfully integrated Orquesta with Cohere, and this is just the tip of the iceberg because, as of the time of writing this article, Orquesta only supports the generate() endpoint, but in the future, you can use the other endpoints, such as embed, classify, summarize, detect-language, etc.

Orquesta supports other SDKs such as Angular, Node.js, React, and TypeScript. Refer to our documentation for more information.

Full Code Example

import os
import time
import cohere
from orquesta_sdk import OrquestaClient, OrquestaClientOptions
from orquesta_sdk.helpers import orquesta_cohere_parameters_mapper
from orquesta_sdk.prompts import OrquestaPromptMetrics

# Initialize Orquesta client
from orquesta_sdk import OrquestaClient, OrquestaClientOptions

api_key = "ORQUESTA-API-KEY"

options = OrquestaClientOptions(
    api_key=api_key,
    ttl=3600
)

client = OrquestaClient(options)
co = cohere.Client('COEHERE-API-KEY')

prompt = client.prompts.query(
  key="data_completion",
  context={
    "environments": ["test"]
  },
  variables={  },
  metadata={"user_id":45515}
)

# Start time of the completion request
start_time = time.time()
print(f'Start time: {start_time}') 

completion = co.generate(
    **orquesta_cohere_parameters_mapper(prompt.value),
    model=prompt.value.get("model"),
    prompt=prompt.value.get('prompt'),
)

# End time of the completion request
end_time = time.time()
print(f'End time: {end_time}')

# Calculate the difference (latency) in milliseconds
latency = (end_time - start_time) * 1000
print(f'Latency is: {latency}')

# Tokenize responses
prompt_tokenization = co.tokenize(prompt.value.get('prompt'))
completion_tokenization = co.tokenize(completion.generations[0].text)

prompt_tokens = len(prompt_tokenization.tokens)
completion_tokens = len(completion_tokenization.tokens)
total_tokens = prompt_tokens + completion_tokens

# Tokenize responses
prompt_tokenization = co.tokenize(prompt.value.get('prompt'))
completion_tokenization = co.tokenize(completion.generations[0].text)

prompt_tokens = len(prompt_tokenization.tokens)
completion_tokens = len(completion_tokenization.tokens)
total_tokens = prompt_tokens + completion_tokens

# Report the metrics back to Orquesta
metrics = OrquestaPromptMetrics(
    economics={
        "total_tokens": total_tokens,
        "completion_tokens": completion_tokens,
        "prompt_tokens": prompt_tokens,
    },
    llm_response=completion.generations[0].text,
    latency=latency,
    metadata={
        "finish_reason": completion.generations[0].finish_reason,
    },
)

prompt.add_metrics(metrics=metrics)

Integrate Orquesta with OpenAI using Python SDK

Olumide Shittu — Mon, 18 Sep 2023 13:09:25 +0000

Orquesta is a powerful LLM Ops suite designed to manage both public and private LLMs (Large Language Models) from a single source. It offers full transparency on performance and costs while reducing your release cycles from weeks to minutes. Integrating Orquesta into your current setup or a new workflow requires a couple of lines of code, ensuring seamless collaboration and transparency in prompt engineering and prompt management for your team.

With Orquesta, you gain access to several LLM Ops features, enabling your team to:

Collaborate directly across product, engineering, and domain expert teams.
Manage prompts for both public and private LLM models.
Customize and localize prompt variants based on your data model.
Push new versions directly to production and roll back instantly.
Obtain model-specific token and cost estimates.
Gain insights into model-specific costs, performance, and latency.
Gather both quantitative and qualitative end-user feedback.
Experiment in production and gather real-world feedback.
Make decisions grounded in real-world information.

This article guides you through integrating your SaaS with Orquesta and OpenAI using our Python SDK. By the end of the article, you'll know how to set up a prompt in Orquesta, perform prompt engineering, request a prompt variant using the SDK code generator, map the Orquesta response with OpenAI, send a payload to OpenAI, and report the response back to Orquesta for observability and monitoring.

Prerequisites

For you to be able to follow along in this tutorial, you will need the following:

Jupyter Notebook (or any IDE of your choice).
An OpenAI account, you can sign up here.
Orquesta Python SDK.

Integration

Follow these steps to integrate the Python SDK with OpenAI.

Step 1 - Install SDK and create a client instance

pip install orquesta-sdk

To create a client instance, you need to have access to the Orquesta API key, which can be found in your workspace https://my.orquesta.dev/<workspace-name>/settings/developers.

Copy it and add the following code to your notebook to initialize the Orquesta client.

from orquesta_sdk import OrquestaClient, OrquestaClientOptions

api_key = "<ORQUESTA_API_KEY>"

options = OrquestaClientOptions(
    api_key=api_key,
    ttl=3600
)

client = OrquestaClient(options)

The OrquestaClient and the OrquestaClientOptions classes which are already defined in the orquesta_sdk module is imported. The API key, which is used for authentication, is assigned to the variable api_key, you can either add the API key this way, or you can add it using the environment variable; api_key = os.environ.get("ORQUESTA_API_KEY", "__API_KEY__"). The instance of the OrquestaClientOptions class is created and configured with the api_key and the ttl (Time to Live) in seconds for the local cache; by default, it is 3600 seconds (1 hour).

Step 2 - Set up a prompt and its variants

After successfully connecting to Orquesta, you continue within the Orquesta Admin Panel to set up your prompt and variants. A prompt is the specific task you provide to LLM, and you'll get a response that is the output of the language model accomplishing the task. To create a prompt, click on Add Prompt and the prompt key.

The image above represents the Prompt Studio in Orquesta, where:

The name of the prompt variant.
Notes, this is where you drop notes for other collaborators.
Since we are working on a chat prompt, this is where you manage the System-User-Assistant messages.
Prompt variables provide flexibility in your prompts.
Prompt tokens and cost are estimated based on the model selected.
Model Selector.
Click Save once you are done.

Step 3 - Request a variant from Orquesta using the SDK

Our flexible configuration matrix allows you to define multiple prompt variants based on custom context. This allows you to work with different prompts and hyperparameters with, for example, environment, country, locale or user segment. The Code Snippet Generator makes it easy to request a prompt variant.

Once you open the Code Snippet Generator, you can use the generated snippet to consume your first prompt from your application.

Step 4 - Map the Orquesta response to OpenAI using a Helper

Map the Orquesta response to OpenAI's API using the Helper functions. Each LLM provider has its own Helper function in Orquesta.

For OpenAI, use the Helper: orquesta_openai_parameters_mapperor Class: OrquestaOpenAIPromptParameters.

import os
import time
import openai
from orquesta_sdk import OrquestaClient, OrquestaClientOptions
from orquesta_sdk.helpers import orquesta_openai_parameters_mapper
from orquesta_sdk.prompts import OrquestaPromptMetrics

openai.api_key = "<OPENAI_API_KEY>"

Paste the code copied from the Code Snippet Generator here.

# Query the prompt from Orquesta
prompt = client.prompts.query(
  key="customer-support-chat",
  context={
    "environments": ["test"],
    "country": ["BEL", "NLD"],
    "locale": ["en"],
    "user-segment": ["b2c"]
  },
  variables={ "customer_name": "John" },
  metadata={"user_id":45515}
)

if prompt.has_error:
    print("There was an error while fetching the prompt")

You can now send the payload to OpenAI and receive the response.

# Start time of the completion request
start_time = time.time()
print(f'Start time: {start_time}')

completion = openai.ChatCompletion.create(
    **orquesta_openai_parameters_mapper(prompt.value),
    model=prompt.value.get("model"),
    messages=prompt.value.get("messages"),
)

# End time of the completion request
end_time = time.time()
print(f'End time: {end_time}')

# Calculate the difference (latency) in milliseconds
latency = (end_time - start_time) * 1000
print(f'Latency is: {latency}')

Step 5 - Report analytics back to Orquesta

# Report the metrics back to Orquesta
metrics = OrquestaPromptMetrics(
    economics={
        "total_tokens": completion.usage.get("total_tokens"),
        "completion_tokens": completion.usage.get("completion_tokens"),
        "prompt_tokens": completion.usage.get("prompt_tokens"),
    },
    llm_response=completion.choices[0].message.content,
    latency=latency,
    metadata={
        "finish_reason": completion.choices[0].finish_reason,
    },
)

prompt.add_metrics(metrics=metrics)

Conclusion

And that is it, and you have integrated Orquesta with OpenAI using the Python SDK! You can easily design, test, and manage prompts for all your LLM providers using Orquesta by simply leveraging its power tools with real-time logs, versioning, code snippets, and a playground for your prompts.

Orquesta supports other SDKs such as Angular, Node.js, React, and TypeScript. Refer to our documentation for more information.

Full Code Example

import os
import time
import openai
from orquesta_sdk import OrquestaClient, OrquestaClientOptions
from orquesta_sdk.helpers import orquesta_openai_parameters_mapper
from orquesta_sdk.prompts import OrquestaPromptMetrics

openai.api_key = "<OPENAI_API_KEY>"

# Initialize Orquesta client

api_key = "<ORQUESTA_API_KEY>"
options = OrquestaClientOptions(
    api_key=api_key,
    ttl=3600
)

client = OrquestaClient(options)

# Query the prompt from Orquesta
prompt = client.prompts.query(
  key="customer-support-chat",
  context={
    "environments": ["test"],
    "country": ["BEL", "NLD"],
    "locale": ["en"],
    "user-segment": ["b2c"]
  },
  variables={
    "customer_name": "John"
  },
  metadata={"user_id":45515}
)

if prompt.has_error:
    print("There was an error while fetching the prompt")

# Start time of the completion request
start_time = time.time()
print(f'Start time: {start_time}')

completion = openai.ChatCompletion.create(
    **orquesta_openai_parameters_mapper(prompt.value),
    model=prompt.value.get("model"),
    messages=prompt.value.get("messages"),
)

# End time of the completion request
end_time = time.time()
print(f'End time: {end_time}')

# Calculate the difference (latency) in milliseconds
latency = (end_time - start_time) * 1000
print(f'Latency is: {latency}')

# Report the metrics back to Orquesta
metrics = OrquestaPromptMetrics(
    economics={
        "total_tokens": completion.usage.get("total_tokens"),
        "completion_tokens": completion.usage.get("completion_tokens"),
        "prompt_tokens": completion.usage.get("prompt_tokens"),
    },
    llm_response=completion.choices[0].message.content,
    latency=latency,
    metadata={
        "finish_reason": completion.choices[0].finish_reason,
    },
)

prompt.add_metrics(metrics=metrics)

Lifecycle of a Prompt: A Guide to Effective Prompts

Olumide Shittu — Mon, 18 Sep 2023 12:54:06 +0000

1. Design & Experiment

The design phase is the first stage of the prompt lifecycle and involves developing the initial prompt or the idea to improve an existing prompt. The design of a new prompt is not simply the prompt you come up with. It's essential to create a prompt that is effective and has good prompt economics. A complete prompt is a combination of the following:

Prompt text.
LLM provider and specific model.
Configuration of hyperparameters.

Any combination of these should be considered a unique prompt and handled throughout the prompt lifecycle. During this phase, one of the pain points is keeping track of all the prompt variations you are considering or developing.

A new prompt or new variation of an existing one can have different goals:

Adding new features and use cases.
Adding new LLMs and models for existing use cases.
Continuous improvement efforts identified in stage 5 of the prompt lifecycle.
Prompt optimization to reduce costs while maintaining effectiveness.
Improving user satisfaction, based on collected feedback.

Specific tooling for this stage consists of the following:

Selecting the right LLM provider and specific model.
The configuration of hyperparameters that result in high-quality responses
Estimation of prompt economics.
A playground to see potential responses allows variation testing.

2. Differentiate & Personalize

After designing a prompt, the second stage involves differentiating and personalizing it and its hyperparameters. This stage requires specific tooling for managing variations:

The granular roll-out of new experiments and versions across different environments.
Defining rules and conditions for serving specific variants.
Versioning of prompts and configurations.
Intuitive variation testing.

3. Serve & Operate

Once a new prompt is considered ready for application use, the changes must propagate throughout the DevOps cycle on different non-production environments (development, test, acceptance, production). But, even in production, applications would need customization of prompts. These customizations can potentially be due to the following:

Roles.
Personalization.
Localization.
Product tiers and subscriptions.
Canary releases and A/B testing.

Specific tooling required in this phase:

MLOps tools.
Low-latency and secure serving of prompts.
Real-time logging and insights.
Versioning and rollbacks.
Customization of prompts and configurations being served in a custom context.
Tracing and auditing of prompts for compliance.

4. Analyze Feedback & Adapt

The fourth stage of the prompt lifecycle involves monitoring and collecting feedback. This stage requires specific tooling such as logs, analytics, and feedback software. Pain points associated with this stage include the need for transparency on which prompt is served in particular contexts and the qualitative and quantitative feedback about its effectiveness. In this case, qualitative feedback is user satisfaction, and quantitative feedback is performance and prompt economics. These data points collected asynchronously must be related to the specific prompt evaluated and served. Additionally, since you're running on production, DevOps tooling is needed to keep control of your tech stack.

Specific tooling required in this phase:

Quality and satisfaction feedback collection.
Prompt economics and performance metrics across LLMs and models.
Kill switches and feature flags.

Finally, the lifecycle closes the continuous improvement cycle and involves analyzing and improving the prompt. Based on the qualitative and quantitative feedback collected, improvements can be hypothesized and (re)designed in phase 1.

Tooling required in this phase:

Data analysis of served prompt variations and performance over time.
Dashboards.
Recommendation agents.

By understanding the prompt lifecycle and the pain points associated with each stage, product teams can at least be aware of potential pain points and tools for remediation.

The Current State of Tooling Solutions for Prompt Lifecycle Management

Many prompt engineering practitioners are still in the honeymoon period of excitement and building shallow gadgets. There are no best practices and tooling yet for conducting DevOps for your LLM's prompt-infused applications. This is quite an underserved niche.

A lot of the existing tooling focuses on teams that do more fundamental work, such as:

Building their own models.
Training and fine-tuning with custom or proprietary data.
Store and compute LLM models.

Addressing the Limitations of Current Tooling Solutions
To summarize, the current gaps we see in tooling for prompt engineering are:

Prompt lifecycle management.

No-code collaboration.
Multi-LLM and multi-modal configurations.
Versioning.
Integrated multi-LLM playgrounds.
Single source of truth preventing fragmentation.

Business Rules and Remote Configurations

Context-aware serving of prompt variations.
Personalization.
Localization.

DevOps capabilities

Kill-switches and feature flags.
Staged roll-outs and canary releases.
Enterprise-grade security.
Traceability and auditing of served prompts.

Orquesta AI Prompts

We empower product teams to engineer with the Orquesta Cloud suite of building blocks.

Orquesta AI Prompts infuses your existing tech stack with prompt engineering capabilities with a couple of lines of code and enables you to conduct LLM Ops. Get a grip on your prompt lifecycle management and use enterprise-grade tooling.

DEV Community: Olumide Shittu

Orquesta raises €800,000 in pre-seed funding!

Replacing a fragmented AI tech stack with one single gateway

€800.000 in pre-seed funding

About Orquesta

About Curiosity

About Spacetime

Integrate Orquesta with LangChain

Prerequisites

Step 1 - Install SDK and create a client instance

Step 2 - Set up a chat prompt

Step 3 - Request a variant from Orquesta

Step 4 - Transform the message into LangChain format

Wrap up

Links

Full code

Integrate Orquesta with Cohere using Python SDK

Prerequisites

Integration

Step 1 - Install SDK and create a client instance

Step 2 - Enable Cohere models in Model Garden

Step 3 - Set up a completion prompt and variants

Step 4 - Request a variant from Orquesta using the SDK

Step 5 - Map the Orquesta response to Cohere using a Helper

Step 6 - Report analytics back to Orquesta

Conclusion

Full Code Example

Integrate Orquesta with OpenAI using Python SDK

Prerequisites

Integration

Step 1 - Install SDK and create a client instance

Step 2 - Set up a prompt and its variants

Step 3 - Request a variant from Orquesta using the SDK

Step 4 - Map the Orquesta response to OpenAI using a Helper

Step 5 - Report analytics back to Orquesta

Conclusion

Full Code Example

Lifecycle of a Prompt: A Guide to Effective Prompts

1. Design & Experiment

2. Differentiate & Personalize

3. Serve & Operate

4. Analyze Feedback & Adapt

The Current State of Tooling Solutions for Prompt Lifecycle Management

Prompt lifecycle management.

Business Rules and Remote Configurations

DevOps capabilities

Orquesta AI Prompts