DEV Community

drorIvry
drorIvry

Posted on

consisTent: A "Unit Testing" Framework for Prompts

consisTentLogo

Testing machine learning models that use large language models (LLMs) for natural language processing can be challenging. However, consisTent, a new testing framework, aims to make this task more straightforward and reproducible.

The goal of consisTent is to create reproducible tests for LLM based applications regardless of the fine-tuning method used. consisTent comes with two types of validators, Syntactic and Semantic Validators. The former is used to do static assertions of the LLM output, validating if the output is in a certain JSON format, asserting the schema, or checking if the output is a valid piece of code. On the other hand, Semantic Validators are used to assert the quality of the response with more "soft" parameters, such as checking if something is factually correct, checking for hallucinations, or checking for labels like "funny" or "interesting." The Semantic Consistency Validator is another type of semantic validator that allows developers to provide a seed of validated input and a threshold to assert the semantic distance of the new output from the seed cluster.

Syntactic Validators are used to assert the FORM of the response, whereas Semantic Validators are used to assert the CONTENT of the response.

examples

Let's take a look at some examples of how consisTent can be used to validate LLM output.

Syntactic Validator Examples:

consisTent.JsValidator().validate('console.log("I\'m a JS program!")')

consisTent.PyValidator().validate('print("I\'m a Python program!")')
Enter fullscreen mode Exit fullscreen mode

Semantic Validator Example:

import consisTent

seed = [
    "the cat sat on the mat",
    "the feline layed on the carpet",
]

consisTent.ConsistencyValidator(
    seed_size=2,
    consistency_threshold=0.5,
).validate(
    seed=seed,
    model_output="the dog sat on the mat",
)
Enter fullscreen mode Exit fullscreen mode

Label Test Example:

OPENAI_KEY = "XXXXXXXXXXXXXXX"

consisTent.LabelsValidator(openai_key=OPENAI_KEY).validate(
    labels=[        
        "funny",        
        "short",        
        "about rabbits",
    ],
    model_output="What do you call a rabbit that tells jokes? A funny bunny!",
)

Enter fullscreen mode Exit fullscreen mode

Facts Validation Example:

OPENAI_KEY = "XXXXXXXXXXXXXXX"

consisTent.FactsValidator(openai_key=OPENAI_KEY).validate(
    facts=["this car weighs 1000KG"],
    model_output="The average person can lift this car can lift this car",
)
Enter fullscreen mode Exit fullscreen mode

Installation of the consisTent package is easy via pip: pip install consistent .

Overall, consisTent is an excellent framework for testing LLM based applications. The framework's ability to handle syntactic and semantic validations in a reproducible manner makes it a valuable tool for anyone working with large language models.

Psst, please star us on github

Top comments (0)