Testing machine learning models that use large language models (LLMs) for natural language processing can be challenging. However, consisTent, a new testing framework, aims to make this task more straightforward and reproducible.
The goal of consisTent is to create reproducible tests for LLM based applications regardless of the fine-tuning method used. consisTent comes with two types of validators, Syntactic and Semantic Validators. The former is used to do static assertions of the LLM output, validating if the output is in a certain JSON format, asserting the schema, or checking if the output is a valid piece of code. On the other hand, Semantic Validators are used to assert the quality of the response with more "soft" parameters, such as checking if something is factually correct, checking for hallucinations, or checking for labels like "funny" or "interesting." The Semantic Consistency Validator is another type of semantic validator that allows developers to provide a seed of validated input and a threshold to assert the semantic distance of the new output from the seed cluster.
Syntactic Validators are used to assert the FORM of the response, whereas Semantic Validators are used to assert the CONTENT of the response.
examples
Let's take a look at some examples of how consisTent can be used to validate LLM output.
Syntactic Validator Examples:
consisTent.JsValidator().validate('console.log("I\'m a JS program!")')
consisTent.PyValidator().validate('print("I\'m a Python program!")')
Semantic Validator Example:
import consisTent
seed = [
"the cat sat on the mat",
"the feline layed on the carpet",
]
consisTent.ConsistencyValidator(
seed_size=2,
consistency_threshold=0.5,
).validate(
seed=seed,
model_output="the dog sat on the mat",
)
Label Test Example:
OPENAI_KEY = "XXXXXXXXXXXXXXX"
consisTent.LabelsValidator(openai_key=OPENAI_KEY).validate(
labels=[
"funny",
"short",
"about rabbits",
],
model_output="What do you call a rabbit that tells jokes? A funny bunny!",
)
Facts Validation Example:
OPENAI_KEY = "XXXXXXXXXXXXXXX"
consisTent.FactsValidator(openai_key=OPENAI_KEY).validate(
facts=["this car weighs 1000KG"],
model_output="The average person can lift this car can lift this car",
)
Installation of the consisTent package is easy via pip: pip install consistent
.
Overall, consisTent is an excellent framework for testing LLM based applications. The framework's ability to handle syntactic and semantic validations in a reproducible manner makes it a valuable tool for anyone working with large language models.
Psst, please star us on github
Top comments (0)