String Distance
One of the simplest ways to compare an LLM or chain's string output against a reference label is by using string distance measurements such as Levenshtein or postfix distance. This can be used alongside approximate/fuzzy matching criteria for very basic unit testing.
This can be accessed using the string_distance
evaluator, which uses distance metric's from the rapidfuzz library.
Note: The returned scores are distances, meaning lower is typically "better".
For more information, check out the reference docs for the StringDistanceEvalChain for more info.
# %pip install rapidfuzz
from langchain.evaluation import load_evaluator
evaluator = load_evaluator("string_distance")
API Reference:
- load_evaluator from
prediction="The job is completely done.",
reference="The job is done",
{'score': 0.11555555555555552}
# The results purely character-based, so it's less useful when negation is concerned
prediction="The job is done.",
reference="The job isn't done",
{'score': 0.0724999999999999}
Configure the String Distance Metric
By default, the StringDistanceEvalChain
uses levenshtein distance, but it also supports other string distance algorithms. Configure using the distance
from langchain.evaluation import StringDistance
API Reference:
- StringDistance from
[<StringDistance.DAMERAU_LEVENSHTEIN: 'damerau_levenshtein'>,
<StringDistance.LEVENSHTEIN: 'levenshtein'>,
<StringDistance.JARO: 'jaro'>,
<StringDistance.JARO_WINKLER: 'jaro_winkler'>]
jaro_evaluator = load_evaluator(
"string_distance", distance=StringDistance.JARO
prediction="The job is completely done.",
reference="The job is done",
{'score': 0.19259259259259254}
prediction="The job is done.",
reference="The job isn't done",
{'score': 0.12083333333333324}