KeyError: 'correctness' when running evaluate in giskard.rag #2111

kunjanshah0811 · 2025-02-17T00:09:24Z

When running evaluate from giskard.rag with the specified metrics ragas_context_recall and ragas_context_precision, the following error occurs

from giskard.rag import evaluate, RAGReport, AgentAnswer
from giskard.rag.metrics.ragas_metrics import ragas_context_recall, ragas_context_precision

def answer_fn(question, history=None):

    answer = chat_engine.chat(question, chat_history=[])

    return AgentAnswer(
        message=answer.response,
        documents=[source.content for source in answer.sources]
    )

report = evaluate(answer_fn, 
                testset=testset, 
                knowledge_base=knowledge_base,
                metrics=[ragas_context_recall, ragas_context_precision])
report.save("test_report")

ERROR

KeyError                                  Traceback (most recent call last)
File c:\Users\KUNJAN SHAH\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\indexes\base.py:3805, in Index.get_loc(self, key)
   3804 try:
-> 3805     return self._engine.get_loc(casted_key)
   3806 except KeyError as err:

File index.pyx:167, in pandas._libs.index.IndexEngine.get_loc()

File index.pyx:196, in pandas._libs.index.IndexEngine.get_loc()

File pandas\\_libs\\hashtable_class_helper.pxi:7081, in pandas._libs.hashtable.PyObjectHashTable.get_item()

File pandas\\_libs\\hashtable_class_helper.pxi:7089, in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'correctness'```

alexcombessie · 2025-03-12T13:18:01Z

Hey @henchaves - is this a known issue?

henchaves · 2025-03-18T09:56:39Z

Hello @kunjanshah0811, thanks for reporting this issue.

Could you confirm that you only get this error when calling report.save method? I've just tried to call this method and it's working fine here, so I think I'll need more information about your specs so I can try to simulate your env. Could you share what is you OS, Python version and pip list?

Just so you know, we constrained the version of some external libs (such as ragas and langchain) which were causing incompatibilities with generate_testset and evaluate methods. You can try to use this giskard version while it's not released yet: #2122.

GTimothee · 2025-04-06T14:37:24Z

Something that could be happening:

We always compute correctness, even if here only ragas_context_recall and ragas_context_precision are passed. So it explains why correctness is involved in the error.
Now we are using a LLM to generate JSON, and here the error seems to be KeyError: 'correctness', which may mean that the LLM that has been used just failed to generate the correctness key.

So it may just be an error in the evaluate function and a LLM problem.

henchaves self-assigned this Mar 18, 2025

GTimothee mentioned this issue Apr 7, 2025

Improve RAGET implementation #2132

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

KeyError: 'correctness' when running evaluate in giskard.rag #2111

KeyError: 'correctness' when running evaluate in giskard.rag #2111

kunjanshah0811 commented Feb 17, 2025 •

edited

Loading

alexcombessie commented Mar 12, 2025

Uh oh!

henchaves commented Mar 18, 2025

Uh oh!

GTimothee commented Apr 6, 2025 •

edited

Loading

Uh oh!

Uh oh!

KeyError: 'correctness' when running evaluate in giskard.rag #2111

KeyError: 'correctness' when running evaluate in giskard.rag #2111

Comments

kunjanshah0811 commented Feb 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

alexcombessie commented Mar 12, 2025

Uh oh!

henchaves commented Mar 18, 2025

Uh oh!

GTimothee commented Apr 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kunjanshah0811 commented Feb 17, 2025 •

edited

Loading

GTimothee commented Apr 6, 2025 •

edited

Loading