Skip to content

Iris: Add FAQ consistency check #61

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 60 commits into
base: main
Choose a base branch
from

Conversation

cremertim
Copy link
Contributor

@cremertim cremertim commented Mar 14, 2025

Motivation

To ensure the quality and consistency of FAQ entries, a dedicated consistency-checking mechanism has been introduced. This allows automated detection of structural or semantic inconsistencies in FAQ data, improving content maintainability and supporting editors during the revision process.

Description

  • Extended the rewriting pipeline to check FAQs for consistency
  • returns inconsistencies as well as an improved suggestion

Summary by CodeRabbit

  • New Features
    • Introduced automated FAQ consistency validation for rewritten content, providing users with feedback on inconsistencies, suggestions for improvement, and an improved version when available.
    • Added support for retrieving FAQs by course and performing semantic searches within FAQs.
  • Enhancements
    • Status updates now display detected inconsistencies and suggested improvements when inconsistencies are found.
  • Bug Fixes
    • Minor improvements to the formatting and handling of status updates.

@cremertim cremertim requested a review from a team as a code owner March 14, 2025 07:54
@github-actions github-actions bot added the iris label Mar 14, 2025
@cremertim cremertim changed the title Add FAQ consistency tab IRIS: Add FAQ consistency tab Mar 14, 2025
@cremertim cremertim changed the title IRIS: Add FAQ consistency tab Iris: Add FAQ consistency tab Mar 14, 2025
@cremertim cremertim requested review from a team as code owners March 14, 2025 08:00
@cremertim cremertim changed the title Iris: Add FAQ consistency tab Iris: Add FAQ consistency check Mar 14, 2025
Copy link

There hasn't been any activity on this pull request recently. Therefore, this pull request has been automatically marked as stale and will be closed if no further activity occurs within seven days. Thank you for your contributions.

@cremertim
Copy link
Contributor Author

Reopen, since i am back working on it

@cremertim cremertim reopened this May 9, 2025
bassner
bassner previously approved these changes Jun 6, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
iris/src/iris/pipeline/rewriting_pipeline.py (3)

126-170: Add error handling for JSON parsing and fix logic order.

The method has several issues that need to be addressed:

  1. Logic order: Empty FAQ check should happen before extracting properties
  2. Missing error handling: JSON parsing lacks try-catch blocks
     def check_faq_consistency(
         self, faqs: List[dict], final_result: str
     ) -> Dict[str, str]:
         """
         Checks the consistency of the given FAQs with the provided final_result.
         Returns "consistent" if there are no inconsistencies, otherwise returns "inconsistent".

         :param faqs: List of retrieved FAQs.
         :param final_result: The result to compare the FAQs against.

         """
-        properties_list = [entry["properties"] for entry in faqs]
-
         if not faqs:
             return {"type": "consistent", "message": "No FAQs to check"}

+        properties_list = [entry["properties"] for entry in faqs]
+
         consistency_prompt = faq_consistency_prompt.format(
             faqs=properties_list, final_result=final_result
         )

         prompt = PyrisMessage(
             sender=IrisMessageRole.SYSTEM,
             contents=[TextMessageContentDTO(text_content=consistency_prompt)],
         )

-        response = self.request_handler.chat(
-            [prompt], CompletionArguments(temperature=0.0), tools=None
-        )
+        try:
+            response = self.request_handler.chat(
+                [prompt], CompletionArguments(temperature=0.0), tools=None
+            )
+        except Exception as e:
+            logging.error(f"Error in FAQ consistency check: {e}")
+            return {"type": "error", "message": f"Failed to check consistency: {str(e)}"}

         self._append_tokens(response.token_usage, PipelineEnum.IRIS_REWRITING_PIPELINE)
         result = response.contents[0].text_content

         if result.startswith("```json"):
             result = result.removeprefix("```json").removesuffix("```").strip()
         elif result.startswith("```"):
             result = result.removeprefix("```").removesuffix("```").strip()

-        data = json.loads(result)
+        try:
+            data = json.loads(result)
+        except json.JSONDecodeError as e:
+            logging.error(f"Failed to parse JSON response: {e}")
+            return {"type": "error", "message": f"Invalid JSON response: {str(e)}"}

         result_dict = {}
         keys_to_check = ["type", "message", "faqs", "suggestion", "improved version"]
         for key in keys_to_check:
             if key in data:
                 result_dict[key] = data[key]
         return result_dict

172-194: Fix docstring inconsistency and unused parameter.

The get_variants method has the same issues identified in previous reviews that remain unaddressed.

     @classmethod
     def get_variants(cls, available_llms: List[LanguageModel]) -> List[FeatureDTO]:
         """
-        Returns available variants for the FaqIngestionPipeline based on available LLMs.
+        Returns available variants for the RewritingPipeline.

         Args:
-            available_llms: List of available language models
+            available_llms: List of available language models (currently unused)

         Returns:
             List of FeatureDTO objects representing available variants
         """

197-202: Fix missing decorator and syntax error.

The function has critical issues that prevent compilation:

  1. Missing @staticmethod decorator
  2. Syntax error: Unescaped quotes in f-string
+    @staticmethod
     def parse_faq_inconsistencies(inconsistencies: List[Dict[str, str]]) -> List[str]:
         parsed_inconsistencies = [
-            f"FAQ ID: {entry["faq_id"]}, Title: {entry["faq_question_title"]}, Answer: {entry["faq_question_answer"]}"
+            f"FAQ ID: {entry['faq_id']}, Title: {entry['faq_question_title']}, Answer: {entry['faq_question_answer']}"
             for entry in inconsistencies
         ]
         return parsed_inconsistencies
🧰 Tools
🪛 Pylint (3.3.7)

[error] 199-199: Parsing failed: 'f-string: unmatched '[' (iris.pipeline.rewriting_pipeline, line 199)'

(E0001)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 778e89f and 5871c47.

📒 Files selected for processing (2)
  • iris/src/iris/domain/rewriting_pipeline_execution_dto.py (1 hunks)
  • iris/src/iris/pipeline/rewriting_pipeline.py (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • iris/src/iris/domain/rewriting_pipeline_execution_dto.py
🧰 Additional context used
🪛 Pylint (3.3.7)
iris/src/iris/pipeline/rewriting_pipeline.py

[error] 199-199: Parsing failed: 'f-string: unmatched '[' (iris.pipeline.rewriting_pipeline, line 199)'

(E0001)

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Build and Push to GitHub Container Registry / Build linux/amd64 Docker Image for ghcr.io/ls1intum/edutelligence/iris
  • GitHub Check: Build and Push to GitHub Container Registry / Build linux/arm64 Docker Image for ghcr.io/ls1intum/edutelligence/iris
  • GitHub Check: Lint
🔇 Additional comments (3)
iris/src/iris/pipeline/rewriting_pipeline.py (3)

1-3: LGTM: Import additions support new FAQ consistency functionality.

The added imports for json, Dict type, and other dependencies are appropriate for the new FAQ consistency checking feature.


27-34: LGTM: New imports properly organized.

The imports for LanguageModel, FeatureDTO, faq_consistency_prompt, VectorDatabase, and FaqRetrieval are correctly placed and necessary for the new functionality.


58-62: LGTM: Database and retrieval initialization is appropriate.

The initialization of VectorDatabase and FaqRetrieval in the constructor properly sets up the dependencies needed for FAQ consistency checking.

@ls1intum ls1intum deleted a comment from github-actions bot Jun 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants