Fix major bugs in serialization #266

Amnah199 · 2025-04-03T11:49:55Z

Chat messages are not serialized properly.

Amnah199 · 2025-04-03T11:52:35Z

haystack_experimental/core/pipeline/pipeline.py

            # NOTE: if the list has only one item, return just that item unless we are dealing with an AnswerBuilder
            if len(transformed) == 1:
                # NOTE: This is a workaround only for the AnswerBuilder component


This part is whats causing the issue with deserializing ChatMessages. The _content is supposed to be a list but this step removes the list around it. I am checking how to fix this without causing other components to crash.

ah, good catch! I think it's enough:

if isinstance(component, AnswerBuilder) or isinstance(component, ChatMessage):

I guess more instances will keep coming up where removing a list will raise an error. This hack will also keep getting longer. WDYT?

Ideally the original type of value should be preserved, whether a str, int or list.

haystack_experimental/core/pipeline/pipeline.py

davidsbatista · 2025-04-03T13:26:59Z

haystack_experimental/core/pipeline/pipeline.py

-                # we need to wrap the value in a list
-                if isinstance(component, AnswerBuilder):
-                    return transformed
+            if len(data) == 1 and isinstance(data[0], dict) and "value" in data[0] and "sender" in data[0]:


I think this line + the return can run before the line above

Amnah199 · 2025-04-03T15:20:42Z

haystack_experimental/core/pipeline/pipeline.py

+        if isinstance(value, GeneratedAnswer):  # noqa: SIM102
+            if value.meta and "usage" in value.meta:  # noqa: SIM102
+                value.meta.pop("usage", None)


As we are not calling serilization on component inputs, this needed to be changed for AnswerJoiner tests to pass.

* initial import from work done on haystack branch * adding missing Exception and code examples * adding missing Exception * David/pipeline testing (#254) * adding ByteStream example * serialising posixpath objects * fixing tests * fixing namespace import * fixing namespace import * adding examples to ruff check excludes * typo * test cases with loops * Add the last component's input to debug state * Revert "Add the last component's input to debug state" This reverts commit fb98b3d. * Add the last component's input to debug state (#256) * fix: fixing deserialisation issues - working version with Hybrid Pipeline tests (#257) * adding ByteStream example * serialising posixpath objects * fixing tests * fixing namespace import * fixing namespace import * adding examples to ruff check excludes * typo * adding verbosity to examples * fixing deserialisation issues * linting * Fix deserilzation errors in resuming (#258) * Fix deserilzation erros * Adding tests * adding tests * nit cleaning * adding debug_path to pipeline and tests for breakpoint/resume * working for all components in a Hybrid RAG * linting * linting * fixing tests * adding extra dependencies * fixing pyproject.toml * replacing tests model by a smaller one --------- Co-authored-by: David S. Batista <[email protected]> * fix: starting from clean state (#260) * wip: debugging more pipelines * wip: debugging more pipelines * adding more tests * cleaning * cleaning * cleaning * cleaning * cleaning * cleaning * typing * fix: removing all sender keys before serialisation (#261) * removing all sender keys * cleaning * cleaning * removing unused function * fixing header file issue * fix: hybrid pipeline working with stop/resume at each breakpoing (#262) * wip: debugging hybrids * wip: debugging hybrids * cleaning * ruffing * removing unused imports * Add a test file for loops * Fix major bugs in serialization (#266) * Fix issues with chat messages * Update the transform structure method to preserve data type of value * Fixed the issue with GeneratedAnswer * fix: `Adaptor` will need `unsafe=True` for the break/resume to work properly (#267) * adaptor needs unsafe=True + another extra list in a Joiner component * adding missing dependencies * triggering end2end tests for answers joiner pipeline * adding pipeline with list joiner tests * test: adding one more test with another Joiner (#268) * cleaning * marking tests as integration tests * nit documentation * fixing pipeline.py * PR comments * Remove passing component param to resume functions * Include load state in the resume process * Improve based on comments * Update haystack_experimental/core/pipeline/pipeline.py Co-authored-by: Sebastian Husch Lee <[email protected]> * only breaking at the exact visit number * moving imports to the top * Improvements * Improve deserialization * Fix linting * Add warning for unreached breakpoints * Fix typing issues * updating errors.py * Fix integration tests * skipping some tests under certain conditions --------- Co-authored-by: Amna Mubashar <[email protected]> Co-authored-by: Sebastian Husch Lee <[email protected]>

* initial import * adding base for easier review * Adding dependency files * feat: adding `Pipeline` checkpoints (#269) * initial import from work done on haystack branch * adding missing Exception and code examples * adding missing Exception * David/pipeline testing (#254) * adding ByteStream example * serialising posixpath objects * fixing tests * fixing namespace import * fixing namespace import * adding examples to ruff check excludes * typo * test cases with loops * Add the last component's input to debug state * Revert "Add the last component's input to debug state" This reverts commit fb98b3d. * Add the last component's input to debug state (#256) * fix: fixing deserialisation issues - working version with Hybrid Pipeline tests (#257) * adding ByteStream example * serialising posixpath objects * fixing tests * fixing namespace import * fixing namespace import * adding examples to ruff check excludes * typo * adding verbosity to examples * fixing deserialisation issues * linting * Fix deserilzation errors in resuming (#258) * Fix deserilzation erros * Adding tests * adding tests * nit cleaning * adding debug_path to pipeline and tests for breakpoint/resume * working for all components in a Hybrid RAG * linting * linting * fixing tests * adding extra dependencies * fixing pyproject.toml * replacing tests model by a smaller one --------- Co-authored-by: David S. Batista <[email protected]> * fix: starting from clean state (#260) * wip: debugging more pipelines * wip: debugging more pipelines * adding more tests * cleaning * cleaning * cleaning * cleaning * cleaning * cleaning * typing * fix: removing all sender keys before serialisation (#261) * removing all sender keys * cleaning * cleaning * removing unused function * fixing header file issue * fix: hybrid pipeline working with stop/resume at each breakpoing (#262) * wip: debugging hybrids * wip: debugging hybrids * cleaning * ruffing * removing unused imports * Add a test file for loops * Fix major bugs in serialization (#266) * Fix issues with chat messages * Update the transform structure method to preserve data type of value * Fixed the issue with GeneratedAnswer * fix: `Adaptor` will need `unsafe=True` for the break/resume to work properly (#267) * adaptor needs unsafe=True + another extra list in a Joiner component * adding missing dependencies * triggering end2end tests for answers joiner pipeline * adding pipeline with list joiner tests * test: adding one more test with another Joiner (#268) * cleaning * marking tests as integration tests * nit documentation * fixing pipeline.py * PR comments * Remove passing component param to resume functions * Include load state in the resume process * Improve based on comments * Update haystack_experimental/core/pipeline/pipeline.py Co-authored-by: Sebastian Husch Lee <[email protected]> * only breaking at the exact visit number * moving imports to the top * Improvements * Improve deserialization * Fix linting * Add warning for unreached breakpoints * Fix typing issues * updating errors.py * Fix integration tests * skipping some tests under certain conditions --------- Co-authored-by: Amna Mubashar <[email protected]> Co-authored-by: Sebastian Husch Lee <[email protected]> * adding tests * fixing tests * bumping up test coverage * simplifying deserliaise function * activating skip for macOS * fix: fixing serialisation bug and missing tests due to filename renaming (#274) * ading missing code to deserialise and safeguard for resume state in RAG hybrid * tests passing for test_pipeline_breakpoints_answer_joiner * tests passing for test_pipeline_breakpoints_branch_joiner * tests passing for test_pipeline_breakpoints_list_joiner * tests passing for test_pipeline_breakpoints_loop * tests passing for test_pipeline_string_joiner * cleaning * skipping macOS tests for RAG hybrid * adding missed import * fix: windows tests fails due to not being able to read resume state files (#275) * initial import * fixing tests * temporary removing delay and rerun from integration tests * temporary removing delay and rerun from integration tests * fixing windows tests * cleaning * reverting delay and rerun from integration tests * fixing tests * fix: revert the passing of the resume state (#276) * cleaning printf leftovers * updating README.MD * fix: fixing some `pipeline.py` and `errors.py` inconsistencies and respective tests (#277) * initial import * cleaning init * fixing errors.py duplication * fixing linting * fixing pipeline run time error * fixing pipeline run time error * fixing pipeline run time tests * fix: adding mocks to tests (#278) * OpeanAI calls mocked * cleaning imports * adding a mocked TransformersSimilarityRanker * adding a mocked SentenceDocumentEmbedder * mocking rag hybrid pipeline components that make external calls * reverting skip on macOS due to pytorch/sentence-transformers lib * reverting skip on macOS due to pytorch/sentence-transformers lib * mocking all OpenAIChatGenerator calls in the pipeline breakpoint tests * reverting skip on macOS due to pytorch/sentence-transformers lib * relying only on Pathlib for f_name * fixing tests * fixing all mocks for Hybrid pipeline * cleaning imports * cleaning imports * reverting skip on macOS due to pytorch/sentence-transformers lib * linting * pinning transformers to same versions as in haystack-ai * pinning transformers to same versions as in haystack-ai * removing skipping macOS tests for RAG hybrid since it's all mocked now --------- Co-authored-by: Amna Mubashar <[email protected]> Co-authored-by: Sebastian Husch Lee <[email protected]>

Fix issues with chat messages

fc0147d

Amnah199 requested a review from a team as a code owner April 3, 2025 11:49

Amnah199 requested review from anakin87 and davidsbatista and removed request for a team and anakin87 April 3, 2025 11:49

Amnah199 commented Apr 3, 2025

View reviewed changes

haystack_experimental/core/pipeline/pipeline.py Outdated Show resolved Hide resolved

Update the transform structure method to preserve data type of value

0606807

davidsbatista reviewed Apr 3, 2025

View reviewed changes

haystack_experimental/core/pipeline/pipeline.py Outdated Show resolved Hide resolved

davidsbatista reviewed Apr 3, 2025

View reviewed changes

Fixed the issue with GeneratedAnswer

9322935

Amnah199 commented Apr 3, 2025

View reviewed changes

Amnah199 changed the title ~~Fix issues with chat messages~~ Fix major bugs in serialization Apr 3, 2025

davidsbatista approved these changes Apr 3, 2025

View reviewed changes

Amnah199 merged commit 1433afa into feat/pipeline-checkpoints Apr 3, 2025
4 of 8 checks passed

Amnah199 deleted the fix-failing-tests branch April 3, 2025 16:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix major bugs in serialization #266

Fix major bugs in serialization #266

Amnah199 commented Apr 3, 2025

Amnah199 Apr 3, 2025

davidsbatista Apr 3, 2025

Amnah199 Apr 3, 2025

Amnah199 Apr 3, 2025

davidsbatista Apr 3, 2025

Amnah199 Apr 3, 2025 •

edited

Loading

Fix major bugs in serialization #266

Fix major bugs in serialization #266

Conversation

Amnah199 commented Apr 3, 2025

Amnah199 Apr 3, 2025

Choose a reason for hiding this comment

davidsbatista Apr 3, 2025

Choose a reason for hiding this comment

Amnah199 Apr 3, 2025

Choose a reason for hiding this comment

Amnah199 Apr 3, 2025

Choose a reason for hiding this comment

davidsbatista Apr 3, 2025

Choose a reason for hiding this comment

Amnah199 Apr 3, 2025 • edited Loading

Choose a reason for hiding this comment

Amnah199 Apr 3, 2025 •

edited

Loading