Skip to content

PipelineConnectError caused by optional None input mis‑inference #9545

Open
@Pitrified

Description

@Pitrified

Describe the bug

When a component’s run method accepts an optional list[str] | None, Haystack incorrectly infers its input type as list[list[str]] during pipeline connection. A similar component requiring a mandatory list[str] works fine.

Error message

PipelineConnectError: Cannot connect 'producer.words' with 'consumer_opt.words': 
their declared input and output types do not match.
'producer':
  - words: list[str]
'consumer_opt':
  - words: list[list[str]] (available)

Expected behavior

Since StringConsumerOptional explicitly declares words: list[str] | None, Haystack should allow the connection from a list[str] output-just as it does for the mandatory version-without inferring a nested list.

Minimal Reproduction Code

from typing import Any
from haystack import Pipeline, component

# Producer: outputs a plain list of strings
@component
class StringProducer:
    @component.output_types(words=list[str])
    def run(self) -> dict[str, Any]:
        return {"words": ["apple", "banana", "cherry"]}

# Consumer (optional): accepts list[str] | None
@component
class StringConsumerOptional:
    @component.output_types(count=int)
    def run(self, words: list[str] | None = None) -> dict[str, Any]:
        return {"count": len(words or [])}

pipeline = Pipeline()
pipeline.add_component("producer", StringProducer())
pipeline.add_component("consumer_opt", StringConsumerOptional())

# fails:
pipeline.connect("producer.words", "consumer_opt.words")

Additional context

Connection validation can be disabled (via Pipeline(..., connection_type_validation=False)), which may bypass this error. But this only masks the problem rather than preserving accurate type inference when optional inputs are used.

Please note that in the issues linked below the problem was Optional[str] -> str, which

  1. is a different case: here, the issue seems with list[str] | None being misinterpreted as list[list[str]].
  2. seems like a more dangerous case, as it can lead to runtime errors if the undeclared optional input is not handled correctly. in this case, the receiver can handle the None case gracefully and is declared to do so.

Please note that while this is a minimal example, there are valid scenarios where a component might need to accept an optional input of type list[str] | None.
We need a component where some of the inputs are optional. The component is used several times in the pipeline, and only in some cases the connection to that optional input is made.
In the actual code, there is at least one input that is always connected, so the component can still run successfully.

To Reproduce

Run the snippet above

FAQ Check

System:

  • OS: *Ubuntu 22.04
  • GPU/CPU:
  • Haystack version: 2.11.2
  • Python: 3.13.1
  • DocumentStore: (not applicable)
  • Reader: (not applicable)
  • Retriever: (not applicable)

Related Links

https://haystack.deepset.ai/release-notes/2.11.0
https://docs.haystack.deepset.ai/reference/pipeline-api#pipeline__init__
#8862
#8866
#8874

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium priority, add to the next sprint if no P1 available

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions