Skip to content

Releases: deepset-ai/haystack-experimental

v0.12.0

14 Jul 14:19
1c7bd91
Compare
Choose a tag to compare

🧪 New Experiments

🧠 Agent Breakpoints

We’ve introduced Agent Breakpoints—a feature that allows you to pause and inspect specific stages within the Agent component's execution.

You can use this feature to:

  • Place breakpoints directly on the chat_generator to debug interactions.
  • Add breakpoints to the tools used by the agent to inspect tool behavior during execution.

🔧 Example Usage for Agent within Pipeline

from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage
from haystack.tools.tool import Tool
from haystack_experimental.components.agents.agent import Agent
from typing import List
from haystack_experimental.dataclasses.breakpoints import AgentBreakpoint, Breakpoint, ToolBreakpoint


# Tool Function
def calculate(expression: str) -> dict:
    try:
        result = eval(expression, {"__builtins__": {}})
        return {"result": result}
    except Exception as e:
        return {"error": str(e)}

# Tool Definition
calculator_tool = Tool(
    name="calculator",
    description="Evaluate basic math expressions.",
    parameters={
        "type": "object",
        "properties": {
            "expression": {"type": "string", "description": "Math expression to evaluate"}
        },
        "required": ["expression"]
    },
    function=calculate,
    outputs_to_state={"calc_result": {"source": "result"}}
)

# Agent Setup
agent = Agent(
    chat_generator=OpenAIChatGenerator(),
    tools=[calculator_tool],
    exit_conditions=["calculator"],
    state_schema={
        "calc_result": {"type": int},
    }
)
debug_path = "Path to save the state"

# Breakpoint on the chat_generator of the Agent
chat_generator_breakpoint = Breakpoint("chat_generator", visit_count=0)
agent_breakpoint = AgentBreakpoint(break_point=chat_generator_breakpoint, agent_name='database_agent')

# Run the Agent
agent.warm_up()
response = agent.run(messages=[ChatMessage.from_user("What is 7 * (4 + 2)?")], break_point=agent_breakpoint, debug_path=debug_path)

# Breakpoint on the tools of the Agent
tool_breakpoint = ToolBreakpoint(component_name="tool_invoker", visit_count=0, tool_name="calculator")
agent_breakpoint = AgentBreakpoint(break_point=tool_breakpoint, agent_name='database_agent')

# Run the Agent
agent.warm_up()
response = agent.run(messages=[ChatMessage.from_user("What is 7 * (4 + 2)?")], break_point=agent_breakpoint, debug_path=debug_path)

📦 Breakpoints Dataclass

We’ve added a dedicated Breakpoint dataclass interface to standardize the way breakpoints are declared and managed.

  • Use Breakpoint to target generic components.
  • Use AgentBreakpoint for setting breakpoints on the agent.
  • Use ToolBreakpoint to set breakpoints on specific tools used by the agent.

Related PRs

  • feat: adding agents back to the experimental repo (#326)

Other Updates

  • test: update Bedrock tests with ComponentInfo (#343)
  • docs: improve some multimodal docstrings (#342)

v0.11.0

02 Jul 10:36
8f13872
Compare
Choose a tag to compare

🧪 New Experiments

Query Expander component

We are introducing a component that generates a list of semantically similar queries to improve retrieval recall in RAG systems.

from haystack.components.generators.chat.openai import OpenAIChatGenerator
from haystack_experimental.components.query import QueryExpander

expander = QueryExpander(
    chat_generator=OpenAIChatGenerator(model="gpt-4.1-mini"),
    n_expansions=3
)

result = expander.run(query="green energy sources")
print(result["queries"])
# Output: ['alternative query 1', 'alternative query 2', 'alternative query 3', 'green energy sources']
# Note: Up to 3 additional queries + 1 original query (if include_original_query=True)

# To control total number of queries:
expander = QueryExpander(n_expansions=2, include_original_query=True)  # Up to 3 total
# or
expander = QueryExpander(n_expansions=3, include_original_query=False)  # Exactly 3 total

🔀 New Document Routers

We're introducing two new Routers: DocumentTypeRouter and DocumentLengthRouter.

🖼️ New Multimodal Features

We introduced several new multimodal features, mostly focused on indexing and retrieval.
A notebook will be published soon to show practical usage examples.

Related PRs
  • refactor: adopt pypdfium2 for PDF to image conversion by @anakin87 in #308
  • feat: multimodal support in AmazonBedrockChatGenerator by @anakin87 in #307
  • test: Fix mypy typing by @sjrl in #309
  • feat: Add DocumentToImageConent component to help enable RAG with image Documents by @sjrl in #311
  • chore: fix format for DocumentToImageContent by @anakin87 in #318
  • chore: ignore type errors in Bedrock monkey patches by @anakin87 in #322
  • feat: add SentenceTransformersDocumentImageEmbedder by @anakin87 in #319
  • feat: Add DocumentTypeRouter by @sjrl in #321
  • refactor: refactor multimodal components and utility functions by @anakin87 in #324
  • fix: Fix storage of file path in ImageContent by @sjrl in #325
  • refactor: Refactor converters to follow embedders directory structure by @sjrl in #333
  • feat: Add normalize_embeddings to SentenceTransformersDocumentImageEmbedder to match signature of other embedders by @sjrl in #335
  • feat: add DocumentLengthRouter component by @anakin87 in #334
  • feat: Add ImageFileToDocument converter by @sjrl in #336
  • feat: Add LLMDocumentContentExtractor to enable Vision-based LLMs to describe/convert an image into text by @sjrl in #338
  • docs: add usage examples to docstrings of multimodal components by @anakin87 in #340

Other Updates

  • refactor: synchronising/merging all pipeline related code with haystack main repository by @davidsbatista in #312
  • chore: align Haystack experimental Hatch scripts by @anakin87 in #315
  • chore: align experimental type checking with Haystack by @anakin87 in #320
  • refactor: Refactor experimental Pipeline to use inheritancee by @sjrl in #323
  • fix: refactor code and update init_params in debug_state by @Amnah199 in #317
  • chore: fix ruff linting error by @Amnah199 in #329
  • fix: Fix logger message for pipeline breakpoints by @sjrl in #327
  • fix: Fix validate_input becoming public method by @sjrl in #337
  • Refactor serialization of breakpoints by @Amnah199 in #332

New Contributors

Full Changelog: v0.10.0...v0.11

v0.10.0

19 May 10:27
106aa00
Compare
Choose a tag to compare

🧪 New Experiments

🖼️ Multimodal Text Generation

We are adding support for passing images in user messages and other multimodal features.

from haystack_experimental.dataclasses import ImageContent, ChatMessage
from haystack_experimental.components.generators.chat import OpenAIChatGenerator

image_url = "https://cdn.britannica.com/79/191679-050-C7114D2B/Adult-capybara.jpg"
image_content = ImageContent.from_url(image_url)

message = ChatMessage.from_user(
    content_parts=["Describe the image in short.", image_content]
)

llm = OpenAIChatGenerator(model="gpt-4o-mini")

print(llm.run([message])["replies"][0].text)

For the list of implemented features, see #302.

For more usage examples, check out the example: 📓 Introduction to Multimodal Text Generation.

Related PRs
  • feat: ImageContent dataclass by @anakin87 in #286
  • feat: Add ImageFileToImageContent and PDFToImageContent converters by @sjrl in #290
  • feat: multimodal support in OpenAIChatGenerator by @anakin87 in #292
  • chore: improve Image Converters pydoc config by @anakin87 in #295
  • feat: add convenience class methods to Imagecontent by @anakin87 in #294
  • chore: move ImageContent to a separate module by @anakin87 in #296
  • feat: add Jinja2 ChatMessage extension by @anakin87 in #297
  • feat: ImageContent visualization by @anakin87 in #300
  • feat: extend ChatPromptBuilder to support string templates by @anakin87 in #299
  • chore: update README with multimodal experiment by @anakin87 in #303
  • fix: move IPython import by @anakin87 in #304
  • feat: ImageContent validation by @anakin87 in #305

🐛 Bug Fixes

  • fix: Update __init__.py to use double underscore by @sjrl in #288
  • fix: preserve initialization parameters in debug state when run params are not supplied by @Amnah199 in #293

✅ Adopted Experiments

  • chore: update/clean up experimental by @anakin87 in #285
  • chore: Remove SuperComponent and pre-made super components. Update Readme by @sjrl in #287
  • chore: remove dependencies needed for MultiFileConverter by @anakin87 in #298

Other Updates

Full Changelog: v0.9.0...v0.10.0

v0.9.0

16 Apr 11:48
a9da65d
Compare
Choose a tag to compare

🔧 Updates to Experiments

Adding breakpoints to components in a Pipeline

It's now possible to set breakpoints at any component in any pipeline, forcing the pipeline execution to stop before that component runs and generating a JSON file with the complete state of the pipeline before the breakpoint component was run.

Usage Examples

# Setting breakpoints
pipeline.run(
    data={"input": "value"},
    breakpoints={("component_name", 0)},  # Break at the first visit
    debug_path="debug_states/"
)

This will generate a JSON with the complete pipeline state before the next component is run, i.e.: the one receiving the output of the component set in the breakpoint

# Resuming from a saved state
state = Pipeline.load_state("debug_states/component_state.json")
pipeline.run(
    data={"input": "value"},
    resume_state=state
)

🧑‍🍳 See an example notebook here
💬 Share your feedback in this discussion

✅ Adopted Experiments

Other Updates

  • Proposal for changing internal working of Agent (#245) @sjrl
  • refactor: Streamline super components input and output mapping logic (#243) @sjrl
  • refactor: Small updates to Agent. Make pipeline internal, add check for warm_up (#244) @sjrl
  • feat: Updates to insertion of values into State (#239) @sjrl
  • feat: Add unclassified to output of MultiFileConverter (#240) @julian-risch
  • feat: Enhance tool error logs and some refactoring (#235) @sjrl

Full Changelog: v0.8.0...v0.9.0

v0.8.0

11 Mar 14:32
06b9833
Compare
Choose a tag to compare

🔧 Updates to Experiments

Stream ChatGenerator responses with Agent

The Agent component now allows setting a streaming callback at init and run time. This way, an Agent's response can be streamed in chunks, enabling faster feedback for developers and end users. #233

agent = Agent(chat_generator=chat_generator, tools=[weather_tool])
response = agent.run([ChatMessage.from_user("Hello")], streaming_callback=streaming_callback)

🐛 Bug Fixes

  • We fixed a bug that prevented ComponentTool to work with Jinja2-based components (PromptBuilder, ChatPromptBuilder, ConditionalRouter, OutputAdapter). #234
  • The Agent component now deserializes Tools with the right class and uses deserialize_tools_inplace. #213 #222

✅ Adopted Experiments

  • chore: remove LLMMetadataExtractor by @davidsbatista in #227
  • chore: Remove some missed utility functions from previous experiments by @sjrl in #232
  • chore: removing async version of InMemoryDocumentStore, DocumentWriter, OpenAIChatGenerator, InMemory Retrievers by @davidsbatista in #220
  • chore: remove pipeline experiments by @mathislucka in #214

🛑 Discontinued Experiments

Full Changelog: v0.7.0...v0.8.0

v0.7.0

27 Feb 11:02
a045396
Compare
Choose a tag to compare

🧪 New Experiments

New Agent component

Agent component enables tool-calling functionality with provider-agnostic chat model support and can be used as a standalone component or within a pipeline.
👉 See the Agent in action: 🧑‍🍳 Build a GitHub Issue Resolver Agent

from haystack.dataclasses import ChatMessage
from haystack.components.websearch import SerperDevWebSearch
from haystack_experimental.tools.component_tool import ComponentTool
from haystack_experimental.components.agents import Agent
from haystack.components.generators.chat import OpenAIChatGenerator

web_tool = ComponentTool(
   component=SerperDevWebSearch(),
)

agent = Agent(
   chat_generator=OpenAIChatGenerator(model="gpt-4o-mini"),
   tools=[web_tool],
   exit_condition="text",
)

result = agent.run(
   messages=[ChatMessage.from_user("Find information about Haystack")]
)

Improved ComponentTool and @tool Decorator

The ComponentTool and @tool decorator are extended for better integration with the new Agent component

New Ready-Made SuperComponents

Introducing new SuperComponents that bundle commonly used components and logic for indexing pipelines: MultiFileConverter, SentenceTransformersDocumentIndexer, DocumentPreprocessor

from haystack_experimental.super_components.converters import MultiFileConverter

# process all common file types (.csv, .docx, .html, .json, .md, .txt, .pdf, .pptx, .xlsx) with one component
converter = MultiFileConverter()
converter.run(sources=["test.txt", "test.pdf"], meta={}) 

What's Changed

Full Changelog: v0.6.0...v0.7.0

v0.6.0

10 Feb 17:09
c689c05
Compare
Choose a tag to compare

New Experiments

  • New SuperComponent abstraction that allows to wrap any pipeline into a friendly component interface and to create your own super components 1
from haystack_experimental import SuperComponent

# rag_pipeline = basic RAG pipeline with retriever, prompt builder, generator and answer builder components

input_mapping = {
    "search_query": ["retriever.query", "prompt_builder.query", "answer_builder.query"]
}
output_mapping = {
    "answer_builder.answers": "final_answers"
}

wrapper = SuperComponent(
    pipeline=rag_pipeline,
    input_mapping=input_mapping,
    output_mapping=output_mapping
)

result = wrapper.run(search_query="What is the capital of France?")
print(result["final_answers"][0])
  • New AsyncPipeline that can schedule components to run concurrently 2

Other Updates:

  • Added a debug/tracing script to compare two pipeline runs with the old and new pipeline run logic 3
  • Changed LLMMetadaExtractor to use ChatGenerator instead of Generator 4

Full Changelog: v0.5.0...v0.6.0

v0.5.0

27 Jan 13:57
cbbf088
Compare
Choose a tag to compare

New Experiments

Full Changelog: v0.4.0...v0.5.0


🧬 New Pipeline Logic

This release introduces a reimplementation of the pipeline-run logic to resolve multiple issues, improving reliability and performance. These changes will also be included in Haystack 2.10.

Fixed Issues:

  1. Exceptions in pipelines with two cycles

    • Pipelines with two cycles sharing an optional (like in PromptBuilder) or a greedy variadic edge (e.g., in BranchJoiner) might raise exceptions. Details here.
  2. Incorrect execution in cycles with multiple optional or variadic edges

    • Entry points for cycles were non-deterministic, causing components to run with unexpected inputs or multiple times. This impacted execution time and final outputs.
  3. Missing intermediate outputs in cycles

    • Outputs produced within a cycle were overwritten, preventing downstream components from receiving them.
  4. Premature execution of lazy variadic components

    • Components like DocumentJoiner sometimes executed before receiving all inputs, leading to repeated partial executions that affected downstream results.
  5. Order-sensitive behavior in add_component and connect

    • Some bugs above occurred due to specific orderings of add_component and connect in pipeline creation, causing non-deterministic behavior in cyclic pipelines.

Am I Affected by this Change?

  • Non-cyclic pipelines without lazy variadic components:
    No impact—your pipelines should function as before.

  • Non-cyclic pipelines with lazy variadic components:
    Check inputs and outputs of components like DocumentJoiner for issues #4 and #5. Use LoggingTracer with content tracing to validate behavior. Component execution order now uses lexicographical sorting; rename upstream components if necessary.

  • Pipelines with cycles:
    Review your pipeline outputs as well as the component input and outputs to ensure expected behavior, as you may encounter any of the above issues.

Share your comments in discussion #177

v0.4.0

11 Dec 11:06
7ade6a2
Compare
Choose a tag to compare

New Experiments

Full Changelog: v0.3.0...v0.4.0

v0.3.0

31 Oct 09:47
4e1b37a
Compare
Choose a tag to compare

New Experiments

Full Changelog: v0.2.0...v0.3.0