Releases: deepset-ai/haystack-experimental
v0.12.0
🧪 New Experiments
🧠 Agent Breakpoints
We’ve introduced Agent Breakpoints—a feature that allows you to pause and inspect specific stages within the Agent
component's execution.
You can use this feature to:
- Place breakpoints directly on the chat_generator to debug interactions.
- Add breakpoints to the tools used by the agent to inspect tool behavior during execution.
🔧 Example Usage for Agent within Pipeline
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage
from haystack.tools.tool import Tool
from haystack_experimental.components.agents.agent import Agent
from typing import List
from haystack_experimental.dataclasses.breakpoints import AgentBreakpoint, Breakpoint, ToolBreakpoint
# Tool Function
def calculate(expression: str) -> dict:
try:
result = eval(expression, {"__builtins__": {}})
return {"result": result}
except Exception as e:
return {"error": str(e)}
# Tool Definition
calculator_tool = Tool(
name="calculator",
description="Evaluate basic math expressions.",
parameters={
"type": "object",
"properties": {
"expression": {"type": "string", "description": "Math expression to evaluate"}
},
"required": ["expression"]
},
function=calculate,
outputs_to_state={"calc_result": {"source": "result"}}
)
# Agent Setup
agent = Agent(
chat_generator=OpenAIChatGenerator(),
tools=[calculator_tool],
exit_conditions=["calculator"],
state_schema={
"calc_result": {"type": int},
}
)
debug_path = "Path to save the state"
# Breakpoint on the chat_generator of the Agent
chat_generator_breakpoint = Breakpoint("chat_generator", visit_count=0)
agent_breakpoint = AgentBreakpoint(break_point=chat_generator_breakpoint, agent_name='database_agent')
# Run the Agent
agent.warm_up()
response = agent.run(messages=[ChatMessage.from_user("What is 7 * (4 + 2)?")], break_point=agent_breakpoint, debug_path=debug_path)
# Breakpoint on the tools of the Agent
tool_breakpoint = ToolBreakpoint(component_name="tool_invoker", visit_count=0, tool_name="calculator")
agent_breakpoint = AgentBreakpoint(break_point=tool_breakpoint, agent_name='database_agent')
# Run the Agent
agent.warm_up()
response = agent.run(messages=[ChatMessage.from_user("What is 7 * (4 + 2)?")], break_point=agent_breakpoint, debug_path=debug_path)
📦 Breakpoints Dataclass
We’ve added a dedicated Breakpoint
dataclass interface to standardize the way breakpoints are declared and managed.
- Use
Breakpoint
to target generic components. - Use
AgentBreakpoint
for setting breakpoints on the agent. - Use
ToolBreakpoint
to set breakpoints on specific tools used by the agent.
Related PRs
- feat: adding agents back to the experimental repo (#326)
Other Updates
v0.11.0
🧪 New Experiments
Query Expander component
We are introducing a component that generates a list of semantically similar queries to improve retrieval recall in RAG systems.
from haystack.components.generators.chat.openai import OpenAIChatGenerator
from haystack_experimental.components.query import QueryExpander
expander = QueryExpander(
chat_generator=OpenAIChatGenerator(model="gpt-4.1-mini"),
n_expansions=3
)
result = expander.run(query="green energy sources")
print(result["queries"])
# Output: ['alternative query 1', 'alternative query 2', 'alternative query 3', 'green energy sources']
# Note: Up to 3 additional queries + 1 original query (if include_original_query=True)
# To control total number of queries:
expander = QueryExpander(n_expansions=2, include_original_query=True) # Up to 3 total
# or
expander = QueryExpander(n_expansions=3, include_original_query=False) # Exactly 3 total
- feat: add QueryExpander component by @mpangrazzi in #331
🔀 New Document Routers
We're introducing two new Routers: DocumentTypeRouter and DocumentLengthRouter.
🖼️ New Multimodal Features
We introduced several new multimodal features, mostly focused on indexing and retrieval.
A notebook will be published soon to show practical usage examples.
- multimodal support in
AmazonBedrockChatGenerator
- new image Converters
SentenceTransformersDocumentImageEmbedder
: a component to compute embeddings for image-based documentsLLMDocumentContentExtractor
: a component to extract textual content from image-based documents using a vision-enabled LLM
Related PRs
- refactor: adopt pypdfium2 for PDF to image conversion by @anakin87 in #308
- feat: multimodal support in
AmazonBedrockChatGenerator
by @anakin87 in #307 - test: Fix mypy typing by @sjrl in #309
- feat: Add
DocumentToImageConent
component to help enable RAG with image Documents by @sjrl in #311 - chore: fix format for
DocumentToImageContent
by @anakin87 in #318 - chore: ignore type errors in Bedrock monkey patches by @anakin87 in #322
- feat: add
SentenceTransformersDocumentImageEmbedder
by @anakin87 in #319 - feat: Add
DocumentTypeRouter
by @sjrl in #321 - refactor: refactor multimodal components and utility functions by @anakin87 in #324
- fix: Fix storage of file path in ImageContent by @sjrl in #325
- refactor: Refactor converters to follow embedders directory structure by @sjrl in #333
- feat: Add
normalize_embeddings
toSentenceTransformersDocumentImageEmbedder
to match signature of other embedders by @sjrl in #335 - feat: add
DocumentLengthRouter
component by @anakin87 in #334 - feat: Add ImageFileToDocument converter by @sjrl in #336
- feat: Add
LLMDocumentContentExtractor
to enable Vision-based LLMs to describe/convert an image into text by @sjrl in #338 - docs: add usage examples to docstrings of multimodal components by @anakin87 in #340
Other Updates
- refactor: synchronising/merging all pipeline related code with haystack main repository by @davidsbatista in #312
- chore: align Haystack experimental Hatch scripts by @anakin87 in #315
- chore: align experimental type checking with Haystack by @anakin87 in #320
- refactor: Refactor experimental Pipeline to use inheritancee by @sjrl in #323
- fix: refactor code and update
init_params
indebug_state
by @Amnah199 in #317 - chore: fix
ruff
linting error by @Amnah199 in #329 - fix: Fix logger message for pipeline breakpoints by @sjrl in #327
- fix: Fix validate_input becoming public method by @sjrl in #337
- Refactor serialization of breakpoints by @Amnah199 in #332
New Contributors
- @mpangrazzi made their first contribution in #331
Full Changelog: v0.10.0...v0.11
v0.10.0
🧪 New Experiments
🖼️ Multimodal Text Generation
We are adding support for passing images in user messages and other multimodal features.
from haystack_experimental.dataclasses import ImageContent, ChatMessage
from haystack_experimental.components.generators.chat import OpenAIChatGenerator
image_url = "https://cdn.britannica.com/79/191679-050-C7114D2B/Adult-capybara.jpg"
image_content = ImageContent.from_url(image_url)
message = ChatMessage.from_user(
content_parts=["Describe the image in short.", image_content]
)
llm = OpenAIChatGenerator(model="gpt-4o-mini")
print(llm.run([message])["replies"][0].text)
For the list of implemented features, see #302.
For more usage examples, check out the example: 📓 Introduction to Multimodal Text Generation.
Related PRs
- feat:
ImageContent
dataclass by @anakin87 in #286 - feat: Add
ImageFileToImageContent
andPDFToImageContent
converters by @sjrl in #290 - feat: multimodal support in
OpenAIChatGenerator
by @anakin87 in #292 - chore: improve Image Converters pydoc config by @anakin87 in #295
- feat: add convenience class methods to
Imagecontent
by @anakin87 in #294 - chore: move
ImageContent
to a separate module by @anakin87 in #296 - feat: add Jinja2 ChatMessage extension by @anakin87 in #297
- feat:
ImageContent
visualization by @anakin87 in #300 - feat: extend
ChatPromptBuilder
to support string templates by @anakin87 in #299 - chore: update README with multimodal experiment by @anakin87 in #303
- fix: move IPython import by @anakin87 in #304
- feat:
ImageContent
validation by @anakin87 in #305
🐛 Bug Fixes
- fix: Update
__init__.py
to use double underscore by @sjrl in #288 - fix: preserve initialization parameters in debug state when run params are not supplied by @Amnah199 in #293
✅ Adopted Experiments
- chore: update/clean up experimental by @anakin87 in #285
- chore: Remove SuperComponent and pre-made super components. Update Readme by @sjrl in #287
- chore: remove dependencies needed for
MultiFileConverter
by @anakin87 in #298
Other Updates
- Update issue template for adding new experiments by @bilgeyucel in #283
- docs: add missing pydocs by @dfokina in #291
Full Changelog: v0.9.0...v0.10.0
v0.9.0
🔧 Updates to Experiments
Adding breakpoints to components in a Pipeline
It's now possible to set breakpoints at any component in any pipeline, forcing the pipeline execution to stop before that component runs and generating a JSON file with the complete state of the pipeline before the breakpoint component was run.
Usage Examples
# Setting breakpoints
pipeline.run(
data={"input": "value"},
breakpoints={("component_name", 0)}, # Break at the first visit
debug_path="debug_states/"
)
This will generate a JSON with the complete pipeline state before the next component is run, i.e.: the one receiving the output of the component set in the breakpoint
# Resuming from a saved state
state = Pipeline.load_state("debug_states/component_state.json")
pipeline.run(
data={"input": "value"},
resume_state=state
)
🧑🍳 See an example notebook here
💬 Share your feedback in this discussion
✅ Adopted Experiments
- chore: Remove
Agent
after Haystack 2.12 release (#263) @julian-risch - chore: Remove
AutoMergingRetriever
after Haystack 2.12 release (#265) @davidsbatista
Other Updates
- Proposal for changing internal working of Agent (#245) @sjrl
- refactor: Streamline super components input and output mapping logic (#243) @sjrl
- refactor: Small updates to Agent. Make pipeline internal, add check for warm_up (#244) @sjrl
- feat: Updates to insertion of values into
State
(#239) @sjrl - feat: Add
unclassified
to output of MultiFileConverter (#240) @julian-risch - feat: Enhance tool error logs and some refactoring (#235) @sjrl
Full Changelog: v0.8.0...v0.9.0
v0.8.0
🔧 Updates to Experiments
Stream ChatGenerator responses with Agent
The Agent
component now allows setting a streaming callback at init and run time. This way, an Agent
's response can be streamed in chunks, enabling faster feedback for developers and end users. #233
agent = Agent(chat_generator=chat_generator, tools=[weather_tool])
response = agent.run([ChatMessage.from_user("Hello")], streaming_callback=streaming_callback)
🐛 Bug Fixes
- We fixed a bug that prevented ComponentTool to work with Jinja2-based components (PromptBuilder, ChatPromptBuilder, ConditionalRouter, OutputAdapter). #234
- The
Agent
component now deserializes Tools with the right class and usesdeserialize_tools_inplace
. #213 #222
✅ Adopted Experiments
- chore: remove
LLMMetadataExtractor
by @davidsbatista in #227 - chore: Remove some missed utility functions from previous experiments by @sjrl in #232
- chore: removing async version of
InMemoryDocumentStore
,DocumentWriter
,OpenAIChatGenerator
, InMemory Retrievers by @davidsbatista in #220 - chore: remove pipeline experiments by @mathislucka in #214
🛑 Discontinued Experiments
- chore: remove evaluation harness experiment by @julian-risch in #231
Full Changelog: v0.7.0...v0.8.0
v0.7.0
🧪 New Experiments
New Agent
component
Agent
component enables tool-calling functionality with provider-agnostic chat model support and can be used as a standalone component or within a pipeline.
👉 See the Agent
in action: 🧑🍳 Build a GitHub Issue Resolver Agent
from haystack.dataclasses import ChatMessage
from haystack.components.websearch import SerperDevWebSearch
from haystack_experimental.tools.component_tool import ComponentTool
from haystack_experimental.components.agents import Agent
from haystack.components.generators.chat import OpenAIChatGenerator
web_tool = ComponentTool(
component=SerperDevWebSearch(),
)
agent = Agent(
chat_generator=OpenAIChatGenerator(model="gpt-4o-mini"),
tools=[web_tool],
exit_condition="text",
)
result = agent.run(
messages=[ChatMessage.from_user("Find information about Haystack")]
)
Improved ComponentTool
and @tool
Decorator
The ComponentTool
and @tool
decorator are extended for better integration with the new Agent
component
New Ready-Made SuperComponents
Introducing new SuperComponent
s that bundle commonly used components and logic for indexing pipelines: MultiFileConverter
, SentenceTransformersDocumentIndexer
, DocumentPreprocessor
from haystack_experimental.super_components.converters import MultiFileConverter
# process all common file types (.csv, .docx, .html, .json, .md, .txt, .pdf, .pptx, .xlsx) with one component
converter = MultiFileConverter()
converter.run(sources=["test.txt", "test.pdf"], meta={})
What's Changed
- docs: add Supercomponent pydoc, delete outdated by @dfokina in #193
- docs: updating trace comparison tool README.md by @davidsbatista in #195
- chore: Create issue templates for adding, removing, moving an experiment by @julian-risch in #192
- chore: remove OpenSearch from experimental by @anakin87 in #200
- fix: fixing auto-merging tests, removing hard-coded doc ids by @davidsbatista in #202
- chore: add tool related code to prepare Agent PR by @mathislucka in #203
- feat: add file and indexing related super components by @mathislucka in #184
- docs: Add SuperComponent to catalog by @julian-risch in #190
- feat: Introduce Agent by @mathislucka in #175
- docs: add pydoc config for Agent component by @julian-risch in #208
- docs: Notebook for Agent component by @mathislucka in #204
- docs: add MultiFileConverter, SentenceTransformerrsDocumentIndexer, and DocumentPreprocessor to docs by @dfokina in #210
Full Changelog: v0.6.0...v0.7.0
v0.6.0
New Experiments
- New
SuperComponent
abstraction that allows to wrap any pipeline into a friendly component interface and to create your own super components 1
from haystack_experimental import SuperComponent
# rag_pipeline = basic RAG pipeline with retriever, prompt builder, generator and answer builder components
input_mapping = {
"search_query": ["retriever.query", "prompt_builder.query", "answer_builder.query"]
}
output_mapping = {
"answer_builder.answers": "final_answers"
}
wrapper = SuperComponent(
pipeline=rag_pipeline,
input_mapping=input_mapping,
output_mapping=output_mapping
)
result = wrapper.run(search_query="What is the capital of France?")
print(result["final_answers"][0])
- New
AsyncPipeline
that can schedule components to run concurrently 2
Other Updates:
- Added a debug/tracing script to compare two pipeline runs with the old and new pipeline run logic 3
- Changed
LLMMetadaExtractor
to useChatGenerator
instead ofGenerator
4
Full Changelog: v0.5.0...v0.6.0
v0.5.0
New Experiments
- New
Pipeline
class with new pipeline run logic -Pipeline
example
Full Changelog: v0.4.0...v0.5.0
🧬 New Pipeline Logic
This release introduces a reimplementation of the pipeline-run logic to resolve multiple issues, improving reliability and performance. These changes will also be included in Haystack 2.10.
Fixed Issues:
-
Exceptions in pipelines with two cycles
- Pipelines with two cycles sharing an optional (like in
PromptBuilder
) or a greedy variadic edge (e.g., inBranchJoiner
) might raise exceptions. Details here.
- Pipelines with two cycles sharing an optional (like in
-
Incorrect execution in cycles with multiple optional or variadic edges
- Entry points for cycles were non-deterministic, causing components to run with unexpected inputs or multiple times. This impacted execution time and final outputs.
-
Missing intermediate outputs in cycles
- Outputs produced within a cycle were overwritten, preventing downstream components from receiving them.
-
Premature execution of lazy variadic components
- Components like
DocumentJoiner
sometimes executed before receiving all inputs, leading to repeated partial executions that affected downstream results.
- Components like
-
Order-sensitive behavior in
add_component
andconnect
- Some bugs above occurred due to specific orderings of
add_component
andconnect
in pipeline creation, causing non-deterministic behavior in cyclic pipelines.
- Some bugs above occurred due to specific orderings of
Am I Affected by this Change?
-
Non-cyclic pipelines without lazy variadic components:
No impact—your pipelines should function as before. -
Non-cyclic pipelines with lazy variadic components:
Check inputs and outputs of components likeDocumentJoiner
for issues#4
and#5
. UseLoggingTracer
with content tracing to validate behavior. Component execution order now uses lexicographical sorting; rename upstream components if necessary. -
Pipelines with cycles:
Review your pipeline outputs as well as the component input and outputs to ensure expected behavior, as you may encounter any of the above issues.
Share your comments in discussion #177
v0.4.0
New Experiments
- AsyncPipeline and async-enabled components -
AsyncPipeline
example
Full Changelog: v0.3.0...v0.4.0
v0.3.0
New Experiments
- Metadata extraction with LLM -
LLMetadataExtractor
- Support for tools in
OllamaChatGenerator
,HuggingFaceAPIChatGenerator
,AnthropicChatGenerator
Full Changelog: v0.2.0...v0.3.0