Skip to content

Add observation.metadata.choice_nodes #4422

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Jun 8, 2025
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions hypothesis-python/RELEASE.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
RELEASE_TYPE: minor

This release adds |OBSERVABILITY_CHOICE_NODES|, a new option for :ref:`observability <observability>`, which includes the choice sequence in ``metadata.choice_nodes`` for test case observations if set.
4 changes: 4 additions & 0 deletions hypothesis-python/docs/prolog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -116,17 +116,21 @@
.. |PrimitiveProvider.draw_string| replace:: :func:`~hypothesis.internal.conjecture.providers.PrimitiveProvider.draw_string`
.. |PrimitiveProvider.draw_bytes| replace:: :func:`~hypothesis.internal.conjecture.providers.PrimitiveProvider.draw_bytes`
.. |PrimitiveProvider.on_observation| replace:: :func:`~hypothesis.internal.conjecture.providers.PrimitiveProvider.on_observation`
.. |PrimitiveProvider.observe_test_case| replace:: :func:`~hypothesis.internal.conjecture.providers.PrimitiveProvider.observe_test_case`
.. |PrimitiveProvider.observe_information_messages| replace:: :func:`~hypothesis.internal.conjecture.providers.PrimitiveProvider.observe_information_messages`
.. |PrimitiveProvider.per_test_case_context_manager| replace:: :func:`~hypothesis.internal.conjecture.providers.PrimitiveProvider.per_test_case_context_manager`
.. |PrimitiveProvider.add_observability_callback| replace:: :data:`~hypothesis.internal.conjecture.providers.PrimitiveProvider.add_observability_callback`

.. |AVAILABLE_PROVIDERS| replace:: :data:`~hypothesis.internal.conjecture.providers.AVAILABLE_PROVIDERS`
.. |TESTCASE_CALLBACKS| replace:: :data:`~hypothesis.internal.observability.TESTCASE_CALLBACKS`
.. |OBSERVABILITY_CHOICE_NODES| replace:: :data:`~hypothesis.internal.observability.OBSERVABILITY_CHOICE_NODES`
.. |BUFFER_SIZE| replace:: :data:`~hypothesis.internal.conjecture.engine.BUFFER_SIZE`
.. |MAX_SHRINKS| replace:: :data:`~hypothesis.internal.conjecture.engine.MAX_SHRINKS`
.. |MAX_SHRINKING_SECONDS| replace:: :data:`~hypothesis.internal.conjecture.engine.MAX_SHRINKING_SECONDS`
.. |BackendCannotProceed| replace:: :exc:`~hypothesis.errors.BackendCannotProceed`

.. |@rule| replace:: :func:`@rule <hypothesis.stateful.rule>`
.. |@precondition| replace:: :func:`@precondition <hypothesis.stateful.precondition>`
.. |RuleBasedStateMachine| replace:: :class:`~hypothesis.stateful.RuleBasedStateMachine`
.. |run_state_machine_as_test| replace:: :func:`~hypothesis.stateful.run_state_machine_as_test`

Expand Down
9 changes: 9 additions & 0 deletions hypothesis-python/docs/reference/integrations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -162,6 +162,15 @@ including Gson in Java, ``JSON.parse()`` in Ruby, and of course in Python.
:hide_key: /additionalProperties, /type


Hypothesis Metadata
^^^^^^^^^^^^^^^^^^^
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's

  • move the information-message schema first
  • make this a subheading of the test-case schema
  • consider pulling out the choice_nodes and choice_spans to a second subheading and collapsing <details> by default (it's a lot of very-niche-interest text)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure sphinx has collapsible elements unless we install some third party extension 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a cute trick, the built-in only directive allows for arbitrary html:

.. only:: html

   .. raw:: html

      <details>
      <summary>Click to expand</summary>

Put whatever you want here, and in the HTML build only it'll be inside a details tag!

.. only:: html

   .. raw:: html

      </details>

Copy link
Member Author

@tybug tybug Jun 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

happy to collapse for now, but TBH I think observability might deserve its own page at this point. (I think ghostwriter does as well). IMO a selling point of big api reference pages is that you don't have to worry about things like niche content or collapsing text

(which also makes cmd+f harder) TIL cmd+f autoexpands <details>

but still, collapsing makes it easy to miss, and this is a pretty important section for the right subset of people

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, good argument, leave it uncollapsed and we'll move it later.


While the observability format is agnostic to the property-based testing library which generated it, Hypothesis includes specific values in the ``metadata`` key for test cases. You may rely on these being present if and only if the observation was generated by Hypothesis.

.. jsonschema:: ./schema_metadata.json
:hide_key: /additionalProperties, /type


.. _pytest-plugin:

The Hypothesis pytest plugin
Expand Down
2 changes: 1 addition & 1 deletion hypothesis-python/docs/reference/internals.rst
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ Observability

.. autodata:: hypothesis.internal.observability.TESTCASE_CALLBACKS
.. autodata:: hypothesis.internal.observability.OBSERVABILITY_COLLECT_COVERAGE

.. autodata:: hypothesis.internal.observability.OBSERVABILITY_CHOICE_NODES

Engine constants
----------------
Expand Down
91 changes: 91 additions & 0 deletions hypothesis-python/docs/reference/schema_metadata.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
{
"title": "Hypothesis Metadata",
"description": "Hypothesis-specific values included in the ``metadata`` key of observations for test cases.",
"type": "object",
"properties": {
"traceback": {
"type": ["string", "null"],
"description": "The traceback for failing tests, if and only if ``status == \"failed\"``."
},
"reproduction_decorator": {
"type": ["string", "null"],
"description": "The ``@reproduce_failure`` decorator string for failing tests, if and only if ``status == \"failed\"``."
},
"predicates": {
"type": "object",
"description": "The number of times each |assume| and |@precondition| predicate was satisfied (``True``) and not satisfied (``False``).",
"additionalProperties": {
"type": "object",
"properties": {
"satisfied": {
"type": "integer",
"minimum": 0,
"description": "The number of times this predicate was satisfied (``True``)."
},
"unsatisfied": {
"type": "integer",
"minimum": 0,
"description": "The number of times this predicate was not satisfied (``False``)."
}
},
"required": ["satisfied", "unsatisfied"],
"additionalProperties": false
}
},
"backend": {
"type": "object",
"description": "Backend-specific observations from |PrimitiveProvider.observe_test_case| and |PrimitiveProvider.observe_information_messages|."
},
"sys.argv": {
"type": "array",
"items": {"type": "string"},
"description": "The result of ``sys.argv``."
},
"os.getpid()": {
"type": "integer",
"description": "The result of ``os.getpid()``."
},
"imported_at": {
"type": "number",
"description": "The unix timestamp when Hypothesis was imported."
},
"data_status": {
"type": "number",
"enum": [0, 1, 2, 3],
"description": "The internal status of the ConjectureData for this test case. The values are as follows: ``Status.OVERRUN = 0``, ``Status.INVALID = 1``, ``Status.VALID = 2``, and ``Status.INTERESTING = 3``."
},
"interesting_origin": {
"type": ["string", "null"],
"description": "The internal InterestingOrigin object for failing tests, if and only if ``status == \"failed\"``. The ``traceback`` string value is derived from this object."
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC we already store this in the top-level status_reason key.

[goes and checks] aaaaaannnd, we mishandle non-failing cases, oops.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, INVALID and OVERRUN get collapsed together (which is probably-good as a library agnostic format)

Copy link
Member Author

@tybug tybug Jun 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I thought this was status, not status_reason. The actual reason I split this out is because hypofuzz wants the InterestingOrigin object itself, not the str(origin) version in status_reason.

[e: and this means the type here is wrong, fixing..]

[e2: nope, it is a string because we're documenting the file format here, not the in-memory format. Hmm, unfortunate that the two are similar but different...]

"choice_nodes": {
"type": ["array", "null"],
"description": "The sequence of choices made during this test case. This includes the choice value, as well as its constraints and whether it was forced or not. The choice sequence is a relatively low-level implementation detail of Hypothesis, and is exposed here for users building tools or research on top of Hypothesis. See |PrimitiveProvider| for more details about the choice sequence.\n\nOnly present if |OBSERVABILITY_CHOICE_NODES| is set.",
"items": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": ["integer", "float", "string", "bytes", "boolean"],
"description": "The type of choice made. Corresponds to a call to |PrimitiveProvider.draw_integer|, |PrimitiveProvider.draw_float|, |PrimitiveProvider.draw_string|, |PrimitiveProvider.draw_bytes|, or |PrimitiveProvider.draw_boolean|."
},
"value": {
"description": "The value of the choice. Corresponds to the value returned by a ``PrimitiveProvider.draw_*`` method.\n\n``NaN`` float values are returned as ``[\"float\", <float64_int_value>]``, to distinguish ``NaN`` floats with nonstandard bit patterns. Integers with ``abs(value) >= 2**63`` are returned as ``[\"integer\", str(value)]``, for compatibility with tools with integer size limitations. Bytes are returned as ``[\"bytes\", base64.b64encode(value)]``."
},
"constraints": {
"type": "object",
"description": "The constraints for this choice. Corresponds to the constraints passed to a ``PrimitiveProvider.draw_*`` method. ``NaN`` float values, integers with ``abs(value) >= 2**63``, and byte values for constraints are transformed as for the ``value`` attribute."
},
"was_forced": {
"type": "boolean",
"description": "Whether this choice was forced. As an implementation detail, Hypothesis occasionally requires that some choices take on a specific value, for instance to end generation of collection elements early for performance. These values are \"forced\", and have ``was_forced = True``."
}
},
"required": ["type", "value", "constraints", "was_forced"],
"additionalProperties": false
}
}
},
"required": ["traceback", "reproduction_decorator", "predicates", "backend", "sys_argv", "os_getpid", "imported_at", "data_status", "interesting_origin", "choice_nodes"],
"additionalProperties": false
}
152 changes: 144 additions & 8 deletions hypothesis-python/src/hypothesis/internal/observability.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,10 @@

"""Observability tools to spit out analysis-ready tables, one row per test case."""

import base64
import dataclasses
import json
import math
import os
import sys
import time
Expand All @@ -20,10 +23,24 @@
from dataclasses import dataclass
from datetime import date, timedelta
from functools import lru_cache
from typing import TYPE_CHECKING, Any, Callable, Literal, Optional, Union
from typing import TYPE_CHECKING, Any, Callable, Literal, Optional, Union, cast

from hypothesis.configuration import storage_directory
from hypothesis.errors import HypothesisWarning
from hypothesis.internal.conjecture.choice import (
BooleanConstraints,
BytesConstraints,
ChoiceConstraintsT,
ChoiceNode,
ChoiceT,
ChoiceTypeT,
FloatConstraints,
IntegerConstraints,
StringConstraints,
)
from hypothesis.internal.escalation import InterestingOrigin
from hypothesis.internal.floats import float_to_int
from hypothesis.internal.intervalsets import IntervalSet

if TYPE_CHECKING:
from typing import TypeAlias
Expand All @@ -43,6 +60,93 @@ def update_count(self, *, condition: bool) -> None:
self.unsatisfied += 1


def _choice_to_json(choice: Union[ChoiceT, None]) -> Any:
if choice is None:
return None
# see the note on the same check in to_jsonable for why we cast large
# integers to floats.
if (
isinstance(choice, int)
and not isinstance(choice, bool)
and abs(choice) >= 2**63
):
return ["integer", str(choice)]
elif isinstance(choice, bytes):
return ["bytes", base64.b64encode(choice).decode()]
elif isinstance(choice, float) and math.isnan(choice):
# handle nonstandard nan bit patterns. We don't need to do this for -0.0
# vs 0.0 since json doesn't normalize -0.0 to 0.0.
return ["float", float_to_int(choice)]
return choice


def choices_to_json(choices: tuple[ChoiceT, ...]) -> list[Any]:
return [_choice_to_json(choice) for choice in choices]


def _constraints_to_json(
choice_type: ChoiceTypeT, constraints: ChoiceConstraintsT
) -> dict[str, Any]:
constraints = constraints.copy()
if choice_type == "integer":
constraints = cast(IntegerConstraints, constraints)
return {
"min_value": _choice_to_json(constraints["min_value"]),
"max_value": _choice_to_json(constraints["max_value"]),
"weights": (
None
if constraints["weights"] is None
# wrap up in a list, instead of a dict, because json dicts
# require string keys
else [
(_choice_to_json(k), v) for k, v in constraints["weights"].items()
]
),
"shrink_towards": _choice_to_json(constraints["shrink_towards"]),
}
elif choice_type == "float":
constraints = cast(FloatConstraints, constraints)
return {
"min_value": _choice_to_json(constraints["min_value"]),
"max_value": _choice_to_json(constraints["max_value"]),
"allow_nan": constraints["allow_nan"],
"smallest_nonzero_magnitude": constraints["smallest_nonzero_magnitude"],
}
elif choice_type == "string":
constraints = cast(StringConstraints, constraints)
assert isinstance(constraints["intervals"], IntervalSet)
return {
"intervals": constraints["intervals"].intervals,
"min_size": _choice_to_json(constraints["min_size"]),
"max_size": _choice_to_json(constraints["max_size"]),
}
elif choice_type == "bytes":
constraints = cast(BytesConstraints, constraints)
return {
"min_size": _choice_to_json(constraints["min_size"]),
"max_size": _choice_to_json(constraints["max_size"]),
}
elif choice_type == "boolean":
constraints = cast(BooleanConstraints, constraints)
return {
"p": constraints["p"],
}
else:
raise NotImplementedError(f"unknown choice type {choice_type}")


def nodes_to_json(nodes: tuple[ChoiceNode, ...]) -> list[dict[str, Any]]:
return [
{
"type": node.type,
"value": _choice_to_json(node.value),
"constraints": _constraints_to_json(node.type, node.constraints),
"was_forced": node.was_forced,
}
for node in nodes
]


@dataclass
class ObservationMetadata:
traceback: Optional[str]
Expand All @@ -52,6 +156,28 @@ class ObservationMetadata:
sys_argv: list[str]
os_getpid: int
imported_at: float
data_status: "Status"
interesting_origin: Optional[InterestingOrigin]
choice_nodes: Optional[tuple[ChoiceNode, ...]]

def to_json(self) -> dict[str, Any]:
data = {
"traceback": self.traceback,
"reproduction_decorator": self.reproduction_decorator,
"predicates": self.predicates,
"backend": self.backend,
"sys.argv": self.sys_argv,
"os.getpid()": self.os_getpid,
"imported_at": self.imported_at,
"data_status": self.data_status,
"interesting_origin": self.interesting_origin,
"choice_nodes": (
None if self.choice_nodes is None else nodes_to_json(self.choice_nodes)
),
}
# check that we didn't forget one
assert len(data) == len(dataclasses.fields(self))
return data


@dataclass
Expand Down Expand Up @@ -183,6 +309,9 @@ def make_testcase(
),
"predicates": dict(data._observability_predicates),
"backend": backend_metadata or {},
"data_status": data.status,
"interesting_origin": data.interesting_origin,
"choice_nodes": data.nodes if OBSERVABILITY_CHOICE_NODES else None,
**_system_metadata(),
# unpack last so it takes precedence for duplicate keys
**(metadata or {}),
Expand All @@ -204,11 +333,7 @@ def _deliver_to_file(observation: Observation) -> None: # pragma: no cover
fname.parent.mkdir(exist_ok=True, parents=True)
_WROTE_TO.add(fname)
with fname.open(mode="a") as f:
obs_json: dict[str, Any] = to_jsonable(observation, avoid_realization=False) # type: ignore
if obs_json["type"] == "test_case":
obs_json["metadata"]["sys.argv"] = obs_json["metadata"].pop("sys_argv")
obs_json["metadata"]["os.getpid()"] = obs_json["metadata"].pop("os_getpid")
f.write(json.dumps(obs_json) + "\n")
f.write(json.dumps(to_jsonable(observation, avoid_realization=False)) + "\n")


_imported_at = time.time()
Expand All @@ -231,6 +356,15 @@ def _system_metadata() -> dict[str, Any]:
OBSERVABILITY_COLLECT_COVERAGE = (
"HYPOTHESIS_EXPERIMENTAL_OBSERVABILITY_NOCOVER" not in os.environ
)
#: If ``True``, include the ``metadata.choice_nodes`` key in test case
#: observations.
#:
#: ``False`` by default. ``metadata.choice_nodes`` can be substantial amount of
#: data, and so must be opted-in to, even when observability is enabled.
OBSERVABILITY_CHOICE_NODES = (
"HYPOTHESIS_EXPERIMENTAL_OBSERVABILITY_CHOICE_NODES" in os.environ
)

if OBSERVABILITY_COLLECT_COVERAGE is False and (
sys.version_info[:2] >= (3, 12)
): # pragma: no cover
Expand All @@ -240,8 +374,10 @@ def _system_metadata() -> dict[str, Any]:
HypothesisWarning,
stacklevel=2,
)
if "HYPOTHESIS_EXPERIMENTAL_OBSERVABILITY" in os.environ or (
OBSERVABILITY_COLLECT_COVERAGE is False

if (
"HYPOTHESIS_EXPERIMENTAL_OBSERVABILITY" in os.environ
or OBSERVABILITY_COLLECT_COVERAGE is False
): # pragma: no cover
TESTCASE_CALLBACKS.append(_deliver_to_file)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -166,6 +166,10 @@ def to_jsonable(obj: object, *, avoid_realization: bool) -> object:
known types.
"""
if isinstance(obj, (str, int, float, bool, type(None))):
# We convert integers of 2**63 to floats, to avoid crashing external
# utilities with a 64 bit integer cap (notable, sqlite). See
# https://github.com/HypothesisWorks/hypothesis/pull/3797#discussion_r1413425110
# and https://github.com/simonw/sqlite-utils/issues/605.
if isinstance(obj, int) and not isinstance(obj, bool) and abs(obj) >= 2**63:
# Silently clamp very large ints to max_float, to avoid OverflowError when
# casting to float. (but avoid adding more constraints to symbolic values)
Expand Down
6 changes: 6 additions & 0 deletions hypothesis-python/tests/conjecture/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,12 @@ def draw_value(choice_type, constraints):
return getattr(data, f"draw_{choice_type}")(**constraints)


@st.composite
def choices(draw):
(choice_type, constraints) = draw(choice_types_constraints())
return draw_value(choice_type, constraints)


@st.composite
def nodes(draw, *, was_forced=None, choice_types=None):
if choice_types is None:
Expand Down
Loading
Loading