Skip to content

feat: add TimeBinConfig for span and trace count time series #8332

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Jul 14, 2025

Conversation

RogerHYang
Copy link
Contributor

@RogerHYang RogerHYang commented Jun 30, 2025

resolves #8079

Changes

  • Configurable time binning: Support for minute, hour, day, week, month, year granularity
  • Timezone offset support: UTC offset configuration for local time binning
  • Cross-database compatibility: New date_trunc() helper works with PostgreSQL and SQLite

@RogerHYang RogerHYang requested review from a team as code owners June 30, 2025 16:19
@github-project-automation github-project-automation bot moved this to 📘 Todo in phoenix Jun 30, 2025
@dosubot dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Jun 30, 2025
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @RogerHYang - I've reviewed your changes and they look great!

Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments

### Comment 1
<location> `src/phoenix/server/api/types/Project.py:727` </location>
<code_context>
+            interval_seconds=interval_seconds,
+            dialect=info.context.db.dialect,
+        )
         stmt = (
-            select(hour, func.count())
+            select(bucket_stop_time, func.count(models.Span.id))
</code_context>

<issue_to_address>
The use of .filter_by(project_rowid=self.project_rowid) in trace_count_time_series may be inconsistent with .where(models.Trace.project_rowid == self.project_rowid) in span_count_time_series.

Consider standardizing the filtering approach in both methods to improve readability and avoid potential differences in query behavior.
</issue_to_address>

### Comment 2
<location> `src/phoenix/server/api/input_types/Granularity.py:43` </location>
<code_context>
         ),
     )

+    def __post_init__(self) -> None:
+        if isinstance(self.sampling_interval_minutes, int) and self.sampling_interval_minutes <= 0:
+            raise BadRequest(
</code_context>

<issue_to_address>
The error message for evaluation_window_minutes does not include the actual value.

Include the actual value in the evaluation_window_minutes error message for consistency and easier debugging.
</issue_to_address>

<suggested_fix>
<<<<<<< SEARCH
        if isinstance(self.evaluation_window_minutes, int) and self.evaluation_window_minutes <= 0:
            raise BadRequest(
                "evaluation_window_minutes must be greater than 0, got 0.",
            )
=======
        if isinstance(self.evaluation_window_minutes, int) and self.evaluation_window_minutes <= 0:
            raise BadRequest(
                f"evaluation_window_minutes must be greater than 0, got {self.evaluation_window_minutes}.",
            )
>>>>>>> REPLACE

</suggested_fix>

### Comment 3
<location> `src/phoenix/server/api/utils.py:87` </location>
<code_context>
+        >>> print(f"Stop: {stop_time}, Interval: {interval}s")
+        Stop: 2023-01-01 13:00:00+00:00, Interval: 1800s
+    """
+    if not granularity:
+        interval_seconds = 3600  # Default to hourly granularity
+    else:
</code_context>

<issue_to_address>
The function get_parameters_for_simple_time_series does not validate that time_range.start is before time_range.end.

Add a check to ensure time_range.start is not after time_range.end to prevent unexpected or empty results.
</issue_to_address>

<suggested_fix>
<<<<<<< SEARCH
    if not granularity:
        interval_seconds = 3600  # Default to hourly granularity
    else:
=======
    if time_range.start > time_range.end:
        raise BadRequest("time_range.start must not be after time_range.end")
    if not granularity:
        interval_seconds = 3600  # Default to hourly granularity
    else:
>>>>>>> REPLACE

</suggested_fix>

### Comment 4
<location> `tests/unit/server/api/types/test_Project.py:1688` </location>
<code_context>
+        """
+        project = _time_series_data.projects[0]
+
+        for time_range, granularity, test_desc in [
+            (
+                TimeRange(
</code_context>

<issue_to_address>
Consider adding a test case for a single-point interval (e.g., 1-minute granularity).

Adding a test for the smallest granularity will help verify correct handling of high-resolution intervals and catch potential performance or rounding issues.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

RogerHYang and others added 3 commits June 30, 2025 09:46
Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com>
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
@RogerHYang RogerHYang marked this pull request as draft June 30, 2025 18:25
@RogerHYang RogerHYang changed the title feat: add granularity for span and trace count time series feat: add TimeBinConfig for span and trace count time series Jul 14, 2025
@RogerHYang RogerHYang marked this pull request as ready for review July 14, 2025 19:14
cursor[bot]

This comment was marked as outdated.

@github-project-automation github-project-automation bot moved this from 📘 Todo to 👍 Approved in phoenix Jul 14, 2025
cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: PostgreSQL `date_trunc` Timezone Format Error

The date_trunc function for PostgreSQL generates incorrect timezone strings. The sign logic is inverted and the hour component is not zero-padded, leading to formats like '-1:00' for UTC+1 instead of the expected '+01:00'. This results in incorrect time truncation.

src/phoenix/db/helpers.py#L237-L240

# (1 row)
sign = "-" if utc_offset_minutes >= 0 else "+"
timezone = f"{sign}{abs(utc_offset_minutes) // 60}:{abs(utc_offset_minutes) % 60:02d}"
return sa.func.date_trunc(field, source, timezone)

Fix in CursorFix in Web


Was this report helpful? Give feedback by reacting with 👍 or 👎

@RogerHYang RogerHYang merged commit f0943d8 into main Jul 14, 2025
51 of 53 checks passed
@RogerHYang RogerHYang deleted the trace-count-time-series branch July 14, 2025 22:33
@github-project-automation github-project-automation bot moved this from 👍 Approved to ✅ Done in phoenix Jul 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:XXL This PR changes 1000+ lines, ignoring generated files.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[const][gql] traces over time resolver
2 participants