Skip to content

Fix circular mean by always storing and using the weighted one #142208

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 4, 2025

Conversation

edenhaus
Copy link
Member

@edenhaus edenhaus commented Apr 3, 2025

Breaking change

Proposed change

  • Store the mean_weight for the circular mean stats in the short term table
  • Always use the mean_weight to calculate the new circular mean

Type of change

  • Dependency upgrade
  • Bugfix (non-breaking change which fixes an issue)
  • New integration (thank you!)
  • New feature (which adds functionality to an existing integration)
  • Deprecation (breaking change to happen in the future)
  • Breaking change (fix/feature causing existing functionality to break)
  • Code quality improvements to existing code or addition of tests

Additional information

  • This PR fixes or closes issue: fixes #
  • This PR is related to issue:
  • Link to documentation pull request:
  • Link to developer documentation pull request:
  • Link to frontend pull request:

Checklist

  • The code change is tested and works locally.
  • Local tests pass. Your PR cannot be merged unless tests pass
  • There is no commented out code in this PR.
  • I have followed the development checklist
  • I have followed the perfect PR recommendations
  • The code has been formatted using Ruff (ruff format homeassistant tests)
  • Tests have been added to verify that the new code works.

If user exposed functionality or configuration variables are added/changed:

If the code communicates with devices, web services, or third-party tools:

  • The manifest file has all fields filled out correctly.
    Updated and included derived files by running: python3 -m script.hassfest.
  • New or updated dependencies have been added to requirements_all.txt.
    Updated by running python3 -m script.gen_requirements_all.
  • For the updated dependencies - a link to the changelog, or at minimum a diff between library versions is added to the PR description.

To help with the load of incoming pull requests:

@home-assistant
Copy link

home-assistant bot commented Apr 3, 2025

Hey there @home-assistant/core, mind taking a look at this pull request as it has been labeled with an integration (recorder) you are listed as a code owner for? Thanks!

Code owner commands

Code owners of recorder can trigger bot actions by commenting:

  • @home-assistant close Closes the pull request.
  • @home-assistant rename Awesome new title Renames the pull request.
  • @home-assistant reopen Reopen the pull request.
  • @home-assistant unassign recorder Removes the current integration label and assignees on the pull request, add the integration domain after the command.
  • @home-assistant add-label needs-more-information Add a label (needs-more-information, problem in dependency, problem in custom component) to the pull request.
  • @home-assistant remove-label needs-more-information Remove a label (needs-more-information, problem in dependency, problem in custom component) on the pull request.

@home-assistant
Copy link

home-assistant bot commented Apr 3, 2025

Hey there @home-assistant/core, mind taking a look at this pull request as it has been labeled with an integration (sensor) you are listed as a code owner for? Thanks!

Code owner commands

Code owners of sensor can trigger bot actions by commenting:

  • @home-assistant close Closes the pull request.
  • @home-assistant rename Awesome new title Renames the pull request.
  • @home-assistant reopen Reopen the pull request.
  • @home-assistant unassign sensor Removes the current integration label and assignees on the pull request, add the integration domain after the command.
  • @home-assistant add-label needs-more-information Add a label (needs-more-information, problem in dependency, problem in custom component) to the pull request.
  • @home-assistant remove-label needs-more-information Remove a label (needs-more-information, problem in dependency, problem in custom component) on the pull request.

@edenhaus edenhaus requested a review from emontnemery April 3, 2025 17:12
Copy link
Contributor

@emontnemery emontnemery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code change looks good, OK to merge unless the statistic.get("mean_weight") or 0.0 is there to handle old rows where the weight is NULL.

If that's the case, I think it's better to bump the schema and remove all rows for circular mean so we don't need to handle the NULL.

Also, I think we should improve the tests:

  • Instead of calculating the circular averages in the same way in tests as we do in the implementation, the tests should calculate the circular averages only based on the states
  • Update tests to check that what we insert in the weight column is as expected

@@ -1063,7 +1068,8 @@ def _reduce_statistics(
max_values.append(_max)
if _want_mean:
if (_mean := statistic.get("mean")) is not None:
mean_values.append(_mean)
_mean_weight = statistic.get("mean_weight") or 0.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the get here needed to handle rows created before this patch was merged? If yes, I think it's better to just wipe the rows so we don't need to deal with this forever.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the column mean_weight is defined as float | None, we also need to handle the case where it's None. This case is when the mean_type is arithmetic.

An alternative could be:

diff --git a/homeassistant/components/recorder/statistics.py b/homeassistant/components/recorder/statistics.py
index 80c0028ef7a..e168cbd92d6 100644
--- a/homeassistant/components/recorder/statistics.py
+++ b/homeassistant/components/recorder/statistics.py
@@ -1025,7 +1025,8 @@ def _reduce_statistics(
     _want_sum = "sum" in types
     for statistic_id, stat_list in stats.items():
         max_values: list[float] = []
-        mean_values: list[tuple[float, float]] = []
+        arithmetic_mean_values: list[float] = []
+        circular_mean_values: list[tuple[float, float]] = []
         min_values: list[float] = []
         prev_stat: StatisticsRow = stat_list[0]
         fake_entry: StatisticsRow = {"start": stat_list[-1]["start"] + period_seconds}
@@ -1042,15 +1043,14 @@ def _reduce_statistics(
                 if _want_mean:
                     row["mean"] = None
                     row["mean_weight"] = None
-                    if mean_values:
-                        match metadata[statistic_id][1]["mean_type"]:
-                            case StatisticMeanType.ARITHMETIC:
-                                row["mean"] = mean([x[0] for x in mean_values])
-                            case StatisticMeanType.CIRCULAR:
-                                row["mean"], row["mean_weight"] = (
-                                    weighted_circular_mean(mean_values)
-                                )
-                    mean_values.clear()
+                    if arithmetic_mean_values:
+                        row["mean"] = mean(arithmetic_mean_values)
+                    if circular_mean_values:
+                        row["mean"], row["mean_weight"] = weighted_circular_mean(
+                            circular_mean_values
+                        )
+                    arithmetic_mean_values.clear()
+                    circular_mean_values.clear()
                 if _want_min:
                     row["min"] = min(min_values) if min_values else None
                     min_values.clear()
@@ -1068,8 +1068,14 @@ def _reduce_statistics(
                 max_values.append(_max)
             if _want_mean:
                 if (_mean := statistic.get("mean")) is not None:
-                    _mean_weight = statistic.get("mean_weight") or 0.0
-                    mean_values.append((_mean, _mean_weight))
+                    match metadata[statistic_id][1]["mean_type"]:
+                        case StatisticMeanType.ARITHMETIC:
+                            arithmetic_mean_values.append(_mean)
+                        case StatisticMeanType.CIRCULAR:
+                            if (
+                                _mean_weight := statistic.get("mean_weight")
+                            ) is not None:
+                                circular_mean_values.append((_mean, _mean_weight))
             if _want_min and (_min := statistic.get("min")) is not None:
                 min_values.append(_min)
             prev_stat = statistic

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the column mean_weight is defined as float | None

But we ourselves insert the rows, and for circular mean we know we always insert the mean and the weight.
But I guess it makes sense to not blow up if there's invalid data in the database?

I like the proposed new version better, then we don't convert an invalid None to a valid 0.0

@emontnemery
Copy link
Contributor

The test tests/components/tibber/test_statistics.py::test_async_setup_entry is failing

@edenhaus
Copy link
Member Author

edenhaus commented Apr 4, 2025

Update tests to check that what we insert in the weight column is as expected

This is done implicit as the expect output is calculated weighted and if we would not store the weight in the db, these row will be skipped and than the expected output will not match anymore

@emontnemery
Copy link
Contributor

Update tests to check that what we insert in the weight column is as expected

This is done implicit as the expect output is calculated weighted and if we would not store the weight in the db, these row will be skipped and than the expected output will not match anymore

That kind of implicit testing is not good enough IMO. But OK to merge this PR without it.

@frenck
Copy link
Member

frenck commented Apr 4, 2025

I'm going to cut the release, so getting it in–tests can be improved later on. The CI failure isn't related.

@frenck frenck merged commit 64e1735 into dev Apr 4, 2025
43 of 44 checks passed
@frenck frenck deleted the edenhaus-always-weighted-circular-mean branch April 4, 2025 19:19
frenck pushed a commit that referenced this pull request Apr 4, 2025
* Fix circular mean by always storing and using the weighted one

* fix

* Fix test
@frenck frenck mentioned this pull request Apr 4, 2025
frenck added a commit that referenced this pull request Apr 4, 2025
* Fix blocking event loop - daikin (#141442)

* fix blocking event loop

* create ssl_context directly

* update manifest

* update manifest.json

* Made Google Search enable dependent on Assist availability (#141712)

* Made Google Search enable dependent on Assist availability

* Show error instead of rendering again

* Cleanup test code

* Fix humidifier platform for Comelit (#141854)

* Fix humidifier platform for Comelit

* apply review comment

* Bump evohome-async to 1.0.5 (#141871)

bump client to 1.0.5

* Replace "to log into" with "to log in to" in `incomfort` (#142060)

* Replace "to log into" with "to log in to" in `incomfort`

Also fix one missing sentence-casing of "gateway".

* Replace duplicate "data_description" strings with references

* Avoid unnecessary reload in apple_tv reauth flow (#142079)

* Add translation for hassio update entity name (#142090)

* Bump pyenphase to 1.25.5 (#142107)

* Hide broken ZBT-1 config entries on the hardware page (#142110)

* Hide bad ZBT-1 config entries on the hardware page

* Set up the bad config entry in the unit test

* Roll into a list comprehension

* Remove constant changes

* Fix condition in unit test

* Bump pysmhi to 1.0.1 (#142111)

* Avoid logging a warning when replacing an ignored config entry (#142114)

Replacing an ignored config entry with one from the user
flow should not generate a warning. We should only warn
if we are replacing a usable config entry.

Followup to adjust the warning added in #130567
cc @epenet

* Slow down polling in Tesla Fleet (#142130)

* Slow down polling

* Fix tests

* Bump tesla-fleet-api to v1.0.17 (#142131)

bump

* Tado bump to 0.18.11 (#142175)

* Bump to version 0.18.11

* Adding hassfest files

* Add preset mode to SmartThings climate (#142180)

* Add preset mode to SmartThings climate

* Add preset mode to SmartThings climate

* Do not create a HA mediaplayer for the builtin Music Assistant player (#142192)

Do not create a HA mediaplayer for the builtin Music player

* Do not fetch disconnected Home Connect appliances (#142200)

* Do not fetch disconnected Home Connect appliances

* Apply suggestions

Co-authored-by: Martin Hjelmare <[email protected]>

* Update docstring

---------

Co-authored-by: Martin Hjelmare <[email protected]>

* Fix fibaro setup (#142201)

* Fix circular mean by always storing and using the weighted one (#142208)

* Fix circular mean by always storing and using the weighted one

* fix

* Fix test

* Bump pySmartThings to 3.0.2 (#142257)

Co-authored-by: Robert Resch <[email protected]>

* Update frontend to 20250404.0 (#142274)

* Bump forecast-solar lib to v4.1.0 (#142280)

Co-authored-by: Jan-Philipp Benecke <[email protected]>

* Bump version to 2025.4.1

* Fix skyconnect tests (#142262)

fix tests

* Fix empty actions (#142292)

* Apply fix

* Add tests for alarm button cover lock

* update light

* add number tests

* test select

* add switch tests

* test vacuum

* update lock test

---------

Co-authored-by: Fredrik Erlandsson <[email protected]>
Co-authored-by: Ivan Lopez Hernandez <[email protected]>
Co-authored-by: Simone Chemelli <[email protected]>
Co-authored-by: David Bonnes <[email protected]>
Co-authored-by: Norbert Rittel <[email protected]>
Co-authored-by: Erik Montnemery <[email protected]>
Co-authored-by: Paul Bottein <[email protected]>
Co-authored-by: Arie Catsman <[email protected]>
Co-authored-by: puddly <[email protected]>
Co-authored-by: G Johansson <[email protected]>
Co-authored-by: J. Nick Koston <[email protected]>
Co-authored-by: Brett Adams <[email protected]>
Co-authored-by: Erwin Douna <[email protected]>
Co-authored-by: Joost Lekkerkerker <[email protected]>
Co-authored-by: Marcel van der Veldt <[email protected]>
Co-authored-by: J. Diego Rodríguez Royo <[email protected]>
Co-authored-by: Martin Hjelmare <[email protected]>
Co-authored-by: rappenze <[email protected]>
Co-authored-by: Robert Resch <[email protected]>
Co-authored-by: Bram Kragten <[email protected]>
Co-authored-by: Klaas Schoute <[email protected]>
Co-authored-by: Jan-Philipp Benecke <[email protected]>
Co-authored-by: Josef Zweck <[email protected]>
Co-authored-by: Petro31 <[email protected]>
@github-actions github-actions bot locked and limited conversation to collaborators Apr 5, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants