Skip to content

Metrics fixes and cleanup #2325

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 22, 2023
Merged

Metrics fixes and cleanup #2325

merged 2 commits into from
Apr 22, 2023

Conversation

JonathanWenger
Copy link
Collaborator

In a Nutshell

This PR fixes the implementation of the mean standardized log loss to match its definition and adds some documentation.

Detailed

The mean standardized log loss as defined by Rasmussen and Williams (2006) compares the average log loss to a trivial model based off of the training data summary statistics. This standardization was missing in the implementation.

Screenshot from 2023-04-21 19-31-05

Further I added the standardized mean squared error to allow for better comparison across experiments.

Comment on lines +130 to +131
def test_standardized_mean_squared_error(self):
self._test_metric(standardized_mean_squared_error)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm this is a kind of interesting test in that it only executes the metric code, without really checking what the computation does. Not really in scope for this PR but this should probably do something more reasonable...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Treating this as out of scope for this PR for now, but created an issue for it: #2326 .

@JonathanWenger JonathanWenger merged commit ee35601 into master Apr 22, 2023
@JonathanWenger JonathanWenger deleted the metrics branch April 22, 2023 11:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants