Skip to content

Negate error metrics to ensure that higher is ALWAYS better #268

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Innixma opened this issue Mar 15, 2021 · 2 comments
Closed

Negate error metrics to ensure that higher is ALWAYS better #268

Innixma opened this issue Mar 15, 2021 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@Innixma
Copy link
Collaborator

Innixma commented Mar 15, 2021

Example of the issue

Currently, depending on the metric a result value of 0.6 compared to 0.4 can be either better (accuracy, auc, r2, etc.) or worse (rmse, log_loss, mae, etc.). When generating aggregated analysis from the results, currently this knowledge has to be hardcoded by the user and is error prone.

I'd like to propose adding a new column, higher_is_better. If the higher the result metric is, the better the score, then higher_is_better should be True or 1. If not, then it should be False or 0.

Another alternative is to report metrics in higher_is_better form always, and flip the sign to align them. This would cause a log_loss of 0.5 to be reported as -0.5. This is the strategy several AutoML systems use such as AutoGluon and MLJAR, although can be confusing when interpreted by humans (but is great for computers).

@PGijsbers
Copy link
Collaborator

Another alternative is to report metrics in higher_is_better form always, and flip the sign to align them.

I think I would prefer this. As in #262 (which the screenshot is from) too many columns make the table hard to read, especially if it means each row in the table is converted in two (or more) lines on the terminal. Perhaps we can just prefix the column names with - (e.g. acc, auc, -logloss).

@Innixma
Copy link
Collaborator Author

Innixma commented Mar 16, 2021

If it is acceptable for your system, I would also prefer the signs to be flipped as well, as it adds a great deal of consistency to the code and makes sorting much easier since the user doesn't have to both understand and use the higher_is_better column. In terms of -logloss, this is an interesting idea that I haven't thought of before and don't have a strong opinion on.

@sebhrusen sebhrusen self-assigned this Mar 30, 2021
@sebhrusen sebhrusen changed the title Add new results column: higher_is_better Negate loss metrics to ensure that higher is ALWAYS better Mar 30, 2021
@sebhrusen sebhrusen added the enhancement New feature or request label Mar 30, 2021
@sebhrusen sebhrusen changed the title Negate loss metrics to ensure that higher is ALWAYS better Negate error metrics to ensure that higher is ALWAYS better Apr 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants