Skip to content

core(scoring): apply v6 score weightings #9949

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Mar 11, 2020
Merged

core(scoring): apply v6 score weightings #9949

merged 10 commits into from
Mar 11, 2020

Conversation

paulirish
Copy link
Member

@paulirish paulirish commented Nov 9, 2019

watch out, scores.

also #10446

@paulirish
Copy link
Member Author

updated. go/lh-perf-v6 is updated as well.

once landed i will ship a 6.0 beta.

Copy link
Collaborator

@patrickhulce patrickhulce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems to match what was decided in the doc 👍

{id: 'largest-contentful-paint', weight: 25, group: 'metrics'},
{id: 'interactive', weight: 15, group: 'metrics'},
{id: 'total-blocking-time', weight: 25, group: 'metrics'},
{id: 'cumulative-layout-shift', weight: 5, group: 'metrics'},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we generally still going to try to stick to the "ship in a release without weight" before we can ship in a breaking change with weight in the future/do we still plan to do that here with cherry-picking? or is this our new policy moving forward?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, the ideal flow for introducing a new metric is:

  1. add metric (invisible or visible)
  2. cut a minor release
  3. Do these ~simultaneously:
    • gather HA/LR data to inform calibration
    • if not already: flip metric to visible & cut a minor release, then....
      • ship minor release to increase user familiarity with the metric.
  4. if necessary, recalibrate score curve
  5. add weighting
  6. optionally, ship a beta
  7. ship a major rls

while we didnt get to follow this pattern here, i'd like to use that plan going forward. hows that sound?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great! yeah agreed.

I still think we can follow part of that pattern here if we wanted to cherry-pick those metrics into a 5.7.0 or something, so folks can play with them without jumping to a new major. I guess that was the root of my question that just was leading in an unintended direction :)

{id: 'cumulative-layout-shift', weight: 5, group: 'metrics'},
// intentionally left out of metrics group so they won't be displayed
{id: 'first-cpu-idle', weight: 0},
{id: 'max-potential-fid', weight: 0},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this is being killed afterall? I thought we brought up in a recent LH meeting that this was going to stay even if it wasn't weighted but the discussion in the doc suggests this is gone gone.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's still in the JSON but not visible anymore.

We've long wanted to add Top Long Tasks as a diagnostic and I think that will end up being more helpful than keeping the mpFid metric.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's still in the JSON but not visible anymore.

right, but with the intention of removing from the JSON in the future?

I know the support was overwhelming for killing in the doc discussion but I still think it's going to be really unfortunate to have no FID-like equivalent for lab when it's like half our performance story in field.

Copy link
Collaborator

@connorjclark connorjclark Mar 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aside: i wouldn't want to remove this from json. but further I'd prefer a average FID metric (instead of just a max one - actually this is just EIL)

@patrickhulce
Copy link
Collaborator

test failure for config number of audits though :)

Copy link
Contributor

@brendankenny brendankenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM2

hold on to your butts

@@ -44,8 +44,9 @@ class CumulativeLayoutShift extends Audit {
*/
static get defaultOptions() {
return {
// TODO(paulirish): Calibrate these
scorePODR: 0.05,
// Calibrated to assure 0.1 gets a score of 0.9.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe a note in here or in the PR description or in the scoring doc on where these came from? As we've discovered it can be very difficult to figure out our motivation for some of these curves later on :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants