Skip to content

fix: fix regression caused by join on trials view #9091

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 5, 2024

Conversation

salonig23
Copy link
Contributor

@salonig23 salonig23 commented Apr 3, 2024

Description

Join on trials view was causing regression, and slowing down GetExperiments. Fixed it by removing join and adding column to runs table.

Test Plan

Make sure all tests pass

For CLI -
Experiment:

  1. Create an experiment.
  2. Wait for the experiment to complete.
  3. Run det e set log-retention-days <exp-id> --days 50
  4. Check the runs table in the database to see if the associated trials for the experiment have updated the log retention days to 50.
  5. Run det e set log-retention-days <exp-id> --days none --forever
  6. Check the tasks table in the database to see if the associated trials for the experiment have updated the log retention days to -1.

Trial:

  1. Create an experiment.
  2. Wait for the experiment to complete.
  3. Run det t set log-retention-days <trial-id> --days 50 for any trial in the experiment.
  4. Check the runs table in the database to see if the trial has updated the log retention days to 50.
  5. Run det t set log-retention-days <trial-id> --days none --forever
  6. Check the tasks table in the database to see if the trial has updated the log retention days to -1.

For API -

  1. Create an experiment.
  2. Wait for the experiment to complete.
  3. Use the following API endpoint:

curl -X 'PUT'
'http://localhost:8080/api/v1/experiments/1/retain_logs'
-H 'accept: application/json'
-H 'Content-Type: application/json'
-d '{
"experimentId": 1,
"numDays": 40
}'

Check the tasks table in the database to see if the associated trials for the experiment have updated the log retention days to 40.

curl -X 'PUT'
'http://localhost:8080/api/v1/trials/1/retain_logs'
-H 'accept: application/json'
-H 'Content-Type: application/json'
-d '{
"trialId": 1,
"numDays": -1
}'

Check the tasks table in the database to see if the trial has updated the log retention days to -1.

For WebUI -

  1. Create a few experiments.
  2. Go to the Experiments list in the Web UI and select a few of them and then click on the actions button.
    EC0E20C0-1338-4317-BDCC-F7F637E8A2D7
  3. Select the Retain Logs option from the drop down.
  4. Type any number in the Number of days field when prompted.
  5. Wait for a success message.
  6. Check the runs table in the database to see if the associated trials for the experiment have updated the log retention days to the number you put in.
  7. Select a few more experiments and follow the above steps, but this time click on the 'Forever' checkbox instead.
  8. Wait for a success message.
  9. Check the runs table in the database to see if the associated trials for the experiment have updated the log retention days to -1.
  10. Also check the Trial List Table to see if the Log Retention Days Column shows up and has the correct days.

Commentary (optional)

Checklist

  • Changes have been manually QA'd
  • User-facing API changes need the "User-facing API Change" label.
  • Release notes should be added as a separate file under docs/release-notes/.
    See Release Note for details.
  • Licenses should be included for new code which was copied and/or modified from any external code.

Ticket

@cla-bot cla-bot bot added the cla-signed label Apr 3, 2024
Copy link

netlify bot commented Apr 3, 2024

Deploy Preview for determined-ui canceled.

Name Link
🔨 Latest commit 15fa3eb
🔍 Latest deploy log https://app.netlify.com/sites/determined-ui/deploys/66104964817d680008645eb1

Copy link

codecov bot commented Apr 3, 2024

Codecov Report

Attention: Patch coverage is 66.66667% with 13 lines in your changes are missing coverage. Please review.

Project coverage is 46.67%. Comparing base (f4b0471) to head (15fa3eb).
Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #9091      +/-   ##
==========================================
+ Coverage   46.65%   46.67%   +0.01%     
==========================================
  Files        1172     1172              
  Lines      143621   143617       -4     
  Branches     2409     2410       +1     
==========================================
+ Hits        67013    67028      +15     
+ Misses      76403    76384      -19     
  Partials      205      205              
Flag Coverage Δ
backend 43.48% <66.66%> (+0.03%) ⬆️
harness 63.99% <ø> (ø)
web 37.77% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
master/internal/checkpoint_gc.go 65.93% <100.00%> (-0.38%) ⬇️
master/internal/command/command.go 67.82% <100.00%> (-0.14%) ⬇️
master/internal/db/postgres_tasks.go 69.11% <100.00%> (-0.12%) ⬇️
master/internal/logretention/logretention.go 84.90% <100.00%> (+0.59%) ⬆️
master/internal/trial.go 41.58% <100.00%> (-0.12%) ⬇️
master/pkg/model/experiment.go 14.28% <100.00%> (+0.35%) ⬆️
master/pkg/model/task.go 15.26% <ø> (ø)
master/internal/api_trials.go 54.69% <80.00%> (+1.32%) ⬆️
master/internal/experiment/bulk_action.go 0.00% <0.00%> (ø)
master/internal/populate_metrics.go 0.00% <0.00%> (ø)
... and 1 more

... and 4 files with indirect coverage changes

@salonig23 salonig23 marked this pull request as ready for review April 3, 2024 16:23
@salonig23 salonig23 requested a review from a team as a code owner April 3, 2024 16:23
@salonig23 salonig23 force-pushed the fix-trials-view-perf branch from 5ebd618 to 7cb2665 Compare April 3, 2024 16:28
Copy link
Contributor

@NicholasBlaskey NicholasBlaskey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

I would check the performance test run first and get someone who has more context on this project to make sure it makes sense to have it on both tasks and runs

@salonig23 salonig23 requested a review from erikwilson April 3, 2024 16:48
@salonig23 salonig23 requested a review from a team as a code owner April 5, 2024 04:52
@salonig23 salonig23 force-pushed the fix-trials-view-perf branch 4 times, most recently from 69a0297 to d37e55b Compare April 5, 2024 06:43
Copy link
Contributor

@NicholasBlaskey NicholasBlaskey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but I would get @erikwilson to make sure the design look right since I'm not familiar with this feature

@@ -109,4 +109,6 @@ message FlatRun {
bool parent_archived = 18;
// Data related the the experiment associated with this run.
optional FlatRunExperiment experiment = 19;
// Number of days to retain logs for.
optional int32 log_retention_days = 20;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this set or returned anywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, I don't think so. i'll remove it

@salonig23 salonig23 force-pushed the fix-trials-view-perf branch from 5bd9f2f to d859408 Compare April 5, 2024 18:07
@salonig23 salonig23 force-pushed the fix-trials-view-perf branch from d859408 to 15fa3eb Compare April 5, 2024 18:56
@salonig23 salonig23 merged commit cf2f2be into main Apr 5, 2024
100 of 103 checks passed
@salonig23 salonig23 deleted the fix-trials-view-perf branch April 5, 2024 21:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants