Split tutorials to 3 groups #4220

pbchekin · 2025-05-15T20:06:01Z

The run time reduced from 35m to 23m. Now "minicore" is in critical path.

.github/workflows/build-test-reusable.yml

whitneywhtsang · 2025-05-16T14:40:27Z

.github/workflows/build-test-reusable.yml

+          04-low-memory-dropout
+          05-layer-norm
+          07-extern-functions
+          09-persistent-matmul


Looking at the CI time, do we want to move 09-persistent-matmul to mxfp? rest is still the bottleneck.

Looking at the new CI result, It is hard to balance, looks like 09-persistent-matmul takes a long time, maybe 06-fused-attention to rest?

Yes, 09 is the slowest. I will try to re-balance.

571.62 09-persistent-matmul 425.86 06-fused-attention 188.46 08-grouped-gemm 143.12 10-experimental-block-pointer 80.68 10i-experimental-block-pointer 76.78 03-matrix-multiplication 62.03 03i-matrix-multiplication 47.10 05-layer-norm 33.16 02-fused-softmax 11.78 04-low-memory-dropout 8.64 01-vector-add 7.09 07-extern-functions

The last run is under 25 minutes, ~~minicore (lts and rolling)~~ is the slowest part now. We can optimize the run time further with adding a new parallel job, splitting minicore, and balancing the workload among the jobs (not in this PR).

UPD: minicore (lts) and scaled_dot (rolling) are both ~17m and are in the critical path now. Could not make if faster with rebalancing tutorials, so the conclusion is the same.

Signed-off-by: Pavel Chekin <[email protected]>

pbchekin force-pushed the split-tutorials branch from 8ced6ff to dea0b57 Compare May 15, 2025 20:32

pbchekin mentioned this pull request May 15, 2025

[CI] Ideas to reduce PR build and test time #3820

Closed

pbchekin requested review from kwasd, gshimansky, anmyachev and whitneywhtsang May 15, 2025 20:57

whitneywhtsang reviewed May 15, 2025

View reviewed changes

.github/workflows/build-test-reusable.yml Show resolved Hide resolved

pbchekin force-pushed the split-tutorials branch from 1e7f09e to a40394c Compare May 16, 2025 02:36

kwasd approved these changes May 16, 2025

View reviewed changes

whitneywhtsang reviewed May 16, 2025

View reviewed changes

pbchekin force-pushed the split-tutorials branch from a40394c to bb42c66 Compare May 16, 2025 15:04

pbchekin added 4 commits May 16, 2025 09:54

Split tutorials to 3 groups

c2757e6

Signed-off-by: Pavel Chekin <[email protected]>

Add 09-persistent-matmul, sort

6e3c085

Signed-off-by: Pavel Chekin <[email protected]>

Move tutorial 09 to mxfp

bf37f9e

Signed-off-by: Pavel Chekin <[email protected]>

Rebalance tutorials

1dd4fe4

Signed-off-by: Pavel Chekin <[email protected]>

pbchekin force-pushed the split-tutorials branch from bb42c66 to 1dd4fe4 Compare May 16, 2025 16:57

pbchekin merged commit 986459a into main May 16, 2025
15 checks passed

pbchekin deleted the split-tutorials branch May 16, 2025 19:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split tutorials to 3 groups #4220

Split tutorials to 3 groups #4220

pbchekin commented May 15, 2025 •

edited

Loading

whitneywhtsang May 16, 2025

whitneywhtsang May 16, 2025

pbchekin May 16, 2025

pbchekin May 16, 2025 •

edited

Loading

Split tutorials to 3 groups #4220

Split tutorials to 3 groups #4220

Conversation

pbchekin commented May 15, 2025 • edited Loading

whitneywhtsang May 16, 2025

Choose a reason for hiding this comment

whitneywhtsang May 16, 2025

Choose a reason for hiding this comment

pbchekin May 16, 2025

Choose a reason for hiding this comment

pbchekin May 16, 2025 • edited Loading

Choose a reason for hiding this comment

pbchekin commented May 15, 2025 •

edited

Loading

pbchekin May 16, 2025 •

edited

Loading