Updates pipeline enrollment metrics queries to improve performance #226
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This commit should dramatically improve the query performance for the
enrollment metrics pipeline
What was wrong?
Queries were very slow because of a 'LMIT 1' issues with MySQL. For a starting point, see here
In Django, we were doing a filter query that returns a single record or
None
. Examples:Query functions such as
latest
,first
,last
and so on add aLIMIT 1
to the underlying SQL query, which has apparent negative performance on the query analyzerTo address this, we do two things
Also, LearnerCourseGradesMetrics queries are slow as the model needs indexing
on fields including site, course, and learner. We address this twofold
This is so we're not indexing records we are just going to delete anyway
course
This commit performs #2 above to then filter from this queryset to find
LearnerCourseGradeMetrics records for the specified learner in the
course
Enrollment Metrics tests have been updated to reflect changes in the
production code