You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
You describe the exclude_all() function as "...takes multiple criteria and applies them in a step-wise manner, summarising at each step." The exclusion count shows that the exclusion criteria are each separately run on the full pre-exclusions set, then all excluded observations are removed (and some may be excluded by more than 1 criterion). This is consistent with the behaviour of filter(), but not behaviour I'd describe as "step-wise".
The distinction can be really important if you’re excluding based on summary statistics of the existing data. It might be useful to include a concrete example, like the difference between these two:
# simultaneous evaluation of criteria
data.frame(a = 1:10) |>
track() |>
exclude_all(
a > 9 ~ "{.excluded} value > 9",
a == max(a) ~ "{.excluded} max value",
) |>
status() |>
flowchart()
You are of course entirely correct and it was not the best piece of documentation I've written. I've fully rewritten the exclude_all documentation both in the function reference and in the supporting vignette to make this clearer including an example as you suggested.
There was a similar problem in the include_any documentation which I have rectified also and included similar example.
In both cases I have drawn out the parallels between these functions and the vanilla dplyr::filter functions.
Review issue: openjournals/joss-reviews#4707
Branch reviewed: main
You describe the
exclude_all()
function as "...takes multiple criteria and applies them in a step-wise manner, summarising at each step." The exclusion count shows that the exclusion criteria are each separately run on the full pre-exclusions set, then all excluded observations are removed (and some may be excluded by more than 1 criterion). This is consistent with the behaviour offilter()
, but not behaviour I'd describe as "step-wise".The distinction can be really important if you’re excluding based on summary statistics of the existing data. It might be useful to include a concrete example, like the difference between these two:
The text was updated successfully, but these errors were encountered: