Skip to content

different R versions of posixct %in% #86

@tdhock

Description

@tdhock

hormutz screed [email protected] writes on R-devel:

I recently became aware that using %in% for the Date class is about
100x slower from R 4.3 onward than in older versions. I did not
include the results from R prior to 4.3 but the first and second
methods below yield equal and very fast results for older R versions.

I have suggested a fix that treats the date class in an identical
manner to POSIXct and POSIXlt via the mtfrm generic which is
ultimately called by %in%. I only found one reference to this issue
(see https://stackoverflow.com/questions/77909868/why-is-match-slower-on-dates-datetimes-in-r-version-4-3-2-than-version-4-2-2).

I apologize if this should have been sent to [email protected] or
if this issue has already been addressed. Thanks.


Rstudio session below, note that R --vanilla gives the same results

> sessionInfo()$R.version$version.string    #
[1] "R version 4.5.1 (2025-06-13)"
>
> date_seq <- seq(as.Date("1705-01-01"), as.Date("2024-12-31"), by="days")
> dt1 <- as.Date("2024-05-01")
>
> # %in%
> tictoc::tic()
> tmp <- dt1 %in% date_seq
> tictoc::toc()
0.125 sec elapsed
>
> # cast to integer then %in% (gives fast results similar to old R without casting to int)
> tictoc::tic()
> tmp <- as.integer(dt1) %in% as.integer(date_seq)
> tictoc::toc()
0.001 sec elapsed
>
> # Create an mtfrm method for Date class that is identical to POSIXct and POSIXlt methods
> # This results in the expected dramatic speedup
> temp_fun <- function(x)
+   as.vector(x, "any")
>
> .S3method("mtfrm", "Date", temp_fun)
>
> # %in% with mtrfm method for Date
> tictoc::tic()
> tmp <- dt1 %in% date_seq
> tictoc::toc()
0.002 sec elapsed

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions