Skip to content

String search kernel optimisations #6107

Open
@samuelcolvin

Description

@samuelcolvin

The main context for this is well described by BurntSushi/memchr#156.

I think (in rough order of impact) we should:

  • switch from str.contains to memchr
  • switch from str.starts_with to to hopefully memchr, otherwise quick_strings::starts_with - there's no "what if the haystack is very long" concern since we're looking at the start of the string, so the difference between memchr and quick_strings won't be as big, or even might be negative
  • switch from using starts_with_ignore_ascii_case to quick_strings::istarts_with
  • same for *ends_with
  • switch from Regex to use quick_strings::icontains (copying the code) for ILIKE - maybe we have to check it's actually faster for large haystacks? - this might have the biggest impact in some scenarois, but me should be careful
  • to use those improvements, switch from some direct use of str.contains etc in like.rs to use Predicate

(I'm not suggesting that we make quick_strings a dependency, it was just a scratch experiment, if we use any of that code we should copy it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions