-
-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Add SpanStringTokenizer and avoid many string allocations in FontCollectionBase #17745
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SpanStringTokenizer and avoid many string allocations in FontCollectionBase #17745
Conversation
Tests seem only to be failing because of the removal of |
495fcf5
to
b1f773d
Compare
You can test this PR using the following package version. |
I reverted |
…tions in FontCollectionBase
b1f773d
to
4ee848e
Compare
You can test this PR using the following package version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A remark aside, LGTM, thank you!
It's always good to have fewer allocations :)
I'm sorry to hijack this PR for a moment. @Alexizx00789999 could you please stop approving all pending pull requests? It won't help them get team approval any faster. This just adds unnecessary noise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you!
You can test this PR using the following package version. |
You can test this PR using the following package version. |
What does the pull request do?
This PR creates a
ref struct
variant ofStringTokenizer
calledSpanStringTokenizer
, which holds aReadOnlySpan<char>
instead of aSystem.String
. Additionally, this also addsEnumHelpers::TryParse
shims accepting spans as input parameter. This drastically reduces the number of allocations of unnecessarySystem.String
objects inFontCollectionBase
and friends.What is the current behavior?
When creating a
TextRunProperties
(e.g., aGenericTextRunProperties
), somewhere down the line Avalonia.Skia accessesTypeface.GlyphTypeface
. This in turn eventually bubbles down all the way toFontCollectionBase::GetImplicitTypeface
, which tries to normalize the typeface by carving out the font style, weight and stretch out of the full family name. It does so with the help ofStringTokenizer
.For example, below is the logic for carving out the font weight out of the full family name:
Avalonia/src/Avalonia.Base/Media/Fonts/FontCollectionBase.cs
Lines 294 to 320 in 44a784a
StringTokenizer
takes in aSystem.String
object, meaning that all parsing needs to have a fully concretized string. This requires every single token returned by the tokenizer to be fully concretized to aSystem.String
, or else we cannot determine whether it was an integer, nor can we pass it intoEnum.TryParse
to match it with font modifiers. Additionally, even more new string allocations are made by concatenating strings and callingReplace
andTrimEnd
. Finally, since there are three types of modifiers, there are three very similar parsing functions that all do extensive string object allocations. This all to simply parse a font family name.While the glyph typeface is cached in the internal
CachedGlyphTypeface
property, only slightly adjusting the visual appearance of text requires a newTextRunProperties
, and thus results in the creation of a new cached glyph typeface. This happens in projects such as AvaloniaEdit and AvaloniaHex, which need to create manyTextRunProperties
instances to implement syntax highlighting. Additionally, every TextBlock control creates a new instance as well. As far as I know, this property is not exposed nor can aTextRunProperties
be cloned efficiently with this cached glyph typeface either. Even though, out of scope for this PR, this potentially could also be an interesting avenue for further optimizations.What is the updated/expected behavior with this PR?
Drastically reduced observed
System.String
allocations when creating manyTextRunProperties
:How was the solution implemented (if it's not obvious)?
internal ref struct SpanStringTokenizer
which is an exact copy ofStringTokenizer
but stores aReadOnlySpan<char>
as opposed to astring
.StringTokenizer
were updated toSpanStringTokenizer
throughout the entire Avalonia solution.StringTokenizer
is now marked obsolete.EnumHelpers
now includeTryParse
shims.StringBuilder
.Checklist
Added a copy of
StringTokenizerTests
but specifically forSpanStringTokenizer
.Breaking changes
The only technically publicly visible breaking change is turning
GridLength::ParseLenghts
into eagerly materializing aList<GridLength>
, asref struct
s are not allowed in conjunction withyield return
. However I don't think this should be much of an issue:Issues