Optimize Formatter for 90% less memory usage and 15-30% faster execution #104

ThePaperPlate · 2025-04-08T01:31:20Z

I've optimized the Formatter class to significantly improve its memory efficiency and performance

Main Improvements

Added Object Pooling: Implemented a custom Poolable<T> system that recycles Element, Token, Lexer, Entry, and ElementTreeBuilder objects
Replaced the delegate-based Lexer state machine with a more efficient enum-based implementation
Optimized String Handling: Implemented StringBuilder pooling within ToTreeFormat()
Cached Common Objects: Created static cached instances for frequently used objects like translation dictionaries and common tags
Maintained API Compatibility: Preserved the original public APIs so third-parties wouldn't need modifications

Benchmark Results

Using this crude benchmark script on both the original and optimized versions:

Before vs After (100 iterations on 2053 test inputs):

Execution Time (lower is better)

Memory Usage (lower is better)

Formatter	Original Time	Optimized Time	Time Improvement	Original Memory	Optimized Memory	Memory Reduction
ToPlaintext	3.41 ms	2.91 ms	14.7% faster	1.59 GB	158.23 MB	90.3% less
ToUnity	2.34 ms	1.61 ms	31.2% faster	1.76 GB	185.40 MB	89.7% less
ToRustLegacy	1.85 ms	1.51 ms	18.4% faster	1.66 GB	178.11 MB	89.5% less
ToRoKAnd7DTD	1.85 ms	1.51 ms	18.4% faster	1.65 GB	173.36 MB	89.7% less
ToTerraria	1.84 ms	1.51 ms	17.9% faster	1.64 GB	173.28 MB	89.7% less
Parse	1.46 ms	1.19 ms	18.5% faster	1.36 GB	254.18 MB	81.7% less

- Add `Poolable<T>` base class - Refactor `Element`, `Token`, `Lexer`, and `Entry` classes to use object pooling - Add `ElementTreeBuilder` for more efficient element processing - Refactor parsing workflow to use pooled objects - Maintain backward compatibility with existing public API

- Replace delegate-based state machine with enum-based approach - Reduce memory allocations and improve performance - Simplify state transitions with direct state assignment

ThePaperPlate added 5 commits April 6, 2025 14:18

Removed LINQ

4fd6122

Cache translation dictionaries and related Tags

ad9c916

Convert Lexer to use enum-based state machine

47d387b

- Replace delegate-based state machine with enum-based approach - Reduce memory allocations and improve performance - Simplify state transitions with direct state assignment

Optimize ToTreeFormat with StringBuilder pooling

29d7e3e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize Formatter for 90% less memory usage and 15-30% faster execution #104

Optimize Formatter for 90% less memory usage and 15-30% faster execution #104

ThePaperPlate commented Apr 8, 2025

Optimize Formatter for 90% less memory usage and 15-30% faster execution #104

Are you sure you want to change the base?

Optimize Formatter for 90% less memory usage and 15-30% faster execution #104

Conversation

ThePaperPlate commented Apr 8, 2025

Main Improvements

Benchmark Results

Execution Time (lower is better)

Memory Usage (lower is better)