Skip to content

Scrap dlight lighthead link lists #3039

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

TwelveEyes
Copy link
Contributor

Replaces the lighthead double linked lists with unordered_maps. Vastly improves performance of large moving dlights over complex map geometry.

Test recordings on Bastion of Chaos (a very performance demanding map for GZDoom).

Renderer Before After
SW Pal before_pal.webm after_pal.webm
SW RGBA before_rgba.webm after_rgba.webm
HW before_hw.webm after_hw.webm

Before:
One large moving light utterly annihilates performance.

After:
Many large moving lights with acceptable performance.

Copy link
Member

@coelckers coelckers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing:

Please avoid using iterators in 'for' loops like this:

for (auto iter = touchlists->wall_tlist.begin(); iter != touchlists->wall_tlist.end(); iter++)

Use range-based 'for' loops instead.

@coelckers
Copy link
Member

As a general comment, have you tried using TMap instead of unordered_map? While this is faster than the old linked list, at least in this particular case, I know the heap allocation behavior of these standard containers is utterly atrocious and this is low level rendering code and extremely dynamic data after all.

Also: Have you made performance tests with different compilers and standard libraries? One very unpleasant thing about these is that what works well on one compiler might not on the next due to different implementation of the container classes.
There's a reason, after all, why we never migrated off our custom container classes. Unlike the STL variants, these have fixed properties we control ourselves.

@TwelveEyes
Copy link
Contributor Author

As a general comment, have you tried using TMap instead of unordered_map?

I have not tried TMap, no. I'll do some testing with TMap vs unordered_map.

Also: Have you made performance tests with different compilers and standard libraries? One very unpleasant thing about these is that what works well on one compiler might not on the next due to different implementation of the container classes. There's a reason, after all, why we never migrated off our custom container classes. Unlike the STL variants, these have fixed properties we control ourselves.

I've tested with clang 17 and VS2022 with comparable performance improvement. Don't know how GCC or earlier clang/VS compilations fare.

@Boondorl
Copy link
Contributor

Boondorl commented Apr 7, 2025

At a glance it looks like this removes the old light node allocator. Any particular reason? From what I could tell that was perfectly fine and simply prevented allocating/deallocating nodes constantly.

@TwelveEyes
Copy link
Contributor Author

At a glance it looks like this removes the old light node allocator. Any particular reason? From what I could tell that was perfectly fine and simply prevented allocating/deallocating nodes constantly.

What part are you referring to, exactly?

@TwelveEyes
Copy link
Contributor Author

Testing with TMap on this branch. The performance increase with moving dlights seems comparable to unordered_map. Though just looking directly down without moving from spawn in Bastion of Chaos, viewing stat fps with unordered_map I get ~31ms, while with TMap I get ~38ms.

@coelckers
Copy link
Member

At a glance it looks like this removes the old light node allocator. Any particular reason? From what I could tell that was perfectly fine and simply prevented allocating/deallocating nodes constantly.

You cannot use that if you let the map allocate the nodes.

@coelckers
Copy link
Member

I tried to understand where the speedup here comes from but failed so far.

@TwelveEyes
Can you please explain why using a map here makes such a difference? It cannot really be cache locality because with an unordered_map that isn't perfect either.

@MrRaveYard
Copy link
Contributor

MrRaveYard commented Apr 8, 2025

I tried to understand where the speedup here comes from but failed so far.

If we are talking old code vs this PR:

In order to avoid duplicate entries in the light lists, the list is linearly searched before anything is inserted. That's what VS profiler revealed (two years ago). Thus the old code is painfully slow in complex scenes if the light has to change position.

@TwelveEyes
Copy link
Contributor Author

@TwelveEyes Can you please explain why using a map here makes such a difference? It cannot really be cache locality because with an unordered_map that isn't perfect either.

As @MrRaveYard said, the biggest drag on performance comes from traversing the lighthead link lists for every sidedef and section the light touches to check if the lists already contain a lightnode.

https://github.com/ZDoom/gzdoom/blob/master/src/playsim/a_dynlight.cpp#L435-L443

@madame-rachelle
Copy link
Collaborator

I've gone with the branch merge instead of this PR, because ultimately while performance is indeed worse with TMap it is still an improvement over the previous status quo.

Also, it lights a goalpost for improving TMap at some point, hopefully. :)

@Talon1024
Copy link
Contributor

Talon1024 commented Apr 29, 2025

Does this change not apply to spot lights? When I was playing Dragon Sector last week, I noticed some severe FPS drops in the outdoor areas when I turned on the flashlight. And as far as I know, the flashlight in Dragon Sector/Project Brutality is a spot light.

@RicardoLuis0
Copy link
Collaborator

RicardoLuis0 commented Apr 29, 2025

Does this change not apply to spot lights? When I was playing Dragon Sector last week, I noticed some severe FPS drops in the outdoor areas when I turned on the flashlight. And as far as I know, the flashlight in Dragon Sector/Project Brutality is a spot light.

it will still have fps drops with extreme light ranges and sector heavy mods, since it still fundamentally iterates over sectors, this just improves the performance of the iteration, if you can on the flashlight mod, simply reduce the light range/radius

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants