Skip to content

fix: invalid Android invalidation of hybrid data #7197

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Mar 10, 2025

Conversation

tjzel
Copy link
Collaborator

@tjzel tjzel commented Mar 7, 2025

Summary

When React Native is reloaded when WorkletsModule.installTurboModule() wasn't called yet we tried to invoke invalidateCpp() which resulted in a nullptr exception, because hybrid data wasn't yet initialized.

I applied similar safety mechanism to all relevant code, turns out it narrowed down only to NativeProxy.java.

Test plan

See that App no longer crashes on invalidateCpp() on a double reload on Android.

@tjzel tjzel requested a review from tomekzaw March 7, 2025 09:50
@tjzel tjzel changed the title fix(worklets): invalid Android invalidation fix: invalid Android invalidation of hybrid data Mar 7, 2025
@tjzel tjzel mentioned this pull request Mar 7, 2025
@tjzel tjzel requested a review from piaskowyk March 7, 2025 15:36
@tjzel tjzel mentioned this pull request Mar 7, 2025
13 tasks
@tjzel tjzel added this pull request to the merge queue Mar 10, 2025
Merged via the queue into main with commit 9a57bcf Mar 10, 2025
14 checks passed
@tjzel tjzel deleted the @tjzel/worklets/fix-android-invalidation branch March 10, 2025 18:59
@tomekzaw
Copy link
Member

tomekzaw commented Mar 10, 2025

Wouldn't it be easier if invalidate method would be just marked synchronized?

github-merge-queue bot pushed a commit that referenced this pull request Mar 12, 2025
## Summary

Fixes #7182
Requires #7197

Currently on iOS React Native can go into a non-fatal race condition on
a double reload. Double reload can happen during an OTA update, when an
app is reloaded immediately after evaluating the bundle.

The flow goes more or less like this. I haven't explored it to the
deepest details.
1. On the first reload a new JavaScript runtime is created.
2. The new runtime evaluates the bundle, starts to populate
`RCTModuleRegistry`.
3. On the second reload a new JavaScript runtime is created. The old
runtime remains and keeps executing. Both runtimes execute on the same
thread, on the same queue.
4. `WorkletsModule.installTurboModule()` is called from the first
runtime. `WorkletsModule` is fetched from the `RCTModuleRegistry`.
5. The second runtime overwrites (clears) the `RCTModuleRegistry` and
starts to populate it. I don't know why both runtimes target the same
instance of `RCTModuleRegistry`.
6. `WorkletsModule.installTurboModule()` finishes successfully on the
first runtime.
7. `ReanimatedModule.installTurboModule()` is invoked in quick
succession on the first runtime.
8. `ReanimatedModule` tries to fetch `WorkletsModule` from
`RCTModuleRegistry`. However, `RCTModuleRegistry` was wiped and it
doesn't have the fully installed instance of `WorkletsModule` anymore.
Therefore a new instance of `WorkletsModule` is created and inserted
into the `RCTModuleRegistry`.
9. In JavaScript we guarantee that
`ReanimatedModule.installTurboModule()` is invoked only after
`WorkletsModule.installTurboModule()`. We follow that heuristic in ObjC
and therefore try operating on a non-installed instance of
`WorkletsModule` resulting in a `NullPtrException`.

We can effectively detect that flow and ignore it, since that race
condition is fatal only for the first runtime which is torn down soon
after anyway.

See the provided screenshots which illustrate the problems.

| <img width="1018" alt="Screenshot 2025-03-07 at 10 34 00"
src="https://github.com/user-attachments/assets/263b9f0d-9573-48a8-99e8-82099221c3e3"
/> | <img width="1014" alt="Screenshot 2025-03-07 at 12 25 23"
src="https://github.com/user-attachments/assets/7dde0eda-526b-43fa-8181-1bec4ad7c38e"
/> |
|:---|:---| 
| Setting `moduleRegistry` for `WorkletsModule` during a healthy
execution of `WorkletsModule.installTurboModule()` | Setting
`moduleRegistry` for `WorkletsModule` during race-condition execution.
Note that the list of the modules is much shorter and that
`WorkletsModule` isn't there. |

## Test plan

To reproduce this issue, reload twice on iOS in `fabric-example`. For
best effect try to reload after ~0.5s after the first time, but it's
easily reproducible if the App isn't slowed heavily by debugger
instructions.

With these changes Reanimated doesn't crash, but the race condition
continues to manifest.
tjzel added a commit that referenced this pull request Mar 12, 2025
When React Native is reloaded when `WorkletsModule.installTurboModule()`
wasn't called yet we tried to invoke `invalidateCpp()` which resulted in
a nullptr exception, because hybrid data wasn't yet initialized.

I applied similar safety mechanism to all relevant code, turns out it
narrowed down only to `NativeProxy.java`.

See that App no longer crashes on `invalidateCpp()` on a double reload
on Android.
tjzel added a commit that referenced this pull request Mar 12, 2025
Fixes #7182
Requires #7197

Currently on iOS React Native can go into a non-fatal race condition on
a double reload. Double reload can happen during an OTA update, when an
app is reloaded immediately after evaluating the bundle.

The flow goes more or less like this. I haven't explored it to the
deepest details.
1. On the first reload a new JavaScript runtime is created.
2. The new runtime evaluates the bundle, starts to populate
`RCTModuleRegistry`.
3. On the second reload a new JavaScript runtime is created. The old
runtime remains and keeps executing. Both runtimes execute on the same
thread, on the same queue.
4. `WorkletsModule.installTurboModule()` is called from the first
runtime. `WorkletsModule` is fetched from the `RCTModuleRegistry`.
5. The second runtime overwrites (clears) the `RCTModuleRegistry` and
starts to populate it. I don't know why both runtimes target the same
instance of `RCTModuleRegistry`.
6. `WorkletsModule.installTurboModule()` finishes successfully on the
first runtime.
7. `ReanimatedModule.installTurboModule()` is invoked in quick
succession on the first runtime.
8. `ReanimatedModule` tries to fetch `WorkletsModule` from
`RCTModuleRegistry`. However, `RCTModuleRegistry` was wiped and it
doesn't have the fully installed instance of `WorkletsModule` anymore.
Therefore a new instance of `WorkletsModule` is created and inserted
into the `RCTModuleRegistry`.
9. In JavaScript we guarantee that
`ReanimatedModule.installTurboModule()` is invoked only after
`WorkletsModule.installTurboModule()`. We follow that heuristic in ObjC
and therefore try operating on a non-installed instance of
`WorkletsModule` resulting in a `NullPtrException`.

We can effectively detect that flow and ignore it, since that race
condition is fatal only for the first runtime which is torn down soon
after anyway.

See the provided screenshots which illustrate the problems.

| <img width="1018" alt="Screenshot 2025-03-07 at 10 34 00"
src="https://github.com/user-attachments/assets/263b9f0d-9573-48a8-99e8-82099221c3e3"
/> | <img width="1014" alt="Screenshot 2025-03-07 at 12 25 23"
src="https://github.com/user-attachments/assets/7dde0eda-526b-43fa-8181-1bec4ad7c38e"
/> |
|:---|:---|
| Setting `moduleRegistry` for `WorkletsModule` during a healthy
execution of `WorkletsModule.installTurboModule()` | Setting
`moduleRegistry` for `WorkletsModule` during race-condition execution.
Note that the list of the modules is much shorter and that
`WorkletsModule` isn't there. |

To reproduce this issue, reload twice on iOS in `fabric-example`. For
best effect try to reload after ~0.5s after the first time, but it's
easily reproducible if the App isn't slowed heavily by debugger
instructions.

With these changes Reanimated doesn't crash, but the race condition
continues to manifest.
tjzel added a commit that referenced this pull request Mar 16, 2025
When React Native is reloaded when `WorkletsModule.installTurboModule()`
wasn't called yet we tried to invoke `invalidateCpp()` which resulted in
a nullptr exception, because hybrid data wasn't yet initialized.

I applied similar safety mechanism to all relevant code, turns out it
narrowed down only to `NativeProxy.java`.

See that App no longer crashes on `invalidateCpp()` on a double reload
on Android.
tjzel added a commit that referenced this pull request Mar 16, 2025
Fixes #7182
Requires #7197

Currently on iOS React Native can go into a non-fatal race condition on
a double reload. Double reload can happen during an OTA update, when an
app is reloaded immediately after evaluating the bundle.

The flow goes more or less like this. I haven't explored it to the
deepest details.
1. On the first reload a new JavaScript runtime is created.
2. The new runtime evaluates the bundle, starts to populate
`RCTModuleRegistry`.
3. On the second reload a new JavaScript runtime is created. The old
runtime remains and keeps executing. Both runtimes execute on the same
thread, on the same queue.
4. `WorkletsModule.installTurboModule()` is called from the first
runtime. `WorkletsModule` is fetched from the `RCTModuleRegistry`.
5. The second runtime overwrites (clears) the `RCTModuleRegistry` and
starts to populate it. I don't know why both runtimes target the same
instance of `RCTModuleRegistry`.
6. `WorkletsModule.installTurboModule()` finishes successfully on the
first runtime.
7. `ReanimatedModule.installTurboModule()` is invoked in quick
succession on the first runtime.
8. `ReanimatedModule` tries to fetch `WorkletsModule` from
`RCTModuleRegistry`. However, `RCTModuleRegistry` was wiped and it
doesn't have the fully installed instance of `WorkletsModule` anymore.
Therefore a new instance of `WorkletsModule` is created and inserted
into the `RCTModuleRegistry`.
9. In JavaScript we guarantee that
`ReanimatedModule.installTurboModule()` is invoked only after
`WorkletsModule.installTurboModule()`. We follow that heuristic in ObjC
and therefore try operating on a non-installed instance of
`WorkletsModule` resulting in a `NullPtrException`.

We can effectively detect that flow and ignore it, since that race
condition is fatal only for the first runtime which is torn down soon
after anyway.

See the provided screenshots which illustrate the problems.

| <img width="1018" alt="Screenshot 2025-03-07 at 10 34 00"
src="https://github.com/user-attachments/assets/263b9f0d-9573-48a8-99e8-82099221c3e3"
/> | <img width="1014" alt="Screenshot 2025-03-07 at 12 25 23"
src="https://github.com/user-attachments/assets/7dde0eda-526b-43fa-8181-1bec4ad7c38e"
/> |
|:---|:---|
| Setting `moduleRegistry` for `WorkletsModule` during a healthy
execution of `WorkletsModule.installTurboModule()` | Setting
`moduleRegistry` for `WorkletsModule` during race-condition execution.
Note that the list of the modules is much shorter and that
`WorkletsModule` isn't there. |

To reproduce this issue, reload twice on iOS in `fabric-example`. For
best effect try to reload after ~0.5s after the first time, but it's
easily reproducible if the App isn't slowed heavily by debugger
instructions.

With these changes Reanimated doesn't crash, but the race condition
continues to manifest.
tjzel added a commit that referenced this pull request Mar 16, 2025
When React Native is reloaded when `WorkletsModule.installTurboModule()`
wasn't called yet we tried to invoke `invalidateCpp()` which resulted in
a nullptr exception, because hybrid data wasn't yet initialized.

I applied similar safety mechanism to all relevant code, turns out it
narrowed down only to `NativeProxy.java`.

See that App no longer crashes on `invalidateCpp()` on a double reload
on Android.
tjzel added a commit that referenced this pull request Mar 17, 2025
When React Native is reloaded when `WorkletsModule.installTurboModule()`
wasn't called yet we tried to invoke `invalidateCpp()` which resulted in
a nullptr exception, because hybrid data wasn't yet initialized.

I applied similar safety mechanism to all relevant code, turns out it
narrowed down only to `NativeProxy.java`.

See that App no longer crashes on `invalidateCpp()` on a double reload
on Android.
tjzel added a commit that referenced this pull request Mar 17, 2025
When React Native is reloaded when `WorkletsModule.installTurboModule()`
wasn't called yet we tried to invoke `invalidateCpp()` which resulted in
a nullptr exception, because hybrid data wasn't yet initialized.

I applied similar safety mechanism to all relevant code, turns out it
narrowed down only to `NativeProxy.java`.

See that App no longer crashes on `invalidateCpp()` on a double reload
on Android.
tjzel added a commit that referenced this pull request Mar 17, 2025
Fixes #7182
Requires #7197

Currently on iOS React Native can go into a non-fatal race condition on
a double reload. Double reload can happen during an OTA update, when an
app is reloaded immediately after evaluating the bundle.

The flow goes more or less like this. I haven't explored it to the
deepest details.
1. On the first reload a new JavaScript runtime is created.
2. The new runtime evaluates the bundle, starts to populate
`RCTModuleRegistry`.
3. On the second reload a new JavaScript runtime is created. The old
runtime remains and keeps executing. Both runtimes execute on the same
thread, on the same queue.
4. `WorkletsModule.installTurboModule()` is called from the first
runtime. `WorkletsModule` is fetched from the `RCTModuleRegistry`.
5. The second runtime overwrites (clears) the `RCTModuleRegistry` and
starts to populate it. I don't know why both runtimes target the same
instance of `RCTModuleRegistry`.
6. `WorkletsModule.installTurboModule()` finishes successfully on the
first runtime.
7. `ReanimatedModule.installTurboModule()` is invoked in quick
succession on the first runtime.
8. `ReanimatedModule` tries to fetch `WorkletsModule` from
`RCTModuleRegistry`. However, `RCTModuleRegistry` was wiped and it
doesn't have the fully installed instance of `WorkletsModule` anymore.
Therefore a new instance of `WorkletsModule` is created and inserted
into the `RCTModuleRegistry`.
9. In JavaScript we guarantee that
`ReanimatedModule.installTurboModule()` is invoked only after
`WorkletsModule.installTurboModule()`. We follow that heuristic in ObjC
and therefore try operating on a non-installed instance of
`WorkletsModule` resulting in a `NullPtrException`.

We can effectively detect that flow and ignore it, since that race
condition is fatal only for the first runtime which is torn down soon
after anyway.

See the provided screenshots which illustrate the problems.

| <img width="1018" alt="Screenshot 2025-03-07 at 10 34 00"
src="https://github.com/user-attachments/assets/263b9f0d-9573-48a8-99e8-82099221c3e3"
/> | <img width="1014" alt="Screenshot 2025-03-07 at 12 25 23"
src="https://github.com/user-attachments/assets/7dde0eda-526b-43fa-8181-1bec4ad7c38e"
/> |
|:---|:---|
| Setting `moduleRegistry` for `WorkletsModule` during a healthy
execution of `WorkletsModule.installTurboModule()` | Setting
`moduleRegistry` for `WorkletsModule` during race-condition execution.
Note that the list of the modules is much shorter and that
`WorkletsModule` isn't there. |

To reproduce this issue, reload twice on iOS in `fabric-example`. For
best effect try to reload after ~0.5s after the first time, but it's
easily reproducible if the App isn't slowed heavily by debugger
instructions.

With these changes Reanimated doesn't crash, but the race condition
continues to manifest.
tjzel added a commit that referenced this pull request Mar 18, 2025
Fixes #7182
Requires #7197

Currently on iOS React Native can go into a non-fatal race condition on
a double reload. Double reload can happen during an OTA update, when an
app is reloaded immediately after evaluating the bundle.

The flow goes more or less like this. I haven't explored it to the
deepest details.
1. On the first reload a new JavaScript runtime is created.
2. The new runtime evaluates the bundle, starts to populate
`RCTModuleRegistry`.
3. On the second reload a new JavaScript runtime is created. The old
runtime remains and keeps executing. Both runtimes execute on the same
thread, on the same queue.
4. `WorkletsModule.installTurboModule()` is called from the first
runtime. `WorkletsModule` is fetched from the `RCTModuleRegistry`.
5. The second runtime overwrites (clears) the `RCTModuleRegistry` and
starts to populate it. I don't know why both runtimes target the same
instance of `RCTModuleRegistry`.
6. `WorkletsModule.installTurboModule()` finishes successfully on the
first runtime.
7. `ReanimatedModule.installTurboModule()` is invoked in quick
succession on the first runtime.
8. `ReanimatedModule` tries to fetch `WorkletsModule` from
`RCTModuleRegistry`. However, `RCTModuleRegistry` was wiped and it
doesn't have the fully installed instance of `WorkletsModule` anymore.
Therefore a new instance of `WorkletsModule` is created and inserted
into the `RCTModuleRegistry`.
9. In JavaScript we guarantee that
`ReanimatedModule.installTurboModule()` is invoked only after
`WorkletsModule.installTurboModule()`. We follow that heuristic in ObjC
and therefore try operating on a non-installed instance of
`WorkletsModule` resulting in a `NullPtrException`.

We can effectively detect that flow and ignore it, since that race
condition is fatal only for the first runtime which is torn down soon
after anyway.

See the provided screenshots which illustrate the problems.

| <img width="1018" alt="Screenshot 2025-03-07 at 10 34 00"
src="https://github.com/user-attachments/assets/263b9f0d-9573-48a8-99e8-82099221c3e3"
/> | <img width="1014" alt="Screenshot 2025-03-07 at 12 25 23"
src="https://github.com/user-attachments/assets/7dde0eda-526b-43fa-8181-1bec4ad7c38e"
/> |
|:---|:---|
| Setting `moduleRegistry` for `WorkletsModule` during a healthy
execution of `WorkletsModule.installTurboModule()` | Setting
`moduleRegistry` for `WorkletsModule` during race-condition execution.
Note that the list of the modules is much shorter and that
`WorkletsModule` isn't there. |

To reproduce this issue, reload twice on iOS in `fabric-example`. For
best effect try to reload after ~0.5s after the first time, but it's
easily reproducible if the App isn't slowed heavily by debugger
instructions.

With these changes Reanimated doesn't crash, but the race condition
continues to manifest.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants