Description
I'm facing a weird panic, which does happen in my app, however not frequently, and I can't seem to reproduce it.
The panic is this one:
[2024-10-29 12:35:08.448][39][critical][wasm] [source/extensions/common/wasm/context.cc:1181] wasm log envoy.vm: panicked at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/proxy-wasm-0.2.2/src/dispatcher.rs:375:13:
invalid context_id
[2024-10-29 12:35:08.448][39][error][wasm] [source/extensions/common/wasm/wasm_vm.cc:38] Function: proxy_on_response_headers failed: Uncaught RuntimeError: unreachable
Proxy-Wasm plugin in-VM backtrace:
0: 0x37cec - __rust_start_panic
1: 0x37cba - rust_panic
2: 0x37cac - _ZN3std9panicking20rust_panic_with_hook17h844c7fdc9e749b51E
3: 0x336a5 - _ZN3std9panicking11begin_panic28_$u7b$$u7b$closure$u7d$$u7d$17h2d5a84dc577a6a1fE
4: 0x33668 - _ZN3std10sys_common9backtrace26__rust_end_short_backtrace17h1cd545de85e32f17E
5: 0x33a93 - _ZN3std9panicking11begin_panic17hef1cc66353531458E
6: 0x29caf - _ZN10proxy_wasm10dispatcher10Dispatcher24on_http_response_headers17ha0ec21a408e1e8a6E
7: 0x2f992 - proxy_on_response_headers
libc++abi: Pure virtual function called!
A few details about this specific envoy, I'm running an external_processor filter, followed by a wasm filter. The envoy's clusters and routes are configured dynamically from an xds-server.
One interesting part is that all my replicas seem to be crashing in the same time. So this made me look towards the common xds service, but I can't exclude some lifecycle race condition somewhere that occurs after failures from my external_processor grpc service.
I tried reproducing, with a similar setup, ext_proc + wasm, having the ext_proc timeout for half the requests, while also sending cds updates every 10ms, but it works as expected.
Does the team here have any ideas what could cause this or how to investigate further? It just seems the response_headers handler is called after the context is removed (on_done).