Skip to content

🐛 Bug: Fix for OCPP Callback Queue Panic – Suggestion for Library Improvement #363

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
2 of 5 tasks
qosmotec opened this issue May 18, 2025 · 3 comments
Open
2 of 5 tasks

Comments

@qosmotec
Copy link

📜 Description

Hi Lorenzo,

We've encountered a recurring panic in the OCPP library related to interface conversion errors when handling asynchronous responses. Here's a sample of the error:

panic: interface conversion: ocpp.Response is *remotetrigger.TriggerMessageConfirmation, not *smartcharging.SetChargingProfileConfirmation

panic: interface conversion: ocpp.Response is *core.HeartbeatConfirmation, not *core.MeterValuesConfirmation

This issue occurred on at least two charging stations—one of them crashed up to 20 times per day due to this.

Investigation

We reviewed previous issues on the GitHub repo, including:

We traced the root cause to how the callback queue works in callbackqueue.go.

Problem

The current implementation queues callbacks per ClientID, but dequeues them in FIFO order. This assumes that responses always arrive in sequence, which doesn't hold if a station delays or times out.

In timeout scenarios:

  1. The callback is dequeued due to timeout.
  2. A delayed response later arrives.
  3. Its callback no longer exists, so the wrong handler is invoked.
  4. This results in an interface conversion panic.

Our Solution

We modified the CallbackQueue to maintain callbacks by both ClientID and request type. That way, callbacks are queued and dequeued by type, avoiding mismatched handlers.

type CallbackQueue struct {
    callbacksMutex sync.RWMutex
    // Old:
    // callbacks map[string][]func(confirmation ocpp.Response, err error)
    
    // New:
    callbacks map[string]map[RequestType][]func(confirmation ocpp.Response, err error)
}

This change has completely eliminated the panic in our environment.

Alternative Idea

A different approach could be queuing by message ID instead, and validating callback presence before handling.

Next Steps

Would you be open to reviewing this solution for inclusion in the library? I’m happy to submit a pull request if you think it’s a good direction.

Thanks again for your great work on this project!

Best regards,

👟 Reproduction steps

  1. The callback is dequeued due to timeout.
  2. A delayed response later arrives.
  3. Its callback no longer exists, so the wrong handler is invoked.
  4. This results in an interface conversion panic.

👍 Expected behavior

Callback queue dequeue should find answer by type to avoid panic, not only dequeuing the last message. A different approach could be queuing by message ID instead, and validating callback presence before handling.

👎 Actual Behavior

Callback queue dequeue uses last Message only, this leads to panic if that one doesn't match.

What OCPP version are you using?

  • OCPP 1.6
  • OCPP 2.0.1

Are you using any OCPP extensions?

  • OCPP 1.6 Security
  • OCPP 1.6 Plug and Charge

release version

No response

📃 Provide any additional context for the Bug.

No response

👀 Have you spent some time to check if this bug has been found before?

  • I checked and didn't find a similar issue
@qosmotec
Copy link
Author

Image

@xBlaz3kx
Copy link
Contributor

xBlaz3kx commented May 19, 2025

Hey,

great catch! However, I think the alternate approach might be the correct one, according to OCPP spec:

Image

Im not quite a 100% familiar with internals, but in this case a callback queue wouldnt even be needed, as we wouldnt actually be enforcing the message order/sequence.

@lorenzodonini Could you asses this approach?

@AndrewYEEE
Copy link

📜 描述

你好,Lorenzo,

在處理非同步回應時,我們在 OCPP 庫中遇到了與介面轉換錯誤相關的反覆出現的恐慌。以下是錯誤的範例:

panic: interface conversion: ocpp.Response is *remotetrigger.TriggerMessageConfirmation, not *smartcharging.SetChargingProfileConfirmation

panic: interface conversion: ocpp.Response is *core.HeartbeatConfirmation, not *core.MeterValuesConfirmation

至少有兩個充電站出現此問題,其中一個充電站因此每天崩潰多達 20 次。

調查

我們回顧了 GitHub 倉庫中先前的問題,包括:

我們追溯了根本原因,發現回呼隊列的工作方式callbackqueue.go

問題

目前實現按 排隊回調ClientID,但按 FIFO 順序將其出隊。這假設回應總是按順序到達,但如果站點延遲或逾時,則不成立。

在超時場景中:

  1. 由於超時,回調已出隊。
  2. 稍後會收到延遲的回覆。
  3. 它的回調不再存在,因此呼叫了錯誤的處理程序。
  4. 這會導致介面轉換恐慌。

我們的解決方案

我們修改了以透過和請求類型CallbackQueue來維護回調。ClientID這樣,回調就會根據類型排隊和出隊,避免處理程序不符。

type CallbackQueue struct {
callbacksMutex sync.RWMutex
// Old:
// callbacks map[string][]func(confirmation ocpp.Response, err error)

// New:
callbacks map[string]map[RequestType][]func(confirmation ocpp.Response, err error)

}
這項改變徹底消除了我們環境中的恐慌。

替代方案

另一種方法可能是透過訊息 ID 進行排隊,並在處理之前驗證回呼的存在。

後續步驟

您是否願意審查該解決方案並將其納入庫中?如果您認為這是一個好的方向,我很樂意提交拉取請求。

再次感謝您在這個專案上所做的出色工作!

此致,

👟 復現步驟

  1. 由於超時,回調已出隊。
  2. 稍後會收到延遲的回覆。
  3. 它的回調不再存在,因此呼叫了錯誤的處理程序。
  4. 這會導致介面轉換恐慌。

👍 預期行為

回調隊列出隊應該按類型找到答案以避免恐慌,而不僅僅是出隊最後一條訊息。另一種方法可能是透過訊息 ID 進行排隊,並在處理之前驗證回呼的存在。

👎 實際行為

回調隊列出隊僅使用最後一條訊息,如果該訊息不匹配,則會導致恐慌。

您使用的 OCPP 版本是什麼?

  • OCPP 1.6[ ] OCPP 2.0.1

您是否使用任何 OCPP 擴充功能?

  • OCPP 1.6 安全性[ ] OCPP 1.6 即插即充

發布版本

沒有回應

📃 為 Bug 提供任何額外的上下文。

沒有回應

👀 您是否花了一些時間檢查過此錯誤是否之前被發現過?

  • 我檢查了一下,沒有發現類似的問題

Can you share the modified code? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants