Skip to content

Add HLD for Orchagent error handling improvements #1698

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

prabhataravind
Copy link

This HLD change attempts to address the following:

  • Handle all ASIC/SAI programming errors gracefully without causing orchagent to crash or restart
  • Detect missed notifications from APP_DB to orchagent in SONiC systems that use redis-based communication channels
  • Detect out-of-sync entries between APP_DB and ASIC_DB

@prabhataravind prabhataravind marked this pull request as ready for review June 24, 2024 00:37
@zhangyanzhao
Copy link
Collaborator

@zhangyanzhao
Copy link
Collaborator

Please leave comments if you want to be a reviewer of this HLD. Thanks.

@zhangyanzhao
Copy link
Collaborator

@prabhataravind can you please add the code PRs by referring to #806? Thanks.

@zhangyanzhao
Copy link
Collaborator

HLD PR is not merged, no code PR. Move to backlog

@mssonicbld
Copy link
Collaborator

/azp run

Copy link

No pipelines are associated with this pull request.

Signed-off-by: Prabhat Aravind <[email protected]>
@mssonicbld
Copy link
Collaborator

/azp run

Copy link

No pipelines are associated with this pull request.

![sai status handling](images/sai_status_handling.png)

It is to be noted that some combinations in the table above are not valid scenarios like for example: SAI_STATUS_INSUFFICIENT_RESOURCES when removing an object or SAI_STATUS_ITEM_NOT_FOUND when creating an object. They are however mentioned for completeness.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a section for Bulk API failure handling.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check for bulk stats API failures

@anilpannala anilpannala moved this from 📋 In Plan Features to MovedToBacklog in SONiC 202505 Release May 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: MovedToBacklog
Status: MovedToBacklog
Development

Successfully merging this pull request may close these issues.

5 participants