Description
Description
Xcvrd CMIS Manager state machine is getting reset to CMIS_STATE_INSERTED from within on_port_update_event() method unnecessarily/improperly.
PR sonic-platform-daemons/#285 should be applied before attempting to reproduce and fix this issue.
Steps to reproduce the issue:
- Using a minimally CMIS 4.0 compliant transceiver module and associated port that is admin enabled, connect to an admin enabled remote peer port so that link (eventually) comes up.
- Save the configuration.
- Reboot local switch (line card), while leaving remote peer switch up and remote peer switch port admin enabled, and then monitor syslog PMON output during bringup.
Describe the results you received:
CMIS Manager state machine begins operating on the protagonist port transceiver and when it reaches CMIS_STATE_DP_INIT the module has its datapath initialized/configured. When this happens, if the remote peer interface is admin up then the link will come up. This will cause syncd originated Redis table update of port to oper 'up' state. The associated port update message that gets promulgated to Xcvrd will then cause Xcvrd to improperly reset the CMIS Manager state machine to CMIS_STATE_INSERTED.
Also of note here is a secondary unexpected behavior where, somehow, a subsequent port update message is promulgated to Xcvrd that shows the port as admin 'down' even though the actual admin state is 'up' and should never have changed. The genesis of this admin 'down' port update message is not currently understood and should be traced backward to its source for reconciliation.
Describe the results you expected:
CMIS Manager should ignore the oper state attribute/field in any arriving port update message, and should only reset the associated module to CMIS_STATE_INSERTED when admin state has toggled either from up-to-down or from down-to-up. Consecutive up-to-up and down-to-down admin status indications should be ignored/debounced.
Note that the above comment is made in a vacuum as relates to only oper_status and admin_status attributes/fields, and does not consider the situation where other germane transceiver related configuration attributes/fields (for example, 'laser_freq') are changed and thus may distinctly require reset of the state machine to CMIS_STATE_INSERTED (irrespective of admin status state).