Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add new platform management API. Currently with support for chassis, PSUs, fans and watchdog #13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new platform management API. Currently with support for chassis, PSUs, fans and watchdog #13
Changes from 13 commits
b039969
aa55f96
bfce7ad
d169853
bd6b6ed
8dfa89f
e026fea
d281571
7e1f1ec
7ab83cf
8e9aaaf
d605896
5c91399
5fbf599
147e830
f7c3810
9bc056f
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reboot cause can be persistent on some product and have to be explicitly cleared.
Otherwise reading this information after a reboot will show all the reload causes that happened previous times.
Some platform might also have limitation on the number of reboot causes that can be stored.
I would suggest either a default parameter
clear=False
to this method or to add a new one calledclear_reboot_cause
.Also I think this function should only be called once with
clear=True
and the result cached somewhere in a tmpfs. This means that future asks for reload cause information would go through the cache.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the need to clear the reboot cause after reading is platform-specific functionality, I believe this logic should be left to the platform-specific implementation of the function. If the platform requires clearing the reboot cause, it should be done in the implementation and a file saved to tmpfs. The function can check for the presence of the file first before attempting to read/clear the register. I prefer this to adding a
clear=True
parameter to the base method, and I think adding aclear_reboot_cause()
method is unnecessary, because there should be no need to explicitly clear the reboot cause.If you're concerned that not all platforms will support all reboot causes, that is expected. The base class will define all possible reboot causes that SONiC understands; the platform-specific implementation will simply return the subset which is applicable to that platform. If this doesn't address your concern, please elaborate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would assume all the platforms to behave the same way with regard to the reboot causes. However this information would have to be crowd-sourced from other vendors as well for more relevant data.
I'm fine with letting the responsibility of clearing and caching the reboot cause to the plugin implementation.
Is there a place in SONiC where such cache files should go? Currently almost everything is backed by the flash including
/tmp
. The only place that is atmpfs
and would feel "suitable" seems to be/run
. Is there some guidelines somewhere regarding this? I would be interested to know the reason why/tmp
is not atmpfs
.My concern was a bit different. Your previous answer ended up answered my question but I'll elaborate a bit more.
Basically the concept of a reboot cause is to be stored on an external device until being read.
These devices usually have space restriction and can only save a given number of reboot cause after which they won't store anything until being cleared. For example if your device can store 2 reboot cause and you trigger 2 watchdogs and then 1 powerloss, you might not see the powerloss if you haven't cleared the reboot cause after each reboot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wanted to get confirmation that if a platform takes a Baseboard Management Controller (BMC) based approach (to handle thermals, events etc), these (and the other such APIs) can remain unimplemented..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intention here is for daemons to receive notifications of changes in platform hardware state in order to update SONiC's view of device state. For instance, if a tranceiver, PSU or fan tray is added or removed. Can you provide an example of an event handled by the BMC which SONiC would not need to be made aware of?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that the events referred to is specifically to track the presence/absence i..e OIR of the modules (essentially this API seems to be a generalization of the get_transceiver_change_event()).
If the BMC implements the thermals, an addition/removal of FAN tray would be completely handled by the BMC itself - but atleast for generating a syslog this API needs to be implemented for SONiC..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming this is only for the fan daemon logic usage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is the initial intention. We may find other uses in the future, however. Is there a concern?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No concerns. I'm was just curious if the fand was in the roadmap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see everywhere that the fan speed is an int from 0 to 100.
Some device dealing with fans can have finer grained control (like 0 to 255) which means loosing a bit of precision.
Was this taken into consideration and integer deemed better than floats?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you prefer a 0.0 ... 1.0 float scale? This should allow for finer-grained control if necessary, while maintaining a more universal scale.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind either way, integer percentages 0 to 100 sounds good to me.
I just wanted to bring this topic up as the sysfs interface defines a pwm value to be ranging from 0 to 255.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If all platforms used sysfs, the 0-255 scale would be fine. However, we'd like a universal scale that will work with other interfaces (IPMI, etc.).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have any strong preference. I just wanted to bring that topic to make sure it wasn't overlooked.
A percentage sounds perfect to me.