-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Ensure redis client socket is closed and prevent race condition #11001
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
metadata = config_db.get_table('DEVICE_METADATA') | ||
if metadata['localhost']['sub_role'] == FRONTEND_ASIC_SUB_ROLE: | ||
front_ns.append(namespace) | ||
elif metadata['localhost']['sub_role'] == BACKEND_ASIC_SUB_ROLE: | ||
back_ns.append(namespace) | ||
if config_db: | ||
# Ensure the client socket is closed | ||
config_db.close('CONFIG_DB') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see build failure at this close call
File "/usr/local/lib/python3.9/dist-packages/sonic_py_common/multi_asic.py", line 218, in get_all_namespaces
config_db.close('CONFIG_DB')
File "/usr/local/lib/python3.9/dist-packages/swsssdk/dbconnector.py", line 271, in close
self.dbintf.close(db_name)
File "/usr/local/lib/python3.9/dist-packages/swsssdk/interface.py", line 216, in close
self.redis_clients[db_name].connection_pool.disconnect()
AttributeError: 'SwssSyncClient' object has no attribute 'connection_pool'
if is_multi_asic(): | ||
for asic in range(num_asics): | ||
namespace = "{}{}".format(ASIC_NAME_PREFIX, asic) | ||
command = 'sonic-db-cli -n {} CONFIG_DB HGET "DEVICE_METADATA|localhost" "sub_role"'.format(namespace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with this change, database container is failing to boot with the error at
"sonic-db-cli -n {} CONFIG_DB HGET "DEVICE_METADATA|localhost" "sub_role""
/usr/local/lib/python3.9/dist-packages/supervisor/options.py:473: UserWarning: Supervisord is running as root and it is searching for its configuration file in default locations (inc.
self.warnings.warn(
Traceback (most recent call last):
File "/usr/local/bin/sonic-cfggen", line 32, in
from minigraph import minigraph_encoder, parse_xml, parse_device_desc_xml, parse_asic_sub_role, parse_asic_switch_type
File "/usr/local/lib/python3.9/dist-packages/minigraph.py", line 14, in
from portconfig import get_port_config
File "/usr/local/lib/python3.9/dist-packages/portconfig.py", line 10, in
from sonic_py_common.multi_asic import get_asic_id_from_name
File "/usr/local/lib/python3.9/dist-packages/sonic_py_common/multi_asic.py", line 447, in
raise RuntimeError(
RuntimeError: Command sonic-db-cli -n asic0 CONFIG_DB HGET "DEVICE_METADATA|localhost" "sub_role" failed with stderr None
/usr/local/lib/python3.9/dist-packages/supervisor/options.py:473: UserWarning: Supervisord is running as root and it is searching for its configuration file in default locations (inc.
self.warnings.warn(
Unlinking stale socket /var/run/supervisor.sock
@prgeor is there fd leak here or it is just too many fd open that is causing the issue ? |
@anamehra do we still have this issue ? |
Why I did it
How I did it
How to verify it
Ran multiple times
1 show interface counters
2. portstat -a
Ensure the redis client socket connection gets closed after the get_all_namespace() returns
Ensure no fd leak
Ensure no race-condition as observed in Cisco chassis when multiple Xcvrd threads calls get_all_namespace()
Which release branch to backport (provide reason below if selected)
Description for the changelog
Link to config_db schema for YANG module changes
A picture of a cute animal (not mandatory but encouraged)