Skip to content

Commit 87576c0

Browse files
stephenxslguohan
authored andcommitted
[Mellanox] Add the following driver patches and update patch/series accordingly. (#115)
patch/0048-mlxsw-core-Skip-thermal-zone-operations-initializati.patch patch/0049-thermal-Fix-use-after-free-when-unregistering-therma.patch patch/0050-mlxsw-core-Drop-creation-of-thermal-to-hwmon-sysfs-i.patch patch/0051-mlxsw-core-Skip-thermal-zones-threshold-setting-duri.patch patch/0052-platform-x86-mlx-platform-Add-more-detention-for-sys.patch
1 parent 0609d7d commit 87576c0

6 files changed

+386
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
From 2a2cd8876916ffaf530053a5e1f05940258eab86 Mon Sep 17 00:00:00 2001
2+
From: Vadim Pasternak <[email protected]>
3+
Date: Tue, 5 Nov 2019 15:25:22 +0200
4+
Subject: [PATCH v1] mlxsw: core: Skip thermal zone operations initialization
5+
6+
Skip thermal zones setting for modules to reduce probing time.
7+
It is to be read anyway during thermal zone operations.
8+
Skip thermal zone setting and reading during initialization.
9+
10+
Decrease i2c controller polling time from 2000usec to 200usec
11+
for the performance improvement.
12+
13+
Signed-off-by: Vadim Pasternak <[email protected]>
14+
---
15+
drivers/i2c/busses/i2c-mlxcpld.c | 2 +-
16+
drivers/net/ethernet/mellanox/mlxsw/core_thermal.c | 21 +++++++++++++++++++++
17+
2 files changed, 22 insertions(+), 1 deletion(-)
18+
19+
diff --git a/drivers/i2c/busses/i2c-mlxcpld.c b/drivers/i2c/busses/i2c-mlxcpld.c
20+
index 2fd717d8dd30..41b57027e348 100644
21+
--- a/drivers/i2c/busses/i2c-mlxcpld.c
22+
+++ b/drivers/i2c/busses/i2c-mlxcpld.c
23+
@@ -51,7 +51,7 @@
24+
#define MLXCPLD_I2C_MAX_ADDR_LEN 4
25+
#define MLXCPLD_I2C_RETR_NUM 2
26+
#define MLXCPLD_I2C_XFER_TO 500000 /* usec */
27+
-#define MLXCPLD_I2C_POLL_TIME 2000 /* usec */
28+
+#define MLXCPLD_I2C_POLL_TIME 200 /* usec */
29+
30+
/* LPC I2C registers */
31+
#define MLXCPLD_LPCI2C_CPBLTY_REG 0x0
32+
diff --git a/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c b/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c
33+
index dfaad30ae960..08458e30e171 100644
34+
--- a/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c
35+
+++ b/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c
36+
@@ -116,6 +116,7 @@ struct mlxsw_thermal {
37+
u8 tz_gearbox_num;
38+
unsigned int tz_highest_score;
39+
struct thermal_zone_device *tz_highest_dev;
40+
+ bool initializing; /* Driver is in initialization stage */
41+
};
42+
43+
static inline u8 mlxsw_state_to_duty(int state)
44+
@@ -287,6 +288,12 @@ static int mlxsw_thermal_get_temp(struct thermal_zone_device *tzdev,
45+
int temp;
46+
int err;
47+
48+
+ /* Do not read temperature in initialization stage. */
49+
+ if (thermal->initializing) {
50+
+ *p_temp = 0;
51+
+ return 0;
52+
+ }
53+
+
54+
mlxsw_reg_mtmp_pack(mtmp_pl, 0, false, false);
55+
56+
err = mlxsw_reg_query(thermal->core, MLXSW_REG(mtmp), mtmp_pl);
57+
@@ -458,6 +465,12 @@ static int mlxsw_thermal_module_temp_get(struct thermal_zone_device *tzdev,
58+
int temp;
59+
int err;
60+
61+
+ /* Do not read temperature in initialization stage. */
62+
+ if (thermal->initializing) {
63+
+ *p_temp = 0;
64+
+ return 0;
65+
+ }
66+
+
67+
/* Read module temperature. */
68+
mlxsw_reg_mtmp_pack(mtmp_pl, MLXSW_REG_MTMP_MODULE_INDEX_MIN +
69+
tz->module, false, false);
70+
@@ -565,6 +578,12 @@ static int mlxsw_thermal_gearbox_temp_get(struct thermal_zone_device *tzdev,
71+
int temp;
72+
int err;
73+
74+
+ /* Do not read temperature in initialization stage. */
75+
+ if (thermal->initializing) {
76+
+ *p_temp = 0;
77+
+ return 0;
78+
+ }
79+
+
80+
index = MLXSW_REG_MTMP_GBOX_INDEX_MIN + tz->module;
81+
mlxsw_reg_mtmp_pack(mtmp_pl, index, false, false);
82+
83+
@@ -919,6 +938,7 @@ int mlxsw_thermal_init(struct mlxsw_core *core,
84+
thermal->core = core;
85+
thermal->bus_info = bus_info;
86+
memcpy(thermal->trips, default_thermal_trips, sizeof(thermal->trips));
87+
+ thermal->initializing = true;
88+
89+
err = mlxsw_reg_query(thermal->core, MLXSW_REG(mfcr), mfcr_pl);
90+
if (err) {
91+
@@ -994,6 +1014,7 @@ int mlxsw_thermal_init(struct mlxsw_core *core,
92+
goto err_unreg_modules_tzdev;
93+
94+
thermal->mode = THERMAL_DEVICE_DISABLED;
95+
+ thermal->initializing = false;
96+
*p_thermal = thermal;
97+
return 0;
98+
99+
--
100+
2.11.0
101+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,138 @@
1+
From 09246e73df900b96b534d6b7ccd26a681facb4d5 Mon Sep 17 00:00:00 2001
2+
From: Vadim Pasternak <[email protected]>
3+
Date: Tue, 26 Nov 2019 09:09:45 +0200
4+
Subject: [PATCH backport bugfix from v5.3] thermal: Fix use-after-free when
5+
unregistering thermal zone device
6+
7+
Upstream commit 1851799e1d2978f68eea5d9dff322e121dcf59c1
8+
Author: Ido Schimmel <[email protected]>
9+
Date: Wed Jul 10 13:14:52 2019 +0300
10+
11+
thermal: Fix use-after-free when unregistering thermal zone device
12+
13+
thermal_zone_device_unregister() cancels the delayed work that polls the
14+
thermal zone, but it does not wait for it to finish. This is racy with
15+
respect to the freeing of the thermal zone device, which can result in a
16+
use-after-free [1].
17+
18+
Fix this by waiting for the delayed work to finish before freeing the
19+
thermal zone device. Note that thermal_zone_device_set_polling() is
20+
never invoked from an atomic context, so it is safe to call
21+
cancel_delayed_work_sync() that can block.
22+
23+
[1]
24+
[ +0.002221] ==================================================================
25+
[ +0.000064] BUG: KASAN: use-after-free in __mutex_lock+0x1076/0x11c0
26+
[ +0.000016] Read of size 8 at addr ffff8881e48e0450 by task kworker/1:0/17
27+
28+
[ +0.000023] CPU: 1 PID: 17 Comm: kworker/1:0 Not tainted 5.2.0-rc6-custom-02495-g8e73ca3be4af #1701
29+
[ +0.000010] Hardware name: Mellanox Technologies Ltd. MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016
30+
[ +0.000016] Workqueue: events_freezable_power_ thermal_zone_device_check
31+
[ +0.000012] Call Trace:
32+
[ +0.000021] dump_stack+0xa9/0x10e
33+
[ +0.000020] print_address_description.cold.2+0x9/0x25e
34+
[ +0.000018] __kasan_report.cold.3+0x78/0x9d
35+
[ +0.000016] kasan_report+0xe/0x20
36+
[ +0.000016] __mutex_lock+0x1076/0x11c0
37+
[ +0.000014] step_wise_throttle+0x72/0x150
38+
[ +0.000018] handle_thermal_trip+0x167/0x760
39+
[ +0.000019] thermal_zone_device_update+0x19e/0x5f0
40+
[ +0.000019] process_one_work+0x969/0x16f0
41+
[ +0.000017] worker_thread+0x91/0xc40
42+
[ +0.000014] kthread+0x33d/0x400
43+
[ +0.000015] ret_from_fork+0x3a/0x50
44+
45+
[ +0.000020] Allocated by task 1:
46+
[ +0.000015] save_stack+0x19/0x80
47+
[ +0.000015] __kasan_kmalloc.constprop.4+0xc1/0xd0
48+
[ +0.000014] kmem_cache_alloc_trace+0x152/0x320
49+
[ +0.000015] thermal_zone_device_register+0x1b4/0x13a0
50+
[ +0.000015] mlxsw_thermal_init+0xc92/0x23d0
51+
[ +0.000014] __mlxsw_core_bus_device_register+0x659/0x11b0
52+
[ +0.000013] mlxsw_core_bus_device_register+0x3d/0x90
53+
[ +0.000013] mlxsw_pci_probe+0x355/0x4b0
54+
[ +0.000014] local_pci_probe+0xc3/0x150
55+
[ +0.000013] pci_device_probe+0x280/0x410
56+
[ +0.000013] really_probe+0x26a/0xbb0
57+
[ +0.000013] driver_probe_device+0x208/0x2e0
58+
[ +0.000013] device_driver_attach+0xfe/0x140
59+
[ +0.000013] __driver_attach+0x110/0x310
60+
[ +0.000013] bus_for_each_dev+0x14b/0x1d0
61+
[ +0.000013] driver_register+0x1c0/0x400
62+
[ +0.000015] mlxsw_sp_module_init+0x5d/0xd3
63+
[ +0.000014] do_one_initcall+0x239/0x4dd
64+
[ +0.000013] kernel_init_freeable+0x42b/0x4e8
65+
[ +0.000012] kernel_init+0x11/0x18b
66+
[ +0.000013] ret_from_fork+0x3a/0x50
67+
68+
[ +0.000015] Freed by task 581:
69+
[ +0.000013] save_stack+0x19/0x80
70+
[ +0.000014] __kasan_slab_free+0x125/0x170
71+
[ +0.000013] kfree+0xf3/0x310
72+
[ +0.000013] thermal_release+0xc7/0xf0
73+
[ +0.000014] device_release+0x77/0x200
74+
[ +0.000014] kobject_put+0x1a8/0x4c0
75+
[ +0.000014] device_unregister+0x38/0xc0
76+
[ +0.000014] thermal_zone_device_unregister+0x54e/0x6a0
77+
[ +0.000014] mlxsw_thermal_fini+0x184/0x35a
78+
[ +0.000014] mlxsw_core_bus_device_unregister+0x10a/0x640
79+
[ +0.000013] mlxsw_devlink_core_bus_device_reload+0x92/0x210
80+
[ +0.000015] devlink_nl_cmd_reload+0x113/0x1f0
81+
[ +0.000014] genl_family_rcv_msg+0x700/0xee0
82+
[ +0.000013] genl_rcv_msg+0xca/0x170
83+
[ +0.000013] netlink_rcv_skb+0x137/0x3a0
84+
[ +0.000012] genl_rcv+0x29/0x40
85+
[ +0.000013] netlink_unicast+0x49b/0x660
86+
[ +0.000013] netlink_sendmsg+0x755/0xc90
87+
[ +0.000013] __sys_sendto+0x3de/0x430
88+
[ +0.000013] __x64_sys_sendto+0xe2/0x1b0
89+
[ +0.000013] do_syscall_64+0xa4/0x4d0
90+
[ +0.000013] entry_SYSCALL_64_after_hwframe+0x49/0xbe
91+
92+
[ +0.000017] The buggy address belongs to the object at ffff8881e48e0008
93+
which belongs to the cache kmalloc-2k of size 2048
94+
[ +0.000012] The buggy address is located 1096 bytes inside of
95+
2048-byte region [ffff8881e48e0008, ffff8881e48e0808)
96+
[ +0.000007] The buggy address belongs to the page:
97+
[ +0.000012] page:ffffea0007923800 refcount:1 mapcount:0 mapping:ffff88823680d0c0 index:0x0 compound_mapcount: 0
98+
[ +0.000020] flags: 0x200000000010200(slab|head)
99+
[ +0.000019] raw: 0200000000010200 ffffea0007682008 ffffea00076ab808 ffff88823680d0c0
100+
[ +0.000016] raw: 0000000000000000 00000000000d000d 00000001ffffffff 0000000000000000
101+
[ +0.000007] page dumped because: kasan: bad access detected
102+
103+
[ +0.000012] Memory state around the buggy address:
104+
[ +0.000012] ffff8881e48e0300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
105+
[ +0.000012] ffff8881e48e0380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
106+
[ +0.000012] >ffff8881e48e0400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
107+
[ +0.000008] ^
108+
[ +0.000012] ffff8881e48e0480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
109+
[ +0.000012] ffff8881e48e0500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
110+
[ +0.000007] ==================================================================
111+
112+
Fixes: b1569e99c795 ("ACPI: move thermal trip handling to generic thermal layer")
113+
Reported-by: Jiri Pirko <[email protected]>
114+
Signed-off-by: Ido Schimmel <[email protected]>
115+
Acked-by: Jiri Pirko <[email protected]>
116+
Signed-off-by: Zhang Rui <[email protected]>
117+
118+
Signed-off-by: Vadim Pasternak <[email protected]>
119+
---
120+
drivers/thermal/thermal_core.c | 2 +-
121+
1 file changed, 1 insertion(+), 1 deletion(-)
122+
123+
diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
124+
index 226b0b4aced6..7be7017f0d9e 100644
125+
--- a/drivers/thermal/thermal_core.c
126+
+++ b/drivers/thermal/thermal_core.c
127+
@@ -402,7 +402,7 @@ static void thermal_zone_device_set_polling(struct thermal_zone_device *tz,
128+
mod_delayed_work(system_freezable_wq, &tz->poll_queue,
129+
msecs_to_jiffies(delay));
130+
else
131+
- cancel_delayed_work(&tz->poll_queue);
132+
+ cancel_delayed_work_sync(&tz->poll_queue);
133+
}
134+
135+
static void monitor_thermal_zone(struct thermal_zone_device *tz)
136+
--
137+
2.11.0
138+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
From 539b36ac336510610b22b134284af5ecb2dc5009 Mon Sep 17 00:00:00 2001
2+
From: Vadim Pasternak <[email protected]>
3+
Date: Thu, 28 Nov 2019 08:34:22 +0200
4+
Subject: [PATCH mlxsw: thermal] mlxsw: core: Drop creation of thermal to hwmon
5+
sysfs interface
6+
7+
Drop creation of "hwmon" interfaces from "thermal". These interfaces
8+
are redundant, since they are created by "core_hwmon" component.
9+
Creation of those interface from "thermal" just causes each temperature
10+
input entry to by created twice in "hwmon"
11+
Add thermal zone platform parameters definition with the field
12+
"no_hwmon" set to true. Use it in thermal_zone_device_register().
13+
It will indicate that the "thermal" to "hwmon" sysfs interface is not
14+
required.
15+
16+
Signed-off-by: Vadim Pasternak <[email protected]>
17+
---
18+
drivers/net/ethernet/mellanox/mlxsw/core_thermal.c | 18 +++++++++++-------
19+
1 file changed, 11 insertions(+), 7 deletions(-)
20+
21+
diff --git a/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c b/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c
22+
index c4a426d01c5e..f234416305fd 100644
23+
--- a/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c
24+
+++ b/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c
25+
@@ -411,6 +411,10 @@ static int mlxsw_thermal_trend_get(struct thermal_zone_device *tzdev,
26+
return 0;
27+
}
28+
29+
+struct thermal_zone_params mlxsw_thermal_params = {
30+
+ .no_hwmon = true,
31+
+};
32+
+
33+
static struct thermal_zone_device_ops mlxsw_thermal_ops = {
34+
.bind = mlxsw_thermal_bind,
35+
.unbind = mlxsw_thermal_unbind,
36+
@@ -774,11 +778,11 @@ mlxsw_thermal_module_tz_init(struct mlxsw_thermal_module *module_tz)
37+
snprintf(tz_name, sizeof(tz_name), "mlxsw-module%d",
38+
module_tz->module + 1);
39+
module_tz->tzdev = thermal_zone_device_register(tz_name,
40+
- MLXSW_THERMAL_NUM_TRIPS,
41+
- MLXSW_THERMAL_TRIP_MASK,
42+
- module_tz,
43+
- &mlxsw_thermal_module_ops,
44+
- NULL, 0, 0);
45+
+ MLXSW_THERMAL_NUM_TRIPS,
46+
+ MLXSW_THERMAL_TRIP_MASK,
47+
+ module_tz,
48+
+ &mlxsw_thermal_module_ops,
49+
+ &mlxsw_thermal_params, 0, 0);
50+
if (IS_ERR(module_tz->tzdev)) {
51+
err = PTR_ERR(module_tz->tzdev);
52+
return err;
53+
@@ -898,7 +902,7 @@ mlxsw_thermal_gearbox_tz_init(struct mlxsw_thermal_module *gearbox_tz)
54+
MLXSW_THERMAL_TRIP_MASK,
55+
gearbox_tz,
56+
&mlxsw_thermal_gearbox_ops,
57+
- NULL, 0, 0);
58+
+ &mlxsw_thermal_params, 0, 0);
59+
if (IS_ERR(gearbox_tz->tzdev))
60+
return PTR_ERR(gearbox_tz->tzdev);
61+
62+
@@ -1052,7 +1056,7 @@ int mlxsw_thermal_init(struct mlxsw_core *core,
63+
MLXSW_THERMAL_TRIP_MASK,
64+
thermal,
65+
&mlxsw_thermal_ops,
66+
- NULL, 0,
67+
+ &mlxsw_thermal_params, 0,
68+
thermal->polling_delay);
69+
if (IS_ERR(thermal->tzdev)) {
70+
err = PTR_ERR(thermal->tzdev);
71+
--
72+
2.11.0
73+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
From 031b57148672ff229d42fe4401e552c05e14f87a Mon Sep 17 00:00:00 2001
2+
From: Vadim Pasternak <[email protected]>
3+
Date: Sun, 1 Dec 2019 17:29:02 +0200
4+
Subject: [PATCH mlxsw: thermal] mlxsw: core: Skip thermal zones threshold
5+
setting during initialization
6+
7+
Skip modules' thermal zones threshold setting during initialization in
8+
order to reduce driver's probing time.
9+
This setting will be performed at the first operation with the thermal
10+
zones.
11+
12+
Signed-off-by: Vadim Pasternak [email protected]
13+
---
14+
drivers/net/ethernet/mellanox/mlxsw/core_thermal.c | 4 ++--
15+
1 file changed, 2 insertions(+), 2 deletions(-)
16+
17+
diff --git a/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c b/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c
18+
index 7fbb7a24eb63..f234416305fd 100644
19+
--- a/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c
20+
+++ b/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c
21+
@@ -813,8 +813,8 @@ mlxsw_thermal_module_init(struct device *dev, struct mlxsw_core *core,
22+
sizeof(thermal->trips));
23+
/* Initialize all trip point. */
24+
mlxsw_thermal_module_trips_reset(module_tz);
25+
- /* Update trip point according to the module data. */
26+
- return mlxsw_thermal_module_trips_update(dev, core, module_tz);
27+
+
28+
+ return 0;
29+
}
30+
31+
static void mlxsw_thermal_module_fini(struct mlxsw_thermal_module *module_tz)
32+
--
33+
2.20.1
34+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
From 5df12806208285231c63a8757208648483aaf395 Mon Sep 17 00:00:00 2001
2+
From: Vadim Pasternak <[email protected]>
3+
Date: Tue, 3 Dec 2019 16:02:08 +0200
4+
Subject: [PATCH platform] platform/x86: mlx-platform: Add more detention for
5+
system attributes
6+
7+
Add new attribute for "next-generation" type systems:
8+
"reset_sw_pwr_off" for indication of reset caused by
9+
software power off command.
10+
11+
Signed-off-by: Vadim Pasternak <[email protected]>
12+
---
13+
drivers/platform/x86/mlx-platform.c | 6 ++++++
14+
1 file changed, 6 insertions(+)
15+
16+
diff --git a/drivers/platform/x86/mlx-platform.c b/drivers/platform/x86/mlx-platform.c
17+
index c3e75b26fe0b..765baf99de60 100644
18+
--- a/drivers/platform/x86/mlx-platform.c
19+
+++ b/drivers/platform/x86/mlx-platform.c
20+
@@ -1335,6 +1335,12 @@ static struct mlxreg_core_data mlxplat_mlxcpld_default_ng_regs_io_data[] = {
21+
.mask = GENMASK(7, 0) & ~BIT(1),
22+
.mode = 0444,
23+
},
24+
+ {
25+
+ .label = "reset_sw_pwr_off",
26+
+ .reg = MLXPLAT_CPLD_LPC_REG_RST_CAUSE2_OFFSET,
27+
+ .mask = GENMASK(7, 0) & ~BIT(2),
28+
+ .mode = 0444,
29+
+ },
30+
{
31+
.label = "reset_comex_thermal",
32+
.reg = MLXPLAT_CPLD_LPC_REG_RST_CAUSE2_OFFSET,
33+
--
34+
2.20.1
35+

patch/series

+5
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,11 @@ linux-4.13-thermal-intel_pch_thermal-Fix-enable-check-on.patch
8383
0045-mlxsw-minimal-Add-validation-for-FW-version.patch
8484
0046-mlxsw-core-Extend-QSFP-EEPROM-supported-size-for-eth.patch
8585
0047-mfd-lpc-ich-extend-with-additional-chipsets-support.patch
86+
0048-mlxsw-core-Skip-thermal-zone-operations-initializati.patch
87+
0049-thermal-Fix-use-after-free-when-unregistering-therma.patch
88+
0050-mlxsw-core-Drop-creation-of-thermal-to-hwmon-sysfs-i.patch
89+
0051-mlxsw-core-Skip-thermal-zones-threshold-setting-duri.patch
90+
0052-platform-x86-mlx-platform-Add-more-detention-for-sys.patch
8691
linux-4.16-firmware-dmi-handle-missing-DMI-data-gracefully.patch
8792
mellanox-backport-introduce-psample-a-new-genetlink-channel.patch
8893
mellanox-backport-introduce-tc-sample-action.patch

0 commit comments

Comments
 (0)