Discussion:
[PATCH] b43legacy: Fix a sleep-in-atomic bug in b43legacy_op_bss_info_changed
Jia-Ju Bai
2017-05-31 10:09:13 UTC
Permalink
The driver may sleep under a spin lock, and the function call path is:
b43legacy_op_bss_info_changed (acquire the lock by spin_lock_irqsave)
b43legacy_synchronize_irq
synchronize_irq --> may sleep

To fix it, the lock is released before b43legacy_synchronize_irq, and the
lock is acquired again after this function.

Signed-off-by: Jia-Ju Bai <***@163.com>
---
drivers/net/wireless/broadcom/b43legacy/main.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/net/wireless/broadcom/b43legacy/main.c b/drivers/net/wireless/broadcom/b43legacy/main.c
index f1e3dad..31ead21 100644
--- a/drivers/net/wireless/broadcom/b43legacy/main.c
+++ b/drivers/net/wireless/broadcom/b43legacy/main.c
@@ -2859,7 +2859,9 @@ static void b43legacy_op_bss_info_changed(struct ieee80211_hw *hw,
b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0);

if (changed & BSS_CHANGED_BSSID) {
+ spin_unlock_irqrestore(&wl->irq_lock, flags);
b43legacy_synchronize_irq(dev);
+ spin_lock_irqsave(&wl->irq_lock, flags);

if (conf->bssid)
memcpy(wl->bssid, conf->bssid, ETH_ALEN);
--
1.7.9.5
Kalle Valo
2017-05-31 10:26:43 UTC
Permalink
Post by Jia-Ju Bai
b43legacy_op_bss_info_changed (acquire the lock by spin_lock_irqsave)
b43legacy_synchronize_irq
synchronize_irq --> may sleep
To fix it, the lock is released before b43legacy_synchronize_irq, and the
lock is acquired again after this function.
---
drivers/net/wireless/broadcom/b43legacy/main.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/wireless/broadcom/b43legacy/main.c b/drivers/net/wireless/broadcom/b43legacy/main.c
index f1e3dad..31ead21 100644
--- a/drivers/net/wireless/broadcom/b43legacy/main.c
+++ b/drivers/net/wireless/broadcom/b43legacy/main.c
@@ -2859,7 +2859,9 @@ static void b43legacy_op_bss_info_changed(struct ieee80211_hw *hw,
b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0);
if (changed & BSS_CHANGED_BSSID) {
+ spin_unlock_irqrestore(&wl->irq_lock, flags);
b43legacy_synchronize_irq(dev);
+ spin_lock_irqsave(&wl->irq_lock, flags);
To me this looks like a fragile workaround and not a real fix. You can
easily add new race conditions with releasing the lock like this.
--
Kalle Valo
Arend van Spriel
2017-05-31 12:15:47 UTC
Permalink
Post by Kalle Valo
Post by Jia-Ju Bai
b43legacy_op_bss_info_changed (acquire the lock by spin_lock_irqsave)
b43legacy_synchronize_irq
synchronize_irq --> may sleep
To fix it, the lock is released before b43legacy_synchronize_irq, and the
lock is acquired again after this function.
---
drivers/net/wireless/broadcom/b43legacy/main.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/wireless/broadcom/b43legacy/main.c b/drivers/net/wireless/broadcom/b43legacy/main.c
index f1e3dad..31ead21 100644
--- a/drivers/net/wireless/broadcom/b43legacy/main.c
+++ b/drivers/net/wireless/broadcom/b43legacy/main.c
@@ -2859,7 +2859,9 @@ static void b43legacy_op_bss_info_changed(struct ieee80211_hw *hw,
b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0);
if (changed & BSS_CHANGED_BSSID) {
+ spin_unlock_irqrestore(&wl->irq_lock, flags);
b43legacy_synchronize_irq(dev);
+ spin_lock_irqsave(&wl->irq_lock, flags);
To me this looks like a fragile workaround and not a real fix. You can
easily add new race conditions with releasing the lock like this.
Hi Jia-Ju,

Agree with Kalle as I was about to say the same thing. You really need
to determine what is protected by the irq_lock. Here you are using the
lock because you are about to change wl->bssid a bit further down. Did
not check the entire function but it seems the lock perimeter is too wide.

Regards,
Arend
Michael Büsch
2017-05-31 15:32:15 UTC
Permalink
On Wed, 31 May 2017 13:26:43 +0300
Post by Kalle Valo
Post by Jia-Ju Bai
b43legacy_op_bss_info_changed (acquire the lock by spin_lock_irqsave)
b43legacy_synchronize_irq
synchronize_irq --> may sleep
To fix it, the lock is released before b43legacy_synchronize_irq, and the
lock is acquired again after this function.
---
drivers/net/wireless/broadcom/b43legacy/main.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/wireless/broadcom/b43legacy/main.c b/drivers/net/wireless/broadcom/b43legacy/main.c
index f1e3dad..31ead21 100644
--- a/drivers/net/wireless/broadcom/b43legacy/main.c
+++ b/drivers/net/wireless/broadcom/b43legacy/main.c
@@ -2859,7 +2859,9 @@ static void b43legacy_op_bss_info_changed(struct ieee80211_hw *hw,
b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0);
if (changed & BSS_CHANGED_BSSID) {
+ spin_unlock_irqrestore(&wl->irq_lock, flags);
b43legacy_synchronize_irq(dev);
+ spin_lock_irqsave(&wl->irq_lock, flags);
To me this looks like a fragile workaround and not a real fix. You can
easily add new race conditions with releasing the lock like this.
I think releasing the lock possibly is fine. It certainly is better than
sleeping with a lock held.
We disabled the device interrupts just before this line.

However I think the synchronize_irq should be outside of the
conditional right after the write to B43legacy_MMIO_GEN_IRQ_MASK. (So
two lines above)
I don't think it makes sense to only synchronize if BSS_CHANGED_BSSID
is set.


On the other hand b43 does not have this irq-disabling foobar anymore.
So somebody must have removed it. Maybe you can find the commit that
removed this stuff from b43 and port it to b43legacy?


So I would vote for moving the synchronize_irq up outside of the
conditional and put the unlock/lock sequence around it.
And as a second patch on top of that try to remove this stuff
altogether like b43 did.
--
Michael
Larry Finger
2017-06-01 00:07:15 UTC
Permalink
Post by Michael Büsch
On Wed, 31 May 2017 13:26:43 +0300
Post by Kalle Valo
Post by Jia-Ju Bai
b43legacy_op_bss_info_changed (acquire the lock by spin_lock_irqsave)
b43legacy_synchronize_irq
synchronize_irq --> may sleep
To fix it, the lock is released before b43legacy_synchronize_irq, and the
lock is acquired again after this function.
---
drivers/net/wireless/broadcom/b43legacy/main.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/wireless/broadcom/b43legacy/main.c b/drivers/net/wireless/broadcom/b43legacy/main.c
index f1e3dad..31ead21 100644
--- a/drivers/net/wireless/broadcom/b43legacy/main.c
+++ b/drivers/net/wireless/broadcom/b43legacy/main.c
@@ -2859,7 +2859,9 @@ static void b43legacy_op_bss_info_changed(struct ieee80211_hw *hw,
b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0);
if (changed & BSS_CHANGED_BSSID) {
+ spin_unlock_irqrestore(&wl->irq_lock, flags);
b43legacy_synchronize_irq(dev);
+ spin_lock_irqsave(&wl->irq_lock, flags);
To me this looks like a fragile workaround and not a real fix. You can
easily add new race conditions with releasing the lock like this.
I think releasing the lock possibly is fine. It certainly is better than
sleeping with a lock held.
We disabled the device interrupts just before this line.
However I think the synchronize_irq should be outside of the
conditional right after the write to B43legacy_MMIO_GEN_IRQ_MASK. (So
two lines above)
I don't think it makes sense to only synchronize if BSS_CHANGED_BSSID
is set.
On the other hand b43 does not have this irq-disabling foobar anymore.
So somebody must have removed it. Maybe you can find the commit that
removed this stuff from b43 and port it to b43legacy?
So I would vote for moving the synchronize_irq up outside of the
conditional and put the unlock/lock sequence around it.
And as a second patch on top of that try to remove this stuff
altogether like b43 did.
The patch that removed it in b43 is

commit 36dbd9548e92268127b0c31b0e121e63e9207108
Author: Michael Buesch <***@bu3sch.de>
Date: Fri Sep 4 22:51:29 2009 +0200

b43: Use a threaded IRQ handler

Use a threaded IRQ handler to allow locking the mutex and
sleeping while executing an interrupt.
This removes usage of the irq_lock spinlock, but introduces
a new hardirq_lock, which is _only_ used for the PCI/SSB lowlevel
hard-irq handler. Sleeping busses (SDIO) will use mutex instead.

Signed-off-by: Michael Buesch <***@bu3sch.de>
Tested-by: Larry Finger <***@lwfinger.net>
Signed-off-by: John W. Linville <***@tuxdriver.com>

I vaguely remember this patch. Although it is roughly a 1000-line fix, I will
try to port it to b43legacy. I still have an old BCM4306 PCMCIA card that I can
test in a PowerBook G4.

I agree with Michael that this is the way to go. Both of Jia-Ju's patches should
be rejected.

Larry
Jia-Ju Bai
2017-06-01 01:07:32 UTC
Permalink
Post by Larry Finger
Post by Michael Büsch
On Wed, 31 May 2017 13:26:43 +0300
Post by Kalle Valo
Post by Jia-Ju Bai
b43legacy_op_bss_info_changed (acquire the lock by spin_lock_irqsave)
b43legacy_synchronize_irq
synchronize_irq --> may sleep
To fix it, the lock is released before b43legacy_synchronize_irq, and the
lock is acquired again after this function.
---
drivers/net/wireless/broadcom/b43legacy/main.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/wireless/broadcom/b43legacy/main.c
b/drivers/net/wireless/broadcom/b43legacy/main.c
index f1e3dad..31ead21 100644
--- a/drivers/net/wireless/broadcom/b43legacy/main.c
+++ b/drivers/net/wireless/broadcom/b43legacy/main.c
@@ -2859,7 +2859,9 @@ static void
b43legacy_op_bss_info_changed(struct ieee80211_hw *hw,
b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0);
if (changed & BSS_CHANGED_BSSID) {
+ spin_unlock_irqrestore(&wl->irq_lock, flags);
b43legacy_synchronize_irq(dev);
+ spin_lock_irqsave(&wl->irq_lock, flags);
To me this looks like a fragile workaround and not a real fix. You can
easily add new race conditions with releasing the lock like this.
I think releasing the lock possibly is fine. It certainly is better than
sleeping with a lock held.
We disabled the device interrupts just before this line.
However I think the synchronize_irq should be outside of the
conditional right after the write to B43legacy_MMIO_GEN_IRQ_MASK. (So
two lines above)
I don't think it makes sense to only synchronize if BSS_CHANGED_BSSID
is set.
On the other hand b43 does not have this irq-disabling foobar anymore.
So somebody must have removed it. Maybe you can find the commit that
removed this stuff from b43 and port it to b43legacy?
So I would vote for moving the synchronize_irq up outside of the
conditional and put the unlock/lock sequence around it.
And as a second patch on top of that try to remove this stuff
altogether like b43 did.
The patch that removed it in b43 is
commit 36dbd9548e92268127b0c31b0e121e63e9207108
Date: Fri Sep 4 22:51:29 2009 +0200
b43: Use a threaded IRQ handler
Use a threaded IRQ handler to allow locking the mutex and
sleeping while executing an interrupt.
This removes usage of the irq_lock spinlock, but introduces
a new hardirq_lock, which is _only_ used for the PCI/SSB lowlevel
hard-irq handler. Sleeping busses (SDIO) will use mutex instead.
I vaguely remember this patch. Although it is roughly a 1000-line fix,
I will try to port it to b43legacy. I still have an old BCM4306 PCMCIA
card that I can test in a PowerBook G4.
I agree with Michael that this is the way to go. Both of Jia-Ju's
patches should be rejected.
Larry
It is fine to me to fix the bug by porting this former patch.

Thanks,
Jia-Ju Bai
Michael Büsch
2017-06-01 05:31:36 UTC
Permalink
On Wed, 31 May 2017 19:07:15 -0500
Post by Larry Finger
Post by Michael Büsch
On Wed, 31 May 2017 13:26:43 +0300
Post by Kalle Valo
Post by Jia-Ju Bai
b43legacy_op_bss_info_changed (acquire the lock by spin_lock_irqsave)
b43legacy_synchronize_irq
synchronize_irq --> may sleep
To fix it, the lock is released before b43legacy_synchronize_irq, and the
lock is acquired again after this function.
---
drivers/net/wireless/broadcom/b43legacy/main.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/wireless/broadcom/b43legacy/main.c b/drivers/net/wireless/broadcom/b43legacy/main.c
index f1e3dad..31ead21 100644
--- a/drivers/net/wireless/broadcom/b43legacy/main.c
+++ b/drivers/net/wireless/broadcom/b43legacy/main.c
@@ -2859,7 +2859,9 @@ static void b43legacy_op_bss_info_changed(struct ieee80211_hw *hw,
b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0);
if (changed & BSS_CHANGED_BSSID) {
+ spin_unlock_irqrestore(&wl->irq_lock, flags);
b43legacy_synchronize_irq(dev);
+ spin_lock_irqsave(&wl->irq_lock, flags);
To me this looks like a fragile workaround and not a real fix. You can
easily add new race conditions with releasing the lock like this.
I think releasing the lock possibly is fine. It certainly is better than
sleeping with a lock held.
We disabled the device interrupts just before this line.
However I think the synchronize_irq should be outside of the
conditional right after the write to B43legacy_MMIO_GEN_IRQ_MASK. (So
two lines above)
I don't think it makes sense to only synchronize if BSS_CHANGED_BSSID
is set.
On the other hand b43 does not have this irq-disabling foobar anymore.
So somebody must have removed it. Maybe you can find the commit that
removed this stuff from b43 and port it to b43legacy?
So I would vote for moving the synchronize_irq up outside of the
conditional and put the unlock/lock sequence around it.
And as a second patch on top of that try to remove this stuff
altogether like b43 did.
The patch that removed it in b43 is
commit 36dbd9548e92268127b0c31b0e121e63e9207108
Date: Fri Sep 4 22:51:29 2009 +0200
Damn it :D
Post by Larry Finger
b43: Use a threaded IRQ handler
Use a threaded IRQ handler to allow locking the mutex and
sleeping while executing an interrupt.
This removes usage of the irq_lock spinlock, but introduces
a new hardirq_lock, which is _only_ used for the PCI/SSB lowlevel
hard-irq handler. Sleeping busses (SDIO) will use mutex instead.
I vaguely remember this patch. Although it is roughly a 1000-line fix, I will
try to port it to b43legacy. I still have an old BCM4306 PCMCIA card that I can
test in a PowerBook G4.
I agree with Michael that this is the way to go. Both of Jia-Ju's patches should
be rejected.
I'm not sure if it's worth it. There is a risk that this would
introduce new bugs.
But sure, please feel free to try it. This way we can find out how big
this change becomes.
--
Michael
Kalle Valo
2017-06-01 04:27:20 UTC
Permalink
Post by Michael Büsch
Post by Kalle Valo
Post by Jia-Ju Bai
--- a/drivers/net/wireless/broadcom/b43legacy/main.c
+++ b/drivers/net/wireless/broadcom/b43legacy/main.c
@@ -2859,7 +2859,9 @@ static void b43legacy_op_bss_info_changed(struct ieee80211_hw *hw,
b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0);
if (changed & BSS_CHANGED_BSSID) {
+ spin_unlock_irqrestore(&wl->irq_lock, flags);
b43legacy_synchronize_irq(dev);
+ spin_lock_irqsave(&wl->irq_lock, flags);
To me this looks like a fragile workaround and not a real fix. You can
easily add new race conditions with releasing the lock like this.
I think releasing the lock possibly is fine. It certainly is better than
sleeping with a lock held.
Sure, but IMHO in general I think the practise of releasing the lock
like this in a middle of function is dangerous as one can easily miss
that upper and lower halves of the function are not actually atomic
anymore. And in this case that it's under a conditional makes it even
worse.
--
Kalle Valo
Michael Büsch
2017-06-01 05:29:15 UTC
Permalink
On Thu, 01 Jun 2017 07:27:20 +0300
Post by Kalle Valo
Post by Michael Büsch
Post by Kalle Valo
Post by Jia-Ju Bai
--- a/drivers/net/wireless/broadcom/b43legacy/main.c
+++ b/drivers/net/wireless/broadcom/b43legacy/main.c
@@ -2859,7 +2859,9 @@ static void b43legacy_op_bss_info_changed(struct ieee80211_hw *hw,
b43legacy_write32(dev, B43legacy_MMIO_GEN_IRQ_MASK, 0);
if (changed & BSS_CHANGED_BSSID) {
+ spin_unlock_irqrestore(&wl->irq_lock, flags);
b43legacy_synchronize_irq(dev);
+ spin_lock_irqsave(&wl->irq_lock, flags);
To me this looks like a fragile workaround and not a real fix. You can
easily add new race conditions with releasing the lock like this.
I think releasing the lock possibly is fine. It certainly is better than
sleeping with a lock held.
Sure, but IMHO in general I think the practise of releasing the lock
like this in a middle of function is dangerous as one can easily miss
that upper and lower halves of the function are not actually atomic
anymore. And in this case that it's under a conditional makes it even
worse.
Yes in general I agree. Releasing and re-acquiring a lock is dangerous.
But I think in this special case here it might be harmless.
The irq_lock is used mostly (if not exclusively; I don't fully
remember) to protect against the IRQ top and bottom half.
But we disabled the device IRQs a line above and the purpose of this
synchronize is to make sure the handler will finish and thus make
dropping the lock save.
Of course it does not make sense to do this with the lock held :)
--
Michael
Loading...