Skip to content

fix(dot/peerset): fix sending on closed channel race condition when dropping peer #2573

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 10, 2022
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 15 additions & 26 deletions dot/peerset/peerset.go
Original file line number Diff line number Diff line change
Expand Up @@ -712,15 +712,27 @@ func (ps *PeerSet) start(ctx context.Context, actionQueue chan action) {
ps.actionQueue = actionQueue
ps.resultMsgCh = make(chan Message, msgChanSize)

go ps.listenAction(ctx)
go ps.periodicallyAllocateSlots(ctx)
go ps.listenActionAllocSlots(ctx)
}

func (ps *PeerSet) listenAction(ctx context.Context) {
func (ps *PeerSet) listenActionAllocSlots(ctx context.Context) {
ticker := time.NewTicker(ps.nextPeriodicAllocSlots)

defer func() {
ticker.Stop()
close(ps.resultMsgCh)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we close ps.actionQueue as well here? does not look like that channel is getting closed anywhere!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note we don't need to close channels, the GC will take care of it. It's usually used to signal to multiple goroutines (a bit like contexts I guess).

On the other hand, I don't mind closing it, to make sure some other goroutine doesn't use it after, since our dot/network and dot/peerset code is a bit async-messy. Better to have a panic than goroutines accumulating and not doing anything (blocked since there is no consumer anymore).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actionQueue is closed in Handler.Stop(). Seems like it's passed into PeerSet.start. It's kind of an odd design to be honest, but I'd rather not refactor the Handler to PeerSet relationship.

}()

for {
select {
case <-ctx.Done():
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could put this log back logger.Debugf("peerset slot allocation exiting: %s", ctx.Err())

case <-ticker.C:
for setID := 0; setID < ps.peerState.getSetLength(); setID++ {
if err := ps.allocSlots(setID); err != nil {
logger.Warnf("failed to allocate slots: %s", err)
}
}
case act, ok := <-ps.actionQueue:
if !ok {
return
Expand Down Expand Up @@ -758,26 +770,3 @@ func (ps *PeerSet) listenAction(ctx context.Context) {
}
}
}

func (ps *PeerSet) periodicallyAllocateSlots(ctx context.Context) {
ticker := time.NewTicker(ps.nextPeriodicAllocSlots)

defer func() {
ticker.Stop()
close(ps.resultMsgCh)
}()

for {
select {
case <-ctx.Done():
logger.Debugf("peerset slot allocation exiting: %s", ctx.Err())
return
case <-ticker.C:
for setID := 0; setID < ps.peerState.getSetLength(); setID++ {
if err := ps.allocSlots(setID); err != nil {
logger.Warnf("failed to allocate slots: %s", err)
}
}
}
}
}