Description
Hi,
I am using the driver with go 1.10. First I've noticed the number of goroutines waiting in socket readLoop is constantly increasing.
goroutine profile: total 3300
3131 @ 0x42ee6a 0x42a18a 0x429807 0x49150b 0x49158d 0x4923ed 0x5f089f 0x60195a 0x6c3193 0x6c3852 0x45c761
# 0x429806 internal/poll.runtime_pollWait+0x56 /usr/local/go110/src/runtime/netpoll.go:173
# 0x49150a internal/poll.(*pollDesc).wait+0x9a /usr/local/go110/src/internal/poll/fd_poll_runtime.go:85
# 0x49158c internal/poll.(*pollDesc).waitRead+0x3c /usr/local/go110/src/internal/poll/fd_poll_runtime.go:90
# 0x4923ec internal/poll.(*FD).Read+0x17c /usr/local/go110/src/internal/poll/fd_unix.go:157
# 0x5f089e net.(*netFD).Read+0x4e /usr/local/go110/src/net/fd_unix.go:202
# 0x601959 net.(*conn).Read+0x69 /usr/local/go110/src/net/net.go:176
# 0x6c3192 github.com/globalsign/mgo.fill+0x52 /home/omerkirk/Projects/go/src/github.com/globalsign/mgo/socket.go:567
# 0x6c3851 github.com/globalsign/mgo.(*mongoSocket).readLoop+0x601 /home/omerkirk/Projects/go/src/github.com/globalsign/mgo/socket.go:583
To be able to confirm that this is not due to high load I've checked mongo stats daily, and found that the number of goroutines is the same as SocketsAlive stat. The service I am using the driver gets traffic around 400 rps to 4000 rps depending on the time of day. For the past 2 weeks the number of alive sockets never decreased even when the traffic is very low and the number of used sockets fluctuates around 1 and 2.
For example a snapshot of mongo stats is below, as you can see the number of needed sockets to handle the traffic is very low however the number of alive sockets is still very high.
Clusters": 2,
"MasterConns": 32967,
"SlaveConns": 2845,
"SentOps": 66839799,
"ReceivedOps": 66801986,
"ReceivedDocs": 66801986,
"SocketsAlive": 3131,
"SocketsInUse": 0,
"SocketRefs": 0
I've checked the code and what I do is first dial a session in the beginning of the runtime then in each mongo request I copy the session and defer session.Close() in every function. The sessions all have a 1 minute timeout. Please let me know if you need any more information on the issue.