Skip to content

OSError: [Errno 9] Bad file descriptor with uvloop #10506

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 task done
tkukushkin opened this issue Mar 2, 2025 · 34 comments · Fixed by #10551
Open
1 task done

OSError: [Errno 9] Bad file descriptor with uvloop #10506

tkukushkin opened this issue Mar 2, 2025 · 34 comments · Fixed by #10551
Labels
bug waiting-for-upstream We are waiting for an upstream change

Comments

@tkukushkin
Copy link

tkukushkin commented Mar 2, 2025

Note: we are waiting for MagicStack/uvloop#646 to get fixed upstream

Describe the bug

Hello! After updating aiohttp to 3.11.13 we see new errors.

To Reproduce

I don't know how to reproduce these errors, but it seems they are somehow connected with cancellation.

Expected behavior

No errors

Logs/tracebacks

Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/aiohttp/connector.py", line 1123, in _wrap_create_connection
    connection = await self._loop.create_connection(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "uvloop/loop.pyx", line 2076, in create_connection
  File "uvloop/loop.pyx", line 2066, in uvloop.loop.Loop.create_connection
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/cian_http/client.py", line 159, in request
    async with self._get_session().request(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/aiohttp/client.py", line 1425, in __aenter__
    self._resp: _RetType = await self._coro
                           ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/aiohttp/client.py", line 703, in _request
    conn = await self._connector.connect(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/aiohttp/connector.py", line 548, in connect
    proto = await self._create_connection(req, traces, timeout)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/aiohttp/connector.py", line 1056, in _create_connection
    _, proto = await self._create_direct_connection(req, traces, timeout)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/aiohttp/connector.py", line 1380, in _create_direct_connection
    transp, proto = await self._wrap_create_connection(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/aiohttp/connector.py", line 1141, in _wrap_create_connection
    sock.close()
  File "/usr/local/lib/python3.12/socket.py", line 505, in close
    self._real_close()
  File "/usr/local/lib/python3.12/socket.py", line 499, in _real_close
    _ss.close(self)
OSError: [Errno 9] Bad file descriptor

Python Version

3.13.1

aiohttp Version

3.11.13

multidict Version

6.1.0

propcache Version

0.3.0

yarl Version

1.18.3

OS

Debian 12.9

Related component

Client

Additional context

No response

Code of Conduct

  • I agree to follow the aio-libs Code of Conduct
@tkukushkin tkukushkin added the bug label Mar 2, 2025
@mklokocka
Copy link

Also having trouble with file descriptors on the latest versions of aiohttp and aiohappyeyeballs. Not using uvloop.

BlockingIOError: [Errno 115] Operation now in progress
  File "asyncio/selector_events.py", line 509, in _sock_connect
    sock.connect(address)

RuntimeError: File descriptor 265 is used by transport <_SelectorSocketTransport fd=265 read=polling write=<idle, bufsize=0>>
  File "usr/local/lib/python3.10/site-packages/<internal library>.py", line 53, in mpub_asyncio
    async with self.session_asyncio.post(f"{self.url}/mpub", timeout=timeout, **kwargs) as r:
  File "aiohttp/client.py", line 1425, in __aenter__
    self._resp: _RetType = await self._coro
  File "ddtrace/contrib/trace_utils_async.py", line 35, in wrapper
    return await func(mod, pin, wrapped, instance, args, kwargs)
  File "ddtrace/contrib/aiohttp/patch.py", line 108, in _traced_clientsession_request
    resp = await func(*args, **kwargs)  # type: aiohttp.ClientResponse
  File "aiohttp/client.py", line 703, in _request
    conn = await self._connector.connect(
  File "ddtrace/contrib/aiohttp/patch.py", line 62, in connect
    result = await self.__wrapped__.connect(req, *args, **kwargs)
  File "aiohttp/connector.py", line 548, in connect
    proto = await self._create_connection(req, traces, timeout)
  File "aiohttp/connector.py", line 1056, in _create_connection
    _, proto = await self._create_direct_connection(req, traces, timeout)
  File "aiohttp/connector.py", line 1380, in _create_direct_connection
    transp, proto = await self._wrap_create_connection(
  File "aiohttp/connector.py", line 1116, in _wrap_create_connection
    sock = await aiohappyeyeballs.start_connection(
  File "aiohappyeyeballs/impl.py", line 93, in start_connection
    raise first_exception
  File "aiohappyeyeballs/impl.py", line 71, in start_connection
    sock = await _connect_sock(
  File "aiohappyeyeballs/impl.py", line 163, in _connect_sock
    await loop.sock_connect(sock, address)
  File "asyncio/selector_events.py", line 499, in sock_connect
    self._sock_connect(fut, sock, address)
  File "asyncio/selector_events.py", line 515, in _sock_connect
    self._ensure_fd_no_transport(fd)
  File "asyncio/selector_events.py", line 248, in _ensure_fd_no_transport
    raise RuntimeError(

@bdraco
Copy link
Member

bdraco commented Mar 3, 2025

@mklokocka Are you using uvloop as well?

@tkukushkin
Copy link
Author

Not using uvloop.

@bdraco
Copy link
Member

bdraco commented Mar 3, 2025

Not using uvloop.

The trace back you posted has uvloop in it.

@tkukushkin
Copy link
Author

I'm talking about @mklokocka comment

@bdraco
Copy link
Member

bdraco commented Mar 3, 2025

@mklokocka Does the problem still happen if you disable the patching of the internals in https://github.com/DataDog/dd-trace-py/blob/main/ddtrace/contrib/internal/aiohttp/patch.py ?

The previous issues thatw ere reported focused on the aiohappyeyeballs staggered race, however the trace you have posted looks like its using the non-staggered race path https://github.com/aio-libs/aiohappyeyeballs/blob/035d976dee1f5e731852649f3fffd4e1aca21825/src/aiohappyeyeballs/impl.py#L68

Which version of aiohappyeyeballs do you have installed?

bdraco added a commit to aio-libs/aiohappyeyeballs that referenced this issue Mar 3, 2025
related aiohttp issue aio-libs/aiohttp#10506

In #101 we replaced the staggered race implementation since the cpython version had races
that were not fixed at the time. cpython has since updated the implementation to fix
additional races. Our current implementation still has problems with cancellation
and cpython has fixed that in python/cpython#128475
and python/cpython#124847

This PR ports the latest cpython implementation
@bdraco
Copy link
Member

bdraco commented Mar 4, 2025

Please update to aiohappyeyeballs 2.4.8 and check if the problem goes away

@mklokocka
Copy link

@bdraco Thanks, the project ran for a few hours with the latest aiohappyeyeballs without any issues.

@bdraco bdraco closed this as completed Mar 4, 2025
@bdraco
Copy link
Member

bdraco commented Mar 4, 2025

Thanks. If the problem reoccurs, let me know and we can reopen

@tkukushkin
Copy link
Author

@bdraco Hello! We've updated aiohappyeyeballs to 2.6.1 and it haven't helped with errors from my original message.

@bdraco bdraco reopened this Mar 12, 2025
@bdraco
Copy link
Member

bdraco commented Mar 12, 2025

I've dug though the aiohappyeyeballs code again, and can find no place where we can work around this. so for uvloop, we are waiting for MagicStack/uvloop#646 to get fixed upstream

In the mean time you might be able to get away with disabling happyeyeballs on the connector if you don't have any use cases where you have non working IPv6

https://docs.aiohttp.org/en/stable/client_reference.html#aiohttp.TCPConnector

The amount of time in seconds to wait for a connection attempt to complete, before starting the next attempt in parallel. This is the “Connection Attempt Delay” as defined in RFC 8305. To disable Happy Eyeballs, set this to None. The default value recommended by the RFC is 0.25 (250 milliseconds).

@bdraco bdraco changed the title OSError: [Errno 9] Bad file descriptor OSError: [Errno 9] Bad file descriptor with uvloop Mar 12, 2025
@bdraco bdraco added the waiting-for-upstream We are waiting for an upstream change label Mar 12, 2025
@tkukushkin
Copy link
Author

Hi @bdraco! We've tried setting happy_eyeballs_delay=None, but we're still encountering the same errors.

@bdraco
Copy link
Member

bdraco commented Mar 14, 2025

Disabling happy eyeballs will only reduce the chance the problem can happen with uvloop. There isn't anything we can do further but wait for uvloop to fix the issue.

@tkukushkin
Copy link
Author

In our case, all the domains we make requests to with aiohttp resolve to a single IPv4. Perhaps disabling Happy Eyeballs couldn't help us because Happy Eyeballs isn't used in the first place?

One thought keeps really bothering me – why don't we see any similar errors on aiohttp 3.9.5, but on 3.10-3.11.13, with Happy Eyeballs disabled, we do? As if something else has changed.

@bdraco
Copy link
Member

bdraco commented Mar 14, 2025

#10464 was probably the most relevant change in 3.11.13

@bdraco
Copy link
Member

bdraco commented Mar 14, 2025

One difference between how aiohappyeyeballs and that pr handles the close is that aiohappyeyeballs expects the socket.close() to be able to raise and traps it in https://github.com/aio-libs/aiohappyeyeballs/blob/e3bd5bdf44f5d187802de6dcb08d27e1ca6da048/src/aiohappyeyeballs/impl.py#L227

It probably makes sense to re-raise the OSError from the socket.close() as client_error(req.connection_key, exc) from exc

bdraco added a commit that referenced this issue Mar 14, 2025
…ose connector socket

This is a followup to #10464 to handle the case where
socket.close() can also raise. This matches the logic
we have in aiohappyeyeballs:

https://github.com/aio-libs/aiohappyeyeballs/blob/e3bd5bdf44f5d187802de6dcb08d27e1ca6da048/src/aiohappyeyeballs/impl.py#L227

fixes #10506
@bdraco
Copy link
Member

bdraco commented Mar 14, 2025

@tkukushkin Can you give #10551 a try?

@bdraco bdraco closed this as completed in d067260 Mar 16, 2025
patchback bot pushed a commit that referenced this issue Mar 16, 2025
…close connector socket (#10551)

<!-- Thank you for your contribution! -->

## What do these changes do?

This is a followup to #10464 to handle the case where `socket.close()`
can also raise. This matches the logic we have in aiohappyeyeballs:

https://github.com/aio-libs/aiohappyeyeballs/blob/e3bd5bdf44f5d187802de6dcb08d27e1ca6da048/src/aiohappyeyeballs/impl.py#L227

We shouldn't raising `OSError` externally from this method as callers
expect a `ClientError`

## Are there changes in behavior for the user?

bugfix

## Is it a substantial burden for the maintainers to support this?

no

## Related issue number

fixes #10506

## Checklist

- [x] I think the code is well written
- [x] Unit tests for the changes exist
- [x] Documentation reflects the changes
- [x] If you provide code modification, please add yourself to
`CONTRIBUTORS.txt`
  * The format is &lt;Name&gt; &lt;Surname&gt;.
  * Please keep alphabetical order, the file is sorted by names.
- [x] Add a new news fragment into the `CHANGES/` folder
  * name it `<issue_or_pr_num>.<type>.rst` (e.g. `588.bugfix.rst`)
  * if you don't have an issue number, change it to the pull request
    number after creating the PR
    * `.bugfix`: A bug fix for something the maintainers deemed an
      improper undesired behavior that got corrected to match
      pre-agreed expectations.
    * `.feature`: A new behavior, public APIs. That sort of stuff.
    * `.deprecation`: A declaration of future API removals and breaking
      changes in behavior.
    * `.breaking`: When something public is removed in a breaking way.
      Could be deprecated in an earlier release.
    * `.doc`: Notable updates to the documentation structure or build
      process.
    * `.packaging`: Notes for downstreams about unobvious side effects
      and tooling. Changes in the test invocation considerations and
      runtime assumptions.
    * `.contrib`: Stuff that affects the contributor experience. e.g.
      Running tests, building the docs, setting up the development
      environment.
    * `.misc`: Changes that are hard to assign to any of the above
      categories.
  * Make sure to use full sentences with correct case and punctuation,
    for example:
    ```rst
    Fixed issue with non-ascii contents in doctest text files
    -- by :user:`contributor-gh-handle`.
    ```

    Use the past tense or the present tense a non-imperative mood,
    referring to what's changed compared to the last released version
    of this project.

(cherry picked from commit d067260)
patchback bot pushed a commit that referenced this issue Mar 16, 2025
…close connector socket (#10551)

<!-- Thank you for your contribution! -->

## What do these changes do?

This is a followup to #10464 to handle the case where `socket.close()`
can also raise. This matches the logic we have in aiohappyeyeballs:

https://github.com/aio-libs/aiohappyeyeballs/blob/e3bd5bdf44f5d187802de6dcb08d27e1ca6da048/src/aiohappyeyeballs/impl.py#L227

We shouldn't raising `OSError` externally from this method as callers
expect a `ClientError`

## Are there changes in behavior for the user?

bugfix

## Is it a substantial burden for the maintainers to support this?

no

## Related issue number

fixes #10506

## Checklist

- [x] I think the code is well written
- [x] Unit tests for the changes exist
- [x] Documentation reflects the changes
- [x] If you provide code modification, please add yourself to
`CONTRIBUTORS.txt`
  * The format is &lt;Name&gt; &lt;Surname&gt;.
  * Please keep alphabetical order, the file is sorted by names.
- [x] Add a new news fragment into the `CHANGES/` folder
  * name it `<issue_or_pr_num>.<type>.rst` (e.g. `588.bugfix.rst`)
  * if you don't have an issue number, change it to the pull request
    number after creating the PR
    * `.bugfix`: A bug fix for something the maintainers deemed an
      improper undesired behavior that got corrected to match
      pre-agreed expectations.
    * `.feature`: A new behavior, public APIs. That sort of stuff.
    * `.deprecation`: A declaration of future API removals and breaking
      changes in behavior.
    * `.breaking`: When something public is removed in a breaking way.
      Could be deprecated in an earlier release.
    * `.doc`: Notable updates to the documentation structure or build
      process.
    * `.packaging`: Notes for downstreams about unobvious side effects
      and tooling. Changes in the test invocation considerations and
      runtime assumptions.
    * `.contrib`: Stuff that affects the contributor experience. e.g.
      Running tests, building the docs, setting up the development
      environment.
    * `.misc`: Changes that are hard to assign to any of the above
      categories.
  * Make sure to use full sentences with correct case and punctuation,
    for example:
    ```rst
    Fixed issue with non-ascii contents in doctest text files
    -- by :user:`contributor-gh-handle`.
    ```

    Use the past tense or the present tense a non-imperative mood,
    referring to what's changed compared to the last released version
    of this project.

(cherry picked from commit d067260)
bdraco added a commit that referenced this issue Mar 16, 2025
…ionError when failing to explicitly close connector socket (#10561)

**This is a backport of PR #10551 as merged into master
(d067260).**


<!-- Thank you for your contribution! -->

## What do these changes do?

This is a followup to #10464 to handle the case where `socket.close()`
can also raise. This matches the logic we have in aiohappyeyeballs:


https://github.com/aio-libs/aiohappyeyeballs/blob/e3bd5bdf44f5d187802de6dcb08d27e1ca6da048/src/aiohappyeyeballs/impl.py#L227

We shouldn't raising `OSError` externally from this method as callers
expect a `ClientError`


## Are there changes in behavior for the user?

bugfix

## Is it a substantial burden for the maintainers to support this?

no

## Related issue number

fixes #10506

## Checklist

- [x] I think the code is well written
- [x] Unit tests for the changes exist
- [x] Documentation reflects the changes
- [x] If you provide code modification, please add yourself to
`CONTRIBUTORS.txt`
  * The format is &lt;Name&gt; &lt;Surname&gt;.
  * Please keep alphabetical order, the file is sorted by names.
- [x] Add a new news fragment into the `CHANGES/` folder
  * name it `<issue_or_pr_num>.<type>.rst` (e.g. `588.bugfix.rst`)
  * if you don't have an issue number, change it to the pull request
    number after creating the PR
    * `.bugfix`: A bug fix for something the maintainers deemed an
      improper undesired behavior that got corrected to match
      pre-agreed expectations.
    * `.feature`: A new behavior, public APIs. That sort of stuff.
    * `.deprecation`: A declaration of future API removals and breaking
      changes in behavior.
    * `.breaking`: When something public is removed in a breaking way.
      Could be deprecated in an earlier release.
    * `.doc`: Notable updates to the documentation structure or build
      process.
    * `.packaging`: Notes for downstreams about unobvious side effects
      and tooling. Changes in the test invocation considerations and
      runtime assumptions.
    * `.contrib`: Stuff that affects the contributor experience. e.g.
      Running tests, building the docs, setting up the development
      environment.
    * `.misc`: Changes that are hard to assign to any of the above
      categories.
  * Make sure to use full sentences with correct case and punctuation,
    for example:
    ```rst
    Fixed issue with non-ascii contents in doctest text files
    -- by :user:`contributor-gh-handle`.
    ```

    Use the past tense or the present tense a non-imperative mood,
    referring to what's changed compared to the last released version
    of this project.

Co-authored-by: J. Nick Koston <[email protected]>
bdraco added a commit that referenced this issue Mar 16, 2025
…ionError when failing to explicitly close connector socket (#10562)

**This is a backport of PR #10551 as merged into master
(d067260).**


<!-- Thank you for your contribution! -->

## What do these changes do?

This is a followup to #10464 to handle the case where `socket.close()`
can also raise. This matches the logic we have in aiohappyeyeballs:


https://github.com/aio-libs/aiohappyeyeballs/blob/e3bd5bdf44f5d187802de6dcb08d27e1ca6da048/src/aiohappyeyeballs/impl.py#L227

We shouldn't raising `OSError` externally from this method as callers
expect a `ClientError`


## Are there changes in behavior for the user?

bugfix

## Is it a substantial burden for the maintainers to support this?

no

## Related issue number

fixes #10506

## Checklist

- [x] I think the code is well written
- [x] Unit tests for the changes exist
- [x] Documentation reflects the changes
- [x] If you provide code modification, please add yourself to
`CONTRIBUTORS.txt`
  * The format is &lt;Name&gt; &lt;Surname&gt;.
  * Please keep alphabetical order, the file is sorted by names.
- [x] Add a new news fragment into the `CHANGES/` folder
  * name it `<issue_or_pr_num>.<type>.rst` (e.g. `588.bugfix.rst`)
  * if you don't have an issue number, change it to the pull request
    number after creating the PR
    * `.bugfix`: A bug fix for something the maintainers deemed an
      improper undesired behavior that got corrected to match
      pre-agreed expectations.
    * `.feature`: A new behavior, public APIs. That sort of stuff.
    * `.deprecation`: A declaration of future API removals and breaking
      changes in behavior.
    * `.breaking`: When something public is removed in a breaking way.
      Could be deprecated in an earlier release.
    * `.doc`: Notable updates to the documentation structure or build
      process.
    * `.packaging`: Notes for downstreams about unobvious side effects
      and tooling. Changes in the test invocation considerations and
      runtime assumptions.
    * `.contrib`: Stuff that affects the contributor experience. e.g.
      Running tests, building the docs, setting up the development
      environment.
    * `.misc`: Changes that are hard to assign to any of the above
      categories.
  * Make sure to use full sentences with correct case and punctuation,
    for example:
    ```rst
    Fixed issue with non-ascii contents in doctest text files
    -- by :user:`contributor-gh-handle`.
    ```

    Use the past tense or the present tense a non-imperative mood,
    referring to what's changed compared to the last released version
    of this project.

Co-authored-by: J. Nick Koston <[email protected]>
@tkukushkin
Copy link
Author

@bdraco Hello!

Sorry for not answering you, you made the release faster than I was able to check it.

We still see these errors with aiohttp 3.11.16 and aiohappyeyeballs 2.6.1.

One thought keeps really bothering me – why don't we see any similar errors on aiohttp 3.9.5, but on 3.10-3.11.13, with Happy Eyeballs disabled, we do? As if something else has changed.

And this is still relevant, everything works perfectly with 3.9.5 and we have a lot of errors with any version after. Once we had 700 such errors in 1 minute from one container.

I understand that root cause of the problem is inside uvloop, but I still don't understand why aiohttp 3.9.5 works fine.

@bdraco
Copy link
Member

bdraco commented Apr 15, 2025

@tkukushkin The problem only happens with uvloop if we pass a socket object in. In 3.9 we called loop.create_connection directly which doesn't (or is less likely to) trigger the bug in uvloop.

3.11

return await self._loop.create_connection(*args, **kwargs, sock=sock)

3.9

return await self._loop.create_connection(*args, **kwargs)

@tkukushkin
Copy link
Author

Got it, thanks a lot! Is it possible to work around this problem somehow? Because 700 errors in one minute is too much.

@bdraco
Copy link
Member

bdraco commented Apr 15, 2025

I haven’t tested this, but something along these lines might help work around the issue. That said, I’m hesitant to take another stab at shipping a uvloop workaround—we’ve been burned a few too many times trying to patch it ourselves. At this point, I’d rather hold off and wait for this PR to get merged upstream.

diff --git a/aiohttp/connector.py b/aiohttp/connector.py
index 08e6ae275..46386a987 100644
--- a/aiohttp/connector.py
+++ b/aiohttp/connector.py
@@ -1123,6 +1123,15 @@ class TCPConnector(BaseConnector):
             async with ceil_timeout(
                 timeout.sock_connect, ceil_threshold=timeout.ceil_threshold
             ):
+                if self._happy_eyeballs_delay is None:
+                    first_addr_infos = addr_infos[0]
+                    address_tuple = first_addr_infos[4]
+                    host: str = address_tuple[0]
+                    port: int = address_tuple[1]
+                    return await self._loop.create_connection(
+                        host, port, *args, **kwargs
+                    )
+                else:
                     sock = await aiohappyeyeballs.start_connection(
                         addr_infos=addr_infos,
                         local_addr_infos=self._local_addr_infos,
@@ -1131,7 +1140,9 @@ class TCPConnector(BaseConnector):
                         loop=self._loop,
                         socket_factory=self._socket_factory,
                     )
-                return await self._loop.create_connection(*args, **kwargs, sock=sock)
+                    return await self._loop.create_connection(
+                        *args, **kwargs, sock=sock
+                    )
         except cert_errors as exc:
             raise ClientConnectorCertificateError(req.connection_key, exc) from exc
         except ssl_errors as exc:
@@ -1324,7 +1335,10 @@ class TCPConnector(BaseConnector):
                 )
             except (ClientConnectorError, asyncio.TimeoutError) as exc:
                 last_exc = exc
-                aiohappyeyeballs.pop_addr_infos_interleave(addr_infos, self._interleave)
+                aiohappyeyeballs.pop_addr_infos_interleave(
+                    addr_infos,
+                    None if self._happy_eyeballs_delay is None else self._interleave,
+                )
                 continue
 
             if req.is_ssl() and fingerprint:

@tkukushkin
Copy link
Author

I would really like to test this workaround, but I need wheels. Maybe it's possible to make a test release (like a release candidate), so it will not be delivered to people who will not request this specific test version? IMHO it could be a good approach to test other fixes in future as well.

@bdraco
Copy link
Member

bdraco commented Apr 15, 2025

It would take me a couple of hours to put something like that together, since I’d need to adjust the existing tests. I’ve already spent a few weeks trying to work around this issue in uvloop, so it’s a bit discouraging to keep pushing on it—especially when the proposed fix hasn’t gotten any attention upstream.

@tkukushkin
Copy link
Author

Thank you so much for your efforts trying to work around this issue on the aiohttp side.
but unfortunately, the fix in uvloop has not been moving anywhere for six months, so I believe some work around is still needed.

@bdraco bdraco reopened this Apr 15, 2025
@bdraco
Copy link
Member

bdraco commented Apr 15, 2025

I'll leave this open, and when I have some cycles, I'll take a look at it again.

@bdraco
Copy link
Member

bdraco commented Apr 15, 2025

Also, in the meantime, I added a note to the top of this issue and pinned it, in the hopes that someone has a relationship with the uvloop maintainers, who will see this issue, and might be able to help get the fix moving upstream.

@bdraco
Copy link
Member

bdraco commented Apr 15, 2025

Also, since you're using Python 3.13, have you considered disabling uvloop?

MagicStack/uvloop#566 (comment)

@tkukushkin
Copy link
Author

tkukushkin commented Apr 15, 2025

Yes, I've tried to disable uvloop, but the performance impact was significant enough in our case.

I have three ideas at the moment:

  1. build uvloop wheels with fix and publish it to our local pypi
  2. try to monkeypatch aiohttp with your new workaround.
  3. pin aiohttp to 3.9.5

@bdraco
Copy link
Member

bdraco commented Apr 15, 2025

One and two are probably decent strategies, I wouldn't do three because you're gonna end up in a panic the next time there's some type of vulnerability that has to be fixed.

@tkukushkin
Copy link
Author

Yes, I agree, and we already need some fixes from newer releases in one of our services.

@tkukushkin
Copy link
Author

Hi! Today we've tried to monkeypatch aiohttp like that:

import asyncio
from typing import Any

import aiohappyeyeballs
from aiohttp import ClientConnectorError, ClientRequest, ClientTimeout, TCPConnector
from aiohttp.client_exceptions import ClientConnectorCertificateError, ClientConnectorSSLError, cert_errors, ssl_errors
from aiohttp.client_proto import ResponseHandler
from aiohttp.helpers import ceil_timeout


async def _wrap_create_connection(
    self: TCPConnector,
    *args: Any,
    addr_infos: list[aiohappyeyeballs.AddrInfoType],
    req: ClientRequest,
    timeout: ClientTimeout,
    client_error: type[Exception] = ClientConnectorError,
    **kwargs: Any,
) -> tuple[asyncio.Transport, ResponseHandler]:
    try:
        async with ceil_timeout(timeout.sock_connect, ceil_threshold=timeout.ceil_threshold):
            first_addr_infos = addr_infos[0]
            address_tuple = first_addr_infos[4]
            host: str = address_tuple[0]
            port: int = address_tuple[1]
            return await self._loop.create_connection(*args, host=host, port=port, **kwargs)  # type: ignore[call-overload]
    except cert_errors as exc:
        raise ClientConnectorCertificateError(req.connection_key, exc) from exc
    except ssl_errors as exc:
        raise ClientConnectorSSLError(req.connection_key, exc) from exc
    except OSError as exc:
        if exc.errno is None and isinstance(exc, asyncio.TimeoutError):
            raise
        raise client_error(req.connection_key, exc) from exc

TCPConnector._wrap_create_connection = _wrap_create_connection  # type: ignore[method-assign]

And it perfectly works. Haven't seen any errors about bad file descriptors for 3 hours. Before that, we saw at least several errors every minute.

@bdraco
Copy link
Member

bdraco commented Apr 16, 2025

That's great, let me know if it's still working well at the end of the week. The connection order probably isn't ideal from the address list, but it may already be sorted. More testing would be required to make sure we could ship a change like that.

@bdraco
Copy link
Member

bdraco commented Apr 19, 2025

Looks like the uvloop PR might be moving along. 🤞

@tkukushkin
Copy link
Author

I hope so.

By the way, our monkey patched version of aiohttp still works great in production.
The only problem that I met, there is something wrong with IPv6 addresses.

We don't have IPv6 in production, but we use aiohttp for tests too, and localhost resolves to IPv6 and IPv4 addresses. IPv6 address leads to an error:

aiohttp.client_exceptions.ClientConnectorError: Cannot connect to host localhost:8000 ssl:default [Connect call failed ('::1', 8000, 0, 0)]

jinyingli-saturn added a commit to Saturn-Technologies/async-pynamodb that referenced this issue Apr 21, 2025
Pin aiohttp to 3.9.5 (downgrade). This is a workaround to
aio-libs/aiohttp#10506, which causes bad
sockets (especially under load).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug waiting-for-upstream We are waiting for an upstream change
Projects
None yet
3 participants