Description
Mostly to ensure, that I'm out of options, except migrating to Linux (or using WSL2).
Windows 10, Python 3.8.5, Scrapy 2.4.1, playwright-1.9.2, scrapy-playwright 0.0.3
TL;DR: asyncioEventLoop built on top of SelectorEventLoop, and by design need from there addReader (or maybe something else), and won't work with ProactorEventLoop. But also, subprocesses on windows supported only in ProactorEventLoop, and not implemented in SelectorEventLoop.
The reasons mostly described here: https://docs.python.org/3/library/asyncio-platforms.html#asyncio-windows-subprocess
With process = CrawlerProcess(get_project_settings())
in starter.py
:
from scrapy.utils.reactor import install_reactor
install_reactor('twisted.internet.asyncioreactor.AsyncioSelectorReactor')
In settings.py
:
DOWNLOAD_HANDLERS = {
"http": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler",
"https": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler",
}
TWISTED_REACTOR = "twisted.internet.asyncioreactor.AsyncioSelectorReactor"
For Twisted == 20.3.0:
starter.py", line 8, in <module>
install_reactor('twisted.internet.asyncioreactor.AsyncioSelectorReactor')
File "C:\Users\i\miniconda3\envs\yu\lib\site-packages\scrapy\utils\reactor.py", line 66, in install_reactor
asyncioreactor.install(eventloop=event_loop)
File "C:\Users\i\miniconda3\envs\yu\lib\site-packages\twisted\internet\asyncioreactor.py", line 320, in install
reactor = AsyncioSelectorReactor(eventloop)
File "C:\Users\i\miniconda3\envs\yu\lib\site-packages\twisted\internet\asyncioreactor.py", line 69, in __init__
super().__init__()
File "C:\Users\i\miniconda3\envs\yu\lib\site-packages\twisted\internet\base.py", line 571, in __init__
self.installWaker()
File "C:\Users\i\miniconda3\envs\yu\lib\site-packages\twisted\internet\posixbase.py", line 286, in installWaker
self.addReader(self.waker)
File "C:\Users\i\miniconda3\envs\yu\lib\site-packages\twisted\internet\asyncioreactor.py", line 151, in addReader
self._asyncioEventloop.add_reader(fd, callWithLogger, reader,
File "C:\Users\i\miniconda3\envs\yu\lib\asyncio\events.py", line 501, in add_reader
raise NotImplementedError
NotImplementedError
For Twisted-21.2.0:
starter.py", line 8, in <module>
install_reactor('twisted.internet.asyncioreactor.AsyncioSelectorReactor')
File "C:\Users\i\miniconda3\envs\yu\lib\site-packages\scrapy\utils\reactor.py", line 66, in install_reactor
asyncioreactor.install(eventloop=event_loop)
File "C:\Users\i\miniconda3\envs\yu\lib\site-packages\twisted\internet\asyncioreactor.py", line 307, in install
reactor = AsyncioSelectorReactor(eventloop)
File "C:\Users\i\miniconda3\envs\yu\lib\site-packages\twisted\internet\asyncioreactor.py", line 60, in __init__
raise TypeError(
TypeError: SelectorEventLoop required, instead got: <ProactorEventLoop running=False closed=False debug=False>
(writing things below just for easier googling for errors, because of course those actions will not help):
Also, if we try to set for CrawlerProcess in starter.py
:
asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())
before installing reactor or just set SelectorEventLoop here:
install_reactor('twisted.internet.asyncioreactor.AsyncioSelectorReactor', event_loop_path='asyncio.SelectorEventLoop')
- we will get NotImplementedError
Even if we not using starter script and will start spider from terminal with scrapy crawl spider_name
with
ASYNCIO_EVENT_LOOP = "asyncio.SelectorEventLoop"
in settings.py
future: <Task finished name='Task-4' coro=<Connection.run() done, defined at c:\users\i\miniconda3\envs\yu\lib\site-packages\playwright\_impl\_connection.py:163> exception=NotImplementedError()>
Traceback (most recent call last):
File "c:\users\i\miniconda3\envs\yu\lib\site-packages\playwright\_impl\_connection.py", line 166, in run
await self._transport.run()
File "c:\users\i\miniconda3\envs\yu\lib\site-packages\playwright\_impl\_transport.py", line 60, in run
proc = await asyncio.create_subprocess_exec(
File "c:\users\i\miniconda3\envs\yu\lib\asyncio\subprocess.py", line 236, in create_subprocess_exec
transport, protocol = await loop.subprocess_exec(
File "c:\users\i\miniconda3\envs\yu\lib\asyncio\base_events.py", line 1630, in subprocess_exec
transport = await self._make_subprocess_transport(
File "c:\users\i\miniconda3\envs\yu\lib\asyncio\base_events.py", line 491, in _make_subprocess_transport
raise NotImplementedError
NotImplementedError