admin管理员组文章数量:1414628
I'm using scrapy
and playwright
to scrape booking
in this way I need to click on a button and get ajax
response.
but when I run my code it returns error :
TypeError: Page.locator() missing 1 required positional argument: 'selector'
this is my code:
import scrapy
from scrapy.crawler import CrawlerProcess
from scrapy.selector import Selector
from scrapy.http import HtmlResponse
from scrapy_playwright.page import PageMethod
from playwright.async_api import Page,expect
class BookingSpider(scrapy.Spider):
name='booking'
start_urls=[".en-gb.html?aid=304142&checkin=2025-04-15&checkout=2025-04-16#map_opened-map_trigger_header_pin"]
def start_requests(self):
yield scrapy.Request(self.start_urls[0], meta={
"playwright": True,
"playwright_include_page":True,
"playwright_page_methods":[
PageMethod("wait_for_selector",".e1793b8db2")
]
})
def parse(self,response):
Page.locator("xpath=//*[@class='e1793b8db2'][1]").click()
with open("copy.txt", "w", encoding="utf-8") as file:
file.write((response.text))
process=CrawlerProcess()
process.crawl(BookingSpider)
process.start()
error Message:
File "c:\Users\mojsa\AppData\Local\Programs\Python\Python313\Lib\site-packages\twisted\internet\defer.py", line 1088, in _runCallbacks
current.result = callback( # type: ignore[misc]
~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
current.result, *args, **kwargs
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "c:\Users\mojsa\AppData\Local\Programs\Python\Python313\Lib\site-packages\scrapy\spiders\__init__.py", line 86, in _parse
return self.parse(response, **kwargs)
~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
File "J:\SeSa\booking\booking\spiders\booking.py", line 27, in parse
Page.locator("xpath=//*[@class='e1793b8db2'][1]").click()
~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: Page.locator() missing 1 required positional argument: 'selector'
I'm using scrapy
and playwright
to scrape booking
in this way I need to click on a button and get ajax
response.
but when I run my code it returns error :
TypeError: Page.locator() missing 1 required positional argument: 'selector'
this is my code:
import scrapy
from scrapy.crawler import CrawlerProcess
from scrapy.selector import Selector
from scrapy.http import HtmlResponse
from scrapy_playwright.page import PageMethod
from playwright.async_api import Page,expect
class BookingSpider(scrapy.Spider):
name='booking'
start_urls=["https://www.booking/hotel/it/hotelnordroma.en-gb.html?aid=304142&checkin=2025-04-15&checkout=2025-04-16#map_opened-map_trigger_header_pin"]
def start_requests(self):
yield scrapy.Request(self.start_urls[0], meta={
"playwright": True,
"playwright_include_page":True,
"playwright_page_methods":[
PageMethod("wait_for_selector",".e1793b8db2")
]
})
def parse(self,response):
Page.locator("xpath=//*[@class='e1793b8db2'][1]").click()
with open("copy.txt", "w", encoding="utf-8") as file:
file.write((response.text))
process=CrawlerProcess()
process.crawl(BookingSpider)
process.start()
error Message:
File "c:\Users\mojsa\AppData\Local\Programs\Python\Python313\Lib\site-packages\twisted\internet\defer.py", line 1088, in _runCallbacks
current.result = callback( # type: ignore[misc]
~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
current.result, *args, **kwargs
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "c:\Users\mojsa\AppData\Local\Programs\Python\Python313\Lib\site-packages\scrapy\spiders\__init__.py", line 86, in _parse
return self.parse(response, **kwargs)
~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
File "J:\SeSa\booking\booking\spiders\booking.py", line 27, in parse
Page.locator("xpath=//*[@class='e1793b8db2'][1]").click()
~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: Page.locator() missing 1 required positional argument: 'selector'
Share
Improve this question
edited Feb 25 at 12:17
Mojsa
asked Feb 21 at 17:46
MojsaMojsa
297 bronze badges
3
|
1 Answer
Reset to default -1Issues:
Incorrect start_urls usage in start_requests
start_urls is a class attribute, and in start_requests, you should reference self.start_urls. Incorrect use of Page.locator
Page is not defined in your parse function. You need to extract the page from the meta field in response. Incorrect indentation for CrawlerProcess
process = CrawlerProcess() and related lines should not be inside the class. Missing imports
You need to import scrapy, CrawlerProcess, and PageMethod from playwright.
本文标签: scrapy booking with playwrightpython return an errorStack Overflow
版权声明:本文标题:scrapy booking with playwright-python return an error - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1745162999a2645533.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
Page.locator()
needs two arguments but you use only one. Maybe it needsresponse
as second (or first) argument. OR maybe you should useresponse.locator()
instead ofPage.locator()
? – furas Commented Feb 21 at 19:53page = response.meta["playwright_page"]
like in question python - Scrapy and Scrapy-playwright scrape first comment of every page instead of every comment for every page - Stack Overflow. And maybe later use this instancepage
instead of class namePage
– furas Commented Feb 21 at 19:59Page
comes from. – lmtaq Commented Feb 21 at 22:36