admin管理员组文章数量:1335711
I'm learning to use scrapy with splash. As an exercise, I'm trying to visit /, click on the address text box, enter a location and then press the Enter button to move to next page containing the restaurants available for that location. I have the following lua code:
function main(splash)
local url = splash.args.url
assert(splash:go(url))
assert(splash:wait(5))
local element = splash:select('.base_29SQWm')
local bounds = element:bounds()
assert(element:mouseclick{x = bounds.width/2, y = bounds.height/2})
assert(element:send_text("Wall Street"))
assert(splash:send_keys("<Return>"))
assert(splash:wait(5))
return {
html = splash:html(),
}
end
When I click on "Render!" in the splash API, I get the following error message:
{
"info": {
"message": "Lua error: [string \"function main(splash)\r...\"]:7: attempt to index local 'element' (a nil value)",
"type": "LUA_ERROR",
"error": "attempt to index local 'element' (a nil value)",
"source": "[string \"function main(splash)\r...\"]",
"line_number": 7
},
"error": 400,
"type": "ScriptError",
"description": "Error happened while executing Lua script"
}
Somehow my css expression is false, resulting in splash trying to access an element that is undefined/nil! I've tried other expressions, but I can't seem to figure it out!
Q: Does anyone know how to solve this problem?
EDIT: Even though I still would like to know how to actually click on the element, I figured out how to get the same result by just using keys:
function main(splash)
local url = splash.args.url
assert(splash:go(url))
assert(splash:wait(5))
splash:send_keys("<Tab>")
splash:send_keys("<Tab>")
splash:send_text("Wall Street, New York")
splash:send_keys("<Return>")
assert(splash:wait(10))
return {
html = splash:html(),
png = splash:png(),
}
end
However, returned html/images in the splash API are from the page where you enter the address, not the page that you see after you've entered your address and clicked enter.
Q2: How do I succesfully load the second page?
I'm learning to use scrapy with splash. As an exercise, I'm trying to visit https://www.ubereats./stores/, click on the address text box, enter a location and then press the Enter button to move to next page containing the restaurants available for that location. I have the following lua code:
function main(splash)
local url = splash.args.url
assert(splash:go(url))
assert(splash:wait(5))
local element = splash:select('.base_29SQWm')
local bounds = element:bounds()
assert(element:mouseclick{x = bounds.width/2, y = bounds.height/2})
assert(element:send_text("Wall Street"))
assert(splash:send_keys("<Return>"))
assert(splash:wait(5))
return {
html = splash:html(),
}
end
When I click on "Render!" in the splash API, I get the following error message:
{
"info": {
"message": "Lua error: [string \"function main(splash)\r...\"]:7: attempt to index local 'element' (a nil value)",
"type": "LUA_ERROR",
"error": "attempt to index local 'element' (a nil value)",
"source": "[string \"function main(splash)\r...\"]",
"line_number": 7
},
"error": 400,
"type": "ScriptError",
"description": "Error happened while executing Lua script"
}
Somehow my css expression is false, resulting in splash trying to access an element that is undefined/nil! I've tried other expressions, but I can't seem to figure it out!
Q: Does anyone know how to solve this problem?
EDIT: Even though I still would like to know how to actually click on the element, I figured out how to get the same result by just using keys:
function main(splash)
local url = splash.args.url
assert(splash:go(url))
assert(splash:wait(5))
splash:send_keys("<Tab>")
splash:send_keys("<Tab>")
splash:send_text("Wall Street, New York")
splash:send_keys("<Return>")
assert(splash:wait(10))
return {
html = splash:html(),
png = splash:png(),
}
end
However, returned html/images in the splash API are from the page where you enter the address, not the page that you see after you've entered your address and clicked enter.
Q2: How do I succesfully load the second page?
Share Improve this question edited Feb 14, 2022 at 18:24 Egor Skriptunoff 23.8k2 gold badges37 silver badges67 bronze badges asked Jan 13, 2017 at 10:46 titusAdamtitusAdam 8091 gold badge17 silver badges36 bronze badges1 Answer
Reset to default 7 +50Not a plete solution, but here is what I have so far:
import json
import re
import scrapy
from scrapy_splash import SplashRequest
class UberEatsSpider(scrapy.Spider):
name = "ubereatspider"
allowed_domains = ["ubereats."]
def start_requests(self):
script = """
function main(splash)
local url = splash.args.url
assert(splash:go(url))
assert(splash:wait(10))
splash:set_viewport_full()
local search_input = splash:select('#address-selection-input')
search_input:send_text("Wall Street, New York")
assert(splash:wait(5))
local submit_button = splash:select('button[class^=submitButton_]')
submit_button:click()
assert(splash:wait(10))
return {
html = splash:html(),
png = splash:png(),
}
end
"""
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36'
}
yield SplashRequest('https://www.ubereats./new_york/', self.parse, endpoint='execute', args={
'lua_source': script,
'wait': 5
}, splash_headers=headers, headers=headers)
def parse(self, response):
script = response.xpath("//script[contains(., 'cityName')]/text()").extract_first()
pattern = re.pile(r"window.INITIAL_STATE = (\{.*?\});", re.MULTILINE | re.DOTALL)
match = pattern.search(script)
if match:
data = match.group(1)
data = json.loads(data)
for place in data["marketplace"]["marketplaceStores"]["data"]["entity"]:
print(place["title"])
Note the changes in the Lua script: I've located the search input, send the search text to it, then located the "Find" button and clicked it. On the screenshot, I did not see the search results loaded no matter the time delay I've set, but I've managed to get the restaurant names from the script
contents. The place
objects contain all the necessary information to filter the desired restaurants.
Also note that the URL I'm navigating to is the "New York" one (not the general "stores").
I'm not pletely sure why the search result page is not being loaded though, but hope it'll be a good start for you and you can further improve this solution.
本文标签: javascriptScrapysplash can39t select elementStack Overflow
版权声明:本文标题:javascript - Scrapy + splash: can't select element - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1742270781a2444263.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论