admin管理员组文章数量:1296485
I'm writing a webscraper/automation tool. This tool needs to use POST requests to submit form data. The final action uses this link:
<a id="linkSaveDestination" href='javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions("linkSaveDestination", "", true, "", "", false, true))'>Save URL on All Search Engines</a>
to submit data from this form:
<input name="sem_ad_group__destination_url" type="text" maxlength="1024" id="sem_ad_group__destination_url" class="TextValueStyle" style="width:800px;">
I've been using requests and BeautifulSoup. I understand that these libraries can't interact with Javascript, and people remend Selenium. But as I understand it Selenium can't do POSTs. How can I handle this? Is it possible to do without opening an actual browser like Selenium does?
I'm writing a webscraper/automation tool. This tool needs to use POST requests to submit form data. The final action uses this link:
<a id="linkSaveDestination" href='javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions("linkSaveDestination", "", true, "", "", false, true))'>Save URL on All Search Engines</a>
to submit data from this form:
<input name="sem_ad_group__destination_url" type="text" maxlength="1024" id="sem_ad_group__destination_url" class="TextValueStyle" style="width:800px;">
I've been using requests and BeautifulSoup. I understand that these libraries can't interact with Javascript, and people remend Selenium. But as I understand it Selenium can't do POSTs. How can I handle this? Is it possible to do without opening an actual browser like Selenium does?
Share asked Jan 15, 2015 at 0:02 Steven WernerSteven Werner 1671 gold badge3 silver badges12 bronze badges 4- Selenium can post just fine. What's important is that you're executing in a runtime that can execute the javascript on the link by clicking it, like that of a webdriver provided by selenium – RutledgePaulV Commented Jan 15, 2015 at 0:05
- You could, of course, just pull the name of the input field(s) for the form and insert whatever values you want and send the post yourself using requests. This would probably be the better option if this is only a small piece of your application that you're not planning on having to do elsewhere. – RutledgePaulV Commented Jan 15, 2015 at 0:06
- So, when I click the link above, it posts what was in the form and refreshes the page. Can I simply make a post request to the same URL instead of using the link? Also, the form has a value attribute when there is data saved for that field, but the input element above shows it without any data entered, so there is no value attribute. Can I just add 'value: "blah"' to my post payload using requests? – Steven Werner Commented Jan 15, 2015 at 2:06
- Yes. I've added an answer with some details and a link to documentation. – RutledgePaulV Commented Jan 15, 2015 at 2:35
2 Answers
Reset to default 7Yes. You can absolutely duplicate what the link is doing by just submitting a POST to the proper url (this is, in reality, eventually going to be the same thing that the javascript that fires when the link is clicked does).
You'll find the relevant section in the requests docs here: http://docs.python-requests/en/latest/user/quickstart/#more-plicated-post-requests
So, that'll look something like this for your particular case:
payload = {'sem_ad_group__destination_url': 'yourTextValueHere'}
r = requests.post("theActionUrlForTheFormHere", data=payload)
If you're having trouble figuring out what url it is actually be posted to, just monitor the network tab (in chrome dev tools) while you manually click the link yourself, you should be able to find the right request and pull any information off of that.
Good Luck!
With selenium
you mimic the real-user interactions in a real browser - tell it to locate an input, write a text inside, click a button etc - high-level approach - you don't even need to know what is there under-the-hood, you see what a real user sees. The downside here is that there is a real browser involved which, at least, slows things down. You can though, automate a a headless browser (PhantomJS
), or use a Xvfb
virtual framebuffer if you don't have conditions to open up a browser with a UI. Example:
from selenium import webdriver
driver = webdriver.PhantomJS()
driver.get('url here')
button = driver.find_element_by_id('linkSaveDestination')
button.click()
With requests+BeautifulSoup
, you are going down to the bare metal - using browser developer tools you research/analyze what requests are made to a server and mimic them in your code. Sometimes the way a page is constructed and requests made are too plicated to automate, or there are anti-web-scraping technique used.
There are pros & cons about both approaches - which option to choose depends on many things.
本文标签: Clicking a Javascript link to make a post request in PythonStack Overflow
版权声明:本文标题:Clicking a Javascript link to make a post request in Python - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1741622512a2388892.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论