admin管理员组文章数量:1356304
is it possible to have mechanize follow an anchor link that is of type javascript?
I am trying to login into a website in python using mechanize and beautifulsoup.
this is the anchor link
<a id="StaticModuleID15_ctl00_SkinLogin1_Login1_Login1_LoginButton" href="javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions("StaticModuleID15$ctl00$SkinLogin1$Login1$Login1$LoginButton", "", true, "Login1", "", false, true))"><img id="StaticModuleID15_ctl00_SkinLogin1_Login1_Login1_Image2" border="0" src="../../App_Themes/default/images/Member/btn_loginenter.gif" align="absmiddle" style="border-width:0px;" /></a>
and here is what i have tried
links = SoupStrainer('a', id="StaticModuleID15_ctl00_SkinLogin1_Login1_Login1_LoginButton")
[anchor for anchor in BeautifulSoup(data, parseOnlyThese=links)]
link = mechanize.Link( base_url = self.url,
url = str(anchor['href']),
text = str(anchor.string),
tag = str(anchor.name),
attrs = [(str(name), str(value))
for name, value in anchor.attrs])
response2 = br.follow_link(link)
Right now I am getting the error message of,
urllib2.URLError:
any help or suggestion is appreciated
Edit
After the ment by helpers, I went and looked at the code of the asp page a bit.
I found a little bit of useful scripts but I am unsure of what I have to do in python to emulate the JS code with python. In no where did I see any cookies set, am I looking at the wrong places?
<form name="form1" method="post" action="BrowseSchedule.aspx?ItemId=75" onsubmit="javascript:return WebForm_OnSubmit();" id="form1">
//<![CDATA[
function WebForm_OnSubmit() {
if (typeof(ValidatorOnSubmit) == "function" && ValidatorOnSubmit() == false) return false;
return true;
}
//]]>
<script type="text/javascript">
//<![CDATA[
var theForm = document.forms['form1'];
if (!theForm) {
theForm = document.form1;
}
function __doPostBack(eventTarget, eventArgument) {
if (!theForm.onsubmit || (theForm.onsubmit() != false)) {
theForm.__EVENTTARGET.value = eventTarget;
theForm.__EVENTARGUMENT.value = eventArgument;
theForm.submit();
}
}
//]]>
</script>
function WebForm_DoPostBackWithOptions(options) {
var validationResult = true;
if (options.validation) {
if (typeof(Page_ClientValidate) == 'function') {
validationResult = Page_ClientValidate(options.validationGroup);
}
}
if (validationResult) {
if ((typeof(options.actionUrl) != "undefined") && (options.actionUrl != null) && (options.actionUrl.length > 0)) {
theForm.action = options.actionUrl;
}
if (options.trackFocus) {
var lastFocus = theForm.elements["__LASTFOCUS"];
if ((typeof(lastFocus) != "undefined") && (lastFocus != null)) {
if (typeof(document.activeElement) == "undefined") {
lastFocus.value = options.eventTarget;
}
else {
var active = document.activeElement;
if ((typeof(active) != "undefined") && (active != null)) {
if ((typeof(active.id) != "undefined") && (active.id != null) && (active.id.length > 0)) {
lastFocus.value = active.id;
}
else if (typeof(active.name) != "undefined") {
lastFocus.value = active.name;
}
}
}
}
}
}
if (options.clientSubmit) {
__doPostBack(options.eventTarget, options.eventArgument);
}
}
is it possible to have mechanize follow an anchor link that is of type javascript?
I am trying to login into a website in python using mechanize and beautifulsoup.
this is the anchor link
<a id="StaticModuleID15_ctl00_SkinLogin1_Login1_Login1_LoginButton" href="javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions("StaticModuleID15$ctl00$SkinLogin1$Login1$Login1$LoginButton", "", true, "Login1", "", false, true))"><img id="StaticModuleID15_ctl00_SkinLogin1_Login1_Login1_Image2" border="0" src="../../App_Themes/default/images/Member/btn_loginenter.gif" align="absmiddle" style="border-width:0px;" /></a>
and here is what i have tried
links = SoupStrainer('a', id="StaticModuleID15_ctl00_SkinLogin1_Login1_Login1_LoginButton")
[anchor for anchor in BeautifulSoup(data, parseOnlyThese=links)]
link = mechanize.Link( base_url = self.url,
url = str(anchor['href']),
text = str(anchor.string),
tag = str(anchor.name),
attrs = [(str(name), str(value))
for name, value in anchor.attrs])
response2 = br.follow_link(link)
Right now I am getting the error message of,
urllib2.URLError:
any help or suggestion is appreciated
Edit
After the ment by helpers, I went and looked at the code of the asp page a bit.
I found a little bit of useful scripts but I am unsure of what I have to do in python to emulate the JS code with python. In no where did I see any cookies set, am I looking at the wrong places?
<form name="form1" method="post" action="BrowseSchedule.aspx?ItemId=75" onsubmit="javascript:return WebForm_OnSubmit();" id="form1">
//<![CDATA[
function WebForm_OnSubmit() {
if (typeof(ValidatorOnSubmit) == "function" && ValidatorOnSubmit() == false) return false;
return true;
}
//]]>
<script type="text/javascript">
//<![CDATA[
var theForm = document.forms['form1'];
if (!theForm) {
theForm = document.form1;
}
function __doPostBack(eventTarget, eventArgument) {
if (!theForm.onsubmit || (theForm.onsubmit() != false)) {
theForm.__EVENTTARGET.value = eventTarget;
theForm.__EVENTARGUMENT.value = eventArgument;
theForm.submit();
}
}
//]]>
</script>
function WebForm_DoPostBackWithOptions(options) {
var validationResult = true;
if (options.validation) {
if (typeof(Page_ClientValidate) == 'function') {
validationResult = Page_ClientValidate(options.validationGroup);
}
}
if (validationResult) {
if ((typeof(options.actionUrl) != "undefined") && (options.actionUrl != null) && (options.actionUrl.length > 0)) {
theForm.action = options.actionUrl;
}
if (options.trackFocus) {
var lastFocus = theForm.elements["__LASTFOCUS"];
if ((typeof(lastFocus) != "undefined") && (lastFocus != null)) {
if (typeof(document.activeElement) == "undefined") {
lastFocus.value = options.eventTarget;
}
else {
var active = document.activeElement;
if ((typeof(active) != "undefined") && (active != null)) {
if ((typeof(active.id) != "undefined") && (active.id != null) && (active.id.length > 0)) {
lastFocus.value = active.id;
}
else if (typeof(active.name) != "undefined") {
lastFocus.value = active.name;
}
}
}
}
}
}
if (options.clientSubmit) {
__doPostBack(options.eventTarget, options.eventArgument);
}
}
Share
Improve this question
edited Feb 15, 2015 at 6:08
Sheena
16.3k15 gold badges78 silver badges122 bronze badges
asked Aug 13, 2009 at 6:01
freshWoWerfreshWoWer
64.3k10 gold badges38 silver badges36 bronze badges
3
- You should read this: wwwsearch.sourceforge/bits/GeneralFAQ.html and read the "Embedded script is messing up my web-scraping. What do I do?" What you choose will largely depend on your needs, and how plicated the login is. It might end up being that the easiest way is to just to emulate the JS code with Python. – user120242 Commented Aug 13, 2009 at 6:31
- That was the page I was looking for: I knew it was on the site somewhere. – jkp Commented Aug 13, 2009 at 6:49
- That page is now at wwwsearch.sourceforge/old/bits/GeneralFAQ.html – utapyngo Commented Dec 12, 2011 at 14:12
2 Answers
Reset to default 4I don't think this is possible with the mechanize module: it doesn't have the ability to interact with JavaScript: its purely Python and HTTP based.
That said, you may be intested in python-spidermonkey module, which it seems is aimed at letting you do just this kind of thing. According to it's website it's aim is to let you
"Execute arbitrary JavaScript code from Python. Allows you to reference arbitrary Python objects and functions in the JavaScript VM"
I've not used it yet but it certainly looks like it would do what you are looking for, although it is still in alpha.
You may set cookies using cookielib
import mechanize
import cookielib
# add headers to your browser also
browser = mechanize.Browser()
browser.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')]
cj = cookielib.LWPCookieJar()
browser.set_cookiejar(cj)
I doubt this is even relevant now, but oh well :)
本文标签: mechanize (python) click on a javascript type linkStack Overflow
版权声明:本文标题:mechanize (python) click on a javascript type link - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1744045679a2581476.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论