admin管理员组文章数量:1412946
I am trying to learn PhantomJS. I would appreciate if you can help me understand why the code below gives me an error(shown below) and help me fix the error. I am trying to execute some javascript on a page using phantomjs. The code lines in the evaluate function work well when I enter them in Chrome console, i.e., they give the expected result (document.title).
Thank you.
PhantomJS Code
var page = require('webpage').create();
var url = '';
page.open(url, function(status) {
var title = page.evaluate(function(query) {
document.querySelector('input[name=q]').setAttribute('value', query);
document.querySelector('input[name="btnK"]').click();
return document.title;
}, 'phantomJS');
console.log(title);
phantom.exit()
})
Error
TypeError: 'null' is not an object (evaluating 'document.querySelector('input[name="btnK"]').click')
phantomjs://webpage.evaluate():4
phantomjs://webpage.evaluate():7
phantomjs://webpage.evaluate():7
null
Edit 1: In response to Andrew's answer
Andrew, it is strange but on my puter, the button is an input element. The following screenshot shows the result on my puter.
Edit 2: click event unreliable
Sometimes, the following click event works, sometimes it does not.
document.querySelector('input[name="btnK"]')
Not clear to me what is happening.
About the answer
For future readers, in addition to the answer, the gist by Artjom B. is helpful in understanding what is happening. However, for a more robust solution, I think something like the waitfor.js example will have to be used (as suggested in the answer). I hope it is okay to copy and paste Artjom B.'s gist here. While the gist below works (with form submit); it is still not clear to me why it does not work if I try to simulate the click button on the input. If anyone can clarify that, it would be great.
// Gist by Artjom B.
var page = require('webpage').create();
var url = '';
page.open(url, function(status) {
var query = 'phantomJS';
page.evaluate(function(query) {
document.querySelector('input[name=q]').value = query;
document.querySelector('form[action="/search"]').submit();
}, query);
setTimeout(function(){
var title = page.evaluate(function() {
return document.title;
});
console.log(title);
phantom.exit();
}, 2000);
});
I am trying to learn PhantomJS. I would appreciate if you can help me understand why the code below gives me an error(shown below) and help me fix the error. I am trying to execute some javascript on a page using phantomjs. The code lines in the evaluate function work well when I enter them in Chrome console, i.e., they give the expected result (document.title).
Thank you.
PhantomJS Code
var page = require('webpage').create();
var url = 'http://www.google.';
page.open(url, function(status) {
var title = page.evaluate(function(query) {
document.querySelector('input[name=q]').setAttribute('value', query);
document.querySelector('input[name="btnK"]').click();
return document.title;
}, 'phantomJS');
console.log(title);
phantom.exit()
})
Error
TypeError: 'null' is not an object (evaluating 'document.querySelector('input[name="btnK"]').click')
phantomjs://webpage.evaluate():4
phantomjs://webpage.evaluate():7
phantomjs://webpage.evaluate():7
null
Edit 1: In response to Andrew's answer
Andrew, it is strange but on my puter, the button is an input element. The following screenshot shows the result on my puter.
Edit 2: click event unreliable
Sometimes, the following click event works, sometimes it does not.
document.querySelector('input[name="btnK"]')
Not clear to me what is happening.
About the answer
For future readers, in addition to the answer, the gist by Artjom B. is helpful in understanding what is happening. However, for a more robust solution, I think something like the waitfor.js example will have to be used (as suggested in the answer). I hope it is okay to copy and paste Artjom B.'s gist here. While the gist below works (with form submit); it is still not clear to me why it does not work if I try to simulate the click button on the input. If anyone can clarify that, it would be great.
// Gist by Artjom B.
var page = require('webpage').create();
var url = 'http://www.google.';
page.open(url, function(status) {
var query = 'phantomJS';
page.evaluate(function(query) {
document.querySelector('input[name=q]').value = query;
document.querySelector('form[action="/search"]').submit();
}, query);
setTimeout(function(){
var title = page.evaluate(function() {
return document.title;
});
console.log(title);
phantom.exit();
}, 2000);
});
Share
Improve this question
edited Jul 1, 2014 at 14:03
Curious2learn
asked Jul 1, 2014 at 1:43
Curious2learnCurious2learn
33.7k43 gold badges111 silver badges126 bronze badges
3
- 1 Google might deliver different pages to different clients, so it might be "input" for you browser, but "button" for the phantomjs. Try log the document.body, and check what really is there. – Andrew Commented Jul 1, 2014 at 2:28
- Thanks Andrew. It is input but with name="btnG". This time it did not give an error. So that part worked. However, it did not click the button either. The page title it returned was "Google" the same before the search term was submitted. – Curious2learn Commented Jul 1, 2014 at 2:41
- That makes sense. click will cause a navigation, and it takes time. To get the new title, wait some time, e.g. 3s, or listen for the page load event like this: page.onLoadFinished = function(){ ..evaluate.. }. Also check casperjs, it's built on Phantomjs, but easier to use. – Andrew Commented Jul 1, 2014 at 2:51
2 Answers
Reset to default 3Google uses a form for submitting its queries. It's also highly likely that google has changed the prototype methods for their search buttons, so it's not really the best site to test web scraping.
The easiest way to do this is to actually perform a form submit, which slightly tweaks your example.
var page = require('webpage').create();
var url = 'http://www.google.';
page.open(url, function(status) {
var query = 'phantomJS';
var title = page.evaluate(function(query) {
document.querySelector('input[name=q]').value = query;
document.querySelector('form[action="/search"]').submit();
return document.title
}, query);
console.log(title);
phantom.exit();
});
Note that you will likely need to consider that the response is async from this call, so getting the title directly will likely result in an undefined error (you need to account for the time it takes for the page to load before looking up data; you can review this in their waitfor.js example).
You can open google. and try document.querySelector('input[name="btnK"]') in the console, it's null.
Actully try replace input with button:
document.querySelector('button[name="btnK"]')
本文标签: javascriptWhy does this phantomjs code return null and the document titleStack Overflow
版权声明:本文标题:javascript - Why does this phantomjs code return null and the document title? - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1744733510a2622185.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论