admin管理员组文章数量:1323225
I'm using cURL to access a number of different pages. I want an elegant way of checking if the page has a javascript redirect. I could check for presence of a window.location
in the body, but because it may be inside a .js file or using a library like jQuery, it seems like any solution wouldn't be perfect. Anyone have any ideas?
I'm using cURL to access a number of different pages. I want an elegant way of checking if the page has a javascript redirect. I could check for presence of a window.location
in the body, but because it may be inside a .js file or using a library like jQuery, it seems like any solution wouldn't be perfect. Anyone have any ideas?
- Not (easily) possible with simple curl requests since curl doesn't support javascript. – PeeHaa Commented Nov 26, 2012 at 19:49
- Yes, i was thinking more of running markup through a parser, rather than executing it. – madphp Commented Nov 26, 2012 at 19:50
- 1 If you are using a parser (or writing one), you can pile a list of .js files that are in the content of the requested file. With that list, you can download those files and parse them for the presence of a redirect as well. Since you have access to the source when downloading the file in your parser, you would be able to append the base url (extrapolated from the url you used originally) to links used in the document to download them – renab Commented Nov 26, 2012 at 19:52
- 1 @popnoodles cURL won't fire the javascript redirect, so there will be no url to resolve – renab Commented Nov 26, 2012 at 19:57
- 1 Maybe you could use something like capybara/selenium: christopherbloom./2012/03/12/… – sroes Commented Nov 26, 2012 at 20:01
4 Answers
Reset to default 2Thanks to Ikstar for pointing out phantomjs I worked out the following example:
test.js
var page = require('webpage').create();
var testUrls = [
"http://www.google.nl",
"http://www.example."
];
function testNextUrl()
{
var testUrl = testUrls.shift();
page.open(testUrl, function() {
var hasRedirect = page.url.indexOf(testUrl) !== 0;
console.log(testUrl + ": " + hasRedirect.toString());
if (testUrls.length) {
testNextUrl();
} else {
phantom.exit();
}
});
}
testNextUrl();
Result:
D:\Tools\phantomjs-1.7.0-windows>phantomjs test.js
http://www.google.nl: false
http://www.example.: true
You cannot do it by only parsing the script. Only executing will show you he true flow of the page's JS.
One way to imitate the execution is to have different levels of code level which has a redirection. The top most would be under <script>
tag and any redirects here would be a straight redirect. If any redirects are found inside functions then you have to track the structure of the program and make a guess.
Depending on the purpose of using Curl and actually needing the redirect on the page. It is possible to incorporate headless framework like PhantomJS (http://phantomjs/) to do the necessary browsing. You would be able to see whether a redirect would happen as well as track any other javascript executing on the page.
It is impossible to detect the presence of a redirect just analyzing the webpage source code.
The undecidable Halting problem can be encoded in JavaScript. The algorithm may halt, resulting in the generation of a redirect, or run forever. Since we do not know if the code will halt, it is impossible also to decide if the redirect will be executed or not.
本文标签: phpDetect if a web page has a javascript redirectStack Overflow
版权声明:本文标题:php - Detect if a web page has a javascript redirect - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1742086440a2420007.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论