admin管理员组文章数量:1287809
I want to retrieve the text within a webpage as a string. Is this possible? I am new to Javascript.
For example:
var url = "";
var result = url.getText(); <---- stores text as a string
document.write(result);
How do I write the getText method? Ether the entire HTML source code (which I can use to get the text) or just the text. I would like to do this from within a web browser.
I tried this and I am able to get an index number:
var url = ";page=2";
var result;
function go(){
result = url.search(/cat/i);
document.write(result);
}
This gives me an index of 44. That means that reading a page is possible. Can I do the opposite and enter the index to retrieve the text?
I want to retrieve the text within a webpage as a string. Is this possible? I am new to Javascript.
For example:
var url = "http://en.wikipedia/wiki/Programming";
var result = url.getText(); <---- stores text as a string
document.write(result);
How do I write the getText method? Ether the entire HTML source code (which I can use to get the text) or just the text. I would like to do this from within a web browser.
I tried this and I am able to get an index number:
var url = "http://www.youtube./results?search_query=cat&page=2";
var result;
function go(){
result = url.search(/cat/i);
document.write(result);
}
This gives me an index of 44. That means that reading a page is possible. Can I do the opposite and enter the index to retrieve the text?
Share Improve this question edited Nov 3, 2012 at 2:40 Qwertyfshag asked Nov 3, 2012 at 2:03 QwertyfshagQwertyfshag 1,0393 gold badges9 silver badges6 bronze badges 3- You mean the entire HTML source? – user1534664 Commented Nov 3, 2012 at 2:04
- Are you looking to do this inside a web browser or from a server-side JS engine like Node.js or Rhino? – psema4 Commented Nov 3, 2012 at 2:07
- In order to get around the cross-domain issue, is running a proxy service a possibility? – psema4 Commented Nov 3, 2012 at 2:31
3 Answers
Reset to default 3If the Ajax/Cross-Domain situation is not an issue for you, you can extract the text of a web page with
var el = document.body; // or some other element reference
var text = el.innerText || el.textContent;
If you need to read text from pages in the same domain as your application, you can use Ajax directly.
If you need to read text from pages outside of your domain, you'll have to jump through a few extra hoops like setting up a proxy server or dealing with CORS - http://en.wikipedia/wiki/Cross-origin_resource_sharing
Ajax won't support cross domain. You need server side language.
You would be better off using a more powerful server-side language to do that, not JavaScript. Python or PHP would be decent choices.
本文标签: htmlJavascript How to retrieve text from a webpageStack Overflow
版权声明:本文标题:html - Javascript: How to retrieve text from a webpage - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1741323415a2372337.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论