admin管理员组

文章数量:1426489

I am looking to get the contents of a text file hosted on my website using Python. The server requires JavaScript to be enabled on your browser. Therefore when I run:

    import urllib2  
    target_url = ".txt"
    data = urllib2.urlopen(target_url) 

I receive a html page saying to enable JavaScript. I was wondering if there was a way of faking having JS enabled or something.

Thanks

I am looking to get the contents of a text file hosted on my website using Python. The server requires JavaScript to be enabled on your browser. Therefore when I run:

    import urllib2  
    target_url = "http://09hannd.me/ai/request.txt"
    data = urllib2.urlopen(target_url) 

I receive a html page saying to enable JavaScript. I was wondering if there was a way of faking having JS enabled or something.

Thanks

Share Improve this question asked Dec 22, 2015 at 13:51 user2216919user2216919
Add a ment  | 

2 Answers 2

Reset to default 3

Selenium is the way to go here, but there is another "hacky" option.

Based on this answer: https://stackoverflow./a/26393257/2517622

import requests

url = 'http://09hannd.me/ai/request.txt'
response = requests.get(url, cookies={'__test': '2501c0bc9fd535a3dc831e57dc8b1eb0'})
print(response.content) # Output: find me a cafe nearby

There is an alternative for urllib that supports JavaScript HTML Session with JavaScript Support (render) .

from requests_html import HTMLSession

custom_session = HTMLSession()
r = custom_session.get('http://python-requests')

r.html.render() #Run JavaScript

print(r.html.search('Python 2 will retire in only {months} months!')['months'])

'<time>25</time>'

本文标签: Python get URL contents when page requires JavaScript enabledStack Overflow