admin管理员组

文章数量:1425754

Goal

To log in to this website () using python requests etc. (I know this could be done with selenium or PhantomJS or something, but would prefer not to)

Problem

During the log in process there a couple of redirects where "session ID" type params are passed. Most of these i can get but there's one called dtPC that appears to e from a cookie that you get when first visiting the page. As far as I can tell, the cookie originates from this JS file (.js). This url is the next GET request the browser performs after the initial GET of the main url. All the methods i've tried so far have failed to get me that cookie.

Code thus far

from requests_html import HTMLSession

url=r''
url2=r'.js'
headers={
'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3',
 'Accept-Encoding': 'gzip, deflate, br',
 'Accept-Language': 'en-US,en;q=0.9',
 'Cache-Control': 'max-age=0',
 'Connection': 'keep-alive',
 'Host': 'www.reliant',
 'Sec-Fetch-Mode': 'navigate',
 'Sec-Fetch-Site': 'none',
 'Sec-Fetch-User': '?1',
 'Upgrade-Insecure-Requests': '1',
 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.3'
}

headers2={
'Referer': '',
 'Sec-Fetch-Mode': 'no-cors',
 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36'
}

s=HTMLSession()
r=s.get(url,headers=headers)
js=s.get(url2,headers=headers2).text

r.html.render() #works but doesn't get the cookie
r.html.render(script=js) #fails on Network error

Goal

To log in to this website (https://www.reliant.) using python requests etc. (I know this could be done with selenium or PhantomJS or something, but would prefer not to)

Problem

During the log in process there a couple of redirects where "session ID" type params are passed. Most of these i can get but there's one called dtPC that appears to e from a cookie that you get when first visiting the page. As far as I can tell, the cookie originates from this JS file (https://www.reliant./ruxitagentjs_ICA2QSVfhjqrux_10175190917092722.js). This url is the next GET request the browser performs after the initial GET of the main url. All the methods i've tried so far have failed to get me that cookie.

Code thus far

from requests_html import HTMLSession

url=r'https://www.reliant.'
url2=r'https://www.reliant./ruxitagentjs_ICA2QSVfhjqrux_10175190917092722.js'
headers={
'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3',
 'Accept-Encoding': 'gzip, deflate, br',
 'Accept-Language': 'en-US,en;q=0.9',
 'Cache-Control': 'max-age=0',
 'Connection': 'keep-alive',
 'Host': 'www.reliant.',
 'Sec-Fetch-Mode': 'navigate',
 'Sec-Fetch-Site': 'none',
 'Sec-Fetch-User': '?1',
 'Upgrade-Insecure-Requests': '1',
 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.3'
}

headers2={
'Referer': 'https://www.reliant.',
 'Sec-Fetch-Mode': 'no-cors',
 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36'
}

s=HTMLSession()
r=s.get(url,headers=headers)
js=s.get(url2,headers=headers2).text

r.html.render() #works but doesn't get the cookie
r.html.render(script=js) #fails on Network error
Share Improve this question asked Sep 26, 2019 at 16:31 SuperStewSuperStew 3,0642 gold badges17 silver badges29 bronze badges
Add a ment  | 

1 Answer 1

Reset to default 5

Alright I figured this one out, despite it fighting me the whole way. Idk why dtPC wasn't showing up in the s.cookies like it should, but I wasn't using the script keyword quite right. Apparently, whatever JS you pass it will be executed after everything else has rendered, like you opened the console on your browser and pasted it in there. When i actually tried that in Chrome, I got some errors. Eventually i realized i could just run a simple JS script to return the cookies generated by the other JS.

s=HTMLSession()
r=s.get(url,headers=headers)
print(r.status_code)

c=r.html.render(script='document.cookie') 

c=urllib.parse.unquote(c)
c=[x.split('=') for x in c.split(';')]
c={x[0]:x[1] for x in c}
print(c)

at this point, c will be a dict with 'dtPC' as a key and the corresponding value.

本文标签: javascriptPython Requests run JS file from GETStack Overflow