admin管理员组

文章数量:1399747

I need to call a web page that has javascript. At the bottom of the page I have the following:

  <noscript>
    <p>Javascript is not supported or enabled.</p>
  </noscript>

When I make my HttpWebRequest request like so, it is clear that the javascript on the page did not execute.

Dim req As System.Net.HttpWebRequest = DirectCast(System.Net.WebRequest.Create(New Uri(url)), System.Net.HttpWebRequest)
' Add the current authentication cookie to the request 
Dim cookie As HttpCookie = HttpContext.Current.Request.Cookies(FormsAuthentication.FormsCookieName)
Dim authenticationCookie As New System.Net.Cookie(FormsAuthentication.FormsCookieName, cookie.Value, cookie.Path, HttpContext.Current.Request.Url.Authority)

req.CookieContainer = New System.Net.CookieContainer()
req.CookieContainer.Add(authenticationCookie)
req.MediaType = "PRINT"
req.Method = "GET"
req.UserAgent = "Mozilla/4.0 (patible; MSIE 8.0; Windows NT 5.1; Trident/4.0; Mozilla/4.0 (patible; MSIE 6.0; Windows NT 5.1; SV1) ; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; .NET CLR 3.0.04506.648; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)"

Dim res As System.Net.WebResponse = req.GetResponse()

What can I do? The response is not useful to me if the javascript did not run. I want to convert the output into a PDF. I guess I need a way to execute the javascript that in included in the response, but do so outside of the browser.

Thanks.

I need to call a web page that has javascript. At the bottom of the page I have the following:

  <noscript>
    <p>Javascript is not supported or enabled.</p>
  </noscript>

When I make my HttpWebRequest request like so, it is clear that the javascript on the page did not execute.

Dim req As System.Net.HttpWebRequest = DirectCast(System.Net.WebRequest.Create(New Uri(url)), System.Net.HttpWebRequest)
' Add the current authentication cookie to the request 
Dim cookie As HttpCookie = HttpContext.Current.Request.Cookies(FormsAuthentication.FormsCookieName)
Dim authenticationCookie As New System.Net.Cookie(FormsAuthentication.FormsCookieName, cookie.Value, cookie.Path, HttpContext.Current.Request.Url.Authority)

req.CookieContainer = New System.Net.CookieContainer()
req.CookieContainer.Add(authenticationCookie)
req.MediaType = "PRINT"
req.Method = "GET"
req.UserAgent = "Mozilla/4.0 (patible; MSIE 8.0; Windows NT 5.1; Trident/4.0; Mozilla/4.0 (patible; MSIE 6.0; Windows NT 5.1; SV1) ; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; .NET CLR 3.0.04506.648; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)"

Dim res As System.Net.WebResponse = req.GetResponse()

What can I do? The response is not useful to me if the javascript did not run. I want to convert the output into a PDF. I guess I need a way to execute the javascript that in included in the response, but do so outside of the browser.

Thanks.

Share Improve this question edited Dec 30, 2009 at 20:44 Bobby Ortiz asked Dec 30, 2009 at 20:38 Bobby OrtizBobby Ortiz 3,1477 gold badges37 silver badges47 bronze badges 0
Add a ment  | 

5 Answers 5

Reset to default 3

What output do you want to convert? You can only scrape the static HTML, not the JavaScript-modified DOM.

Remember that HttpWebRequest does not interpret JavaScript.

  1. Use the HttpWebRequest as you have already did
  2. After GetResponse and GetResponseStream, save the stream content a temporary file (e.g. using filename from Path.GetTempFilename() method)
  3. Loads it up in The WebBrowser class.
  4. Lets the page executes itself for a while.
  5. Walk the web browser instance's representation of the DOM to get what you want.

Hope this helps.

Javascript executes on the user-agent (client-side). You are providing a false user-agent string for the request. The user-agent you are "pretending" to be has a Javascript implementation. HttpWebRequest, of course, does not.

I guess I need a way to execute the javascript that in included in the response, but do so outside of the browser.

You'll need to write your own jasvascript interpreter then.

The only alternatives I can think about is using any web browser engine like webkit, gecko, etc. to render the page for you at the server-side or searching for online service like browsershots that will render the page for you.

Fix the page so it doesn't depend on JavaScript. Build on things that work.

本文标签: aspnetHow to Enable Javascript When Making an HttpWebRequestStack Overflow