admin管理员组

文章数量:1178539

In an HTML page, I want to pick the value of a javascript variable.
Below is the snippet of HTML page:

<input id="hidval" value="" type="hidden"> 
<form method="post" style="padding: 0px;margin: 0px;" name="profile" autocomplete="off">
<input name="pqRjnA" id="pqRjnA" value="" type="hidden">
<script type="text/javascript">
    key="pqRjnA";
</script>

My aim is to read the value of variable key from this page using jsoup.
Is it possible with jsoup? If yes then how?

In an HTML page, I want to pick the value of a javascript variable.
Below is the snippet of HTML page:

<input id="hidval" value="" type="hidden"> 
<form method="post" style="padding: 0px;margin: 0px;" name="profile" autocomplete="off">
<input name="pqRjnA" id="pqRjnA" value="" type="hidden">
<script type="text/javascript">
    key="pqRjnA";
</script>

My aim is to read the value of variable key from this page using jsoup.
Is it possible with jsoup? If yes then how?

Share Improve this question edited Nov 14, 2021 at 15:23 Mahozad 24.4k19 gold badges157 silver badges179 bronze badges asked Feb 15, 2013 at 22:58 raviravi 6,32819 gold badges83 silver badges162 bronze badges 3
  • 1 You'd have to get the script content then either parse manually, or see if you could use Rhino to get context out of an executed JS fragment. – Dave Newton Commented Feb 15, 2013 at 23:09
  • @Reimeus: no. Initialization can be done somewhere else here some value is being assigned to variable key. – ravi Commented Feb 15, 2013 at 23:19
  • Added kotlin tag because a similar Koltlin question is marked duplicate and is linked to this question. – Mahozad Commented Nov 14, 2021 at 15:24
Add a comment  | 

2 Answers 2

Reset to default 35

Since jsoup isn't a javascript library you have two ways to solve this:

A. Use a javascript library

  • Pro:

    • Full Javascript support
  • Con:

    • Additional libraray / dependencies

B. Use Jsoup + manual parsing

  • Pro:

    • No extra libraries required
    • Enough for simple tasks
  • Con:

    • Not as flexible as a javascript library

Here's an example how to get the key with jsoupand some "manual" code:

Document doc = ...
Element script = doc.select("script").first(); // Get the script part


Pattern p = Pattern.compile("(?is)key=\"(.+?)\""); // Regex for the value of the key
Matcher m = p.matcher(script.html()); // you have to use html here and NOT text! Text will drop the 'key' part


while( m.find() )
{
    System.out.println(m.group()); // the whole key ('key = value')
    System.out.println(m.group(1)); // value only
}

Output (using your html part):

key="pqRjnA"
pqRjnA

The Kotlin question is marked as duplicate and is directed to this question.
So, here is how I did that with Kotlin:

val (key, value) = document
    .select("script")
    .map(Element::data)
    .first { "key" in it } // OR single { "key" in it }
    .split("=")
    .map(String::trim)
val pureValue = value.replace(Regex("""["';]"""), "")
println("$key::$pureValue") // key::pqRjnA

Another version:

val (key, value) = document
    .select("script")
    .first { Regex("""key\s*=\s*["'].*["'];""") in it.data() }
    .data()
    .split("=")
    .map { it.replace(Regex("""[\s"';]"""), "") }
println("$key::$value") // key::pqRjnA

Footnote

To get the document you can do this:

  • From a file:
    val input = File("my-document.html")
    val document = Jsoup.parse(input, "UTF-8")
    
  • From a server:
    val document = Jsoup.connect("the/target/url")
        .userAgent("Mozilla")
        .get()
    

本文标签: javaParse JavaScript with jsoupStack Overflow