admin管理员组

文章数量:1328002

I am attempting to create a Chrome extension that finds "Sponsored" posts on Facebook and removes them.

While doing this, I noticed this rather bizarre behavior of Google Chrome on Facebook, where certain types of queries for existing elements (in my case document.querySelector('a[href*="/ads/about"]');) would return null. But if you "inspect"-click them (using the Inspect Tool or CTRL+SHIFT+C), they would show up in DevTools, and then running the query again in the console will show the element. Without any scrolling, moving, resizing, or doing anything to the page.

This can easily be replicated using the instructions above, but for the sake of clarity, I made this following video that shows exactly the weird behavior:

Is this some sort of dom-querying caching issue? Have you ever encountered anything similar? Thanks

EDIT: the issue has now been reduced to the query returning null up until the element is hovered, and it's not a DevTools-related issue anymore.

I am attempting to create a Chrome extension that finds "Sponsored" posts on Facebook and removes them.

While doing this, I noticed this rather bizarre behavior of Google Chrome on Facebook., where certain types of queries for existing elements (in my case document.querySelector('a[href*="/ads/about"]');) would return null. But if you "inspect"-click them (using the Inspect Tool or CTRL+SHIFT+C), they would show up in DevTools, and then running the query again in the console will show the element. Without any scrolling, moving, resizing, or doing anything to the page.

This can easily be replicated using the instructions above, but for the sake of clarity, I made this following video that shows exactly the weird behavior:

https://streamable./mxsf86

Is this some sort of dom-querying caching issue? Have you ever encountered anything similar? Thanks

EDIT: the issue has now been reduced to the query returning null up until the element is hovered, and it's not a DevTools-related issue anymore.

Share Improve this question edited Aug 27, 2020 at 18:14 Lajos Arpad 77.2k40 gold badges117 silver badges222 bronze badges asked Aug 27, 2020 at 17:44 iuliuiuliu 7,1658 gold badges51 silver badges72 bronze badges 9
  • 3 "and then running the query again in the console will show the element." sounds like the first time you run the code the element is simply not there and you need to wait for it to be added to the DOM. – VLAZ Commented Aug 27, 2020 at 17:46
  • After seeing the video: are you sure that link doesn't show up on click or longer mouseover or whatever? – VLAZ Commented Aug 27, 2020 at 17:48
  • Perhaps, the div you're trying to get is built using React Portals and placed in another DOM tree. – Pablo Darde Commented Aug 27, 2020 at 17:48
  • @VLAZ I am running the query after like 3-4sec from the actual time I see it on the page... so how could it "not be added to the DOM" yet? – iuliu Commented Aug 27, 2020 at 17:50
  • 1 The Sponsored is a role="button" with tabindex 0, which reloads content on click and hover. You can see it in the network tab as well. The a is simply not there before that. It does not matter whether you hover it with DevTools open or not. – Lain Commented Aug 27, 2020 at 18:07
 |  Show 4 more ments

3 Answers 3

Reset to default 3

As already noticed, the sponsored links are simply not at their position before some mouse event occurs. Once the mouse event occurs, the elements are added to the DOM, supposedly this is how Facebook avoids people crawling it too easily.

So, if you have a quest to find the sponsored links, then you will need to do the following

  • find out what is the exact event which results in the links being added
  • conduct experiments until you find out how you can programmatically generate that event
  • implement a crawling algorithm that does some scrolling on the wall for a long while and then induces the given event. At that point you might get many sponsored links

Note: sponsored links are paid by panies and they would not be very happy if their ad slots are being used up by uninterested bots.

The approach I took to solve this issue is as follows:

// using an IIFE ("Immediately-Invoked Function Expression"):
(function() {
    'use strict';

// using Arrow function syntax to define the callback function
// supplied to the (later-created) mutation observer, with
// two arguments (supplied automatically by that mutation
// observer), the first 'mutationList' is an Array of
// MutationRecord Objects that list the changes that were
// observed, and the second is the observer that observed
// the change:
const nodeRemoval = (mutationList, observer) => {

  // here we use Array.prototype.forEach() to iterate over the
  // Array of MutationRecord Objects, using an Arrow function
  // in which we refer to the current MutationRecord of the
  // Array over which we're iterating as 'mutation':
  mutationList.forEach( (mutation) => {

    // if the mutation.addedNodes property exists and
    // also has a non-falsy length (zero is falsey, numbers
    // above zero are truthy and negative numbers - while truthy -
    // seem invalid in the length property):
    if (mutation.addedNodes && mutation.addedNodes.length) {

        // here we retrieve a list of nodes that have the
        // "aria-label" attribute-value equal to 'Advertiser link':
        mutation.target.querySelectorAll('[aria-label="Advertiser link"]')
          // we use NodeList.prototype.forEach() to iterate over
          // the returned list of nodes (if any) and use (another)
          // Arrow function:
          .forEach(
            // here we pass a reference to the current Node of the
            // NodeList we're iterating over, and use
            // ChildNode.remove() to remove each of the nodes:
            (adLink) => adLink.remove() );
    }
  });
},
      // here we retrieve the <body> element (since I can't find
      // any element with a predictable class or ID that will
      // consistently exist as an ancestor of the ad links):
      targetNode = document.querySelector('body'),

      // we define the types of changes we're looking for:
      options = {
          // we're looking for changes amongst the
          // element's descendants:
          childList: true,
          // we're not looking for attribute-changes:
          attributes: false,
          (if this is false, or absent, we look only to
          changes/mutations on the target element itself):
          subtree: true
},
      // here we create a new MutationObserver, and supply
      // the name of the callback function:
      observer = new MutationObserver(nodeRemoval);

    // here we specify what the created MutationObserver
    // should observe, supplying the targetNode (<body>)
    // and the defined options:
    observer.observe(targetNode, options);

})();

I realise that in your question you're looking for elements that match a different attribute and attribute-value (document.querySelector('a[href*="/ads/about"]')) but as that attribute-value wouldn't match my own situation I couldn't use it in my code, but it should be as simple as replacing:

mutation.target.querySelectorAll('[aria-label="Advertiser link"]')

With:

mutation.target.querySelector('a[href*="/ads/about"]')

Though it's worth noting that querySelector() will return only the first node that matches the selector, or null; so you may need to incorporate some checks into your code.

While there may look to be quite a bit of code, above, unmented this bees merely:

(function() {
    'use strict';

const nodeRemoval = (mutationList, observer) => {
  mutationList.forEach( (mutation) => {
    if (mutation.addedNodes && mutation.addedNodes.length) {
        mutation.target.querySelectorAll('[aria-label="Advertiser link"]').forEach( (adLink) => adLink.remove() );
    }
  });
},
      targetNode = document.querySelector('body'),
      options = {
          childList: true,
          attributes: false,
          subtree: true
},
      observer = new MutationObserver(nodeRemoval);

    observer.observe(targetNode, options);

})();

References:

  • Array.prototype.forEach().
  • Arrow Functions.
  • childNode.remove().
  • MutationObserver() Interface.
  • NodeList.prototype.forEach().

I ran into the same issue on chrome. If it helps anyone, I solved it by accessing the frame by

window.frames["myframeID"].document.getElementById("myElementID")

本文标签: javascriptDocumentquerySelector returns null until element is inspected using DevToolsStack Overflow