Discussions » Development

Extract certain data from a webpage as it loads

§
Posted: 2017-09-28
Edited: 2017-09-28

Extract certain data from a webpage as it loads

Hi,

I am making a script to extract ALL ASIN from a page like below

https://www.amazon.com/s/ref=srstprice-desc-rank?fst=as%3Aoff&rh=n%3A3760911%2Cp85%3A2470955011%2Cp6%3AATVPDKIKX0DER%2Cn%3A!11055981%2Cn%3A3777891&qid=1506623832&bbn=11055981&sort=price-desc-rank

I have tried

// ==UserScript==
// @name         v1script
// @namespace   v1script
// @version      0.1
// @description  asinscriptv1
// @author       me
// @include     *//*amazon.*/*
// @require      https://ajax.googleapis.com/ajax/libs/jquery/2.1.0/jquery.min.js
// ==/UserScript==

(function() {
 'use strict';

 function clog(x){console.log(x);}

 var ids = [];
 $(".Container").each(function(){
  var $me = $(this),
      $btn = $me.find("[data-result-rank=0]").first(),
      $asin = $btn.data("asin")  ;
     ids.push(asin.value());
 clog(response);

 "ids="+ids.join(","),
 clog(ids);

 });


})();

but i am not able to get the value for each "data-asin" on the page.

wOxxOmMod
§
Posted: 2017-09-28
Edited: 2017-09-28
  • $(".Container") returns no elements for me.
  • [data-result-rank=0] is not a valid selector, the value should be quoted.
  • There's only one element with data-result-rank=0 so you don't need =0 part if you want all ASIN.

The entire code:

// ==UserScript==
// @name    ASIN scraper
// @match   https://www.amazon.com/*
// ==/UserScript==

const asins = [...document.querySelectorAll('[data-asin]')].map(el => el.dataset.asin);
console.log(asins.join(','));

To process page number navigation at the bottom of the amazon page, use a MutationObserver, because the page is changed in-place without being reloaded:

// ==UserScript==
// @name    ASIN scraper
// @match   https://www.amazon.com/*
// @require https://greasyfork.org/scripts/12228/code/setMutationHandler.js
// ==/UserScript==

setMutationHandler({
  selector: '[data-asin]',
  processExisting: true,
  handler: elements => {
    const asins = elements.map(el => el.dataset.asin);
    console.log(asins.join(','));
  },
});
§
Posted: 2017-09-29

The first code is working perfectly.

But when we navigate between page numbers it doesn't capture the ASIN, tried using "MutationObserver" but still it can't get the value for each "data-asin"

Further I forgot to add in my question, by current code we need to manually click page-1, page-2, page-3 etc to get the ASIN, I need this to be done with this script itself like "click" Next Page (selector "pagnNextString") get ALL ASIN,[copy ASIN to clipboard or store somewhere] again click Next Page and get ASIN and so on ...

wOxxOmMod
§
Posted: 2017-09-29

In that case you'd be better off using amazon search API.

§
Posted: 2017-09-29

i am looking to do this via greasemonkey , with current code I can get ALL ASIN as required just needed to click " Next Page" with code and extract ASIN for other page however "MutationObserver" isn't working so stuck.

§
Posted: 2017-09-30

@wOxxOm ?

wOxxOmMod
§
Posted: 2017-09-30

What? I'm not interested.

§
Posted: 2017-10-01
Edited: 2017-10-01

thats fair @wOxxOm , I cannot force you :smile: I am trying hard to get my code working correctly, Thanks

§
Posted: 2017-10-01
Edited: 2018-01-02

無言

Post reply

Sign in to post a reply.