Greasy Fork is available in English.

Обговорення » Розробка

Extract certain data from a webpage as it loads

§
Опубліковано: 28.09.2017
Edited: 28.09.2017

Extract certain data from a webpage as it loads

Hi,

I am making a script to extract ALL ASIN from a page like below

https://www.amazon.com/s/ref=srstprice-desc-rank?fst=as%3Aoff&rh=n%3A3760911%2Cp85%3A2470955011%2Cp6%3AATVPDKIKX0DER%2Cn%3A!11055981%2Cn%3A3777891&qid=1506623832&bbn=11055981&sort=price-desc-rank

I have tried

// ==UserScript==
// @name         v1script
// @namespace   v1script
// @version      0.1
// @description  asinscriptv1
// @author       me
// @include     *//*amazon.*/*
// @require      https://ajax.googleapis.com/ajax/libs/jquery/2.1.0/jquery.min.js
// ==/UserScript==

(function() {
 'use strict';

 function clog(x){console.log(x);}

 var ids = [];
 $(".Container").each(function(){
  var $me = $(this),
      $btn = $me.find("[data-result-rank=0]").first(),
      $asin = $btn.data("asin")  ;
     ids.push(asin.value());
 clog(response);

 "ids="+ids.join(","),
 clog(ids);

 });


})();

but i am not able to get the value for each "data-asin" on the page.

wOxxOmMod
§
Опубліковано: 28.09.2017
Edited: 28.09.2017
  • $(".Container") returns no elements for me.
  • [data-result-rank=0] is not a valid selector, the value should be quoted.
  • There's only one element with data-result-rank=0 so you don't need =0 part if you want all ASIN.

The entire code:

// ==UserScript==
// @name    ASIN scraper
// @match   https://www.amazon.com/*
// ==/UserScript==

const asins = [...document.querySelectorAll('[data-asin]')].map(el => el.dataset.asin);
console.log(asins.join(','));

To process page number navigation at the bottom of the amazon page, use a MutationObserver, because the page is changed in-place without being reloaded:

// ==UserScript==
// @name    ASIN scraper
// @match   https://www.amazon.com/*
// @require https://greasyfork.org/scripts/12228/code/setMutationHandler.js
// ==/UserScript==

setMutationHandler({
  selector: '[data-asin]',
  processExisting: true,
  handler: elements => {
    const asins = elements.map(el => el.dataset.asin);
    console.log(asins.join(','));
  },
});
§
Опубліковано: 29.09.2017

The first code is working perfectly.

But when we navigate between page numbers it doesn't capture the ASIN, tried using "MutationObserver" but still it can't get the value for each "data-asin"

Further I forgot to add in my question, by current code we need to manually click page-1, page-2, page-3 etc to get the ASIN, I need this to be done with this script itself like "click" Next Page (selector "pagnNextString") get ALL ASIN,[copy ASIN to clipboard or store somewhere] again click Next Page and get ASIN and so on ...

wOxxOmMod
§
Опубліковано: 29.09.2017

In that case you'd be better off using amazon search API.

§
Опубліковано: 29.09.2017

i am looking to do this via greasemonkey , with current code I can get ALL ASIN as required just needed to click " Next Page" with code and extract ASIN for other page however "MutationObserver" isn't working so stuck.

§
Опубліковано: 30.09.2017

@wOxxOm ?

wOxxOmMod
§
Опубліковано: 30.09.2017

What? I'm not interested.

§
Опубліковано: 01.10.2017
Edited: 01.10.2017

thats fair @wOxxOm , I cannot force you :smile: I am trying hard to get my code working correctly, Thanks

§
Опубліковано: 01.10.2017
Edited: 02.01.2018

無言

Опублікувати відповідь

Sign in to post a reply.