Feature request: Extract links as the user scroll down (general-purpose, any site)
I understand what you mean, but that would take some good amount of time to code, plus I think that there's a better way.
1 make a button to call this "function" document.querySelectorAll("a[href*='http']").href
2 add that to an array or inside a for, then display all urls on that page on the browser console or somewhere else on that page.
3 Even if the page auto loads something else, just click on that button to pick up all the new links loaded
like this
for (var i =document.querySelectorAll("a[href*='http']").length; i--;)
{
console.log(document.querySelectorAll("a[href*='http']")[i].href);
}
or add the links to an array
var animeidResult = []; //Creates a new blank array
var animeid = document.querySelectorAll("a[href*='http']");
for (var i = 0; i < animeid.length; i++) { //Starts the for condition
animeidResult.push(animeid[i].href); //Add The animeid To The Array
}
animeidResult.join('\n'); //display the links
I prefer semi-automated-like method (not having to press a button), without having to press a button, like as I surf, it autoextracts links.
Then add the codes I did inside a function and call that function as you scroll the website
window.addEventListener('scroll',FunctionName);
I chose the first one, this wasn't so bad:
window.addEventListener('scroll',ExtractLinks); function ExtractLinks() { for (var i =document.querySelectorAll("a[href*='http']").length; i--;) { console.log(document.querySelectorAll("a[href*='http']")[i].href); } }
Wait a minute, it is only getting some of the links. It does not work with relative paths (a href="../"), let me edit that...
Ah, there we go:
window.addEventListener('scroll',ExtractLinks); function ExtractLinks() { for (var i =document.querySelectorAll("a[href]").length; i--;) { console.log(document.querySelectorAll("a[href]")[i].href); } }
Ack that is spitting out duplicates but not all the links.
@hacker09 IDK how selectors work, what I'm testing this on github's search page, and have issues where it: it extracts the some links but ignore others:
window.addEventListener('scroll',ExtractLinks); function ExtractLinks() { for (var i =document.querySelectorAll("a[href*='http']").length; i--;) { console.log(document.querySelectorAll("a[href*='http']")[i].href); } for (var i =document.querySelectorAll("a[href*='/']").length; i--;) { console.log(document.querySelectorAll("a[href*='/']")[i].href); } }
I suck at programming these types of things
Ok, I think this works:
window.addEventListener('scroll',ExtractLinks); function ExtractLinks() { for (var i =document.querySelectorAll("a").length; i--;) { console.log(document.querySelectorAll("a")[i].href); } for (var i =document.querySelectorAll("img").length; i--;) { console.log(document.querySelectorAll("img")[i].src); } }
If you want this script to work only on github, you can use and add .match on
console.log(document.querySelectorAll("a")[i].href);
console.log(document.querySelectorAll("img")[i].src);
But you need to make the regex for links like
https://github.com/Username/RepositoryName
https://github.com/Username1/RepositoryName2
https://github.com/hacker09/Scripts123
If you don't know regex, ask here https://webchat.freenode.net/#regex
Then it's pretty easy to change the links of the Github repository URL format:
https://github.com/Username/RepositoryName
To download the files links: https://github.com/Username/RepositoryName/archive/master.zip
^That redirects to links like: https://codeload.github.com/Username/RepositoryName/zip/master.zip
Crud, when tested on tumblr, the images aren't logged. Test: https://kujoushino.tumblr.com/post/619891748515266560
deleted
I said tumblr, not twitter.
deleted
add this to your function,but this is returning the images in the reverse order, I'm not sure why
for (var i =document.querySelector("div[class*='photoset'] > iframe").contentDocument.querySelectorAll("img").length; i--;)
{
console.log(document.querySelector("div[class*='photoset'] > iframe").contentDocument.querySelectorAll("img")[i].src);
}
Uhh, just today, it is no longer logging in the console log.
EDIT: Nevermind, was using the mobile twitter URL.
I'm not good with advanced programming (only the basics of JS), but anyway, can someone make a general-purpose version of this: https://greasyfork.org/en/discussions/requests/57590-extract-links-to-tweet-and-media-on-twitter-as-you-scroll-down
Extracts not only links to external sites, but also links to images, and any external files.
This is also very useful for infinite-scrolling AND pagination, because of how console logs can persist as you navigate to another page. This certainly will help saving any pages to the WBM. I really hate having to only extract links on a page 1-at-a-time, and would like it so that it auto-extracts as I go from page to page. It should execute on page load and when scrolling in case of JS-loaded content (otherwise it only grab links that are loaded from the HTML but not when generated by code)
I wanted to automate saving github pages. Thanks to the URL format, I don't have to go to each repository pages just to get the download link: