Discussions » Creation Requests

Extract links on page

§
Posted: 2014-05-27

Extract links on page

Can someone make a script that extracts all links (or ones that follow a filter) from a page?

Something like

https://addons.mozilla.org/en-us/firefox/addon/link-gopher/

Now you might say "Why not just use that?" Well because it doesn't always work.

§
Posted: 2014-05-27

What would you want the output to look like?

The shortest path to a collection of links is: https://developer.mozilla.org/en-US/docs/Web/API/document.links

There may also be URLs launched using scripts associated with various elements in the page. Those are difficult to extract.

§
Posted: 2014-05-28
Edited: 2014-05-28

Well just a blank page with the links in plain text would be fine with me (like the extension itself does)

Basically if i can see the links on the page i want to be able to extract them

§
Posted: 2014-05-28

Hmm, I found an old bookmarklet I created 8 years ago. Maybe this will tide you over?

To install a bookmarklet, copy the code, then right-click your Bookmarks Toolbar and choose New Bookmark, paste the code in Location, and give it a name like ListLinks.

javascript:function%20fwbr(str){return%20str.replace(/([&\?\*=])/g,"$1<wbr>");};loc=fwbr(location.href);x=document.getElementsByTagName("A");w=document.getElementsByTagName("AREA");y=window.open();y.document.write("<html><head><title>Links!</title><style>p{line-height:1.2em}%20a{text-decoration:none;border-bottom:1px%20dotted%20blue}%20img{border:none}</style></head>\n<body><h3>Anchor%20(&lt;A&gt;)%20Links%20from<br>"+loc+"</h3>\n");for(n=0;n<x.length;n++){if(x[n].href!=""){y.document.write("<p><a%20href=\""+x[n].href+"\">");if(x[n].textContent.replace(/\s+/,"").length<1){if(x[n].childNodes.length>0){for(j=0;j<x[n].childNodes.length;j++){if(x[n].childNodes[j].nodeName=="IMG"){y.document.write("<img%20src=\""+x[n].childNodes[j].src+"\"%20alt=\""+x[n].childNodes[j].alt+"\">");if(x[n].childNodes[j].alt!="")y.document.write("<br><em>Alt%20Text:</em>%20"+x[n].childNodes[j].alt);if(x[n].childNodes[j].title!="")y.document.write("<br><em>Title%20Tip:</em>%20"+x[n].childNodes[j].title);y.document.write("</a>");%20break;}else%20if(j+1==x[n].childNodes.length)%20y.document.write("[Background%20Image%20or%20Color%20Only]</a>");}}else%20y.document.write("[Background%20Image%20or%20Color%20Only]</a>");}else%20y.document.write(x[n].textContent.replace(/^\s+/,"")+"</a>");y.document.write("<br>\n"+fwbr(x[n].href));%20if(x[n].onclick){y.document.write("<br>\n(onclick%20event%20handler%20not%20shown)");};y.document.write("</p>\n");}}if(w.length>0){y.document.write("<h3>Image%20Map%20(&lt;AREA&gt;)%20Links%20from<br>"+loc+"</h3>\n");for(n=0;n<w.length;n++){if(w[n].href!="")%20y.document.write("<p><a%20href=\""+w[n].href+"\">"+fwbr(w[n].href)+"</a></p>");}}%20y.document.write("</body></html>");y.document.close();void%200;

As for doing a userscript, what kind of user interface are you looking for? It's not so convenient to use the "monkey menu" and I am not aware of a way to integrate with the right-click context menu. That leaves the option of adding a button into the page or defining a keyboard shortcut (with the potential for a conflict if it's not unique).

§
Posted: 2014-05-29

Well in the extension there's a button on the add-on bar that you right click and it gives you the option to extract all links or do it by a filter.

Is it possible to do that with a greasemonkey script?

§
Posted: 2014-05-29
Is it possible to do that with a greasemonkey script?

It is possible, but not necessary (IMO;)

§
Posted: 2014-05-29

I do not think user scripts can create their own toolbar buttons. On the Greasemonkey button drop-down, you will find "User Script Commands" if a script is enabled for the current page. So it's buried a bit. Still, if you do not need it very often, that might be convenient enough.

§
Posted: 2014-05-29

By the way, this extension was updated on May 19th to version 1.3.3, so if you have an older version, you might want to test the current one.

§
Posted: 2014-06-01
Edited: 2014-06-01

i tested the new version, same problems

i wish it worked properly because it has the exact functionality i want.

§
Posted: 2014-06-01
i tested the new version, same problems

i wish it worked properly because it has the exact functionality i want.

You did not tell what problems you are having. You mite be better of contacting the addons' author and open an issue about your problems.

§
Posted: 2014-06-06

The problem is it doesn't always extract the links

One instance would be when the links are inside of a [code] tag on a vbulletin forum, it just completely ignores them.

Post reply

Sign in to post a reply.