You will need to install a user script manager extension to install this script.
Hit_scraper with hit export script added CUZ IT'S MORE CONVENIENT! Here's a few guides, one on mturkforum.com and one on mturkgrind.com. There is also a screencast video located here where I ramble on about hit scraper for a time, giving a good overview of all the functionality it offers.
Additionally, there is a script by clickhappier located here that uses scraper's blocklist to block hits on the regular mturk search results interface as well.
v2.0 Major Update
Please see below for updates past v2.0, including graphical changes and other features
I've been doing a lot of work (with copious help from clickhappier and others, and I figure enough's been done to go for a major version release. You'll find the changelog down below, but a major rundown of all features follows.
What is Hit Scraper WITH EXPORT and why should I download it?
Hit Scraper WITH EXPORT (hereafter referred to has HS) at its core is really just a different way of looking at mturk pages. Its purpose was to take the place of several other scripts people were using every day, and to make a unified, easy-to-understand interface that everyone can use with minimal training. That being said, HS still has a ton of features to enhance your turking and make your life a lot easier.
How do I use HS?
To use HS, you need to visit This URL. Bookmark it so you don't forget. If HS doesn't load right away, try refreshing a few times. If it still doesn't load, there might be an issue and I'll try to see if I can figure it out.
When you get to that page, you'll see the main
interface This photo is actually very old. I will be redoing it at some point. This should be pre-populated with some default data...You can start going right away by clicking "Start", or you can customize it as shown below.
|Auto-refresh delay||0||How many seconds will elapse before the page starts scraping again. 0 is manual scrape only. EG 10 = scrape 10 seconds after the last scrape finished|
|Pages to scrape||3||How many pages you want HS to look at. Default is 3 pages|
|Correct for skips||No||If you have a lot of hits on your blocklist, you might end up blocking a lot of hits. "Correct for skips" will search additional pages to "fill up" your results. If correct for skips is off, it will ONLY search the number of pages you select in "pages to scrape"|
|Minimum batch size||100 (not specified)||For searching for batches. This does not matter unless you sort by most available.|
|Minimum reward||None||Minimum dollar reward you want HS to show. EG 1 = don't show hits under $1; .2 = don't show hits under $0.20|
|Qualified||Yes if logged in, No if logged out||If yes, only show hits you're qualified for. If no, show all hits regardless of whether you qualify|
|Masters Require||No||If yes, only show masters hits. If no, show all hits|
|Masters Show||Show||If set to "Show", it will show both masters and non-masters hits (not applicable if you don't have masters and have "qualified" checked). If set to "hide", it will remove masters hits from the results|
|Sort types||Latest||Latest sorts by time created, earliest first. Most available is by number of hits available, most first. Reward is by monetary reward, highest first. Title is alphabetical by title, A first|
|Invert||No||Reverses the order of the sort type. Latest = oldest hits first; Most available = fewest hits available first; Reward = lowest reward first; Title = Z first|
|New HIT Highlighting||300||Hits that are new to the scrape show up in bold. This number determines how long they will remain that way, in seconds.|
|Sound on new hit||No||Play a sound when a new hit is discovered. The sound is only played once for each "screen" of new hits. For example, if two new hits are found in one scrape, the sound will play once. If one of the hits goes away, but the other remains, and it's still new based on the New HIT Highlighting number, the sound will not play because it already has.|
|Ding||Ding||Which sound you want to hear, the old-style "Ding", or the new-style "Squee" best pony approved|
|Sort by TO pay||No||Sorts hits by TO pay with lowest numbers on top, highest numbers on the bottom, and "No TO" requesters on the bottom most. When selected, you get the option to change sort ascend/descending.|
|Sort by TO overall||No||Sorts hits based on Feihtality's TO calculations. They're wizardry, I'm not even really sure how it works. I think it takes all the TO categories, as well as #reviews and weightings into account.|
|Min pay TO||None||Allows you to set a minimum "Pay" TO threshold (0-5). Any hits with a "Pay" TO below that threshold will be hidden. You can click on the "Show hits below TO threshold" button to see them. This button only appears if you're using this option. See important note below.|
|Hide no To||No||Hides requesters who do not have a TO (not recommended) See important note below.|
|Disable TO||No||Turns off TO checking altogether, TO Pay column will report "TO Disabled". Used for when TO is blocked, should speed it up a bit by not querying the TO server. This will invalidate any other TO configurations.|
|Display export buttons||No||Shows/hides different export buttons. If, for example, you only wish to export to IRC, you can only select that button, and it will hide the rest. Note that, if logged out, VB is disabled regardless of whether it is selected or not. See below for an explanation.|
|Search Terms||None||Allows you to search mturk for given terms. This is the same as searching the mturk interface. All results will contain one or more of your terms.|
|Restrict to includelist||No||Allows you to only show requesters on your "include list". You must have an include list set before using this option or you will get no results. It will do normal searches, but any requester NOT on your include list will be ignored.|
|Hide blocklisted||Yes||Enables/disables the blocklist. If you are not using the blocklist, any hits that WOULD have been blocked are outlined in red.|
|Highlight Includelisted||No||Adds a highlight to any requester on your include list even when "use includelist" not checked.|
|Hide Panel||Button||Hides everything above the buttons to give you more room. It's a toggle, so clicking it once will hide, once will show.|
|Edit Blocklist||Button||Opens the blocklist for manual editing if you need to remove a name or something. Blocklist and include list items are delimited by the ^ symbol.|
|Edit Includelist||Button||Opens the include list for manual editing to add or remove requesters. Blocklist and include list are delimited by the ^ symbol|
|Edit Current Theme||Button||Opens the theme editor on the right. Mouseover each box to see what it refers to, click to get a color selector. Also, click "revert to default" to go back to default settings (a rescrape may be required)|
|Settings||Button||Change the way the TO calculations work, enable/disable the blocklist wildcard functionality.|
|Show TO-hidden hits||Hidden Button||See Min pay TO|
|Stopped||Status message||Shows you the status of hit scraper, if it's stopped, scraping, running, waiting, etc|
|Status messages||Status message||Very "dumb" status message indicator attempting to shed some light into why some things work and others don't...Also why hit scraper's doing something it "shouldn't be".|
Some of the elements in the settings list have informative mouseover text as well.
The hit table comes under the status information. It's laid out like so:
|Requester||Requester Page||Shows the requester name and links to their page. R and T buttons allow for blocking Requester and Title respectively||None|
|Title||Hit preview page OR requester page||Shows the hit preview page if one can be created/viewed, OR the requester page if one cannot. Will note if the requester link is substituted. VB and IRC buttons open the hit exporter for forums and IRC respectively||Description of hit|
|Reward||None||Shows how much the hit pays||None|
|HITs Available||None||Shows how many hits are available at the time the page was scraped||None|
|TO pay||Requester TO page||Shows the TO value for "pay" for that requester||Shows all TO ratings, number of reviews, and number of TOS flags for that requester|
|Accept HIT||Requester Preview and Accept (PANDA) page OR requester page||Similarly to the "title", it shows the panda link OR the requester page. See "title" to know if the requester link is substituted||None|
|M?||None||N means a non-masters hit, Y means a masters hit||Shows all qualifications for the hit|
|R||HitDB search for requester OR nothing||If green, you've done a hit that matches that requester name, click it to view. If red, you haven't, and clicking does nothing||None|
|T||HitDB search for title OR nothing||If green, you've done a hit that matches that title, click it to view. If red, you haven't, and clicking does nothing||None|
|Not Qualified||None||Shows hits you are not qualified for. Only shows up if there are non-qual'd hits||None|