Greasy Fork is available in English.

DownloadAllContent

Fetch and download main content on current page, provide special support for chinese novel

Versione datata 29/12/2021. Vedi la nuova versione l'ultima versione.

Autore
hoothin
Valutazione
0 0 0
Versione
2.6.1
Creato il
23/11/2016
Aggiornato il
29/12/2021
Compatibilità
Compatibile con Firefox Compatibile con Chrome Compatibile con Opera Compatibile con Safari
Licenza
MIT
Applica a
Tutti i siti

Lightweight crawling script, used to download the main content of the webpage, theoretically suitable for any non-Ajax novel website, forum, etc. without writing any rules for that.

The script will automatically retrieve the main content on the page and download it.

If you are in the novel catalog page, it will traverse all the chapters, sort and save them as a TXT file.

Buy me a coffee if it helps with PayPal.Me

Script Github

Stream links from cloud storage


Operation Instructions

  • Open the novel catalog page, forum content page or any other page (just like current page).
  • Press CTRL+F9 or click the command menu
  • Press SHIFT+CTRL+F9 to download current single page (will not fetch catalog) only.
  • About configuration items
    • The following functions need to be entered through the Greasemonkey command menu
    • Custom download with directory range: such as :https://xxx.xxx/book-[20-99].html,https://xxx.xxx/book-[01- 10].html, which means download book-20.html to book-99.html, and book-01.html to book-10.html, [1-10] does not add 0
    • Custom download via chapter selector: Just enter the css selector of the chapter link to be downloaded, and then you can follow the url replacement code and js code
    • Interference code: fill in the css selector of the interference code, such as .mask,.ksam, which means to delete the element whose class is mask or ksam
    • Reorder by title name: if true, sort all links on the catalog page by title name and save them in txt, otherwise, they will be sorted by page position order
  • Custom example
    1. po18, the chapter selector is .l_chaptname>a, after inputting and downloading, it will be found that the body content cannot be downloaded through the url, the body is Downloaded through articlescontent, then you can follow @@articles@@articlescontent (@@ separated) to replace articles in the chapter url with articlescontent, in summary .l_chaptname>a@@articles@@articlescontent Equipped with the site. The first 'articles' can use regularity, for example, @@articles\d+@@$1content means to replace "articles1", "articles2", etc in the link with "1content" "2content"
    2. pixiv, the chapter selector of the site is main>section ul>li>div>a, no need to replace the link, so after Two items(links&replace) are left blank, there are 6@ after, the content is in the meta, you need to customize the code to extract the content item of the meta-preload data, in summary main>section ul>li>div>a@@@@@@var noval=JSON.parse(data.querySelector("#meta-preload-data").content).novel;noval[Object.keys(noval)[0]].content; You can download pixiv novel with this custom code. "data" means the document of page that get, use data.body.innerText to get text if the api return is text only.

Test case