Greasy Fork is available in English.

DownloadAllContent

Fetch and download main content on current page, provide special support for chinese novel

Versión del día 22/12/2021. Echa un vistazo a la versión más reciente.

Autor
hoothin
Calificaciones
0 0 0
Versión
2.5.9
Creado
23/11/2016
Actualizado
22/12/2021
Compatibilidad
Compatible con Firefox Compatible con Chrome Compatible con Opera Compatible con Safari
Licencia
MIT
Funciona en
Todos los sitios

Lightweight crawling script, used to download the main content of the webpage, theoretically suitable for any non-Ajax writing novel website, forum, etc. without writing any rules for this

The script will automatically retrieve the main content on the page and download it.

If you are in the novel directory page, it will traverse all the chapters and sort them and save them as TXT files.

Script Github

Buy me a coffee if it helps with PayPal.Me


Operation Instructions-Usage

  • Open the novel catalog page or forum content page
  • Press CTRL+F9 or click the command menu
  • Press SHIFT+CTRL+F9 to download current single page only
  • About configuration items
    • The following functions need to be entered through the oil monkey command menu
    • Custom download with directory range: such as :https://xxx.xxx/book-[20-99].html,https://xxx.xxx/book-[01- 10].html, which means download book-20.html to book-99.html, and book-01.html to book-10.html, [1-10] does not add 0
    • Custom download via chapter selector: Just enter the css selector of the chapter link to be downloaded, and then you can follow the url replacement code and js code
    • Interference code: fill in the css selector of the interference code, such as .mask,.ksam, which means to delete the element whose class is mask or ksam
    • Reorder by title name: if true, sort all links on the catalog page by title name and save them in txt, otherwise, they will be sorted by page position order
  • Custom example
    1. po18, the chapter selector is .l_chaptname>a, after inputting and downloading, it will be found that the body content cannot be downloaded through the url, the body is Downloaded through articlescontent, then you can follow @@articles@@articlescontent (@@ separated) to replace articles in the chapter url with articlescontent, in summary .l_chaptname>a@@articles@@articlescontent Equipped with the site. The first 'articles' can use regularity, for example, @@articles\d+@@$1content means to replace "articles1", "articles2", etc in the link with "1content" "2content"
    2. pixiv, the chapter selector of the site is main>section ul>li>div>a, no need to replace the link, so after Two items(links&replace) are left blank, there are 6@ after, the content is in the meta, you need to customize the code to extract the content item of the meta-preload data, in summary main>section ul>li>div>a@@@@@@var noval=JSON.parse(data.querySelector("#meta-preload-data").content).novel;noval[Object.keys(noval)[0]].content; You can download pixiv novel with this custom code. "data" means the document of page that get, use data.body.innerText to get text if the api return is text only.

Test webpage-Test case