Paper Clip (Save as HTML, Markdown and Text)

Edit and save selection as clean HTML, Markdown or Text file optimized for printing. Hotkey: Command + Shift + S.

Author
schimon
Daily installs
0
Total installs
113
Ratings
1 0 0
Version
24.01.29
Created
2023-05-10
Updated
2024-01-29
License
MIT
Applies to
All sites

📎 Paper Clip

Save selected content to a clean HTML, Markdown or Plain Text file.

This userscript saves the common (root) HTML element, of given selection, and all of its child elements into a printing-optimized and valid (x)XHTML, Markdown or Text file. This program is useful in cases you want a tightly and quick reference without extra resources and media.


Features

  • Cut text;
  • Edit text as you type;
  • Instant save (no waiting time);
  • Remove all stylesheets and potential distractions;
  • Optimized for annotation, notes and printing;
  • Easy to manipulate XHTML, Markdown or Text;
  • Strip attributes of tags;
  • Omit privacy compromising contents (frames, media, scripts etc.);
  • Media urls are kept as references in Site Information (see example below);
  • Resulted filesize is as small as can be.

Work in Progress

  • Send text annotations via Email, Jabber/XMPP.
  • IRC and Matrix buttons will work on next update.

Motivation

This script was written because of the following reasons:

  • Some save-page extensions are not available to Falkon Web Browser.
  • The maintained save-page extensions are designed to save large portions of a webpage in order to make an authentic copy, hence unnecessary data (e.g. css, fonts, images etc.) is pulled, and may cause to a resulted file of 500KB sized up to 5MB sized and above.
  • FocusWriter Word Processor ignores hyperlinks, hence copy and paste task has to be made in a meticulous manner, which is both time consuming and might not always be accurate all the time.
  • LibreOffice takes time to load, so the copy and paste task might take between 30 to 60 seconds.
  • ODT files are often larger than an average subject HTML file.

Comparison

Tested on this page. This is a comparison of common available ways (default behaviour):

Software Complete Page Selection
Save HTML - 14 KiB 4 KiB
Wget - 24 KiB -
Falkon 350 KiB 24 KiB -
Writer (ODT) - 33 KiB 32 KiB
SingleFile 600 KiB - 580 KiB
Save Page WE 1.3 MiB - -

Example Site Information

Tag Value
url https://www.corbettreport.com/5thgen/
date Tue May 09 2023 15:29:39 GMT+0200
creator i2p.schimon.paperclip
user-agent Mozilla/5.0 (Wayland; Linux postmarketOS) Falkon/23.04.3
content-type-sourced text/html
charset-sourced UTF-8
viewport-imported width=device-width,initial-scale=1
description-imported We are in the middle of a world-changing war. . . .
generator-imported Publii Open-Source CMS for Static Site
extracted-media-audio https://www.corbettreport.com/mp3/episode441_5th_gen.mp3?_=1
extracted-media-iframe https://odysee.com/$/embed/@corbettreport:0/ep441-5thgen:0

Recommended Userscripts

📜 DownloadAllContent

Fetch and download main content on current page, provide special support for chinese novel.

✍️ Edit Text on Webpage

Enable or disable the ability to edit any text on a given webpage by clicking a button in the top right corner of the screen.

🧼 WebEraser

Erase parts of any webpage --annoyances, logos, ads, images, etc., permanently with just, Ctrl + Left-Click.


Upcoming changes

  • Aggregate mode; select part by part and save them all at once into a document; toolbar always shown as long as mode isn't done or cancelled.
  • Editable elements element.contentEditable = "true";
  • Remove elements;
  • Convert embedded elements (e.g. iframe) to links; (cancelled)
  • Save links of embedded elements (e.g. iframe) to meta tags; (cancelled)
  • Omit elements with no content (e.g. <tag></tag>); (done)
  • Save images; (cancelled. maybe relevant to HTMLZ)
  • Multiple modes (Markdown, PDF, Screenshot);
  • Save to HTMLZ;
  • Bookmarklet.

🦅 Designed for Falkon web browser

📱Designed for postmarketOS Linux