Discussions » Creation Requests

Proper 'source code' Image Blocker & Replacer

§
Posted: 03 Oktober 2023

Hi everybody, here's a problem, this problem is general, browser agnostic and to my knowledge unsolved with any browser add-on for any browser (discovered User Scripts recently and maybe the solution can be achieved this way (I don't know how to code)), when you use any image blocker add-on or browser integrated image blocker on any website (meaning all images are perceptively blocked) and then you save the page using SingleFile or legacy .maff page archiver or just do a regular single .mhtml file save, you will notice that although images were blocked in page preview, they are now saved in this newly created save file.

I can't find a way to get around it, and images always end up in the saved file. If I go offline, the page won't save, if I resave offline saved file with blocked images again, images end up in a resaved file just like I saved the page online the first time.

I think to resolve this, all image links in page source should be replaced with invalid ones (if you manually change image link to incorrect one while in browsers element Inspector, that specific image will not be saved), for example :
'https://exapmlepage.com/some_image.png' to be changed to
'https://0.0.0.0/exapmlepage.com/some_image.png'
In this way, the original link to the image would be preserved, and the specific image would be easy to retrieve later if needed by predictably correcting the link.

If this could be automated by a script for predictable file types (.jpg, .png, .gif, .mp4 etc. (this should be easily editable)), at least it would become possible to save pages without image files embedded in them.

---

But there is also another problem, although the first solution would enable saving pages without images, when you block the images, the page structure and frames can get messed up, so to fix this (you should think about this as a separate script from the above one, but dealing with the same type of problem in different way), linked images should be exchanged for blank image ones of the same size (for example, a blank Full HD .png image (color doesn't affect its size, so let's say it's 25% gray, 1920x1080px) is only about 400 Byte in file size) so the saved page retaining all those blank images wouldn't be much bigger in file size than one without any images at all, and the page original format and frame structure would remain.

I have seen some web pages using automatic resizing for their hosted images on user request. For instance, link to a certain image on that page would look like this :
'somepage.com/some_image_200x200.png', when you open that image in a New Tab, and change the size numbers '200x200' to any other value, image is automatically resized (image is resized on the server side, since there is no unlimited amount of images hosted of every size on their server, there is only one image). So, when you change image link to  'somepage.com/some_image_16x16.png' or  'somepage.com/some_image_10x10000.png' size of that image is opened in you browser tab.
Using this principle, we could replace all the image links on the website that we want to save with that one image just resized over and over again so that the webpage format is maintained since no image is really missing in its frame.
All images (and videos) of any file type would be replaced with that one file format (.png) since only its size in pixels really matters to not disrupt the webpage's look and not the file type.

Important, even in this case, the original link should be preserved in page source after saving the page. So, our replacement image link should be put in 'img src=' part of the link in the source code, but the original link should be placed as 'data-original=' part of the code, so it can be retrieved later if needed from a saved page.

Here are my two bits about these problems, and I hope someone wants to help or at least give his thoughts and suggestions for remediating this issues.

Best Regards

§
Posted: 20 Oktober 2023

Translation by deepl:.
Need to load an image to know the image width (request block and layout maintenance are not compatible).
If it is an image error, width and height cannot be set.
The placeholder image is to be downloaded.(https://placehold.jp/150x150.png)
I found that blob images are not downloaded.
This is the test code.


javascript: (async () => {
    const targetElements = document.querySelectorAll(`*`);
    for (const [i, elm] of targetElements.entries()) {
        if (elm.style.backgroundImage) {
            elm.dataset.originalBackgroundImage = elm.style.backgroundImage;
            elm.style.backgroundImage = ``;
        }
        if ([`IMG`, `VIDEO`].includes(elm.tagName) && elm.src) {
            elm.dataset.originalSrc = elm.src;
            elm.src = await createImageBlobURL(elm.width, elm.height);
        }
    }

    async function createImageBlobURL(width, height) {
        const canvas = document.createElement('canvas');
        canvas.width = width;
        canvas.height = height;
        const ctx = canvas.getContext('2d');
        ctx.fillStyle = 'gray';
        ctx.fillRect(0, 0, width, height);
        const mime = `image/jpeg`;
        const quality = 1.0;
        const blobURL = await new Promise(
            resolve => canvas.toBlob(
                (blob) => resolve(URL.createObjectURL(blob)),
                mime,
                quality
            )
        );
        return blobURL;
    }
})()

Post reply

Sign in to post a reply.