Discussions » Development

How to make multiple web.archive.org/save requests at once?

Posted: 2023-12-02
Edited: 2023-12-02

I have read their old API documentation, searched online, and also analyzed a couple of other scripts that do the same thing, but they all only make a single fetch request, not multiple ones like I am trying to do. Is there a way to do this, or an API limitation I don't know of? Does the "concurrent captures limit (limit=3)", mean that I can only save 3 pages per minute?

Below is what I found out about their API.

Anonymous users have lower concurrent captures limit (limit=3) compared to authenticated users (limit=5). The limit of daily captures for anonymous users is 5k. The size of screenshots is limited to 4MB. Bigger screenshots are not allowed due to system overload. If a target site returns HTTP status=529 (bandwidth exceeded), we pause crawling that for an hour. If a target site returns HTTP status=429 (too many requests), we pause crawling that for a minute. All requests for the same host in that period get a relevant error message. Previously, we started these captures later, adding a delay of 20-30sec.

    //Test URL https://myanimelist.net/profile/hacker09
    //fix HTTP 429 Too Many Requests
    async function SaveToIA() {
      const urls = [ //fix HTTP 429 Too Many Requests
      const SaveMALPages = await Promise.all(urls.map(async (url, index) => {
        return new Promise(resolve => {
          setTimeout(() => {
              console.log('request sent Pages Fetched' + index+1)
              url: `https://web.archive.org/save/${location.host}/`+url,
              headers: {
                "content-type": "application/x-www-form-urlencoded"
              method: 'GET',
              onload: function(response) {
              console.log('request made Pages Fetched' + index+1)
                if (response.status === 200) {
                  console.table([{'Pages Fetched': index+1, "Archived page!": `https://web.archive.org/save/${location.host}/`+url, "Saved in": response.finalUrl}])
                } else {
                  console.table([{'Pages Fetched': index+1, "Archiving failed for page": `https://web.archive.org/save/${location.host}/`+url, "Status": response.status}]);
                  if (response.status === 429) {
                    console.error('Cool down! The I.A. is being time rate limited!!')
          }, index * 10000);

      await fetch(`https://api.allorigins.win/raw?url=https://anime.plus/${username}/queue-add`);
      await fetch(`https://api.allorigins.win/raw?url=https://www.mal-badges.com/users/${username}/update`);

      const profileTextResponse = await (await fetch('https://myanimelist.net/editprofile.php')).text();
      const profileDocument = new DOMParser().parseFromString(profileTextResponse, 'text/html');
      GM_setValue("ProfileBBCodes", profileDocument.querySelectorAll("textarea")[1].value);
      console.log('program complete'); //close(); //Close the actual tab

Posted: 2023-12-02
Edited: 2023-12-02

Google Docs for regstuff/Wayback Machine SPN2 API Docs

You can also refer this one which is doing the same thing as you. But you need to use translator to read it. https://qiita.com/yuki_2020/items/73307ddb2d286d79a5a9

Posted: 2023-12-02


Well I am pretty sure that it would capture way too much trash that I don't care about and it would make the program take much longer to complete too, so I don't think that it would work.

Post reply

Sign in to post a reply.