Google SERP Scraper

Scrape Google SERP results. View, filter, and export (JSON, CSV, MD, URLs). Configurable.

Auteur
StonedKhajiit
Installations quotidiennes
1
Installations (total)
2
Notes
0 0 0
Version
0.1.0
Créé
16/05/2025
Mis à jour
16/05/2025
Taille
185 ko
Licence
MIT
S'applique à

Extract structured data from Google Search Engine Results Pages (SERPs) with a straightforward and configurable user interface. This script allows you to manually scrape search results, view them in JSON or a filterable list with a preview, and export them in various formats (JSON, CSV, Markdown, URLs).

Features:

  • Manual Scraping: Click a button to scrape results from the currently loaded Google SERP.
  • Floating UI Panel: A draggable, minimizable, and maximizable panel for all operations.
  • Dual View Modes:
    • JSON View: Inspect raw scraped data.
    • List View: Browse results in a list with an integrated preview pane for details.
  • Filtering: Filter scraped results by keywords across multiple fields.
  • Multiple Export Options:
    • Copy to Clipboard: JSON, URL List, Markdown.
    • Download as File: JSON, CSV, URL List (.txt), Markdown (.md).
  • Configurable Settings:
    • Custom Title Selector: Define your own CSS selector for result titles, with a built-in Selector Tester to verify and preview matches directly on the page.
    • Data Fetching Control: Choose which data fields to extract (Title, URL, Site Name, Breadcrumbs, Description, Highlighted Keywords, Date Info).
    • Export Field Customization: Select specific fields to include in CSV and Markdown exports.
    • UI Preferences: Dark Mode, show/hide preview pane, filter area, download actions.
    • Highlighting Options: Temporarily highlight scraped items on the page or highlight a selected list item on the page.
    • Debug mode and output options.
  • Context Menu for List Items: Right-click on a result in the list view for quick actions (copy specific fields, open URL, highlight on page).
  • URL Decoding: Automatically decodes URLs, including Punycode for internationalized domain names (IDNs).
  • Date Parsing: Attempts to parse various date formats (relative and absolute) from result descriptions.

How to Use:

  1. After installing the script, a floating panel will appear on Google search results pages.
  2. Click the "Scrape Page" button to start scraping the currently visible results.
  3. Use the filter input, view toggle buttons, and copy/download links to manage your data.
  4. Click the gear icon (⚙️) in the panel's title bar to access detailed settings.

Important Notes:

  • This script does not automatically handle infinite scroll or paginated results. Scraping is performed on the content currently loaded on the page.
  • To scrape multiple pages or a long infinite scroll page at once:
    1. It is strongly recommended to use a browser extension like Infy Scroll (https://github.com/sixcious/infy-scroll) to automatically load all desired results onto a single page first.
    2. Once all content is loaded by Infy Scroll (or manually scrolled/navigated), then use this script's "Scrape Page" button to process all the loaded results.
  • Google's page structure can change, which might affect the script's ability to find results. If the script stops working correctly, try adjusting the "Title Element Selector" in the settings or report an issue.
  • This script is primarily tested and optimized for Google search sites in English, Japanese, and Traditional Chinese, especially concerning date parsing and some internal text indicators (like "Related Questions"). While it may work on other languages/regions, full functionality is not guaranteed.

Acknowledgements / Third-Party Code

  • This script includes an implementation of Punycode.js for decoding internationalized domain names (IDNs). The original Punycode.js library was created by Mathias Bynens and is typically available under the MIT license. You can find more information about the original library at https://github.com/mathiasbynens/punycode.js.