Skip to content
Change the repository type filter

All

    Repositories list

    • Arquivo.pt home page web application
      CSS
      GNU General Public License v3.0
      0003Updated Mar 21, 2025Mar 21, 2025
    • Viagens no tempo repository have some demos using a timeline presentation about some institutions.
      HTML
      GNU General Public License v3.0
      1000Updated Mar 10, 2025Mar 10, 2025
    • Browsertrix is the hosted, high-fidelity, browser-based crawling service from Webrecorder designed to make web archiving easier and more accessible for all!
      TypeScript
      GNU Affero General Public License v3.0
      47000Updated Mar 4, 2025Mar 4, 2025
    • Serverless replay of web archives directly in the browser
      TypeScript
      GNU Affero General Public License v3.0
      67000Updated Mar 4, 2025Mar 4, 2025
    • Run a high-fidelity browser-based web archiving crawler in a single Docker container
      TypeScript
      GNU Affero General Public License v3.0
      98000Updated Mar 4, 2025Mar 4, 2025
    • Arquivo.pt Page Search System
      Java
      GNU General Public License v3.0
      2104Updated Feb 25, 2025Feb 25, 2025
    • wombat

      Public
      Wombat.js client-side rewriting library
      JavaScript
      GNU Affero General Public License v3.0
      32000Updated Feb 11, 2025Feb 11, 2025
    • Arquivo.pt's branding customizations for our instance of pywb.
      CSS
      GNU General Public License v3.0
      3204Updated Jan 23, 2025Jan 23, 2025
    • Arquivo.pt main goal is the preservation and access of web contents that are no longer available online. During the developing of the PWA IR (information retrieval) system we faced limitations in searching speed, quality of results, scalability and usability. To cope with this, we modified the archive-access project (http://archive-access.source…
      Java
      GNU General Public License v3.0
      74310214Updated Jan 22, 2025Jan 22, 2025
    • SOLR imagesearch API repository
      Java
      GNU General Public License v3.0
      2132Updated Jan 22, 2025Jan 22, 2025
    • Functional tests developed with selenium framework for Arquivo.pt
      Java
      Apache License 2.0
      4003Updated Jan 22, 2025Jan 22, 2025
    • A High-Fidelity Web Archiving Extension for Chrome and Chromium based browsers!
      TypeScript
      GNU Affero General Public License v3.0
      67000Updated Jan 10, 2025Jan 10, 2025
    • CDXJ Indexing of WARC/ARCs
      Python
      Apache License 2.0
      13000Updated Dec 10, 2024Dec 10, 2024
    • warcio

      Public
      Streaming WARC/ARC library for fast web archive IO
      Python
      Apache License 2.0
      61000Updated Dec 10, 2024Dec 10, 2024
    • Image Search Indexing over web archived images using Apache Solr indexes.
      Java
      GNU General Public License v3.0
      3205Updated Oct 14, 2024Oct 14, 2024
    • The repository consists of a set of scipts used to extract data from APIs. For example, RCAAP API or CienciaVitae API
      Python
      Apache License 2.0
      0000Updated Oct 2, 2024Oct 2, 2024
    • Roff
      GNU General Public License v3.0
      1110Updated Sep 17, 2024Sep 17, 2024
    • Soft 404
      JavaScript
      GNU General Public License v3.0
      0100Updated Jul 26, 2024Jul 26, 2024
    • A wrap of the pywb cdxj-indexer command line tool that offers incremental and parallel indexing of a collection.
      Shell
      Apache License 2.0
      0001Updated Jul 22, 2024Jul 22, 2024
    • The PWA9609 test collection was created to support research on web archive information retrieval (WAIR).
      Python
      GNU General Public License v3.0
      0000Updated Jun 4, 2024Jun 4, 2024
    • It will be a repository containing the code to provide real time analytics of Arquivo.pt Data
      Python
      Apache License 2.0
      0000Updated Mar 12, 2024Mar 12, 2024
    • iauploads

      Public
      Tools to upload Arquivo.pt contents for Internet Archive (IA)
      Roff
      GNU General Public License v3.0
      0000Updated Mar 8, 2024Mar 8, 2024
    • YOLOv4, YOLOv4-tiny, YOLOv3, YOLOv3-tiny Implemented in Tensorflow 2.0, Android. Convert YOLO v4 .weights tensorflow, tensorrt and tflite
      Python
      MIT License
      1.2k000Updated Feb 22, 2024Feb 22, 2024
    • Python
      GNU General Public License v3.0
      0000Updated Feb 22, 2024Feb 22, 2024
    • Repo that collects all scripts to be used to setup SolrCloud for images and text.
      Python
      GNU General Public License v3.0
      0000Updated Feb 8, 2024Feb 8, 2024
    • scripts

      Public
      Scripts for the maintenance of the Portuguese web archive
      Shell
      GNU General Public License v3.0
      0300Updated Jan 16, 2024Jan 16, 2024
    • memorial

      Public
      Redirects to arquivo.pt a couple of vhosts. We call this the memorial service.
      Python
      GNU General Public License v3.0
      1000Updated Jan 10, 2024Jan 10, 2024
    • Patching Arquivo.pt web archived pages using puppeteer to automate browser behavior
      JavaScript
      GNU General Public License v3.0
      0004Updated Dec 11, 2023Dec 11, 2023
    • Repository containing the service to extract URLs from PDFs or Text
      Python
      Apache License 2.0
      1200Updated Nov 14, 2023Nov 14, 2023
    • Arquivo.pt Brozzler Environment
      Shell
      Apache License 2.0
      1000Updated Nov 2, 2023Nov 2, 2023