Fine-grained list of all tools, by content type and function.

Audio

Audio: Access

Tools that facilitate access to digital data by users.

  • Rescarta - The ResCarta Tools software empowers users to create non-proprietary digital objects with LOC standard METS, MODS, MIX and AudioMD metadata from existing TIFF, JPEG, PDF and WAV data through user-friendly interfaces. Digital collections can be created, indexed, displayed and validated using the software. Exports DC, OAI_DC formats for use in OAI/PMH servers.

Audio: Annotation

Tools that facilitate annotation of digital data by users.

  • Clipper - Clipper is a free open-source web application enabling researchers to create and share virtual-clips without altering the original media files

Audio: De-Duplication

Tools that enable the identification and/or removal of duplicate or similar files.

  • XcorrSound - The xcorrSound package compares sound waves using cross correlation.

Audio: Disk Imaging

Tools that enable the capture, viewing or extraction of contents of a disk image (which is a computer file containing the contents and structure of a disk volume or an entire data storage device, such as a hard drive or floppy disk).

  • CDRDAO (CDR Disk At Once) - Cdrdao records audio or data CD-Rs in disk-at-once (DAO) mode based on a textual description of the CD contents.
  • Easy CD-DA Extractor - Easy CD-DA Extractor is CD Ripper, Music Converter, Audio Converter, Metadata Editor, and CD/DVD burning software.
  • Exact Audio Copy - Exact Audio Copy is an audio grabber for audio CDs using standard CD and DVD-ROM drives on Windows only.
  • IsoBuster - Recover data from CD, DVD, BD, HDD, Flash drive, USB stick, media card, SD and SSD.
  • Paranoia - "Use your CDROM drive to read audio tracks.... and have it actually work right!"

Audio: File Format Identification

Tools that enable the automatic identification of the file format of a particular file, typically by examining characteristic codes (often termed file format magic) in the file header.

  • Media conch - Media Conch is a implementation checker, policy checker and fixer for audiovisual files with focus on Matroska, LPCM and FFV1.

Audio: File Format Migration

Tools that support the transformation of data from one file format to another.

  • Audio/Video to WAV Converter - This tool converts audio and video files to WAV format.
  • DBpoweramp Music Converter (dMC) - dBpoweramp Music Converter (dMC) is an audio conversion tool.
  • Easy CD-DA Extractor - Easy CD-DA Extractor is CD Ripper, Music Converter, Audio Converter, Metadata Editor, and CD/DVD burning software.
  • Exact Audio Copy - Exact Audio Copy is an audio grabber for audio CDs using standard CD and DVD-ROM drives on Windows only.
  • FFmpeg - *FFmpeg* is a complete, cross-platform solution to record, convert and stream audio and video.
  • MPG321 - mpg321 is a command-line mp3 player. mpg321 is used for frontends, as an mp3 player and as an mp3 to wave file decoder.

Audio: Metadata Extraction

Tools that support the extraction of metadata from files.

  • BWF MetaEdit - BWF MetaEdit permits embedding, validating, and exporting of metadata in Broadcast WAVE Format (BWF) files.
  • Easy CD-DA Extractor - Easy CD-DA Extractor is CD Ripper, Music Converter, Audio Converter, Metadata Editor, and CD/DVD burning software.
  • Exact Audio Copy - Exact Audio Copy is an audio grabber for audio CDs using standard CD and DVD-ROM drives on Windows only.
  • ExifTool - Properties extraction, identification, metadata editing
  • GetID3() - Extracts technical and embedded descriptive metadata from common multimedia file formats.
  • Mdqc - Tool for managing and comparing digital asset metadata
  • MediaInfo - Supplies technical and tag information about a video or audio file.

Audio: Metadata Processing

Tools that support the processing or management of metadata.

  • BWF MetaEdit - BWF MetaEdit permits embedding, validating, and exporting of metadata in Broadcast WAVE Format (BWF) files.
  • ExifTool - Properties extraction, identification, metadata editing
  • Mdqc - Tool for managing and comparing digital asset metadata
  • XMP metadata support in JabRef - With XMP support the JabRef team tries to bring the advantages of metadata to the world of reference managers.

Audio: Personal Archiving

Tools that support the preservation and archiving of data relating to individuals.

  • Rescarta - The ResCarta Tools software empowers users to create non-proprietary digital objects with LOC standard METS, MODS, MIX and AudioMD metadata from existing TIFF, JPEG, PDF and WAV data through user-friendly interfaces. Digital collections can be created, indexed, displayed and validated using the software. Exports DC, OAI_DC formats for use in OAI/PMH servers.

Audio: Policy

Tools that support the development and management of digital preservation policy.

  • Media conch - Media Conch is a implementation checker, policy checker and fixer for audiovisual files with focus on Matroska, LPCM and FFV1.

Audio: Preservation System

Tools that support the management and preservation of digital resources, typically performing a number of functions across the digital lifecycle such as ingest, storage, preservation action and access.

  • Libsafe - libsafe allows the organizations to create a full OAIS compliant Archive, including active and passive digital preservation workflows and is particularly suited for master image files of digitizing processes.
  • Rescarta - The ResCarta Tools software empowers users to create non-proprietary digital objects with LOC standard METS, MODS, MIX and AudioMD metadata from existing TIFF, JPEG, PDF and WAV data through user-friendly interfaces. Digital collections can be created, indexed, displayed and validated using the software. Exports DC, OAI_DC formats for use in OAI/PMH servers.

Audio: Quality Assurance

Tools that support quality checking of digital resources, identifying damaged, incomplete or low quality data. Typically used to identify damage introduced via processes such as format migration or digitisation.

  • MP3val - MP3val is a small, high-speed, free software tool for checking MPEG audio files' integrity.
  • Mdqc - Tool for managing and comparing digital asset metadata
  • XcorrSound - The xcorrSound package compares sound waves using cross correlation.

Audio: Rendering

Tools that support the rendering of digital resources so they can be viewed, printed, or otherwise accessed by users.

  • VLC Media Player - Cross platform audio and video player based primarily on the libavcodec.

Audio: Validation

Tools that support the validation of digital files, typically against a file format specification.

  • BWF MetaEdit - BWF MetaEdit permits embedding, validating, and exporting of metadata in Broadcast WAVE Format (BWF) files.
  • MP3val - MP3val is a small, high-speed, free software tool for checking MPEG audio files' integrity.
  • Media conch - Media Conch is a implementation checker, policy checker and fixer for audiovisual files with focus on Matroska, LPCM and FFV1.

Binary Data

Binary Data: Binary & Hexidecimal Editing

Tools for viewing and editing of files displayed in different views such as binary, hexadecimal. These are typically known as hex editors.

  • Bless - Bless is a high quality, full featured hex editor.
  • Hex Workshop - The Hex Workshop Hex Editor by BreakPoint Software is a complete set of hexadecimal development tools for Microsoft Windows 2000 and later.
  • WxHexEditor - A free hex editor / disk editor

Binary Data: Forensic

Tools that support forensics related functions.

  • Hex Workshop - The Hex Workshop Hex Editor by BreakPoint Software is a complete set of hexadecimal development tools for Microsoft Windows 2000 and later.

Container

Container: Rendering

Tools that support the rendering of digital resources so they can be viewed, printed, or otherwise accessed by users.

  • 7-Zip - 7-Zip is a file archiver with a high compression ratio
  • Filzip - Filzip offers full support (add and extract) support for ZIP (including Quake III's PK3), BH (BlakHole), CAB (Microsoft Cabinet), JAR (JavaARchive), LHA (LZH), TARand GZIP(TAR.
  • Gzip - gzip produces files with a .gz extension. gunzip can decompress files created by gzip, compress or pack
  • Tar - The Tar program provides the ability to create tar archives, as well as various other kinds of manipulation.
  • WinZip - WinZip is the world's most popular Windows Zip utility for file compression, file sharing, file encryption, and data backup.

Database

Database: Access

Tools that facilitate access to digital data by users.

  • SIARDexcerpt - SIARDexcerpt is a Java-based application that searches and extracts individual records of SIARD files.

Database: File Format Migration

Tools that support the transformation of data from one file format to another.

  • AccessToSiard - A collection of scripts to automatically convert MS Access files to the SIARD format.
  • CHRONOS - Database Retirement, Partial and Ongoing Database Archiving, Application Retirement.
  • CSV2SIARD - A tool to create SIARD containers from CSV files.
  • DANS (Data Archiving and Networked Services) DBF - DANS DBF Library is a Java library for reading and writing xBase database files.
  • DANS MIXED - Migration to Intermediate XML for Electronic Data.
  • Db-preservation-toolkit - Enables conversion between database formats or dumping from live database systems for the purposes of preservation.
  • DeepArc - Intended for preserving web sites from the back-end, this is a database-to-XML curation tool.
  • MIXED (Migration to Intermediate XML for Electronic Data) - MIXED (Migration to Intermediate XML for Electronic Data) is a web service that converts tabular data files such as spreadsheets and databases to the Standard Data Format for Preservation (SDFP), a supplier-independent XML format.
  • RODA DBML - Migrates databases to an XML schema, DBML. Can then provide access by dumping DBML to MySQL and showing it in phpMyAdmin.
  • SIARD Suite - SIARD Suite is a freeware tool for the conversion of contents of relations databases into the SIARD format.

Database: Metadata Processing

Tools that support the processing or management of metadata.

  • NESSTAR - Nesstar suite is an online publishing platform for organisations wishing to share datasets both internally and with the wider web.

Database: Preservation System

Tools that support the management and preservation of digital resources, typically performing a number of functions across the digital lifecycle such as ingest, storage, preservation action and access.

  • KOST-Val - KOST-Val is an open source validator for different file formats (TIFF, SIARD, PDF/A, JP2, JPEG) and Submission Information Package (SIP).
  • XArch - XArch is an archive management system that allows one to create, populate, and query archives of multiple database versions.

Database: Quality Assurance

Tools that support quality checking of digital resources, identifying damaged, incomplete or low quality data. Typically used to identify damage introduced via processes such as format migration or digitisation.

  • KOST-Val - KOST-Val is an open source validator for different file formats (TIFF, SIARD, PDF/A, JP2, JPEG) and Submission Information Package (SIP).
  • SIARD-VAL - SIARD-Val is an open source validator for SIARD files.
  • SIARDexcerpt - SIARDexcerpt is a Java-based application that searches and extracts individual records of SIARD files.

Database: Rendering

Tools that support the rendering of digital resources so they can be viewed, printed, or otherwise accessed by users.

Database: Service

  • NESSTAR - Nesstar suite is an online publishing platform for organisations wishing to share datasets both internally and with the wider web.

Database: Validation

Tools that support the validation of digital files, typically against a file format specification.

  • KOST-Val - KOST-Val is an open source validator for different file formats (TIFF, SIARD, PDF/A, JP2, JPEG) and Submission Information Package (SIP).
  • SIARD-VAL - SIARD-Val is an open source validator for SIARD files.

Database: Web Crawl

Tools that support the capture of data from the world wide web, typically by "crawling" links between resources.

  • DeepArc - Intended for preserving web sites from the back-end, this is a database-to-XML curation tool.

Disk Image

Disk Image: Backup

Tools that support the backing up of digital data to another storage location, typically in a scheduled manner.

Disk Image: Disk Imaging

Tools that enable the capture, viewing or extraction of contents of a disk image (which is a computer file containing the contents and structure of a disk volume or an entire data storage device, such as a hard drive or floppy disk).

  • AFF Open Source Computer Forensics Software - Tools for the creation of disk images, used in conjunction with the AFF open and extensible file format to store disk images and associated metadata.
  • DiscImageChef - Media dump software and disc image manager
  • Disktype - Tool for detecting the content format of a disk or disk image. It knows about common file systems, partition tables, and boot codes.
  • IsoBuster - Recover data from CD, DVD, BD, HDD, Flash drive, USB stick, media card, SD and SSD.
  • KryoFlux - Floppy disk controller software that accompanies a KryoFlux drive
  • Paranoia - "Use your CDROM drive to read audio tracks.... and have it actually work right!"
  • QPxTool - With QPxTool you can measure the quality of CDs and DVDs.

Disk Image: Forensic

Tools that support forensics related functions.

Disk Image: Metadata Extraction

Tools that support the extraction of metadata from files.

  • DiscImageChef - Media dump software and disc image manager
  • Disktype - Tool for detecting the content format of a disk or disk image. It knows about common file systems, partition tables, and boot codes.

Disk Image: Metadata Processing

Tools that support the processing or management of metadata.

  • Gumshoe - Search interface for metadata extracted from forensic disk images.

Disk Image: Preservation System

Tools that support the management and preservation of digital resources, typically performing a number of functions across the digital lifecycle such as ingest, storage, preservation action and access.

Disk Image: Rendering

Tools that support the rendering of digital resources so they can be viewed, printed, or otherwise accessed by users.

  • 7-Zip - 7-Zip is a file archiver with a high compression ratio

Document

Document: Access

Tools that facilitate access to digital data by users.

  • Library of Congress Newspaper Viewer - The Library of Congress Newspaper Viewer is a web application used to ingest and view digitized newspaper pages meeting the National Digital Newspaper Program specification.
  • MPP Viewer - MPP Viewer is a viewer for Microsoft Project files
  • Rescarta - The ResCarta Tools software empowers users to create non-proprietary digital objects with LOC standard METS, MODS, MIX and AudioMD metadata from existing TIFF, JPEG, PDF and WAV data through user-friendly interfaces. Digital collections can be created, indexed, displayed and validated using the software. Exports DC, OAI_DC formats for use in OAI/PMH servers.

Document: Data capture and Deposit

Tools that enable the capture and deposit of data.

  • Tabula - Extract tabular data from PDF files

Document: Decryption

Tools for recovering passwords or unlocking encrypted digital files.

  • Qpdf - QPDF is a command-line program that does structural, content-preserving transformations on PDF files

Document: Dependency Analysis

Tools for identifying essential information that resides externally to a digital object, or for identifying dependent processes such as which DLLs are required by a Windows process.

  • Dependency Discovery Tool - The Dependency Discovery Tool searches through binary office files (.doc, .xls and .ppt) and tries to find any documents or files that are linked to the document.
  • PDF Tools (by Didier Stevens) - Tools for parsing and analysing PDF documents

Document: Encryption Detection

Tools that support the detection of encryption or password protection in files.

  • Apache PDFBox - JAVA PDF library for creation, manipulation, validation and content extraction of PDF documents
  • Apache POI - the Java API for Microsoft Documents - The Apache POI Project's mission is to create and maintain Java APIs for manipulating various file formats based upon the Office Open XML standards (OOXML) and Microsoft's OLE 2 Compound Document format (OLE2).
  • EpubCheck - Validator for EPUB files
  • Flint - Validates a file against a policy, using common validation tools

Document: File Format Identification

Tools that enable the automatic identification of the file format of a particular file, typically by examining characteristic codes (often termed file format magic) in the file header.

  • Officeparser.py - officerparser.py is a python script that parses the format of OLE compound documents used by Microsoft Office applications.

Document: File Format Migration

Tools that support the transformation of data from one file format to another.

  • Antiword - Antiword is a free MS Word reader for Linux and RISC OS.
  • Apache PDFBox - JAVA PDF library for creation, manipulation, validation and content extraction of PDF documents
  • Apache POI - the Java API for Microsoft Documents - The Apache POI Project's mission is to create and maintain Java APIs for manipulating various file formats based upon the Office Open XML standards (OOXML) and Microsoft's OLE 2 Compound Document format (OLE2).
  • Calibre - An e-book management tool, including viewer, migration, and file conversion features among others.
  • Catdoc & xls2csv - catdoc is a program that reads one or more Microsoft Word files and outputs text to standard output.
  • LuraDocument PDF Compressor - LuraDocument PDF Compressor is a document conversion engine.
  • MPP Viewer - MPP Viewer is a viewer for Microsoft Project files
  • PDFTron PDF-A Manager - PDF/A Manager is a PDF/A (ISO 19005) validation and conversion software.

Document: Metadata Extraction

Tools that support the extraction of metadata from files.

  • Apache PDFBox - JAVA PDF library for creation, manipulation, validation and content extraction of PDF documents
  • Apache POI - the Java API for Microsoft Documents - The Apache POI Project's mission is to create and maintain Java APIs for manipulating various file formats based upon the Office Open XML standards (OOXML) and Microsoft's OLE 2 Compound Document format (OLE2).
  • EpubCheck - Validator for EPUB files
  • Exempi - Exempi is a library for handling XMP metadata, based on the Adobe XMP SDK
  • IText - PDF library for manipulation, content extraction and creation
  • ODF Validator - ODF Validator is a tool that validates OpenDocument files and checks them for certain conformance criteria.
  • Officeparser.py - officerparser.py is a python script that parses the format of OLE compound documents used by Microsoft Office applications.
  • PDF Tools (by Didier Stevens) - Tools for parsing and analysing PDF documents
  • Pdftk - PDF manipulation tool
  • Peepdf - peepdf is a Python tool to explore PDF files in order to find out if the file can be harmful or not.
  • Python XMP Toolkit - Library for working with XMP metadata, as well as reading/writing XMP metadata stored in many different file formats
  • Qpdf - QPDF is a command-line program that does structural, content-preserving transformations on PDF files
  • Xpdf - Open source PDF viewer that includes PDF information extractor and font analyzer

Document: Metadata Processing

Tools that support the processing or management of metadata.

  • CSV Validator - Validation of CSV files against user-defined schema
  • Exempi - Exempi is a library for handling XMP metadata, based on the Adobe XMP SDK
  • Python XMP Toolkit - Library for working with XMP metadata, as well as reading/writing XMP metadata stored in many different file formats

Document: Personal Archiving

Tools that support the preservation and archiving of data relating to individuals.

  • Rescarta - The ResCarta Tools software empowers users to create non-proprietary digital objects with LOC standard METS, MODS, MIX and AudioMD metadata from existing TIFF, JPEG, PDF and WAV data through user-friendly interfaces. Digital collections can be created, indexed, displayed and validated using the software. Exports DC, OAI_DC formats for use in OAI/PMH servers.

Document: Preservation System

Tools that support the management and preservation of digital resources, typically performing a number of functions across the digital lifecycle such as ingest, storage, preservation action and access.

  • KOST-Val - KOST-Val is an open source validator for different file formats (TIFF, SIARD, PDF/A, JP2, JPEG) and Submission Information Package (SIP).
  • Libsafe - libsafe allows the organizations to create a full OAIS compliant Archive, including active and passive digital preservation workflows and is particularly suited for master image files of digitizing processes.
  • Rescarta - The ResCarta Tools software empowers users to create non-proprietary digital objects with LOC standard METS, MODS, MIX and AudioMD metadata from existing TIFF, JPEG, PDF and WAV data through user-friendly interfaces. Digital collections can be created, indexed, displayed and validated using the software. Exports DC, OAI_DC formats for use in OAI/PMH servers.

Document: Quality Assurance

Tools that support quality checking of digital resources, identifying damaged, incomplete or low quality data. Typically used to identify damage introduced via processes such as format migration or digitisation.

  • KOST-Val - KOST-Val is an open source validator for different file formats (TIFF, SIARD, PDF/A, JP2, JPEG) and Submission Information Package (SIP).

Document: Rendering

Tools that support the rendering of digital resources so they can be viewed, printed, or otherwise accessed by users.

  • Calibre - An e-book management tool, including viewer, migration, and file conversion features among others.
  • Xpdf - Open source PDF viewer that includes PDF information extractor and font analyzer

Document: Repair

Tools that support the repair of damaged or corrupted data.

  • Apache PDFBox - JAVA PDF library for creation, manipulation, validation and content extraction of PDF documents
  • Pdftk - PDF manipulation tool

Document: Validation

Tools that support the validation of digital files, typically against a file format specification.

  • 3-Heights(TM) PDF Validator - 3-Heights(TM) PDF Validator from PDF-Tools AG.
  • Apache PDFBox - JAVA PDF library for creation, manipulation, validation and content extraction of PDF documents
  • CSV Validator - Validation of CSV files against user-defined schema
  • EpubCheck - Validator for EPUB files
  • Flint - Validates a file against a policy, using common validation tools
  • KOST-Val - KOST-Val is an open source validator for different file formats (TIFF, SIARD, PDF/A, JP2, JPEG) and Submission Information Package (SIP).
  • ODF Validator - ODF Validator is a tool that validates OpenDocument files and checks them for certain conformance criteria.
  • PDF Tools (by Didier Stevens) - Tools for parsing and analysing PDF documents
  • PDFTron PDF-A Manager - PDF/A Manager is a PDF/A (ISO 19005) validation and conversion software.
  • VeraPDF - PDF/A validation tool

EBook

EBook: Encryption Detection

Tools that support the detection of encryption or password protection in files.

  • EpubCheck - Validator for EPUB files
  • Flint - Validates a file against a policy, using common validation tools

EBook: File Format Migration

Tools that support the transformation of data from one file format to another.

  • Calibre - An e-book management tool, including viewer, migration, and file conversion features among others.

EBook: Metadata Extraction

Tools that support the extraction of metadata from files.

EBook: Rendering

Tools that support the rendering of digital resources so they can be viewed, printed, or otherwise accessed by users.

  • Calibre - An e-book management tool, including viewer, migration, and file conversion features among others.

EBook: Validation

Tools that support the validation of digital files, typically against a file format specification.

  • EpubCheck - Validator for EPUB files
  • Flint - Validates a file against a policy, using common validation tools

Email

Email: De-Duplication

Tools that enable the identification and/or removal of duplicate or similar files.

  • Emailchemy - Converts proprietary emails to standard portable formats

Email: File Format Migration

Tools that support the transformation of data from one file format to another.

  • Emailchemy - Converts proprietary emails to standard portable formats
  • WMDecode - WMDecode is used for extracting files from winmail.

Email: File Management

Tools that support general file management activities such as viewing or renaming

  • Emailchemy - Converts proprietary emails to standard portable formats

Email: File Recovery

Tools that support the recovery of data from damaged or corrupted storage devices such as disks.

  • Emailchemy - Converts proprietary emails to standard portable formats

Email: Metadata Extraction

Tools that support the extraction of metadata from files.

  • EPADD - ePADD is a software package developed by Stanford University's Special Collections & University Archives that supports archival processes around the appraisal, ingest, processing, discovery, and delivery of email archives.

Email: Metadata Processing

Tools that support the processing or management of metadata.

  • EPADD - ePADD is a software package developed by Stanford University's Special Collections & University Archives that supports archival processes around the appraisal, ingest, processing, discovery, and delivery of email archives.

Email: Personal Archiving

Tools that support the preservation and archiving of data relating to individuals.

  • Muse - A tool used for personal archiving of email.

Email: Preservation System

Tools that support the management and preservation of digital resources, typically performing a number of functions across the digital lifecycle such as ingest, storage, preservation action and access.

  • EPADD - ePADD is a software package developed by Stanford University's Special Collections & University Archives that supports archival processes around the appraisal, ingest, processing, discovery, and delivery of email archives.
  • GFI MailArchiver - GFI MailArchiver is an email archiving software that is the single solution source for your email management problems on Exchange Server.
  • InBoxer - InBoxer is a next generation email archiving, IM archiving, e-discovery, and policy management system.
  • Proofpoint Enterprise Archive: SaaS Email Archiving - Proofpoint Enterprise Archive is a SaaS email archiving solution that addresses three key challenges—eDiscovery, regulatory compliance and email storage management—without the headaches of managing archiving in-house.

Geospatial

Geospatial: Metadata Processing

Tools that support the processing or management of metadata.

Image

Image: Access

Tools that facilitate access to digital data by users.

  • IIPImage - IIPImage is an advanced high-performance imaging server and client for web-based streamed remote visualization of ultra resolution scientific imagery.
  • Library of Congress Newspaper Viewer - The Library of Congress Newspaper Viewer is a web application used to ingest and view digitized newspaper pages meeting the National Digital Newspaper Program specification.

Image: Data capture and Deposit

Tools that enable the capture and deposit of data.

  • Artivity - A tool for capturing contextual data produced during the creative process of artists and designers while working on a computer.

Image: De-Duplication

Tools that enable the identification and/or removal of duplicate or similar files.

  • Matchbox Tool - Matchbox: Duplicate detection tool for digital document collections.

Image: File Format Migration

Tools that support the transformation of data from one file format to another.

  • EXIF to DC XML normaliser - Extract EXIF data and normalise it to DC XML.
  • ImageMagick - ImageMagick® is a software suite to create, edit, compose, or convert bitmap images.
  • JJ2000 - Pure Java implementation of a JPEG2000 decoder
  • Kakadu - JPEG 2000 SDK, includes encoder/decoder
  • OpenJPEG - The OpenJPEG library is an open-source JPEG 2000 codec written in C language.

Image: Metadata Extraction

Tools that support the extraction of metadata from files.

  • EMET (Embedded Metadata Extraction Tool) - EMET is a stand-alone tool designed to extract metadata embedded in JPEG and TIFF files.
  • EXIF to DC XML normaliser - Extract EXIF data and normalise it to DC XML.
  • Exempi - Exempi is a library for handling XMP metadata, based on the Adobe XMP SDK
  • ExifTool - Properties extraction, identification, metadata editing
  • Jp2StructCheck - Simple JP2 file structure checker
  • Jpylyzer - JP2 validation + properties extraction
  • Mdqc - Tool for managing and comparing digital asset metadata
  • OpenJPEG - The OpenJPEG library is an open-source JPEG 2000 codec written in C language.
  • Python XMP Toolkit - Library for working with XMP metadata, as well as reading/writing XMP metadata stored in many different file formats

Image: Metadata Processing

Tools that support the processing or management of metadata.

  • Exempi - Exempi is a library for handling XMP metadata, based on the Adobe XMP SDK
  • ExifTool - Properties extraction, identification, metadata editing
  • Exiv2 - Exiv2 is a C++ library and a command line utility to manage image metadata.
  • ImageVerifier - ImageVerifier (IV for short) traverses a hierarchy of folders looking for image files to verify. It can verify TIFFs, JPEGs. PSDs, DNGs, and non-DNG raws (e.g., NEF, CR2).
  • Mdqc - Tool for managing and comparing digital asset metadata
  • Python XMP Toolkit - Library for working with XMP metadata, as well as reading/writing XMP metadata stored in many different file formats

Image: Preservation System

Tools that support the management and preservation of digital resources, typically performing a number of functions across the digital lifecycle such as ingest, storage, preservation action and access.

  • KOST-Simy - The KOST-Simy application is used for Compare Images.
  • KOST-Val - KOST-Val is an open source validator for different file formats (TIFF, SIARD, PDF/A, JP2, JPEG) and Submission Information Package (SIP).
  • Libsafe - libsafe allows the organizations to create a full OAIS compliant Archive, including active and passive digital preservation workflows and is particularly suited for master image files of digitizing processes.

Image: Quality Assurance

Tools that support quality checking of digital resources, identifying damaged, incomplete or low quality data. Typically used to identify damage introduced via processes such as format migration or digitisation.

  • AsTiffTagViewer - AsTiffTagViewer is a TIFF Tag Viewer application.
  • Bad Peggy - Scans for damaged images and photos.
  • Checkit tiff - a tool to validate TIFF files against given configuration profile
  • Fingerdet - QA tool for detecting fingers on digitised pages
  • ImageVerifier - ImageVerifier (IV for short) traverses a hierarchy of folders looking for image files to verify. It can verify TIFFs, JPEGs. PSDs, DNGs, and non-DNG raws (e.g., NEF, CR2).
  • Jp2StructCheck - Simple JP2 file structure checker
  • Jpylyzer - JP2 validation + properties extraction
  • KOST-Simy - The KOST-Simy application is used for Compare Images.
  • KOST-Val - KOST-Val is an open source validator for different file formats (TIFF, SIARD, PDF/A, JP2, JPEG) and Submission Information Package (SIP).
  • Matchbox Tool - Matchbox: Duplicate detection tool for digital document collections.
  • Mdqc - Tool for managing and comparing digital asset metadata
  • TIFF-Val - TIFF-Val is an open source validator for TIFF files.

Image: Rendering

Tools that support the rendering of digital resources so they can be viewed, printed, or otherwise accessed by users.

  • ImageMagick - ImageMagick® is a software suite to create, edit, compose, or convert bitmap images.
  • IrfanView - IrfanView is a very fast, small, compact and innovative FREEWARE (for non-commercial use) graphic viewer for Windows 9x, ME, NT, 2000, XP, 2003, 2008, Vista, Windows 7.
  • JJ2000 - Pure Java implementation of a JPEG2000 decoder
  • Kakadu - JPEG 2000 SDK, includes encoder/decoder

Image: Repair

Tools that support the repair of damaged or corrupted data.

  • Fixit tiff - fixes some issues in (potentially) baseline tiffs, as an example, invalid datetime tags, wrong tiff tag order

Image: Validation

Tools that support the validation of digital files, typically against a file format specification.

  • Bad Peggy - Scans for damaged images and photos.
  • Checkit tiff - a tool to validate TIFF files against given configuration profile
  • ImageVerifier - ImageVerifier (IV for short) traverses a hierarchy of folders looking for image files to verify. It can verify TIFFs, JPEGs. PSDs, DNGs, and non-DNG raws (e.g., NEF, CR2).
  • Jp2StructCheck - Simple JP2 file structure checker
  • Jpylyzer - JP2 validation + properties extraction
  • KOST-Simy - The KOST-Simy application is used for Compare Images.
  • KOST-Val - KOST-Val is an open source validator for different file formats (TIFF, SIARD, PDF/A, JP2, JPEG) and Submission Information Package (SIP).
  • TIFF-Val - TIFF-Val is an open source validator for TIFF files.

Image: Workflow and Lab Notebook Management

Tools that support the capture and management of research data as well as the details of the research activities which generated them.

  • Artivity - A tool for capturing contextual data produced during the creative process of artists and designers while working on a computer.

Project Management Data

Project Management Data: Access

Tools that facilitate access to digital data by users.

  • MPP Viewer - MPP Viewer is a viewer for Microsoft Project files

Project Management Data: File Format Migration

Tools that support the transformation of data from one file format to another.

  • MPP Viewer - MPP Viewer is a viewer for Microsoft Project files

Research Data

Research Data: Academic Social Networking

Tools that support making connections, sharing research and maximising the impact of digital data.

  • MyExperiment - myExperiment is an online social networking service aimed at scientific researchers; the site fosters collaboration by allowing members to share scientific workflows, experiment plans, and other digital objects.

Research Data: Access

Tools that facilitate access to digital data by users.

  • Rescarta - The ResCarta Tools software empowers users to create non-proprietary digital objects with LOC standard METS, MODS, MIX and AudioMD metadata from existing TIFF, JPEG, PDF and WAV data through user-friendly interfaces. Digital collections can be created, indexed, displayed and validated using the software. Exports DC, OAI_DC formats for use in OAI/PMH servers.

Research Data: Active Data Storage

Tools that support the storage, management, and ultimately the preservation, of evolving research data.

  • DataStage - DataStage is a flexible data storage system that provides controlled access, secure backup, and the ability to transfer selected files to a more permanent archiving facility.
  • Dataverse - The Dataverse is an open source web application to share, preserve, cite, explore and analyze research data.

Research Data: Annotation

Tools that facilitate annotation of digital data by users.

  • Clipper - Clipper is a free open-source web application enabling researchers to create and share virtual-clips without altering the original media files

Research Data: Backup

Tools that support the backing up of digital data to another storage location, typically in a scheduled manner.

  • Data Vault - A storage broker and front end for archiving research data that is no longer active but that does not have a need for open publication

Research Data: Citation and Impact Tracking

Tools that support the citation of data and the tracking of the impact of usage of that data.

  • DataCite - DataCite works with data centres to assign persistent identifiers to datasets using the Digital Object Identifier (DOI) infrastructure.

Research Data: Data Management Planning

Tools that support the development of research data management plans and related activities.

  • CARDIO - CARDIO is a benchmarking tool for data management strategy development
  • D-Net Software Kit - Software Kit creates a network of repositories that share the infrastructure services necessary to process and provide access to digital content.
  • DMAOnline (Data Management Administration Online) - Provides a single dashboard view of how various departments contribute to RDM activities and how an institution is performing in terms of its compliance with policies
  • DMPTool - DMPTool is an online service to enable researchers to create data management plans now required by many funding agencies, and to receive tailored institutional guidance to help them in the process.
  • DMPonline - DMPonline is the DCC's data management planning tool.

Research Data: Data capture and Deposit

Tools that enable the capture and deposit of data.

  • Artivity - A tool for capturing contextual data produced during the creative process of artists and designers while working on a computer.
  • Tabula - Extract tabular data from PDF files

Research Data: Managing Active Research Data

Tools that enable researchers to manage data from its point of creation, facilitating its productive use in the present, but also establishing the support structures necessary to ensure its future survival.

  • CRunch - cRunch provides an infrastructure for exploratory data analysis with the statistical programming language and environment R
  • D-Net Software Kit - Software Kit creates a network of repositories that share the infrastructure services necessary to process and provide access to digital content.
  • DMPTool - DMPTool is an online service to enable researchers to create data management plans now required by many funding agencies, and to receive tailored institutional guidance to help them in the process.
  • DMPonline - DMPonline is the DCC's data management planning tool.
  • DataCite - DataCite works with data centres to assign persistent identifiers to datasets using the Digital Object Identifier (DOI) infrastructure.
  • DataStage - DataStage is a flexible data storage system that provides controlled access, secure backup, and the ability to transfer selected files to a more permanent archiving facility.
  • Dataverse - The Dataverse is an open source web application to share, preserve, cite, explore and analyze research data.
  • Kepler - Kepler is a scientific workflow modelling and management system that enables users, regardless of programming experience, to set up data analysis pipelines.
  • LabTrove - LabTrove is a blogging platform specifically designed for use in a research environment.
  • MyExperiment - myExperiment is an online social networking service aimed at scientific researchers; the site fosters collaboration by allowing members to share scientific workflows, experiment plans, and other digital objects.
  • Taverna - Taverna is a scientific workflow management system designed to assemble, run, document and share sequences sequences of web services and scripts.

Research Data: Metadata Processing

Tools that support the processing or management of metadata.

  • CSV Validator - Validation of CSV files against user-defined schema
  • NESSTAR - Nesstar suite is an online publishing platform for organisations wishing to share datasets both internally and with the wider web.
  • ReDBox - ReDBox and Mint are two complimentary applications designed to create, store, and provide access to research metadata.

Research Data: Organisational Audit

Tools that that enable an audit of an organisation's capability with respect to preservation, typically relating to a maturity model

  • CARDIO - CARDIO is a benchmarking tool for data management strategy development
  • DMAOnline (Data Management Administration Online) - Provides a single dashboard view of how various departments contribute to RDM activities and how an institution is performing in terms of its compliance with policies
  • Data Asset Framework - The Data Asset Framework (formerly the Data Audit Framework) provides organisations with the means to identify, locate, describe and assess how they are managing their research data assets.
  • OPD for RDM - An RDF based list of basic RDM infrastructure components to make this infrastructure more visible and easier to identify

Research Data: Persistent Identification

Tools that support the unique and persistent identification of files or intellectual entities.

  • DataCite - DataCite works with data centres to assign persistent identifiers to datasets using the Digital Object Identifier (DOI) infrastructure.

Research Data: Personal Archiving

Tools that support the preservation and archiving of data relating to individuals.

  • Rescarta - The ResCarta Tools software empowers users to create non-proprietary digital objects with LOC standard METS, MODS, MIX and AudioMD metadata from existing TIFF, JPEG, PDF and WAV data through user-friendly interfaces. Digital collections can be created, indexed, displayed and validated using the software. Exports DC, OAI_DC formats for use in OAI/PMH servers.

Research Data: Planning

Tools that support the planning of preservation activities.

  • CARDIO - CARDIO is a benchmarking tool for data management strategy development
  • D-Net Software Kit - Software Kit creates a network of repositories that share the infrastructure services necessary to process and provide access to digital content.
  • DMAOnline (Data Management Administration Online) - Provides a single dashboard view of how various departments contribute to RDM activities and how an institution is performing in terms of its compliance with policies
  • DMPTool - DMPTool is an online service to enable researchers to create data management plans now required by many funding agencies, and to receive tailored institutional guidance to help them in the process.
  • DMPonline - DMPonline is the DCC's data management planning tool.
  • Data Asset Framework - The Data Asset Framework (formerly the Data Audit Framework) provides organisations with the means to identify, locate, describe and assess how they are managing their research data assets.

Research Data: Preservation System

Tools that support the management and preservation of digital resources, typically performing a number of functions across the digital lifecycle such as ingest, storage, preservation action and access.

  • Data Vault - A storage broker and front end for archiving research data that is no longer active but that does not have a need for open publication
  • DataFlow - DataFlow is a two-stage data management infrastructure that is designed to allow researchers to work with, annotate, publish, and permanently store research data.
  • DataStage - DataStage is a flexible data storage system that provides controlled access, secure backup, and the ability to transfer selected files to a more permanent archiving facility.
  • Dataverse - The Dataverse is an open source web application to share, preserve, cite, explore and analyze research data.
  • ReDBox - ReDBox and Mint are two complimentary applications designed to create, store, and provide access to research metadata.
  • Rescarta - The ResCarta Tools software empowers users to create non-proprietary digital objects with LOC standard METS, MODS, MIX and AudioMD metadata from existing TIFF, JPEG, PDF and WAV data through user-friendly interfaces. Digital collections can be created, indexed, displayed and validated using the software. Exports DC, OAI_DC formats for use in OAI/PMH servers.
  • The Dataverse Network Project - The Dataverse Network Project is an open-source application for publishing, citing and discovering research data.

Research Data: Service

  • DMPonline - DMPonline is the DCC's data management planning tool.
  • NESSTAR - Nesstar suite is an online publishing platform for organisations wishing to share datasets both internally and with the wider web.

Research Data: Storage

Tools that support the storage of digital resources, possibly in multiple locations to avoid loss of data due to hardware or other failures.

  • Data Vault - A storage broker and front end for archiving research data that is no longer active but that does not have a need for open publication
  • Dataverse - The Dataverse is an open source web application to share, preserve, cite, explore and analyze research data.

Research Data: Validation

Tools that support the validation of digital files, typically against a file format specification.

  • CSV Validator - Validation of CSV files against user-defined schema

Research Data: Workflow

Tools that support the orchestration and management of specific tools or processes in a workflow.

  • MyExperiment - myExperiment is an online social networking service aimed at scientific researchers; the site fosters collaboration by allowing members to share scientific workflows, experiment plans, and other digital objects.
  • Taverna - Taverna is a scientific workflow management system designed to assemble, run, document and share sequences sequences of web services and scripts.

Research Data: Workflow and Lab Notebook Management

Tools that support the capture and management of research data as well as the details of the research activities which generated them.

  • Artivity - A tool for capturing contextual data produced during the creative process of artists and designers while working on a computer.
  • CRunch - cRunch provides an infrastructure for exploratory data analysis with the statistical programming language and environment R
  • Kepler - Kepler is a scientific workflow modelling and management system that enables users, regardless of programming experience, to set up data analysis pipelines.
  • LabTrove - LabTrove is a blogging platform specifically designed for use in a research environment.
  • MyExperiment - myExperiment is an online social networking service aimed at scientific researchers; the site fosters collaboration by allowing members to share scientific workflows, experiment plans, and other digital objects.
  • Taverna - Taverna is a scientific workflow management system designed to assemble, run, document and share sequences sequences of web services and scripts.

Software

Software: Backup

Tools that support the backing up of digital data to another storage location, typically in a scheduled manner.

Software: Disk Imaging

Tools that enable the capture, viewing or extraction of contents of a disk image (which is a computer file containing the contents and structure of a disk volume or an entire data storage device, such as a hard drive or floppy disk).

Software: Emulation

Tools that enable the emulation or virtualisation of a hardware or software system on another system.

  • KEEP Emulation Framework - KEEP Emulation Framework (EF) allows users to view and interact with digital files that otherwise would require obsolete hardware and software.
  • Recompute - Automatically generates "playable" virtual machines from source code on github

Software: Metadata Extraction

Tools that support the extraction of metadata from files.

  • DiscImageChef - Media dump software and disc image manager
  • EXE Explorer - EXE Explorer reads and displays executable file properties and structure.

Software: Preservation System

Tools that support the management and preservation of digital resources, typically performing a number of functions across the digital lifecycle such as ingest, storage, preservation action and access.

Spreadsheet

Spreadsheet: Dependency Analysis

Tools for identifying essential information that resides externally to a digital object, or for identifying dependent processes such as which DLLs are required by a Windows process.

  • Dependency Discovery Tool - The Dependency Discovery Tool searches through binary office files (.doc, .xls and .ppt) and tries to find any documents or files that are linked to the document.

Spreadsheet: File Format Identification

Tools that enable the automatic identification of the file format of a particular file, typically by examining characteristic codes (often termed file format magic) in the file header.

  • Officeparser.py - officerparser.py is a python script that parses the format of OLE compound documents used by Microsoft Office applications.

Spreadsheet: File Format Migration

Tools that support the transformation of data from one file format to another.

  • Lingfo - Lingfo provides a library for developers to use to extract information from Microsoft Excel spreadsheet files.
  • MIXED (Migration to Intermediate XML for Electronic Data) - MIXED (Migration to Intermediate XML for Electronic Data) is a web service that converts tabular data files such as spreadsheets and databases to the Standard Data Format for Preservation (SDFP), a supplier-independent XML format.
  • Ssconvert - ssconvert is a command line utility to convert spreadsheet files between various spreadsheet file formats.

Spreadsheet: Metadata Extraction

Tools that support the extraction of metadata from files.

  • Lingfo - Lingfo provides a library for developers to use to extract information from Microsoft Excel spreadsheet files.
  • Officeparser.py - officerparser.py is a python script that parses the format of OLE compound documents used by Microsoft Office applications.

Spreadsheet: Metadata Processing

Tools that support the processing or management of metadata.

  • CSV Validator - Validation of CSV files against user-defined schema

Spreadsheet: Validation

Tools that support the validation of digital files, typically against a file format specification.

  • CSV Validator - Validation of CSV files against user-defined schema

Video

Video: Annotation

Tools that facilitate annotation of digital data by users.

  • Clipper - Clipper is a free open-source web application enabling researchers to create and share virtual-clips without altering the original media files

Video: Disk Imaging

Tools that enable the capture, viewing or extraction of contents of a disk image (which is a computer file containing the contents and structure of a disk volume or an entire data storage device, such as a hard drive or floppy disk).

  • IsoBuster - Recover data from CD, DVD, BD, HDD, Flash drive, USB stick, media card, SD and SSD.
  • Paranoia - "Use your CDROM drive to read audio tracks.... and have it actually work right!"

Video: File Format Identification

Tools that enable the automatic identification of the file format of a particular file, typically by examining characteristic codes (often termed file format magic) in the file header.

  • Media conch - Media Conch is a implementation checker, policy checker and fixer for audiovisual files with focus on Matroska, LPCM and FFV1.

Video: File Format Migration

Tools that support the transformation of data from one file format to another.

  • Audio/Video to WAV Converter - This tool converts audio and video files to WAV format.
  • FFmpeg - *FFmpeg* is a complete, cross-platform solution to record, convert and stream audio and video.
  • Open Video Converter - This tool is for video conversion, splitting and editing.

Video: Metadata Extraction

Tools that support the extraction of metadata from files.

  • ExifTool - Properties extraction, identification, metadata editing
  • GetID3() - Extracts technical and embedded descriptive metadata from common multimedia file formats.
  • Mdqc - Tool for managing and comparing digital asset metadata
  • MediaInfo - Supplies technical and tag information about a video or audio file.
  • NARA Video Frame Analyzer - NARA Video Frame Analyzer analyzes technical properties of individual frames of a video file in order to detect quality issues within digitized video files.

Video: Metadata Processing

Tools that support the processing or management of metadata.

  • DV Analyzer - DV Analyzer is a technical quality control and reporting tool that examines DV streams in order to report errors in the tape-to-file transfer process.
  • ExifTool - Properties extraction, identification, metadata editing
  • Mdqc - Tool for managing and comparing digital asset metadata

Video: Policy

Tools that support the development and management of digital preservation policy.

  • Media conch - Media Conch is a implementation checker, policy checker and fixer for audiovisual files with focus on Matroska, LPCM and FFV1.

Video: Preservation System

Tools that support the management and preservation of digital resources, typically performing a number of functions across the digital lifecycle such as ingest, storage, preservation action and access.

  • Libsafe - libsafe allows the organizations to create a full OAIS compliant Archive, including active and passive digital preservation workflows and is particularly suited for master image files of digitizing processes.

Video: Quality Assurance

Tools that support quality checking of digital resources, identifying damaged, incomplete or low quality data. Typically used to identify damage introduced via processes such as format migration or digitisation.

  • DV Analyzer - DV Analyzer is a technical quality control and reporting tool that examines DV streams in order to report errors in the tape-to-file transfer process.
  • Mdqc - Tool for managing and comparing digital asset metadata
  • NARA Video Frame Analyzer - NARA Video Frame Analyzer analyzes technical properties of individual frames of a video file in order to detect quality issues within digitized video files.
  • Qctools - Analyse digital video and detect corruption/artefacts

Video: Rendering

Tools that support the rendering of digital resources so they can be viewed, printed, or otherwise accessed by users.

  • VLC Media Player - Cross platform audio and video player based primarily on the libavcodec.

Video: Validation

Tools that support the validation of digital files, typically against a file format specification.

  • Media conch - Media Conch is a implementation checker, policy checker and fixer for audiovisual files with focus on Matroska, LPCM and FFV1.

Video: Web Crawl

Tools that support the capture of data from the world wide web, typically by "crawling" links between resources.

  • TubeKit - TubeKit is a toolkit for creating YouTube crawlers.

Web

Web: Access

Tools that facilitate access to digital data by users.

  • Wayback Machine - The Wayback Machine is a powerful search and discovery tool for use with collections of Web site "snapshots" collected through Web harvesting, usually with Heritrix (ARC or WARC files).

Web: Content Profiling

Tools that build a profile of the characteristics of digital content, typically by combining or analysing a number of sources of information such as extracted metadata and file format identifications.

Web: Discovery

Tools that facilitate the discovery of digital data by users.

  • Wayback Machine - The Wayback Machine is a powerful search and discovery tool for use with collections of Web site "snapshots" collected through Web harvesting, usually with Heritrix (ARC or WARC files).
  • Web Archive Discovery - Indexing and discovery tools for web archives.

Web: File Format Identification

Tools that enable the automatic identification of the file format of a particular file, typically by examining characteristic codes (often termed file format magic) in the file header.

Web: File Format Migration

Tools that support the transformation of data from one file format to another.

  • DeepArc - Intended for preserving web sites from the back-end, this is a database-to-XML curation tool.
  • JWAT - Java Web Archive Toolkit

Web: Fixity

Tools that support the verification of file fixity, typically through the generation and validation of checksum based manifests.

Web: Metadata Extraction

Tools that support the extraction of metadata from files.

  • JWAT - Java Web Archive Toolkit
  • Warctools - Command line tools and libraries for handling and manipulating WARC files (and HTTP contents)
  • Web Archive Discovery - Indexing and discovery tools for web archives.

Web: Metadata Processing

Tools that support the processing or management of metadata.

  • WCT (Web Curator Tool) - Web Curator Tool (WCT) is a workflow management application for selective web archiving.

Web: Quality Assurance

Tools that support quality checking of digital resources, identifying damaged, incomplete or low quality data. Typically used to identify damage introduced via processes such as format migration or digitisation.

Web: Service

  • Archive-It - Archive-It is the leading web archiving service for collecting and accessing cultural heritage on the web. It is a service provided by the Internet Archive.
  • WAS (Web Archiving Service) - The Web Archiving Service (WAS) is a Web-based curatorial tool that enables libraries and archivists to capture, curate, analyze, and preserve Web-based government and political information.

Web: Validation

Tools that support the validation of digital files, typically against a file format specification.

  • JWAT - Java Web Archive Toolkit
  • Warctools - Command line tools and libraries for handling and manipulating WARC files (and HTTP contents)

Web: Web Crawl

Tools that support the capture of data from the world wide web, typically by "crawling" links between resources.

  • Archive-It - Archive-It is the leading web archiving service for collecting and accessing cultural heritage on the web. It is a service provided by the Internet Archive.
  • ArchiveFacebook - ArchiveFacebook is a Firefox extension which allows individuals to save and manage Facebook web content.
  • DeepArc - Intended for preserving web sites from the back-end, this is a database-to-XML curation tool.
  • GNU Wget - Non-interactive network downloader
  • HTTrack - HTTrack is a website copying utility.
  • Heritrix - Heritrix is an open-source web crawler, allowing users to target websites they wish to include in a collection and to harvest an instance of each site.
  • NetarchiveSuite - NetarchiveSuite is a web archiving software package designed to plan, schedule and run web harvests of parts of the Internet.
  • NutchWAX - NutchWAX is software for indexing ARC files (archived Web sites gathered using Heritrix) for full text search.
  • SiteStory - SiteStory is a transactional web archive. It archives resources of a web server it is associated with.
  • Storytracker - Tools for tracking stories on news homepages
  • TubeKit - TubeKit is a toolkit for creating YouTube crawlers.
  • WAS (Web Archiving Service) - The Web Archiving Service (WAS) is a Web-based curatorial tool that enables libraries and archivists to capture, curate, analyze, and preserve Web-based government and political information.
  • WCT (Web Curator Tool) - Web Curator Tool (WCT) is a workflow management application for selective web archiving.
  • WarcManager - The WARC Manager is a web-based UI for managing and querying collections of web crawl data.
  • Warrick - Warrick is a free utility for reconstructing (or recovering) a website from web archives.
  • Wayback Machine - The Wayback Machine is a powerful search and discovery tool for use with collections of Web site "snapshots" collected through Web harvesting, usually with Heritrix (ARC or WARC files).

Web: Web Snapshot

Tools that support the capture of a static snapshot of a web page.

  • Khtml2png - khtml2png is a command line program to create screenshots of webpages.
  • Pearl Crescent Page Saver - Pearl Crescent Page Saver is an extension for Mozilla Firefox that lets you capture images of web pages, including Flash content.
  • Snagit - Snagit is screen capture software to create interesting training documents, collaborative design work, IT bug reports, and more.
  • WebShot - WebShot allows you to take screenshots of web pages and save them as full sized images or thumbnails.
  • Webkit2png - webkit2png is a command line tool that creates png screenshots of webpages.

~Not Content Type Specific~

~Not Content Type Specific~: Academic Social Networking

Tools that support making connections, sharing research and maximising the impact of digital data.

  • Mendeley - Mendeley is a combination web service and desktop application that allows users to create, manage, and share collections of references.
  • ResearchGate - ResearchGate is an online professional network for scientists and researchers, particularly employed by those wishing to follow and track the publication outputs of others in their field.

~Not Content Type Specific~: Access

Tools that facilitate access to digital data by users.

  • ArchivesSpace - ArchivesSpace is the next-generation web-based archives information management system, designed by archivists and supported by diverse archival repositories.
  • Archon - Archon automatically publishes archival descriptive information and digital archival objects in a user-friendly website.
  • CollectiveAccess - CollectiveAccess is web-based software to catalogue, manage, and publish museum and archival collections.
  • DSpace - DSpace is an institutional repository system which enables easy deposit, preservation, and access for all types of digital content.
  • Djatoka - djatoka is open source Java software that builds upon a rich set of APIs and libraries to provide a service framework for the dynamic dissemination of JPEG 2000 image files.
  • EPrints - EPrints is an open access digital repository software, which is intended to create a highly configurable web-based repository.
  • LOCKSS (Lots of Copies Keep Stuff Safe) - LOCKSS software allows libraries to create preserved digital collections out of materials that would otherwise be accessible only through a licensed academic subscription.
  • Omeka - Omeka is a free open source web-publishing platform for the display of library, museum, archives, and scholarly collections and exhibitions.
  • Recollection - Recollections is a free open source platform for generating and customizing views (interactive maps, timelines, facets, tag clouds) that allow scholars, librarians, and curators to explore digital collection.
  • Rosetta - Ex Libris Rosetta enables institutions to preserve and provide access to the collections in their care.
  • Simile Exhibit - Exhibit lets you easily create web pages with advanced text search and filtering functionalities, with interactive maps, timelines, and other visualizations.
  • SobekCM - SobekCM is a digital repository and digital scholarship/publishing system which enables easy deposit, preservation, and access for all types of digital content, tailored to the needs of galleries, libraries, archives, museums, scholars, and researchers.
  • The Open Video Digital Library Toolkit - The Open Video Digital Library Toolkit project is intended to provide museums, libraries and other institutions holding moving image collections tools to more easily create Web-based digital video libraries.
  • Voyeur - Voyeur is a web-based text analysis environment that can use texts in a variety of formats, from different locations to perform lexical analysis, export data to other tools, and embed live tools into remote websites.
  • Wayfinder - Wayfinder is a developing resource for students and researchers to use in browsing digital archives.

~Not Content Type Specific~: Backup

Tools that support the backing up of digital data to another storage location, typically in a scheduled manner.

  • Carbonite - an online backup service that automatically backs up documents, e-mails, music, photos, and settings. Info gathered early March 2013.
  • Chronopolis - "Chronopolis digital preservation network provides services for the long-term preservation and curation of America's digital holdings"
  • Dropbox - Dropbox is a free service that lets you bring all your photos, docs, and videos anywhere. This means that any file you save to your Dropbox will automatically save to all your computers, phones and even the Dropbox website. Dropbox also makes it super easy to share with others, whether you're a student or professional, parent or grandparent. Even if you accidentally spill a latte on your laptop, have no fear! You can relax knowing that Dropbox always has you covered, and none of your stuff will ever be lost.
  • Glacier (Amazon) - Amazon Glacier is a secure, durable, and extremely low-cost cloud storage service for data archiving and long-term backup.
  • SafeBack - SafeBack is used to create mirror-image (bit-stream) backup files of hard disks or to make a mirror-image copy of an entire hard disk drive or partition.

~Not Content Type Specific~: Benefits

Tools that enable the identification and articulation of the benefits of preservation and curation.

~Not Content Type Specific~: Binary & Hexidecimal Editing

Tools for viewing and editing of files displayed in different views such as binary, hexadecimal. These are typically known as hex editors.

  • HxD - Free Hex- and Ram-Editor

~Not Content Type Specific~: Citation and Impact Tracking

Tools that support the citation of data and the tracking of the impact of usage of that data.

  • ImpactStory - ImpactStory (previously Total-Impact) allows researchers and organisations to gather a wide range of impact metrics about multiple forms of scholarly output.
  • Mendeley - Mendeley is a combination web service and desktop application that allows users to create, manage, and share collections of references.
  • ReaderMeter - ReaderMeter is a web-based service that compiles readership information about scientific content to create an estimate of the content's community impact.

~Not Content Type Specific~: Content Profiling

Tools that build a profile of the characteristics of digital content, typically by combining or analysing a number of sources of information such as extracted metadata and file format identifications.

  • Brunnhilde - Siegfried-based characterization of directories and disk images
  • C3PO - C3PO is a content profiling tool for visualization and preservation analysis
  • DROID Siegfried Sqlite Analysis Engine - Analysis and automatic generation of summary information from DROID output
  • Yara - Pattern matching tool

~Not Content Type Specific~: Costing

Tools that support the calculation or prediction of the cost of preservation or curation activities.

~Not Content Type Specific~: Data capture and Deposit

Tools that enable the capture and deposit of data.

  • Screen-scraper - screen-scraper is a tool for extracting data from websites.
  • WARCreate - Google Chrome browser extension for creating WARC files from web pages
  • Web Scraper Plus+ - Web Scraper Plus+ takes data from the web and puts it into a spreadsheet or database.

~Not Content Type Specific~: De-Duplication

Tools that enable the identification and/or removal of duplicate or similar files.

  • DROID Siegfried Sqlite Analysis Engine - Analysis and automatic generation of summary information from DROID output
  • FileVerifier++ - Windows utility for verifying file contents
  • Fslint - Set of utilities to find and clean various forms of lint on a filesystem, such as duplicate files, empty directories, and bad file names.
  • GNU Diffutils - GNU Diffutils is a package of several programs related to finding differences between files.
  • Java library implementing Pairtree - The PAIRTREE LIBRARY is a software library that supports the mapping between identifiers and filepaths according to the Pairtree Specification.
  • SSDeep - Recursive piecewise hashing tool
  • The DeDuplicator (Heritrix add-on module) - The DeDuplicator is an add-on module for Heritrix to reduce the amount of duplicate data collected in a series of snapshot crawls.

~Not Content Type Specific~: Decryption

Tools for recovering passwords or unlocking encrypted digital files.

  • AccessData Decryption Tools - This page gives information on AccessData Decryption Tools.
  • ElcomSoft - ElcomSoft offers numerous password recovery applications.
  • Password Recovery Software - Passware software recovers or resets passwords for Windows, Word , Excel, QuickBooks, Access, Acrobat, and more than 180 document types.

~Not Content Type Specific~: Dependency Analysis

Tools for identifying essential information that resides externally to a digital object, or for identifying dependent processes such as which DLLs are required by a Windows process.

  • Nuclear Processor - Process/module manager for Windows, with features such as Kill/Resume/Suspend thread of a process and unload DLL files
  • PERICLES Extraction Tool (PET) - A tool to capture contextual information in a sheer curation scenario

~Not Content Type Specific~: Discovery

Tools that facilitate the discovery of digital data by users.

  • EnCase eDiscovery - EnCase eDiscovery is the market leading e-discovery software that enables more efficient business process and significantly reduces legal risk and cost with a judicially accepted solution that provides everything from legal hold to first pass review and is scalable, defensible, and repeatable.
  • Project Blacklight - Blacklight is a free and open source ruby-on-rails based discovery interface (a.
  • SobekCM - SobekCM is a digital repository and digital scholarship/publishing system which enables easy deposit, preservation, and access for all types of digital content, tailored to the needs of galleries, libraries, archives, museums, scholars, and researchers.
  • TReSy - TReSy is an XML search engine oriented to text retrieval.
  • UpLib - UpLib is a personal digital library system that provides a long-term archival system with powerful search and a visually oriented retrieval mechanism, suitable for a wide variety of personal documents such as papers, photos, receipts, music, Web pages, books, clippings, and email.
  • Wayfinder - Wayfinder is a developing resource for students and researchers to use in browsing digital archives.
  • XPAT - The XPAT engine is an SGML/XML-aware search engine that the University of Michigan has deployed with an extremely diverse set of digital library resources.

~Not Content Type Specific~: Disk Imaging

Tools that enable the capture, viewing or extraction of contents of a disk image (which is a computer file containing the contents and structure of a disk volume or an entire data storage device, such as a hard drive or floppy disk).

  • CloneCD - CloneCD is the perfect tool to make backup copies of your music and data CDs, regardless of copy protection.
  • Dc3dd for computer forensics - dc3dd is a patched version of GNU dd with a number of features useful for computer forensics.
  • DriveImage XML - DriveImage XML is an easy to use and reliable program for imaging and backing up partitions and logical drives.
  • GetDriveInfo2 - GetDriveInfo2 is a Win32 program that examines the optical and removable media drives currently mounted on a computer, and returns information about those devices (in the case of optical devices it also returns information about the any media currently mounted in the device).
  • IMAGE - IMAGE is a DOS application capable of generating either highly compressed or "flat" images for forensic analysis.
  • PhotoRescue - PhotoRescue is the best and fairest picture and data recovery solution for digital film - sd cards, compact flash, memory sticks, microdrive, etc.
  • Power ISO - PowerISO is a powerful CD/DVD image file processing tool, which allows you to open, extract, create, edit, compress, encrypt, split and convert ISO files, and mount these files with internal virtual drive.
  • Virtual CloneDrive - Virtual CloneDrive works and behaves just like a physical CD/DVD drive, but it exists only virtually.
  • Zlon HDD cloning and imaging - Zlon is a disk imaging tool.

~Not Content Type Specific~: Emulation

Tools that enable the emulation or virtualisation of a hardware or software system on another system.

  • Dioscuri - Dioscuri is a computer hardware emulator, specifically designed to be used as part of a digital preservation strategy.
  • IBM Digital Asset Preservation Tool - IBM's Digital Asset Preservation Tool is a proof-of-concept demonstration of the Universal Virtual Computer solution that provides long-term access to JPEG and GIF87a files.
  • JPC - JPC is the fast pure Java x86 PC emulator.
  • Kernel-based virtual machine - KVM (for Kernel-based Virtual Machine) is a full virtualization solution for Linux on x86 hardware containing virtualization extensions (Intel VT or AMD-V).
  • Linux-VServer - Linux-VServer provides virtualization for GNU/Linux systems.
  • OpenVZ wiki - OpenVZ is container-based virtualization for Linux.
  • VMware Player - VMware Player is the easiest way to run multiple operating systems at the same time on your PC.
  • VirtualBox - VirtualBox is a powerful x86 and AMD64/Intel64 virtualization product for enterprise as well as home use.
  • Windows Virtual PC - Windows XP Mode and Windows Virtual PC, available on Windows 7 Professional and Windows 7 Ultimate, allow you to run multiple Windows environments, such as Windows XP Mode, from your Windows 7 desktop.
  • Wine - Wine lets you run Windows software on other operating systems.
  • Xen - The Xen hypervisor, the powerful open source industry standard for virtualization, offers a powerful, efficient, and secure feature set for virtualization of x86, x86_64, IA64, ARM, and other CPU architectures.

~Not Content Type Specific~: Encryption Detection

Tools that support the detection of encryption or password protection in files.

  • FITS (File Information Tool Set) - FITS allows data curators to identify, validate, and extract technical metadata for the objects in their digital repository.
  • JHOVE (Harvard Object Validation Environment) - JHOVE provides functions to perform format-specific identification, validation, and characterization of digital objects.
  • JHOVE2 - JHOVE2 allows data curators to characterise the digital objects in their repositories.

~Not Content Type Specific~: File Copy

Tools that support the copying of files from one storage location to another, typically with facilities to verify the completeness of the copy and enable resumption of copying after an interruption.

  • BIL (BagIt Library) - BagIt Library is a Java software library that supports the creation, manipulation and validation of bags.
  • BagIt Transfer Utilities - BagIt transfer Utilities are a collection of tools developed for the purpose of validation and transfer of bags.
  • Cp Unix command - cp copies files (or, optionally, directories). Part of GNU coreutils.
  • Cryptcat - Cryptcat is a lightweight version of netcat with integrated transport encryption capabilities.
  • Dcfldd - dcfldd is an enhanced version of GNU dd with features useful for forensics and security.
  • Dd Unix command - This page gives information on using the dd Unix command.
  • XXCopy - XXCopy is an expanded version of Xcopy
  • Xcopy - Xcopy copies files and directories, including subdirectories.

~Not Content Type Specific~: File Format Identification

Tools that enable the automatic identification of the file format of a particular file, typically by examining characteristic codes (often termed file format magic) in the file header.

  • Apache Tika - Java based tool for identifying file formats using signatures and extracting metadata and text content from documents.
  • DROID (Digital Record Object Identification) - DROID (Digital Record Object Identification) is a software tool developed to perform automated batch identification of file formats.
  • DUMPBIN Utility - The DUMPBIN utility, which is provided with the 32-bit version of Microsoft Visual C++, combines the abilities of the LINK, LIB, and EXEHDR utilities.
  • FIDO (Format Identification for Digital Objects) - A PRONOM based, command line, file format identification tool written in Python
  • FIDOO - A PRONOM based, online file format identification tool written in Javascript and HTML5
  • FITS (File Information Tool Set) - FITS allows data curators to identify, validate, and extract technical metadata for the objects in their digital repository.
  • Fine Free File Command - This is the home page for the open source implementation of the file(1) command that ships with every free operating system (OpenBSD, Linux, NetBSD, FreeBSD, etc.
  • Gvfs-info - gvfs-info - print information about files and directories
  • JHOVE (Harvard Object Validation Environment) - JHOVE provides functions to perform format-specific identification, validation, and characterization of digital objects.
  • JHOVE2 - JHOVE2 allows data curators to characterise the digital objects in their repositories.
  • Libmagic-dev - This library can be used to classify files according to magic number tests.
  • Libsharedmime - This is an implementation for libsharedmime.
  • NARA File Analyzer and Metadata Harvester - NARA File Analyzer and Metadata Harvester allows a user to analyze the contents of a file system or external drive and generates statistics about the contents of the contained directories.
  • Nanite - A friendly swarm of format-identifying robots
  • Ohcount - Analyses plain text files, looking for code (scripting languages etc.)
  • PRONOM Signature Development Utility - Output DROID compatible file format signature files using PRONOM syntax
  • Siegfried - A PRONOM based, command line, file format identification tool using Aho Corasick matching and no buffer limits.
  • TrID File Identifier - TrID is a utility designed to identify file types from their binary signatures.

~Not Content Type Specific~: File Format Migration

Tools that support the transformation of data from one file format to another.

  • Archivematica - Archivematica is a digital preservation system that automates the process of preparing digital objects for ingest into a repository and an access system
  • CDS Convert - CDS Convert is a suite of tools that allow conversion of documents, presentations and images between different software formats.
  • DocMorph: Electronic Document Conversion - The U.S. National Library of Medicine's (NLM) document conversion tools make the exchange and use of biomedical library electronic information easier for librarians, library users, and the general public
  • MSIL Disassembler (Ildasm.exe) - The MSIL Disassembler is a companion tool to the MSIL Assembler (Ilasm.
  • Open Office - OpenOffice.org 3 is the leading open-source office software suite for word processing, spreadsheets, presentations, graphics, databases and more.
  • OpenXML/ODF Translator Add-in for Office - The goal for this project is to provide translators to allow for interoperability between applications based on ODF (OpenDocument) 1.
  • Oracle Outside In Technology - Outside In Technology is a suite of software development kits (SDKs) that provides developers with a comprehensive solution to access, transform and control the contents of over 500 unstructured file formats.
  • PREMIS in METS (PiM) Toolbox - PREMIS in METS Toolbox was developed to support the implementation of PREMIS in the METS container format.
  • Rosetta - Ex Libris Rosetta enables institutions to preserve and provide access to the collections in their care.
  • Xena - Detecting the file formats of digital objects; converting digital objects into open formats for preservation.

~Not Content Type Specific~: File Management

Tools that support general file management activities such as viewing or renaming

  • BAT: BnfArcTools - BAT is a Perl package for processing Internet Archive ARC, DAT and CDX file format.
  • Bulk Rename Utility - Bulk Rename Utility is a free file renaming software for Windows. Bulk Rename Utility allows you to easily rename files and entire folders based upon extremely flexible criteria.
  • Dcfldd - dcfldd is an enhanced version of GNU dd with features useful for forensics and security.
  • DiskView - DiskView shows you a graphical map of your disk, allowing you to determine where a file is located or, by clicking on a cluster, seeing which file occupies it.
  • Explore2fs - Explore2fs is a GUI explorer tool for accessing ext2 and ext3 filesystems.
  • File Analyzer and Metadata Harvester V2 - The File Analyzer is a general purpose desktop (and command line) tool designed to automate simple, file-based operations. The File Analyzer assembles a toolkit of tasks a user can perform. The tasks that have been written into the File Analyzer code base have been optimized for use by libraries, archives, and other cultural heritage institutions.
  • Fslint - Set of utilities to find and clean various forms of lint on a filesystem, such as duplicate files, empty directories, and bad file names.
  • Java library implementing Pairtree - The PAIRTREE LIBRARY is a software library that supports the mapping between identifiers and filepaths according to the Pairtree Specification.
  • ReACT (Resource Audit and Comparison Tool) - A file audit and comparison tool using Microsoft Excel and VBA.
  • ReNamer - ReNamer is a very powerful and flexible file renaming tool.
  • The Rename - bulk renaming of files - Bulk renaming of files - free downloadable software
  • TreeSize Professional - disk space management software - Manage disk space and scan your hard disks.

~Not Content Type Specific~: File Recovery

Tools that support the recovery of data from damaged or corrupted storage devices such as disks.

  • Dd rescue - dd_rescue is suitable for rescuing data from a medium with errors, i.
  • Foremost - Foremost is a console program to recover files based on their headers, footers, and internal data structures.
  • GetDataBack - GetDataBack will recover your data if the hard drive's partition table, boot record, FAT/MFT or root directory are lost or damaged, data was lost due to a virus attack, the drive was formatted, fdisk has been run, a power failure has caused a system crash, files were lost due to a software failure, files were accidentally deleted.
  • Ontrack EasyRecovery - Ontrack EasyRecovery software products offer home users or businesses complete solutions for their data recovery, file repair and disk diagnostic needs.
  • PhotoRec - PhotoRec is file data recovery software designed to recover lost files including video, documents and archives from hard disks, CD-ROMs, and lost pictures (thus the Photo Recovery name) from digital camera memory.
  • PhotoRescue - PhotoRescue is the best and fairest picture and data recovery solution for digital film - sd cards, compact flash, memory sticks, microdrive, etc.
  • Recovery is Possible - Recovery Is Possible (RIP) is a CD or USB boot/rescue/backup/maintenance system.
  • Restorer Ultimate - Restorer Ultimate offers data recovery software.
  • Safecopy - low level data recovery tool
  • SalvageData Recovery - SalvageData Recovery software tools and products are designed to empower both IT professionals and average personal computer users with all the functionalities and features needed to successfully salvage and recover data files from any kind of logical data loss situation.
  • SpinRite - SpinRite is a magnetic storage data recovery, repair, and maintenance utility.
  • TestDisk - TestDisk is powerful free data recovery software that was primarily designed to help recover lost partitions and/or make non-booting disks bootable again when these symptoms are caused by faulty software, certain types of viruses or human error (such as accidentally deleting a Partition Table).
  • Unrm - unrm is a small shell utility that can, under some circumstances, recover almost 99% of your erased data (similar to DOS's undelete).
  • Windows data recovery with ZAR - ZAR is Windows data recovery software.

~Not Content Type Specific~: Fixity

Tools that support the verification of file fixity, typically through the generation and validation of checksum based manifests.

  • ACE (Audit Control Environment) - The Auditing Control Environment is a mature set of software designed to help libraries and archives prove their holdings are intact and trustworthy.
  • BIL (BagIt Library) - BagIt Library is a Java software library that supports the creation, manipulation and validation of bags.
  • BagIt Transfer Utilities - BagIt transfer Utilities are a collection of tools developed for the purpose of validation and transfer of bags.
  • Bagger - GUI application to facilitate the creation and verification of BagIt bags.
  • Cksum Unix command - cksum computes a cyclic redundancy check (CRC) checksum for each given file, or standard input if none are given
  • File Analyzer and Metadata Harvester V2 - The File Analyzer is a general purpose desktop (and command line) tool designed to automate simple, file-based operations. The File Analyzer assembles a toolkit of tasks a user can perform. The tasks that have been written into the File Analyzer code base have been optimized for use by libraries, archives, and other cultural heritage institutions.
  • FileVerifier++ - Windows utility for verifying file contents
  • Fixi - Fixi is a command-line utility that indexes, verifies, and updates checksum information for collections of files.
  • Fixity - Fixity monitoring for small-medium collections
  • Md5deep and hashdeep - md5deep is a set of programs to compute MD5, SHA-1, SHA-256, Tiger, or Whirlpool message digests on an arbitrary number of files. hashdeep is a program to compute, match, and audit hashsets.
  • Md5sum Unix command - md5sum computes a 128-bit checksum (or fingerprint or message-digest) for each specified file.
  • Md5summer - MD5summer is an application for Microsoft Windows 9x, NT, ME, 2000 and XP which generates and verifies md5 checksums.
  • NARA File Analyzer and Metadata Harvester - NARA File Analyzer and Metadata Harvester allows a user to analyze the contents of a file system or external drive and generates statistics about the contents of the contained directories.
  • Python checkm package - This is a Python implementation of the checkm specification.
  • Rhash - RHash (Recursive Hasher) is a console utility for computing and verifying hash sums of files.
  • SSDeep - Recursive piecewise hashing tool

~Not Content Type Specific~: Forensic

Tools that support forensics related functions.

  • AFFLIB - The Advanced Forensics Format (AFF) and AFF Library (AFFLIB) are a joint development project of Simson L.
  • Autopsy Forensic Browser - graphical interface to the command line digital investigation tools in The Sleuth Kit
  • DataLifter - suite of tools "designed to assist with Computer Forensics, Information Auditing, Information Security and Data Recovery"
  • Dc3dd for computer forensics - dc3dd is a patched version of GNU dd with a number of features useful for computer forensics.
  • Dcfldd - dcfldd is an enhanced version of GNU dd with features useful for forensics and security.
  • Digital Intelligence Forensic Software - Digital Intelligence Forensic Software
  • EnCase Forensic (Guidance Software) - EnCase Forensic (Guidance Software)
  • FCCU GNU/Linux Forensic Boot CD - bootable CD with Linux and forensic tools
  • FTK (Forensic Toolkit) - Forensic Toolkit (AccessData)
  • Farmer's Boot CD (FBCD) - bootable CD with Linux and forensic tools
  • Foremost - Foremost is a console program to recover files based on their headers, footers, and internal data structures.
  • Forensic Acquistion Utilities - A collection of utilities and libraries intended for forensic or forensic-related investigative use in a modern Microsoft Windows environment.
  • Freeware Hex Editor XVI32 - XVI32 is a freeware hex editor running under Windows 95, Windows 98, Windows NT, Windows 2000, and Windows XP.
  • HashKeeper - Digital Evidence Laboratory specialists created the HashKeeper software in 1998 to expedite the analysis of electronic media by reducing the number of files to be analyzed during the course of an investigation.
  • Helix (e-fense) - bootable CD with Linux and forensic tools
  • I2 - i2 is a provider of intelligence and investigation management software for law enforcement, defense, national security and private sector organizations.
  • ILookPI - ILookPI provides a fully programmable IDE environment with customizable tool capabilities.
  • Index.dat Analyzer v2.5 - Index.dat Analyzer is a tool to view, examine and delete contents of index.dat files.
  • InfinaDyne - InfinaDyne's forensic products are focused on government and law enforcement examining various types of media and intent on collecting evidence in a thorough, secure and trustworthy manner.
  • KEA (Keyphrase Extraction Algorithm) - KEA is an algorithm for extracting keyphrases from text documents.
  • Libewf - Libewf is a library for support of the Expert Witness Compression Format (EWF), it support both the SMART (EWF-S01) and EnCase (EWF-E01) format.
  • MRU-Blaster - MRU-Blaster is a program made to do one large task - detect and clean MRU (most recently used) lists on your computer.
  • McAfee Free Tools - Free Tools [See specifically Foresnic Tools]
  • Microsoft Office 2003 Add-in: Word Redaction v1.2 - Use the Word 2003 Redaction Add-in to hide text within Microsoft Office Word 2003 documents.
  • Microsoft Office 2003/XP Add-in: Remove Hidden Data - With this add-in you can permanently remove hidden data and collaboration data, such as change tracking and comments, from Microsoft Word, Microsoft Excel, and Microsoft PowerPoint files.
  • NSRL (National Software Reference Library) - The NSRL provides a large data set of metadata on computer files which can be used to identify the files and their provenance
  • OCFA (Open Computer Forensics Architecture) - Open Computer Forensics Architecture is a modular computer forensics framework.
  • Paraben - Paraben provides forensics tools.
  • PyFlag - FLAG (Forensic and Log Analysis GUI) is an advanced forensic tool for the analysis of large volumes of log files and forensic investigations.
  • RAID (Real-time Analytical Intelligence Database) - RAID is a relational database used to record key pieces of information and to quickly identify links among people, places, businesses, financial accounts, telephone numbers, and other investigative information.
  • RapidRedact - The RapidRedact product range provides fast, easy to use redaction tools for irreversibly blanking out (redacting) selected information, author's changes and hidden data from all electronic document types.
  • Redact-It - Provides Windows desktop and server redaction of PDF, Word, scanned TIFF images. Find, black out and remove content within documents, images or drawings.
  • Redax - Redax completely redacts (removes) text and graphics from the PDF page.
  • Regshot - Regshot is an open-source (GPL) registry compare utility that allows you to quickly take a snapshot of your registry and then compare it with a second one - done after doing system changes or installing a new software product.
  • Technology Pathways - Technology Pathways, LLC is a leading edge provider of computer security tools and services for the Corporate IT, government and legal communities.
  • The PERPOS Tools: User's Guide - The Archival Repository Tool (ART) is a prototype software tool designed to support archivists in accessing and describing file systems containing electronic records.
  • WinHex - WinHex is in its core a universal hexadecimal editor, particularly helpful in the realm of computer forensics, data recovery, low-level data processing, and IT security.
  • Windows IR/CF Tools - This page links to Windows IR/CF Tools.
  • Yara - Pattern matching tool

~Not Content Type Specific~: Managing Active Research Data

Tools that enable researchers to manage data from its point of creation, facilitating its productive use in the present, but also establishing the support structures necessary to ensure its future survival.

  • WebCite - WebCite is an on-demand web archiving service that takes snapshots of Internet-accessible digital objects at the behest of users, storing the data on their own servers and assigning unique identifiers to those instances of the material.

~Not Content Type Specific~: Metadata Extraction

Tools that support the extraction of metadata from files.

  • Apache Tika - Java based tool for identifying file formats using signatures and extracting metadata and text content from documents.
  • Brunnhilde - Siegfried-based characterization of directories and disk images
  • C3PO - C3PO is a content profiling tool for visualization and preservation analysis
  • DROID (Digital Record Object Identification) - DROID (Digital Record Object Identification) is a software tool developed to perform automated batch identification of file formats.
  • DROID Siegfried Sqlite Analysis Engine - Analysis and automatic generation of summary information from DROID output
  • DUMPBIN Utility - The DUMPBIN utility, which is provided with the 32-bit version of Microsoft Visual C++, combines the abilities of the LINK, LIB, and EXEHDR utilities.
  • FIDO (Format Identification for Digital Objects) - A PRONOM based, command line, file format identification tool written in Python
  • FIDOO - A PRONOM based, online file format identification tool written in Javascript and HTML5
  • FITS (File Information Tool Set) - FITS allows data curators to identify, validate, and extract technical metadata for the objects in their digital repository.
  • File Analyzer and Metadata Harvester V2 - The File Analyzer is a general purpose desktop (and command line) tool designed to automate simple, file-based operations. The File Analyzer assembles a toolkit of tasks a user can perform. The tasks that have been written into the File Analyzer code base have been optimized for use by libraries, archives, and other cultural heritage institutions.
  • FileAlyzer - FileAlyzer allows a basic analysis of files (showing file properties and file contents in hex dump form) and is able to interpret common file contents like resources structures (like text, graphics, HTML, media and PE).
  • GNU libextractor - GNU libextractor is a library used to extract meta data from files of arbitrary type.
  • Index.dat Analyzer v2.5 - Index.dat Analyzer is a tool to view, examine and delete contents of index.dat files.
  • JHOVE (Harvard Object Validation Environment) - JHOVE provides functions to perform format-specific identification, validation, and characterization of digital objects.
  • JHOVE2 - JHOVE2 allows data curators to characterise the digital objects in their repositories.
  • Keith Humphreys' PhraseRate - PhraseRate is a program, developed by Keith Humphreys, for extracting a set of meaningful, attractive keywords and key phrases from a web page describing the content of that page.
  • MP3::Tag - MP3::Tag is a module for reading tags of MP3 audio files.
  • Metadata Extraction Tool - Metadata Extraction Tool automatically extracts a limited set of metadata from the headers of digital files.
  • NARA File Analyzer and Metadata Harvester - NARA File Analyzer and Metadata Harvester allows a user to analyze the contents of a file system or external drive and generates statistics about the contents of the contained directories.
  • Nanite - A friendly swarm of format-identifying robots
  • PERICLES Extraction Tool (PET) - A tool to capture contextual information in a sheer curation scenario
  • Pagelyzer - Suite of tools for detecting changes in web pages and their rendering
  • WordHoard - WordHoard is an application for the close reading and scholarly analysis of deeply tagged texts.

~Not Content Type Specific~: Metadata Processing

Tools that support the processing or management of metadata.

  • ArchivesSpace - ArchivesSpace is the next-generation web-based archives information management system, designed by archivists and supported by diverse archival repositories.
  • Archivists' Toolkit - The Archivists? Toolkit?, or the AT, is the first open source archival data management system to provide broad, integrated support for the management of archives.
  • Archon - Archon automatically publishes archival descriptive information and digital archival objects in a user-friendly website.
  • Collectus -- A Digital Object Collector Tool - The UVa Library's Collectus digital object collector tool allows users to to collect image or text objects from a repository.
  • ContextMiner - ContextMiner is a framework to collect, analyze, and present the contextual information along with the data.
  • Curator's Workbench - Curator's Workbench is a tool that automates and streamlines the process of preparing collections of digital materials for submission to a repository
  • Duke Data Accessioner - Data Accessioner provides a graphical user interface to aid in migrating data from physical media to a dedicated file server, documenting the process and using MD5 checksums to identify any errors introduced in transfer.
  • File Analyzer and Metadata Harvester V2 - The File Analyzer is a general purpose desktop (and command line) tool designed to automate simple, file-based operations. The File Analyzer assembles a toolkit of tasks a user can perform. The tasks that have been written into the File Analyzer code base have been optimized for use by libraries, archives, and other cultural heritage institutions.
  • ICA-AtoM - ICA-AtoM allows organisations to create standards-based descriptions of their archival holdings and subsequently publish them to the Web.
  • Karen's Directory Printer - Karen's Directory Printer can print the name of every file on a drive, along with the file's size, date and time of last modification, and attributes (Read-Only, Hidden, System and Archive).
  • OpenWMS: Workflow Management System for Digital Objects - The OpenWMS is a platform-independent, open source, web-accessible system that can be used as a standalone application or integrated with other repository architectures by a wide range of organizations.
  • PAIRTREE Library - software library that supports the mapping between identifiers and filepaths according to the Pairtree Curation Microservices Specification.
  • PREMIS in METS (PiM) Toolbox - PREMIS in METS Toolbox was developed to support the implementation of PREMIS in the METS container format.
  • Rosetta - Ex Libris Rosetta enables institutions to preserve and provide access to the collections in their care.
  • SobekCM - SobekCM is a digital repository and digital scholarship/publishing system which enables easy deposit, preservation, and access for all types of digital content, tailored to the needs of galleries, libraries, archives, museums, scholars, and researchers.
  • Tree - Tree displays the directory structure of a path or of the disk in a drive graphically.
  • Voyeur - Voyeur is a web-based text analysis environment that can use texts in a variety of formats, from different locations to perform lexical analysis, export data to other tools, and embed live tools into remote websites.

~Not Content Type Specific~: Multi Format Rendering

Tools that support the rendering of a cross section of file format or content categories.

  • Quick View Plus - View virtually all the files and e-mail attachments you need, instantly without purchasing numerous software programs.

~Not Content Type Specific~: OCR

Tools that support the generation of text from bitmap images, otherwise known as Optical Character Recognition

  • Goobi - Workflow Management Tool
  • Tesseract-ocr - Open source OCR engine, accepting uncompressed TIFF files as input

~Not Content Type Specific~: Organisational Audit

Tools that that enable an audit of an organisation's capability with respect to preservation, typically relating to a maturity model

  • DRAMBORA - DRAMBORA offers a quantifiable insight into the severity of risks faced by repositories right now, and an effective means for reporting these.
  • Embedding Repositories Self-Assessment Tool - Embedding Repositories Self-Assessment Tool is comprised of a series of questions designed to quantify the degree that a digital repository is ‘embedded’ within its institution – the extent to which both the organisation's research and its administrative culture recognise the repository’s value and take full advantage of its capacity.
  • NDSA Levels of Preservation - The "Levels of Digital Preservation" are a tiered set of recommendations for how organizations should begin to build or enhance their digital preservation activities.
  • RMCAS - RMCAS is an assessment tool for organisations wishing to map their current records management infrastructure against community best-practice.

~Not Content Type Specific~: Persistent Identification

Tools that support the unique and persistent identification of files or intellectual entities.

  • EZID - EZID (easy-eye-dee) makes it easy to create and manage unique, persistent identifiers.
  • WebCite - WebCite is an on-demand web archiving service that takes snapshots of Internet-accessible digital objects at the behest of users, storing the data on their own servers and assigning unique identifiers to those instances of the material.

~Not Content Type Specific~: Personal Archiving

Tools that support the preservation and archiving of data relating to individuals.

  • WARCreate - Google Chrome browser extension for creating WARC files from web pages

~Not Content Type Specific~: Planning

Tools that support the planning of preservation activities.

  • AIDA - Assessing Institutional Digital Assets: Self-assessment tool for describing institutional readiness and capabilities for digital asset management and digital preservation
  • DRAMBORA - DRAMBORA offers a quantifiable insight into the severity of risks faced by repositories right now, and an effective means for reporting these.
  • Digital Preservation Capability Maturity Model (DPCMM) - Maturity / gap analysis model for digital preservation
  • Digital Preservation Management Tools and Techniques - A toolset for developing standards compliant digital preservation management documentation on an array of topics
  • Embedding Repositories Self-Assessment Tool - Embedding Repositories Self-Assessment Tool is comprised of a series of questions designed to quantify the degree that a digital repository is ‘embedded’ within its institution – the extent to which both the organisation's research and its administrative culture recognise the repository’s value and take full advantage of its capacity.
  • Goobi - Workflow Management Tool
  • HoliRisk - HoliRisk is a framework and online tool to support the development of a risk assessment based on principles from ISO31000.
  • NDSA Levels of Preservation - The "Levels of Digital Preservation" are a tiered set of recommendations for how organizations should begin to build or enhance their digital preservation activities.
  • PLATO - Plato is a preservation-planning tool for organisations charged with safeguarding digital materials.
  • RMCAS - RMCAS is an assessment tool for organisations wishing to map their current records management infrastructure against community best-practice.
  • SCOUT - A brief description
  • Tufts Submission-Agreement Builder Tool - SABT is a web-based tool that guides records creators and records managers through the process of creating submission agreements, both for single transfers and for standing submissions.

~Not Content Type Specific~: Policy

Tools that support the development and management of digital preservation policy.

  • Catalogue of Policy Elements - Supports creation of new preservation policies as well as planning and watch activities.
  • Digital Preservation Management Tools and Techniques - A toolset for developing standards compliant digital preservation management documentation on an array of topics
  • HoliRisk - HoliRisk is a framework and online tool to support the development of a risk assessment based on principles from ISO31000.
  • OpenDOAR - OpenDOAR is a simple, web-based tool that guides repository administrators through the process of creating basic policies for the submission, re-use, and preservation of digital materials.

~Not Content Type Specific~: Preservation System

Tools that support the management and preservation of digital resources, typically performing a number of functions across the digital lifecycle such as ingest, storage, preservation action and access.

  • ADIGRES - ADIGRES is a powerful cross-platform Document Management System written in Java.
  • Archivematica - Archivematica is a digital preservation system that automates the process of preparing digital objects for ingest into a repository and an access system
  • CONTENTdm - CONTENTdm is a digital collection management system
  • CollectiveAccess - CollectiveAccess is web-based software to catalogue, manage, and publish museum and archival collections.
  • Curator's Workbench - Curator's Workbench is a tool that automates and streamlines the process of preparing collections of digital materials for submission to a repository
  • DAITSS - A digital preservation software application designed as a dark archive to service consortial and institutional preservation repositories a multi-user environment type. DAITSS is considered to be a first-party system.
  • DCape (ingest only) - "The goal of the DCAPE project is to build a distributed production preservation environment that meets the needs of archival repositories for trusted archival preservation services." (Note: This is a work in progress, see notes for more information)
  • DSPS (Digital Preservation Software Platform) - The DPSP is a collection of four software applications which support the goal of digital preservation.
  • DSpace - DSpace is an institutional repository system which enables easy deposit, preservation, and access for all types of digital content.
  • Digital Preservation Recorder - Digital Preservation Recorder (DPR) is free and open source software developed by the National Archives of Australia to aid in the long term preservation of digital records.
  • Duke Data Accessioner - Data Accessioner provides a graphical user interface to aid in migrating data from physical media to a dedicated file server, documenting the process and using MD5 checksums to identify any errors introduced in transfer.
  • EPrints - EPrints is an open access digital repository software, which is intended to create a highly configurable web-based repository.
  • Fedora Commons - Fedora provides the back-end foundation for digital repository systems responsible for managing and preserving all types of digital content.
  • HP Integrated Archive Platform - The HP Integrated Archive Platform (HP IAP) provides a solution for the long-term archival and disposition of information.
  • Hoppla - Hoppla is an archiving solution that combines back-up and fully automated migration services for data collections in small office environments.
  • IRODS (integrated Rule Oriented Data Systems) - iRODS software was designed to allow curators utilising heterogeneous storage and computing facilities to define policies without being concerned with the technical detail of how the system implements those policies and without having to respond to changes in technical infrastructure.
  • Invenio - Invenio is a free software suite enabling you to run your own digital library or document repository on the web.
  • KoLibRI (Kopal Library for Retrieval and Ingest) - The kopal Library for Retrieval and Ingest (koLibRI) represents a library of Java tools that have been developed for the interaction with the DIAS system of IBM within the kopal project.
  • LOCKSS (Lots of Copies Keep Stuff Safe) - LOCKSS software allows libraries to create preserved digital collections out of materials that would otherwise be accessible only through a licensed academic subscription.
  • Merritt Repository Service - Merritt is a new cost-effective repository service from the University of California Curation Center (UC3) that lets the UC community manage, archive, and share its valuable digital content.
  • OpenWMS: Workflow Management System for Digital Objects - The OpenWMS is a platform-independent, open source, web-accessible system that can be used as a standalone application or integrated with other repository architectures by a wide range of organizations.
  • Preservica - Preservica is a complete OAIS Digital Preservation system available on the cloud (hosted in US, EU or AUS) and on premise (Standard and Enterprise versions). It is trusted by over 50 organisations across 4 continents to preserve collections both large (>6Pb) and small (few 100kb)
  • Roda - RODA - Repository of Authentic Digital Objects
  • Rosetta - Ex Libris Rosetta enables institutions to preserve and provide access to the collections in their care.
  • SobekCM - SobekCM is a digital repository and digital scholarship/publishing system which enables easy deposit, preservation, and access for all types of digital content, tailored to the needs of galleries, libraries, archives, museums, scholars, and researchers.
  • The Hydra Project - Hydra is a multi-institutional, multi-functional, multi-purpose, technical and community framework.
  • The Open Video Digital Library Toolkit - The Open Video Digital Library Toolkit project is intended to provide museums, libraries and other institutions holding moving image collections tools to more easily create Web-based digital video libraries.

~Not Content Type Specific~: Quality Assurance

Tools that support quality checking of digital resources, identifying damaged, incomplete or low quality data. Typically used to identify damage introduced via processes such as format migration or digitisation.

  • File Analyzer and Metadata Harvester V2 - The File Analyzer is a general purpose desktop (and command line) tool designed to automate simple, file-based operations. The File Analyzer assembles a toolkit of tasks a user can perform. The tasks that have been written into the File Analyzer code base have been optimized for use by libraries, archives, and other cultural heritage institutions.
  • GNU Diffutils - GNU Diffutils is a package of several programs related to finding differences between files.
  • Goobi - Workflow Management Tool
  • Pagelyzer - Suite of tools for detecting changes in web pages and their rendering
  • ReACT (Resource Audit and Comparison Tool) - A file audit and comparison tool using Microsoft Excel and VBA.
  • SobekCM - SobekCM is a digital repository and digital scholarship/publishing system which enables easy deposit, preservation, and access for all types of digital content, tailored to the needs of galleries, libraries, archives, museums, scholars, and researchers.

~Not Content Type Specific~: Redaction

Tools that support the removal of selected information from digital files. Typically used for removal of sensitive information like telephone or credit card numbers from personal archives before providing access to users.

  • MRU-Blaster - MRU-Blaster is a program made to do one large task - detect and clean MRU (most recently used) lists on your computer.
  • Microsoft Office 2003 Add-in: Word Redaction v1.2 - Use the Word 2003 Redaction Add-in to hide text within Microsoft Office Word 2003 documents.
  • Microsoft Office 2003/XP Add-in: Remove Hidden Data - With this add-in you can permanently remove hidden data and collaboration data, such as change tracking and comments, from Microsoft Word, Microsoft Excel, and Microsoft PowerPoint files.
  • RapidRedact - The RapidRedact product range provides fast, easy to use redaction tools for irreversibly blanking out (redacting) selected information, author's changes and hidden data from all electronic document types.
  • Redact-It - Provides Windows desktop and server redaction of PDF, Word, scanned TIFF images. Find, black out and remove content within documents, images or drawings.
  • Redax - Redax completely redacts (removes) text and graphics from the PDF page.

~Not Content Type Specific~: Rendering

Tools that support the rendering of digital resources so they can be viewed, printed, or otherwise accessed by users.

  • Mutlivalent - Multivalent works on digital documents research and development.
  • Open Office - OpenOffice.org 3 is the leading open-source office software suite for word processing, spreadsheets, presentations, graphics, databases and more.
  • Quick View Plus - View virtually all the files and e-mail attachments you need, instantly without purchasing numerous software programs.

~Not Content Type Specific~: Secure Deletion

Tools that support deletion of data in a way that cannot be reversed, typically to avoid third parties stealing sensitive information from decommissioned or recycled hardware.

  • BCWipe - BCWipe data wiping software enables you to permanently delete selected files so that they can never be recovered or undeleted.
  • CCleaner - CCleaner is a tool for cleaning Windows PCs.
  • Darik's Boot And Nuke - Darik's Boot and Nuke ("DBAN") is a self-contained boot disk that securely wipes the hard disks of most computers.
  • Disk Utility - In Disk Utility in Mac OS X 10.
  • Eraser - Eraser is an advanced security tool for Windows which allows you to completely remove sensitive data from your hard drive by overwriting it several times with carefully selected patterns.
  • Ontrack Eraser Software - Ontrack Eraser software is an easy-to-use, highly flexible data erasure tool that erases all traces of data stored on a targeted media - ensuring that sensitive information does not fall into the wrong hands.
  • PDWIPE (Physical Drive WIPE) - PDWIPE (Physical Drive WIPE) is a standalone DOS utility to wipe (zero) an entire physical hard drive.
  • SDelete v1.51 - SDelete is a command line utility that takes a number of options.
  • Secure Deletion - Secure deletion involves the use of special software to ensure that when you delete a file, there really is no way to get it back again.

~Not Content Type Specific~: Service

  • Amazon Cloud - Amazon Cloud is an internet-based storage location designed to hold files indefinitely.
  • Carbonite - an online backup service that automatically backs up documents, e-mails, music, photos, and settings. Info gathered early March 2013.
  • Chronopolis - "Chronopolis digital preservation network provides services for the long-term preservation and curation of America's digital holdings"
  • Dropbox - Dropbox is a free service that lets you bring all your photos, docs, and videos anywhere. This means that any file you save to your Dropbox will automatically save to all your computers, phones and even the Dropbox website. Dropbox also makes it super easy to share with others, whether you're a student or professional, parent or grandparent. Even if you accidentally spill a latte on your laptop, have no fear! You can relax knowing that Dropbox always has you covered, and none of your stuff will ever be lost.
  • DuraCloud - DuraCloud is a hosted service that provides a centralised interface for organizations interested in using cloud storage as a part of their digital archiving and preservation programs.
  • Glacier (Amazon) - Amazon Glacier is a secure, durable, and extremely low-cost cloud storage service for data archiving and long-term backup.
  • Google Cloud - Google Cloud Storage allows users to store, access, and manage their data.
  • Preservica - Preservica is a complete OAIS Digital Preservation system available on the cloud (hosted in US, EU or AUS) and on premise (Standard and Enterprise versions). It is trusted by over 50 organisations across 4 continents to preserve collections both large (>6Pb) and small (few 100kb)
  • RackSpace - RackSpace provices cloud based services to businesses of all sizes through the world.

~Not Content Type Specific~: Storage

Tools that support the storage of digital resources, possibly in multiple locations to avoid loss of data due to hardware or other failures.

  • Amazon Cloud - Amazon Cloud is an internet-based storage location designed to hold files indefinitely.
  • CERN Advanced STORage manager (CASTOR) - CASTOR, which stands for the CERN Advanced STORage manager, is a hierarchical storage management (HSM) system developed at CERN used to store physics production files and user files.
  • Carbonite - an online backup service that automatically backs up documents, e-mails, music, photos, and settings. Info gathered early March 2013.
  • Chronopolis - "Chronopolis digital preservation network provides services for the long-term preservation and curation of America's digital holdings"
  • DCape (ingest only) - "The goal of the DCAPE project is to build a distributed production preservation environment that meets the needs of archival repositories for trusted archival preservation services." (Note: This is a work in progress, see notes for more information)
  • Dropbox - Dropbox is a free service that lets you bring all your photos, docs, and videos anywhere. This means that any file you save to your Dropbox will automatically save to all your computers, phones and even the Dropbox website. Dropbox also makes it super easy to share with others, whether you're a student or professional, parent or grandparent. Even if you accidentally spill a latte on your laptop, have no fear! You can relax knowing that Dropbox always has you covered, and none of your stuff will ever be lost.
  • Glacier (Amazon) - Amazon Glacier is a secure, durable, and extremely low-cost cloud storage service for data archiving and long-term backup.
  • Google Cloud - Google Cloud Storage allows users to store, access, and manage their data.
  • Hoppla - Hoppla is an archiving solution that combines back-up and fully automated migration services for data collections in small office environments.
  • IRODS (integrated Rule Oriented Data Systems) - iRODS software was designed to allow curators utilising heterogeneous storage and computing facilities to define policies without being concerned with the technical detail of how the system implements those policies and without having to respond to changes in technical infrastructure.
  • LOCKSS (Lots of Copies Keep Stuff Safe) - LOCKSS software allows libraries to create preserved digital collections out of materials that would otherwise be accessible only through a licensed academic subscription.
  • Legacy Locker - Legacy Locker is a safe, secure repository for your vital digital property that lets you grant access to online assets for friends and loved ones in the event of loss, death, or disability.
  • RackSpace - RackSpace provices cloud based services to businesses of all sizes through the world.
  • The DICE Storage Resource Broker (SRB) - The DICE Storage Resource Broker (SRB) supports shared collections that can be distributed across multiple organizations and heterogeneous storage systems.
  • The aDORe Federation - The aDORe Federation is a federated repository framework and reference implementation which aims to address many of the scalability issues experienced by large scale digital object repositories.

~Not Content Type Specific~: Validation

Tools that support the validation of digital files, typically against a file format specification.

  • FITS (File Information Tool Set) - FITS allows data curators to identify, validate, and extract technical metadata for the objects in their digital repository.
  • File Analyzer and Metadata Harvester V2 - The File Analyzer is a general purpose desktop (and command line) tool designed to automate simple, file-based operations. The File Analyzer assembles a toolkit of tasks a user can perform. The tasks that have been written into the File Analyzer code base have been optimized for use by libraries, archives, and other cultural heritage institutions.
  • JHOVE (Harvard Object Validation Environment) - JHOVE provides functions to perform format-specific identification, validation, and characterization of digital objects.
  • JHOVE2 - JHOVE2 allows data curators to characterise the digital objects in their repositories.
  • PREMIS in METS (PiM) Toolbox - PREMIS in METS Toolbox was developed to support the implementation of PREMIS in the METS container format.
  • W3C Markup Validation Service - This is the World Wide Web Consortium's validation tool.

~Not Content Type Specific~: Version Control

Tools that support the tracking of changes to digital files over time.

~Not Content Type Specific~: Web Crawl

Tools that support the capture of data from the world wide web, typically by "crawling" links between resources.

  • ContextMiner - ContextMiner is a framework to collect, analyze, and present the contextual information along with the data.
  • Curate.Us - With a simple click of the mouse, you can create visually compelling clips and quotes of web content that are easily embedded in blog posts, email, forums, and websites.
  • Find It! Keep It! - Find It! Keep It! is a tool to save and organise web content.
  • Heritrix plug-in for rich media capture - The Rich Media Capture module (RMC), developed in the LiWA (Living Web Archives) project, is designed to enhance the capturing capabilities of the crawler, with regards to different multimedia content types.
  • Metaproducts - Metaproducts offers several commercial capture and off-line browsing tools.
  • PageVault - pageVault supports the archiving of all unique responses generated by a web server.
  • Pagelyzer - Suite of tools for detecting changes in web pages and their rendering
  • RARC (ARC replicator) - rARC is a distributed system that enables Internet users to provide storage space from their computers to replicate small parts of the archived data stored in the central repository of the Web archive.
  • Spadix software - Spadix Software can download websites from a starting URL, search engine results or web dirs, and is able to follow external links.
  • Tennyson Maxwell Information Systems - Tennyson Maxwell Information Systems offers a variety of features to support multithreaded retrieval, password-protected access, filtering, batch capture, and management of derived databases.
  • The DeDuplicator (Heritrix add-on module) - The DeDuplicator is an add-on module for Heritrix to reduce the amount of duplicate data collected in a series of snapshot crawls.
  • The Nalanda iVia Focused Crawler - The Nalanda iVia Focused Crawler (NIFC) is a focused Web crawler.
  • WARCreate - Google Chrome browser extension for creating WARC files from web pages
  • WAXToolbar - WAXToolbar is a firefox extension to help users with common tasks encountered surfing a web archive.
  • WERA (Web ARchive Access) - WERA (Web ARchive Access) is a freely available solution for searching and navigating archived web document collections.

~Not Content Type Specific~: Web Snapshot

Tools that support the capture of a static snapshot of a web page.

  • WARCreate - Google Chrome browser extension for creating WARC files from web pages

~Not Content Type Specific~: Workflow

Tools that support the orchestration and management of specific tools or processes in a workflow.

  • File Analyzer and Metadata Harvester V2 - The File Analyzer is a general purpose desktop (and command line) tool designed to automate simple, file-based operations. The File Analyzer assembles a toolkit of tasks a user can perform. The tasks that have been written into the File Analyzer code base have been optimized for use by libraries, archives, and other cultural heritage institutions.
  • Goobi - Workflow Management Tool