Projekt E-ARK stöd vid digital arkivering Björn Skog @ ES Solutions
European Archival Records and Knowledge Preservation E-ARK var ett treårigt (140201-170131) multinationellt forskningsprojekt (e-arkiv) Målet var att ta fram s.k PAN European standarder och metoder för e-arkiv samt verktyg tillgängliga för alla (Open Source) www.essolutions.se
THE E-ARK PROJECT IS CO-FUNDED BY THE EUROPEAN COMMISSION UNDER THE ICT-PSP PROGRAMME www.eark-project.eu
E-archive concepts Receive SIPs and SIP2AIP conversion for the preservation platform Prepare IPs, create SIPs and submit SIPs to archival institution Manage generation of IPs and support different archive policies Manage and administrate archive for long-term preservation Search for and query/order archived information Provide DIPs based on AIP2DIP conversion rules Archive objects according to archive policy and storage methods Provide access capabilities Manage and administrate regulations for the archive
OAIS
Processesand archival workflows Koncept e-arkiv Leveranser till e-arkiv Informationsförsörjning Sök / Hämta i e-arkiv System för långsiktig informationsförsörjning (e-arkiv) P 1 Kontinuerliga leveranser 2 Periodvisa leveranser 3 Enskilda leveranser PreIngest 4 T K N Paketera SIP Transport Ingest 5 Ta emot SIP Validera SIP Paketera AIP Leverera AIP Redovisningsprinciper 12 Lagring 11 14 Metadata 13 Access 10 Ta emot fråga Validera fråga Paketera svar/dip Leverera svar/dip PreAccess 9 Sök / Hämta Transport 6 Kontinuerliga sök / hämta 7 Periodvisa sök / hämta 8 Enskilda Sök / hämta K Kvitton Tjänster Kvitton 15 16
Process levels Process levels Overall process OAIS + - conceptual model etc General Process GM general archival wokflow (any) etc General Archival Level GAL pilots, user stories, requirements etc Detailed Archival Level DAL pilots, use cases, requirements etc. and a technical level TAL based on DAL = many variations of ucs. www.essolutions.se
Process Pre-Ingest
Process - Ingest
Process - Access
Common Specifications Common Specifications Metadata adm/desc. metadata Package Package Content information types METS Preservation Metadata BagIt Encoded Archival Description E-signature Encoded Archival Context Package package description??? Personnel ERMS Databases Economics Web?????? GIS Journals Dataset Publication????????????
Information Package components Package Information Package METS Package Archival Metadata Preservation Metadata PREMIS Authority Information EAC-CPF Archival Description EAD Archival Metadata (Content Type) (Application Profiles) (System Structure Type) Records Management System MoReq Database SIARD Content Type Specification Content Type Specification (to develop) (Content Type) (Application Profiles) (System Structure Type) (Content) (Objects) (Record) Metadata (e.g. manual) Display Structure (e.g. stylesheet) Digital Data Object 1 Digital Data Object N (Content) (Objects) (Record) (Ex. PDF/A) (Ex. XSLT/XSLFO) (Ex. TIFF, XML, PDF/A), ASCII.
Information Package logical and physical model Logical model Physical model Example physical model Information Package IP_name (UUID) ip.xml IP_name (UUID) ip.xml (METS) content content data data Content Metadata documentation cts.xml (CTS) metadata cts.xsd (CTS) <content/data> Data Documentation documentation <content documentation> metadata eac.xml SIP, AIP and DIP should be as transparent as possible Every IP is its own (S/A/D) IP and is described at least by its own METS and PREMIS file. A copy of administrative and descriptive metadata schemas should be added to the IP, preferably into metadata folder (content information type metadata schemas (CITS) are preferably added together with content/data) We prefer separate xml-files like METS and PREMIS, e.g. not embedded, just linked, since it will simplify operations on the xml-files, but we don't exclude embedding SchemaLocation in XML-files should always point to an external url, even if it is stored in an IP. Solution to handle external urls are just technical, like the use of a fake dns-srv etc. eac.xsd ead.xml ead.xsd cs_ip_mets.xsd premis.xml premis.xsd xlink.xsd www.essolutions.se
E-ARK nytta för Sverige Projekt E-ARK resultat, tillgänglighet och nytta: Projektresultaten är öppna och tillgängliga för alla Projektets webbplats - http://www.eark-project.com Projektets leverabler (delar av) - http://www.dasboard.eu/ Specifikationer IP (SIP, AIP, DIP) och CITS (SIARD, SMURF) IP Common Specification harmoniserar med FGS Paket SIP Specification är en tillämpning av IP AIP Specification är en tillämpning av IP DIP Specification är en tillämpning av IP CITS SIARD2 är en informationsstypsspecifikation för databasleveranser CITS SMURF (the semantically marked up record format) samlingsnamn för informationstypsspecifikationer för records, GEO data och samling av ostrukturerade filer Mognadsmodell för bl.a. självgranskning Maturity Assessment Tool ett webbaserat verktyg (se http://kc.dlmforum.eu) Verktyg tillgängliga som Open Source Open Source kontinuerlig utveckling av verktyg sker (se https://github.com/eark-project) www.essolutions.se
Tack!