- Steve Marks, Nick Ruest, Graham Stewart & Amaz Taufique
Everyone has many of the same needs when looking at digital collections: digitization of collections, mixed types of content, preservation, etc.
Nick: Need long term solution:
- recognition of benefits of preservation by decision makers e.g. written into strategic planning
- process for selection digital materials with long term value e.g. selection committee
- incentive for decision markers to preserve in the public interest e.g. available funds
- appropriate organization and governance of digital preservation activities
- mechanisms to secure an ongoing, efficient allocation of resources for digital preservation activities e.g. ongoing budget line
- timely actions to ensure access
Should be done locally or consortially? Easier to do as a bigger team?
Graham: professional road blockers. Trying instead to look at technology that will help us build digital repositories, particularly because cost going down slowly but needs growing exponentially.
Open-source solutions with fast, active release cycles, and open licenses
- Trend 1: Server Virtualization – Very fast provisioning of machines.
- Trend 2: Configuration management / dev ops: allow automation of large IT systems. e.g. puppet, Chef. Automate everything, wherever possible.
- Trend 3: Hardware commditization and open source software. Intel-based servers dominant in data centres, but nothing particular to distinguish the products of different hardware vendors. The process is now all well defined. Hardware does not really matter anymore.
The open source hardware movement will encourage innovation. e.g. Open compute project is providing hardware designs with open licenses.
Now moving to software that provides the features that hardware used to do.
OCUL Storage Project
Amaz: What is the cost of storing large amounts of data (1.2 PB across OCUL)? Will cost ~$1mill per year. Thought instead: can we do this ourselves?
Using open source hardware and software, create storage nodes using commodity hardware.
- off the shelf consumer grade disks
- software replication
- fixity & self healing
- geographically distributed
- object storage: access data via HTTP, build applications using REST API, static content, all files have a URL
Can interact with it using a GUI, much like S/FTP. $420K for the 1.2PB
Steve: Really important and cool.
Interesting because, solution to help with:
- Scholars Portal platforms
- dataverse/other RDM initiatives
- new applications
- space in general
lots of different functional requirements:
- autonomic digital preservation activities
- lots of different interfaces
- integration with HPC centres, because they have the expertise and already working with researchers
Research frequently has to be stored somewhere and restricted to Canada. A lot of content will also be born digital rather be the paper preservation we’re used to.
- self-sufficiency – should not be seeding this responsibility to a corporation via RFP
We have to do this together. We need to solve problems in a coordinated way, but distribute what can/should be distributed. Doing this in public and consultation with OCUL members, bringing back what happens to the community. Help each other achieve our goals. If this initiative is important and works across Ontario, we think it should go across Canada.