Logo OR2015

Conference Agenda

Overview and details of the sessions of this conference. Please select a date or room to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

Return to conference web site

Session Overview
P4A: Managing Research (and Open) Data
Wednesday, 10/Jun/2015:
10:30am - 12:30pm

Session Chair: Holly Mercer
Location: Regency A-D
350 seats


[24x7] Revisiting Self-Deposit of Scientific Data

Darren Hardy

Stanford University, United States of America

Sharing scientific data is increasingly valuable for reproducible science, furthering investigation, and innovation. To this end, repositories facilitate data sharing by making scholarly data available. We are at an impasse, however. Librarian-mediated approaches to self-deposit of scientific data are very resource-intensive, and the repository services provided to researchers are often limited. Self-deposit is quite a challenging use case as it encompasses data preparation, metadata description, upload, visualization, annotation, sharing, publication, access, rights, preservation, citation, and discovery services. This editorial suggests we revisit the value proposition we make for self-deposit and mitigate its resource-intensive workflows.

Hardy-24x7 Revisiting Self-Deposit of Scientific Data-106.pptx

[24x7] CERN Open Data and Data Analysis Preservation

Tibor Simko

CERN, Switzerland

We present newly launched CERN Open Data Portal and related long-term Data Analysis preservation activities. Using the Invenio digital library platform and taking inspiration from OAIS preservation practices, the knowledge associated with successive data analysis steps is being captured for further reuse. The aim is to preserve not only information about research datasets, but also about the underlying user software and virtual machine platforms used to study it, together with any configuration parameters and high-level physics information associated with the analysis process. The CERN Open Data portal disseminates selected primary and reduced datasets of LHC experiments and offers several high-level tools permitting general public and general data scientists to visualise and further work with the data, such as interactive event display or histogram plotting interfaces. The ultimate goal of data analysis preservation efforts is to be able to reproduce an analysis even many years after its initial publication, permitting to extend the impact of preserved analyses through their future revalidation and recasting.

Simko-24x7 CERN Open Data and Data Analysis Preservation-123.pdf

[24x7] Integration and Adoption: An ORCID story

Graham Triggs, John Fearns

Symplectic, United Kingdom

As academic engagement with institutional repositories moves from “why should I do this?” to “good idea, but how can the Library make this easier for me?”, the need for consistent and unambiguous metadata has never been greater.

Metadata consistency includes the unambiguous identification of authors, editors, supervisors and other contributors to repository objects but until the launch of ORCID, there wasn’t a common means of unambiguously identifying authors.

In this presentation, we will explain how Imperial College London - the first institution to integrate a research information management system with an institutional repository - enabled over 1,200 research active staff within a week to claim an ORCID and subsequently automate the harvest of data from ORCID helping to populate Imperial’s institutional repository with verified metadata.

Triggs-24x7 Integration and Adoption-176.pdf

Islandora as an access system for iRODS managed information packages

Kilian Amrhein, Marco Klindt

Zuse Institute Berlin (ZIB), Germany

Accessing information packages with Islandora is straight forward, albeit not so much when they reside within a federated data management environment. In our case, dissemination information packages live in the Fedora object store for immediate access. The archival information packages are stored safely in a hierarchical storage infrastructure managed by iRODS and are only accessible for administrative and preservation action purposes. We present a data model that supports both use cases utilizing just a single Islandora instance. To integrate with iRODS, we developed an Islandora module to display and deliver data and metadata from the storage location. This solution also allows us to extend the system with further preservation workflow actions that will be required in the future.

Amrhein-Islandora as an access system for iRODS managed information packages-32_a.pdf
Amrhein-Islandora as an access system for iRODS managed information packages-32_b.pptx

Databrary: A research-centered repository for video data

Andrew Gordon, Dylan A. Simon, Lisa Steiger

New York University, United States of America

As a research data repository, Databrary focuses specifically on the storage, discoverability, and sharing of video-based datasets within the developmental and learning sciences. Storing video presents its own unique opportunities and challenges, the latter of which include research subject privacy and difficulties in creating and storing metadata that comes from different research projects in a standardized fashion. Databrary has implemented policies and practices within a functioning web application that meets both the needs of researchers as well as the preservation and access needs to share these datasets into the future. The lessons learned thus far in developing Databrary stand to model a viable approach to establishing practices and workflows for gathering and organizing research data that lift the burden off of researchers and also have potential to feed into established library systems for broader findability and accessibility.


The Hydra Common Data Model

Esme Cowles1, Robert Sanderson2, Jon Stroop3

1University of California, San Diego; 2Stanford University; 3Princeton University

One of the many successes of the Hydra community is the fundamental notion from which its name is derived—the concept of many interfaces (“heads”) over top of a single repository (the “body”). The recent release of Fedora 4, with its internal RDF-centric model, has spurred efforts for a community-wide model of collections and works, such that the heads can be sure that the body will behave as they expect it to. That model has been designed and vetted by the Hydra community, and its architecture and initial implementations will be presented in this paper.

Cowles-The Hydra Common Data Model-102.pdf