Logo OR2015

Conference Agenda

Overview and details of the sessions of this conference. Please select a date or room to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

Return to conference web site

Session Overview
P3D: Developer Track 2
Tuesday, 09/Jun/2015:
3:30pm - 5:30pm

Session Chair: Claire Knowles
Location: Network
100 Seats


Metadata Extraction as a Service

William Gunn

Mendeley, United Kingdom

Generating metadata records for IR deposits should not have to be a manual process. I'll demo some technology that Mendeley has developed which uses machine learning to automatically extract article metadata from PDFs and show how this can be used as a service within your own IR. Might not have a working DSpace plugin to show by then, but I'll have something appropriately hacky. The catalog enrichment part might be the most interesting and unique thing for attendees, for more on that, see: https://krisjack.wordpress.com/2015/03/12/how-well-does-mendeleys-metadata-extraction-work/

Gunn-Metadata Extraction as a Service-242.pdf

Doing DevOps for a Perfect Repository Environment

Glen Horton

University of Cincinnati Libraries, United States of America

The University of Cincinnati implemented their Hydra-based repository, Scholar@UC, using a DevOps approach. DevOps is a development methodology that promotes communication and collaboration between the software developers and IT operations. The partnership between the Libraries’ developers and UC's IT staff fostered a stable and robust hosting environment for the repository.

Glen will discuss the journey UC took from code to deployment and highlight what worked and what didn't work so well. He will also share the many tools used to develop Scholar@UC's hosting and deployment environment including Vagrant, Puppet, GitHub, and Bamboo. Glen will also explain the importance of communication and share the steps UC took to make sure everyone had a clear understanding of what needed to get done.

Horton-Doing DevOps for a Perfect Repository Environment-232.pdf

Archidora: Leveraging Archivematica preservation services with an Islandora front-end

Justin Simpson

Artefactual Systems, Inc., Canada

Archidora was co-developed by Islandora developers, Discovery Garden and Archivematica developers, Artefactual Systems, and sponsored by the University of Saskatchewan Libraries. Stated simply, files uploaded to Islandora pass from Fedora to Archivematica, where they are processed for preservation. Once the archival packages are stored, Islandora is notified. This presentation would describe the current workflow, as well as discuss the opportunities it creates for development of features like PREMIS and DDI integration, Fedora support and integrity checks.

Time for presentation: 20 minutes


Data Citation Box

Jozef Mišutka

Charles University in Prague, Czech Republic

Citing submissions is important but citing data submissions has not an established format yet.

We have created a citation service based on DSpace OAI-PMH endpoint implementation which returns citations of resources specified by PID and in desired format like simple html styled text. We display the citation box in DSpace item view but also in external applications.

Mišutka-Data Citation Box-241.pdf

Publishing Datasets from an Open Access Repository As Linked Data

Hui Zhang

Oregon State University, United States of America

Exposing research data from traditional repository systems such as DSpace in Linked Data has numerous benefits such as increasing visibility, prompting open access, and interlinking datasets with elements such as authors. For a successful implementation, it is crucial to preserve object structure (e.g., hierarchical and related) in Linked Data in addition to bibliographic metadata.

This demonstration showcases a case study of migrating datasets from DSpace to a newly development institutional repository with Hydra technology including lessons learned from data modeling and approaches for metadata cleanup and controlled vocabulary enrichment. The estimated time for the demonstration is between 10 to 15 minutes.

Zhang-Publishing Datasets from an Open Access Repository As Linked Data-234.pdf