Logo OR2015

Conference Agenda

Overview and details of the sessions of this conference. Please select a date or room to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

Return to conference web site

Session Overview
Date: Monday, 08/Jun/2015
9:00am - 12:30pmWorkshop 05: Fedora Committers Meeting

Fedora Committers Meeting

Andrew Woods

Duraspace, United States of America

Open Repositories represents an annual opportunity to bring current and prospective Fedora developers together to review, discuss, and share:

- current initiatives

- upcoming roadmap

- design issues

- collaboration opportunities

- etc

Although this meeting is open to community developers interested in joining the Fedora effort, this is a working/planning session and not a Fedora tutorial.

Woods-Fedora Committers Meeting-40.pdf
1:30pm - 5:00pmWorkshop 10: Tutorial: Archivematica Digital Preservation Workflows

Tutorial: Archivematica digital preservation workflows

Courtney Mumma

Artefactual Systems, Inc., Canada

This is a hands-on tutorial of the Archivematica digital preservation management systems, with a particular focus on workflow flexibility. Archivematica is free and open-source and, taken together with AtoM (AccessToMemory), enables end-to-end curation of digital materials from ingest to access and management. Attendees will have the opportunity to ingest a variety of digital object formats through Archivematica, creating METS.xml with PREMIS preservation metadata and some simple Dublin Core. After processing the objects, the users will be able to upload the Archival Information Packages with logs and metadata to Archival Storage and to upload access versions and metadata to AtoM for access and management. The systems fulfill all of the requirements of OAIS-compliance, with added strategies for research data curation, archival arrangement, and preservation planning. Attendees will navigate the systems using their own laptops, and there will be time allotted for discussion of diverse workflow management.


Date: Tuesday, 09/Jun/2015
11:00am - 12:30pmP1D: Developer Track 1
Session Chair: Adam Field

Munging your data in Java with five times less code

Liz Krznarich, Laura Paglione, Rob Peters

ORCID, United States of America

Approx. Duration: 10-15mins

Java has some useful tools for creating XML/JSON APIs, and for storing data in relational databases such as Postgres. Like many similar apps, ORCID uses JAXB to represent XML/JSON as Java classes, and JPA to do the equivalent job for data from database tables.

To implement the ORCID REST API, however, we need to seamlessly translate from one to the other (in both directions!) - how do we do this without invoking a monstrous gob of code?! Enter Orika, a Java library that we use to reduce our JAXB to JPA mapping code from nearly 1000 lines to under 200.

This presentation will include:

- Basic introduction of Orika and brief demo

- Comparison of Orika to other tools

- Advanced customizations

- Examples showing how Orika is used in ORCID

Krznarich-Munging your data in Java with five times less code-230.rtf

Microservices with Docker and Go

Richard Wincewicz

University of Edinburgh, United Kingdom

Technologies like Docker are becoming more common and allow developers to speed up their development workflows and get robust code into production much faster. Docker is written in Go and creating microservices in Go allows for very lightweight components that are easy to deploy. Connecting these together with a messaging system creates an environment that is flexible, distributed and elastic.

My presentation will involve demonstrating a number of Go-based microservices that work in concert to process content ingested into a Fedora repository. I will also demonstrate how the system can be scaled to deal with increased traffic. I anticipate that this will take around 20 mins to explain and demonstrate.

Wincewicz-Microservices with Docker and Go-235.pptx

DSpace UI enhancements, visualizations and Python scripting

Ivan Masar


After launching our DSpace-based repository for research outputs, we added many small, but convenient UI features like citation counts, SHERPA/RoMEO status checker, RefWorks export, citation generator, Ex Libris bX (related articles), which move us from a barebones repository a few steps towards a perfect one. We'll show what technology is behind them, which APIs were used and how it was integrated into XMLUI.

Furthermore, we spiced up the repository with a few visualizations which make it understandable at a glance, even though they present data also available elsewhere. The d3 JavaScript library was used as a building block.

Last but not least, we used Python to build some of the above and more. Although using Java instead would be possible, Python made the prototyping faster and more fun. I'll show several ways how you can leverage Python to work with the DSpace Java API and Solr.

Masar-DSpace UI enhancements, visualizations and Python scripting-236.pdf

Vagrant-DSpace Live Demo

Hardy Joseph Pottinger

University of Missouri, United States of America

Vagrant is a tool for building complete development environments. With it, you can set up a development environment quickly, reproducibly, and in a way that is readily sharable with others. Vagrant-DSpace harnesses this power to help developers quickly ramp up to working with DSpace, and to faciltate sharing that work with others. Don't believe it? Let me show you. I will live demo Vagrant-DSpace in action. If you bring your notebook and have a Vagrant Cloud login, maybe we can even do some impromptu pair programming? Let's do this.

Pottinger-Vagrant-DSpace Live Demo-82.pdf

SobekCM : A true standards-based, structured, user-friendly approach to APIs and open data

Mark Sullivan

Sobek Digital Hosting & Consulting, LLC, United States of America

The Open Source SobekCM digital repository is approaching its tenth year of development and third year of being released Open Source. Recently, the SobekCM development community has focused on a major architectural revolution, moving towards configurable micro-services and a clear separation between the engine and standard web interface, to enable greater ease for installation and administration while adding support for research data. SobekCM continues to represent a unified, structured approach while retaining its commitment to standards-compliance, retaining core METS/MODS and embracing other metadata formats. In addition, community involvement around the software has continued to develop, with the official community framework draft released in early 2015.

This presentation will introduce the new REST APIs and show ways to “hack the API” to quickly build a new user interface over the SobekCM engine to replace the standard, included user interface. The presentation will showcase the API for searching and sharing research data which will include building a research data portal. The presentation will cover issues encountered and solutions developed when implementing changes to increase configurability, modularity, and customization while remaining true to the core set of beliefs that founded SobekCM.

3:30pm - 5:30pmP3D: Developer Track 2
Session Chair: Claire Knowles

Metadata Extraction as a Service

William Gunn

Mendeley, United Kingdom

Generating metadata records for IR deposits should not have to be a manual process. I'll demo some technology that Mendeley has developed which uses machine learning to automatically extract article metadata from PDFs and show how this can be used as a service within your own IR. Might not have a working DSpace plugin to show by then, but I'll have something appropriately hacky. The catalog enrichment part might be the most interesting and unique thing for attendees, for more on that, see: https://krisjack.wordpress.com/2015/03/12/how-well-does-mendeleys-metadata-extraction-work/

Gunn-Metadata Extraction as a Service-242.pdf

Doing DevOps for a Perfect Repository Environment

Glen Horton

University of Cincinnati Libraries, United States of America

The University of Cincinnati implemented their Hydra-based repository, Scholar@UC, using a DevOps approach. DevOps is a development methodology that promotes communication and collaboration between the software developers and IT operations. The partnership between the Libraries’ developers and UC's IT staff fostered a stable and robust hosting environment for the repository.

Glen will discuss the journey UC took from code to deployment and highlight what worked and what didn't work so well. He will also share the many tools used to develop Scholar@UC's hosting and deployment environment including Vagrant, Puppet, GitHub, and Bamboo. Glen will also explain the importance of communication and share the steps UC took to make sure everyone had a clear understanding of what needed to get done.

Horton-Doing DevOps for a Perfect Repository Environment-232.pdf

Archidora: Leveraging Archivematica preservation services with an Islandora front-end

Justin Simpson

Artefactual Systems, Inc., Canada

Archidora was co-developed by Islandora developers, Discovery Garden and Archivematica developers, Artefactual Systems, and sponsored by the University of Saskatchewan Libraries. Stated simply, files uploaded to Islandora pass from Fedora to Archivematica, where they are processed for preservation. Once the archival packages are stored, Islandora is notified. This presentation would describe the current workflow, as well as discuss the opportunities it creates for development of features like PREMIS and DDI integration, Fedora support and integrity checks.

Time for presentation: 20 minutes


Data Citation Box

Jozef Mišutka

Charles University in Prague, Czech Republic

Citing submissions is important but citing data submissions has not an established format yet.

We have created a citation service based on DSpace OAI-PMH endpoint implementation which returns citations of resources specified by PID and in desired format like simple html styled text. We display the citation box in DSpace item view but also in external applications.

Mišutka-Data Citation Box-241.pdf

Publishing Datasets from an Open Access Repository As Linked Data

Hui Zhang

Oregon State University, United States of America

Exposing research data from traditional repository systems such as DSpace in Linked Data has numerous benefits such as increasing visibility, prompting open access, and interlinking datasets with elements such as authors. For a successful implementation, it is crucial to preserve object structure (e.g., hierarchical and related) in Linked Data in addition to bibliographic metadata.

This demonstration showcases a case study of migrating datasets from DSpace to a newly development institutional repository with Hydra technology including lessons learned from data modeling and approaches for metadata cleanup and controlled vocabulary enrichment. The estimated time for the demonstration is between 10 to 15 minutes.

Zhang-Publishing Datasets from an Open Access Repository As Linked Data-234.pdf

Date: Thursday, 11/Jun/2015
11:00am - 12:30pmDSP1: DSpace Interest Group 1: DSpace Strategic Plan and Road Map
Session Chair: Sean Thomas

DSpace Long Term RoadMap / Strategic Direction

Tim Donohue1, Maureen Walsh2, Jonathan Markow1

1DuraSpace, United States of America; 2Ohio State University

Over the past few years, the DSpace project has made great strides towards establishing a longer term roadmap.

In 2013, we held a series of “Vision Discussions” to begin brainstorming the vision for DSpace's future. This resulted in a high level Vision Document, as well as a community survey of needs and uses cases to achieve that vision.

In 2014, we analyzed data from that community survey (into a very “rough” high-level plan), and the DSpace Community Advisory Team began to flesh out more detailed Use Cases that DSpace should strive to achieve.

In 2015, in conjunction with the new DSpace Steering Group and Leadership Group, we are now working towards drafting a longer term RoadMap. The goal of this RoadMap would be to attempt to schedule out a clear plan for achieving the most common community use cases, while also pointing out opportunities for institutions to collaborate on additional features or needs.

This session will summarize path we’ve taken towards achieving this long term Roadmap, as well as introduce an early draft of the long term Roadmap for community feedback. There will be an opportunity for open discussion / Q&A during this session.

Donohue-DSpace Long Term RoadMap Strategic Direction-149_a.pdf
Donohue-DSpace Long Term RoadMap Strategic Direction-149_b.ppt
1:30pm - 3:00pmDSP2A: DSpace Interest Group 2A: DSpace 5 / Managing Research (and Open) Data
Session Chair: Maureen Walsh

Introducing DSpace 5

Kim Shepherd1, Hardy Pottinger2, Tim Donohue3

1The University of Auckland, New Zealand; 2University of Missouri; 3Duraspace Ltd

This presentation will cover the recent release of DSpace 5.0, its contributors, new features, improvements and significant bugfixes, as well as a look towards future plans.

DSpace 5.0 was released on January 21st 2015 after months of hard work by the release team, DSpace developers and committers, DCAT, Duraspace and the entire DSpace community.

New features and improvements will be discussed and demonstrated, such as:

* A new responsive web theme for JSPUI: Mirage 2

* Improved, streamlined update process for DSpace repositories

* Batch import items via the web UI

* PDF coverpage generation

* Integration with systems such as ORCID, SHERPA/Romeo

* REST API improvements

...and much more!

As well as highlights of contributions and contributors, time will be set aside for an extended Q&A session where the audience and attending contributors can ask questions and start discussions about new features, upgrades, and ideas for future releases.

Shepherd-Introducing DSpace 5-110_a.pdf
Shepherd-Introducing DSpace 5-110_b.pptx

Durable Item Relations for DSpace 6

Tom Van Gulck1, Jan Lievens1, Bram Luyten2

1Flemish Government department of Environment, Nature and Energy; 2mire, Belgium

The hierarchical DSpace datamodel has long been recognized as a limiting factor for using DSpace in contexts other than typical institutional repository services. This proposal presents a new contribution to the DSpace 6 development facilitating the creation of durable item relations in DSpace. The contribution allows repository managers to break away from the tightly defined hierarchical structure and enables a variety of new use cases for the DSpace platform.

Unlike past proposal with similar ambitions, including the DSpace 2 prototype work​, the proposed approach and contribution has been fully developed and is operational today. The functionality was established in such a way that backwards compatibility with the standard DSpace datamodel has been preserved. As a result, the inclusion of the work into the DSpace codebase does not present the community with new constraints or limitations that would hinder adoption.

These developments have been undertaken by the Flemish Government Department of Environment, Nature and Energy. The main motivation for these developments was the need for representing complex objects in DSpace, while preserving the possibility to apply granular access controls on items and bitstreams.

Van Gulck-Durable Item Relations for DSpace 6-94_a.pdf

What does it take to add data to my repository?

Ryan Scherle

Dryad Digital Repository

Researchers are increasingly motivated to make their data available for future use. Repositories are an ideal location to store such data. However, data is now being placed in repositories that were not designed for data, and in some cases, the repositories were explicitly designed for other purposes. Repository staff do not always have the necessary training or tools to handle data, and repository policies do not always reflect the realities of data.

This talk will review the challenges unique to storing and managing data within a repository. In particular, it will focus on the factors that repository staff must consider when determining the policies that govern acceptance of new data for their repository and long-term management of data in their repository.

Scherle-What does it take to add data to my repository-209_a.pptx
3:30pm - 5:00pmDSP3A: DSpace Interest Group 3A: Review Workflow Workshop
Session Chair: Bram Luyten

DSpace review workflow: the next generation

Andrea Schweer

The University of Waikato, New Zealand

This interactive session invites managers of DSpace repositories to share how they are using the DSpace review workflow currently, what issues they encounter around the review workflow and what areas of functionality are missing or not quite right in the current implementation. DSpace developers are invited to attend the session to learn from their end users.

Schweer-DSpace review workflow-89.pdf