Output – IncReASe: Final Report – self archiving rates

Title: IncReASe Final Report

Pages: 10-11

Date Released: 30 April 2009
URI for Output: http://eprints.whiterose.ac.uk/increase/increase_finalreportv1.pdf

Summary of contents:
“It is often stated that, worldwide, the spontaneous level of self-archiving is around 10-15% (i.e. about 15% of published articles are made openly available by their authors).[Harnard (2006), Björk, B-C., Roosr, A. & Lauri, M. (2008)] We found similar levels of archiving: 16% of questionnaire respondents link to local, open copies of their work; 19% link to external copies – though often these are not openly accessible. Having said this, much of the self-archived content on web sites is working papers, reports and conference papers; the % of published journal papers spontaneously self-archived (on personal web sites or in any repository) by White Rose authors is likely to be lower than 15%. Of course, there is considerable variation between subject disciplines. This highlights the immediate potential value of open access repositories but also, perhaps, underlines the scale of the cultural change required – even after several years of institutional repository development – to engage researchers in active dissemination of their outputs.”

Comments:
This provides further evidence for the percentile statistics of self-archiving. One consequence of this figure (even within a now established repository) is the challenge faced by instituions seeking to comply with funder’s deposit manadates.

Output – IncReASe: Final Report – bulk import

Title: IncReASe: Final Report

Page: 12-13, 21

Date Released: 30 April 2009
URI for Output: http://eprints.whiterose.ac.uk/increase/increase_finalreportv1.pdf

Summary of contents:
The project looked at importing from departmental bibliographic databases and from other departmental bibliographic collections (some of which where created explicitly for this purpose).

“It is interesting to note that the department preferred on balance to create their own local database and upload material en masse at the end of the summer. Similar suggestions have been made from time to time by other departments even though creating an additional collection system involves more work at the local level. For example, we have been asked to provide an Excel template to allow data to be collected ready for periodic bulk import into the repository.Though this approach may seem counterintuitive, local academics and administrators have suggested that, for some departments, this [local collection] may be a more sustainable method of data collection. Such solutions may be worth considering, perhaps as an interim measure, where sustained self-archiving activity is proving particularly elusive – though could prove counterproductive overall.”

It is also of note a number of departments already had their own bibliographic management tools. Some of which could export in formats that are directly importable into ePrints via plugins (DOI, EndNote, BibTex, Multiline Excel and PubMed ID). more detaisl on the use of the plugions ar available: http://eprints.whiterose.ac.uk/increase/plugins.html [page 18 notes that one difficulty with using DOI material from crossref is the lack of author data as a result “We have used CrossRef as a base source of metadata but not to enhance metadata in records already created within the repository.”]

The project notes that some of the desire to use other tools maybe be sidestepped by future developments that better integrate repository deposit into researcher’s workflows and by the introduction of research information/ management systems.

From the conlcusions
“There are likely to be personal and departmental sources of metadata suitable for bulk import at most /all HEIs. The metadata within such systems may well be inconsistent and incomplete. We found import to be more time-consuming than we hoped. A high degree of manual intervention was required: mainly to supplement incomplete metadata or add full publication details to imported “in press” items. Unless effective ways can be found to automatically check and improve bulk metadata this type of import may be a false economy and may not be the best way to grow the repository sustainably nor to embed into researchers’ workflow. An alternative approach would be to identify sources of pre-quality checked metadata – possibly from commercial sources – to create a back-catalogue of publication metadata.”

Comments:
There is again a highlighted concern about alternative solutions impacting on the adoption of self-archiving.

[I think] The project’s experience that departments may opt to run their own bibliographic systems is an important reminder that there is not one solution to either archiving Open Access copies and that information in one place does not equate to information in one system.

It demonstrates the effective use of a number of plugins around the e-prints software to successfully import data.

Output – IncReASe: Final report – self-archiving

Title: IncReASe Final Report

Page: 11, 21-22

Date Released: 30 April 2009

URI for Output: http://eprints.whiterose.ac.uk/increase/increase_finalreportv1.pdf

Summary of contents:
“Our observations suggest that conditions likely to improve self-deposit are:
(i) keeping things as simple as possible from the author’s perspective
(ii) always asking for the author’s final version of a work (… “Accepted Version” suggested by The VERSIONS project …)
(iii) facilitating capture of the work at the point of acceptance for publication. …
(iv) providing central support to monitor uploaded files and seek copyright clearance where required
(v) reminding authors to deposit: this could be a periodic reminder, or could be linked to a publication “event” such as a publication being indexed in a bibliographic database
(vi) highlighting the impact of deposit through the regular provision of usage data”

From the conclusions:
“There is probably no simple “optimum” deposit point for research outputs; however, in the short term, capturing papers at the point of acceptance for publication is probably the most realistic option. The emergence of desktop capture/deposit tools may facilitate earlier capture and assist with version control. Capturing the most appropriate version of a work continues to be an issue; all efforts should be made to inform researchers about the “accepted version” and its importance in the open access landscape. It is likely to be helpful to instil this awareness in early career researchers and PhD students by including open access / scholarly communication elements in training.”

Comments:
Based on their survey work and interviews these are the project’s suggestions to support the self-archiving process; this is an ongoing challenge even with mandates; in itself it provides workflow advice and suggests what software tools are needed.

Output – IncReASe: Final Report – research management

Title: IncReASe: Final Report

Page: 14, 21-22

Date Released: 30 April 2009
URI for Output: http://eprints.whiterose.ac.uk/increase/increase_finalreportv1.pdf

Summary of contents:
All three universitiies particpating in WRRO have begun to examine Research management systems with differing results.
University of Sheffield have put their working group’s findings on hold pending more information about the REF but are investigating systems.
University of York is currently scoping a Research Information System (WRRO is likley to have a significant role)
University of Leeds has selected a system. Their RIS system “will [probably] become the primary ingest route for both metadata and full text”. As yet workflows and staffing (including any involvement of repository or library staff) for the metadata creation in this new system are unclear. As the source of metadata and primary point of contact with academic/ research staff this has the potential to greatly benefit the integration of WRRO into the publication process.

From the conlucions:
“Our discussion with researchers suggests that a comprehensive service – essentially, a publication database – is probably an easier sell than a pure “open access” repository (echoing the conclusions previously drawn by, for example, the TARIS project); its raison d’être is clearer and the possibility for providing services back to researchers in the form of full listings of research and detailed information on traffic to individual works, is increased. Currently, this is not the direction being taken by WRRO; rather, because other central services are likely to fulfil the publication database function, the emphasis remains on external dissemination of open access outputs.”
“Capturing grant and project data relevant to research outputs is likely to increase in importance; this data can help maximise the value of repository content for both research and administrative purposes.”

Comments:
The RIS system at Leeds does of course also raise potential to difficulties for metadata quality.

All three institutions invovled in WRRO are actively moving towards some form of CRIS system; in all institutions this will impact significantly on the role and prominence of the repository. It is not clear how positive or negative this impact will be. What is clear is that for a institutional repositories covering the area of scholarly communicaitons CRIS systems are very likely to change what and how they operate.

Output – SAFIR: Requirements Specification – Scenarios

Title: Digital Library Project (SAFIR): Requirements Specification

Pages: 14-15
Date Released: 07 March 2008

URI for Output: https://vle.york.ac.uk/bbcswebdav/xid-89716_3

Summary of contents:
Five scenarios are presented for the use of a mulitmedia repository.
Each is focused around a key type of use; they are:

  • finding image materials
  • sharing resources, advice and guidance
  • streaming
  • archival collections
  • video materials

Comments:

The five scenarios are relevant to the growing knowledge/ innovation  base connected to the e-Framework.

Output – VIF:The results of the VIF user requirements study – datasets

Title: VIF:The results of the VIF user requirements study

Pages: webpage (summary of http://www.lse.ac.uk/library/vif/Versioning_Issues_-_Discussion_Paper.doc)
Date Released:

URI for Output: http://www.lse.ac.uk/library/vif/Problem/research.html

Summary of contents:
“VIF carried out further research into repositories that already contain some datasets, and investigated how these datasets are managed. Because this is a currently limited field, and because repository systems are not primarily configured to deal with such objects, we found that repository staff:

* Avoid versioning issues wherever possible by only keeping the most recent version. Older versions are deleted. This contrasts with how older version of other types of object are usually treated.
* By doing this, potential issues about which version people are citing becomes a problem.
* Have not found satisfactory ways to describe or indicate the relationship that a particular set of data holds to other related research outputs that are held in the repository.”

Comments:

This practice, if widespread beyond the survey group, represents a significant challenge that needs to be addressed (possibly by tool/repository plugin development). Succesfully citing and sharing datasets requires a stable and identifiable versioning system.

Output – VIF:The results of the VIF user requirements study – taxonomy

Title: VIF:The results of the VIF user requirements study

Pages: webpage (summary of http://www.lse.ac.uk/library/vif/Versioning_Issues_-_Discussion_Paper.doc)
Date Released:

URI for Output: http://www.lse.ac.uk/library/vif/Problem/research.html

Summary of contents:
“Many free text comments remarked that whilst the idea is a sound one in principle, implementing such a taxonomy [of versions] would be virtually impossible without some sort of enforcing body. Also, getting community agreement on the terminology used would be difficult due to the often polarised standpoints of publishers and information professionals. Insulating the vocabulary chosen from the pre-established terminology and bias of certain camps would clearly be a very serious undertaking.”

Comments:

This review of feedback received through the survey highlights the probable difficulties inherent in any proposed common/standard set of terms for versions of digital assets.

The contentiousness of agreeing a taxonomy of versions for articles can also be seen in the variety of responses to NISO’s work on journal article versions  (http://www.niso.org/publications/rp/RP-8-2008.pdf).

Output – VIF:The results of the VIF user requirements study – formats

Title: VIF:The results of the VIF user requirements study

Pages: webpage (summary of http://www.lse.ac.uk/library/vif/Versioning_Issues_-_Discussion_Paper.doc)
Date Released:

URI for Output: http://www.lse.ac.uk/library/vif/Problem/research.html

Summary of contents:
“There is an awareness by information professionals of a trend towards a wider range of object types being created. When asked what types of material they currently stored in their repositories, 95.4% of information professionals claimed that they currently store, or plan to store, text documents with many also stating that they store, or plan to store, audio files (73.6%), datasets (77.9%), images (83.3%), learning objects (46.5%) and video files (75.3%). This can be seen to be especially positive, especially in the context of the results of the academics survey, which suggested a large number of researchers either already create or intend to create audio files (47.2%), datasets (68%), images (72.5%), learning objects (74.6%) and video files (57.6%). As expected, the vast majority also intend to continue working with text documents.”

Comments:
Survey data about snapshot of content types stored by repositories and content types created by academics; it provides one comparasion between current ‘supply’ (what can be stored) and ‘demand’ (what users want to store) which informs the sector.

The figures for non-textual materials being (or about to be stored) by repositories seem quite high given comparable stats from OpenDOAR:

Content Types in OpenDOAR Repositories - Worldwide
From: OpenDOAR

Output – VIF – VIF user requirements study: repository purpose

Title: VIF:The results of the VIF user requirements study

Pages: webpage (summary of http://www.lse.ac.uk/library/vif/Versioning_Issues_-_Discussion_Paper.doc)
Date Released:

URI for Output: http://www.lse.ac.uk/library/vif/Problem/research.html

Summary of contents:
“The two groups did diverge on the perceived purpose of repositories. The academics we surveyed were very clear about their wish to only make the finished version of their output ultimately available and free text comments (often even in answers to questions on different subjects) showed that they considered repositories were useful to highlight latest research, but not necessarily to preserve the body of research. This contrasts directly with the wishes of information professionals, who overwhelmingly wanted to store all available versions.”

Comments:
A finding which highlights a potential difference of opinion between information professionals and academics about what the repository is there for. This lends support to the idea that preservation may not be perceived by academics as a key function of a repository (though counter example of Hull – Repomman etc.- should be noted).

Output – CAIRO: Cairo Use Cases

Title: Cairo use cases: a survey of user scenarios applicable to the Cairo ingest tool

Pages: all
Date Released: 21 May 2007

Summary of contents:
The CAIRO “project will develop a tool for ingesting complex collections of born-digital materials, with basic descriptive, preservation and relationship metadata, into a preservation repository.” The tool is designed to aggregate and interface with other tools and so reduce the computing skills overhead on archivists awnting to create AIPs. p3
“This document outlines a set of [55] use cases describing the different interactions users of the Cairo tool have with that tool. The use cases also describe the behaviour of the tool in response to those user interactions.”p5

URI for Output: http://cairo.paradigm.ac.uk/projectdocs/cairo_project_use_cases_pv1.pdf

Comments:
This document provides a selection of use cases that have shaped the developed of an ingest tool. As such they inform not only this tools but software/service development more generally and institutional preservation planning.