Output – IncReASe: Final Report – web pages

Title: IncReASe: Final Report

Page: 13

Date Released: 30 April 2009
URI for Output: http://eprints.whiterose.ac.uk/increase/increase_finalreportv1.pdf

Summary of contents:
“Analysis of individual researcher publication pages revealed a good deal of inconsistency of formatting, including within individual publication lists. The idea of “scraping” publication metadata from researcher pages is attractive, but the reality is quite challenging.”
“The Perl code written for one author could not be reused with another and would need tweaking every time.”

more detials of the issues encountered are available http://eprints.whiterose.ac.uk/increase/scraper.html

the project notes that the AIR project is investigating more sophisticated approaches to this problem (using machine-learning alogorithims) http://clg.wlv.ac.uk/projects/AIR/.

Output – IncReASe: Final Report – bulk import

Title: IncReASe: Final Report

Page: 12-13, 21

Date Released: 30 April 2009
URI for Output: http://eprints.whiterose.ac.uk/increase/increase_finalreportv1.pdf

Summary of contents:
The project looked at importing from departmental bibliographic databases and from other departmental bibliographic collections (some of which where created explicitly for this purpose).

“It is interesting to note that the department preferred on balance to create their own local database and upload material en masse at the end of the summer. Similar suggestions have been made from time to time by other departments even though creating an additional collection system involves more work at the local level. For example, we have been asked to provide an Excel template to allow data to be collected ready for periodic bulk import into the repository.Though this approach may seem counterintuitive, local academics and administrators have suggested that, for some departments, this [local collection] may be a more sustainable method of data collection. Such solutions may be worth considering, perhaps as an interim measure, where sustained self-archiving activity is proving particularly elusive – though could prove counterproductive overall.”

It is also of note a number of departments already had their own bibliographic management tools. Some of which could export in formats that are directly importable into ePrints via plugins (DOI, EndNote, BibTex, Multiline Excel and PubMed ID). more detaisl on the use of the plugions ar available: http://eprints.whiterose.ac.uk/increase/plugins.html [page 18 notes that one difficulty with using DOI material from crossref is the lack of author data as a result “We have used CrossRef as a base source of metadata but not to enhance metadata in records already created within the repository.”]

The project notes that some of the desire to use other tools maybe be sidestepped by future developments that better integrate repository deposit into researcher’s workflows and by the introduction of research information/ management systems.

From the conlcusions
“There are likely to be personal and departmental sources of metadata suitable for bulk import at most /all HEIs. The metadata within such systems may well be inconsistent and incomplete. We found import to be more time-consuming than we hoped. A high degree of manual intervention was required: mainly to supplement incomplete metadata or add full publication details to imported “in press” items. Unless effective ways can be found to automatically check and improve bulk metadata this type of import may be a false economy and may not be the best way to grow the repository sustainably nor to embed into researchers’ workflow. An alternative approach would be to identify sources of pre-quality checked metadata – possibly from commercial sources – to create a back-catalogue of publication metadata.”

There is again a highlighted concern about alternative solutions impacting on the adoption of self-archiving.

[I think] The project’s experience that departments may opt to run their own bibliographic systems is an important reminder that there is not one solution to either archiving Open Access copies and that information in one place does not equate to information in one system.

It demonstrates the effective use of a number of plugins around the e-prints software to successfully import data.

Output – IESR – Latest Additions RSS feed

Title: IESR Latest Additions RSS feed

Date Released: Unknown

URI for Output: http://iesr.ac.uk/feeds/latestadditions.xml

Summary of contents:

Allows applications or users to subscribe to the 10 latest resources added to IESR.

Additional information:


Output – IESR – OpenURL Link-To Resolver

Title: IESR OpenURL Link-To Resolver

Date Released: Unknown

URI for Output: http://iesr.ac.uk/use/openurl/

Summary of contents:

“The IESR OpenURL ‘Link-To’ Resolver service provides retrieval of IESR XML records for single entities using Z39.88-2004, OpenURL Framework, syntax. Currently support for OpenURL syntax is limited allowing retrieval by identifier only using a Key/Encoded Value (KEV) request inline by HTTP. Values must be URL-encoded.”

Additional information:


Output – IESR – Registry m2m Interfaces (OAI-PMH, SRU/W and Z39.50)

Title: IESR Registry Machine to Machine Interfaces (OAI-PMH, SRU/W and Z39.50)

Date Released: Unknown

URI for Output:  http://iesr.ac.uk/use/oaipmh/, http://iesr.ac.uk/use/sru/http://iesr.ac.uk/use/z3950/

Summary of contents:

These outputs provide m2m interfaces for cross, meta- or federated search applications.

Additional information:


Output – IESR – IESR Registry

Title: IESR Registry

Date Released: Unknown

URI for Output: http://iesr.ac.uk/service/iesrbrowse?type=new

Summary of contents:

“IESR is a resource discovery tool intended to benefit the UK academic community. Access is through web and machine interfaces or search plug-ins.”

Additional information:

There are a number of machine to machine interfaces to the IESR. The link above is to the web interface aimed at users.


Project – OpenDoar

Project Name: OpenDoar

Programme Name:Digital Repositories Programme 2005-7

Strand: Information Environment, e-Administration

JISC Project URI: http://www.jisc.ac.uk/whatwedo/programmes/digitalrepositories2005/opendoar.aspx

Project URI: http://www.opendoar.org/

Start Date: 1st Jan 2005

End Date: 30th June 2006


Contact Name and Role: Bill Hubbard, Project Manager

Brief project description:

“OpenDOAR will categorise and list the wide variety of Open Access research archives that have grown up around the world. Such repositories have mushroomed over the last 2 years in response to calls by scholars and researchers worldwide to provide open access to research information.

OpenDOAR will provide a comprehensive and authoritative list of institutional and subject-based repositories, as well as archives set up by funding agencies – like the National Institutes for Health in the USA or the Wellcome Trust in the UK and Europe. Users of the service will be able to analyse repositories by location, type, the material they hold and other measures. This will be of use both to users wishing to find original research papers and for third-party “service providers”, like search engines or alert services, which need easy to use tools for developing tailored search services to suit specific user communities.”


  • Descriptive list of open access repositories of relevance to academic research.
  • Comprehensive & authoritative list for end users wishing to find particular types of, or specific repositories.
  • Comprehensive, structured and maintained list with clear update and self-regulation protocols to enable development of the list.
  • Crominent international role in the organisation of and access to open access repository services.
  • Supporting Open Access outreach and advocacy endeavours within institutions and globally.
  • Survey the growing field of academic open access research repositories and categorise them in terms of locale, content and other measures

Project – Fedorazon

Project Name: Fedorazon

Short Project Name: Fedorazon

Programme Name:  Repositories and Preservation

Strand: Repositories Start-up and Enhancement projects

JISC Project URI: http://www.jisc.ac.uk/whatwedo/programmes/reppres/sue/fedorazon.aspx

Project URI: http://www.ukoln.ac.uk/repositories/digirep/index/Fedorazon

Start Date:  1 October 2007

End Date: 31 March 2008

Governance: JISC IE

Contact Name and Role: David Flanders (Project Manager)

Brief project description:

The Aim of project Fedorazon is to enhance the content of repositories throughout the UK’s HE and FE sector by providing solutions for the scalability of repositories as they grow in size and complexity. It looks to remove the “hardware” barriers involved in launching and maintaining a repository. It will accomplish this initially by enabling the use of Fedora Commons repository software on-top-of Amazon’s virtual servers (EC2 & S3). By pre-configuring these servers, any HE/FE institution can “rent” Amazon server space and launch their own secure Fedora repository without having to pre-configure a local server within their institution. In short, institutions can launch their repository service in the same day they decide to have one, and without hiring a “hardware” expert. Overall, the project will begin to formulate the cost effectiveness for this kind of set-up and recommend best-practice to other repository departments.

Name of Trawler: Mahendra Mahey

Outputs: (just link to individual output postings) as a bulleted list

Output – UHRA – University of Hertfordshire Research Archive

Title: The University of Hertfordshire Research Archive

Date Released: Approx September 2007

URI for Output: https://uhra.herts.ac.uk/dspace/

Summary of contents:

The main output from this project is the establishment of the University of Hertfordshire Research Archive. It is described as “.. a showcase of the research produced by the University of Hertfordshire staff (copyright permitting) which is freely available over the web” and ” .. provides a simple interface to enable researchers to self-archive the full text of their published work with just a few quick and easy steps.”

Additional information:


The archive/repository appears to be fully functional and contains 2556 items as at 30th January 2007.

Output – KULTUR – Environmental Assessment of the University of the Arts, London

Title: Environmental Assessment of the University of the Arts, London
Number of pages or page numbers: pp 6-7
Section: Summary

Date Released: 8th April 2008

URI for Output: http://kultur.eprints.org/docs/UUAL%20profile%208%20april%20online%20version.pdf

Summary of contents:

The summary section has a few useful observations w.r.t. repositories in the Arts sector:

“The opportunities for a repository at UAL are great since there is a wealth of research
being produced at all levels within the University. At the same time the sheer amount of
research and research active staff can present its own problems. The targeting of key
research staff, the enlisting of research centres/units and the research offices are
essential for the success of the project. Advocacy from the top and from the bottom is
needed but this can only really be effective by establishing good relationships and links
with relevant University bodies and staff. We need to identify just what a repository can
do for each group and advocate along those lines … Populating the demonstrator with a good number of pieces of research will help the project become more attractive and viable to research staff. The interface and the software itself will also play a large part in any success.”

Additional information: