Output – IncReASe: Final Report – web pages

Title: IncReASe: Final Report

Page: 13

Date Released: 30 April 2009
URI for Output: http://eprints.whiterose.ac.uk/increase/increase_finalreportv1.pdf

Summary of contents:
“Analysis of individual researcher publication pages revealed a good deal of inconsistency of formatting, including within individual publication lists. The idea of “scraping” publication metadata from researcher pages is attractive, but the reality is quite challenging.”
“The Perl code written for one author could not be reused with another and would need tweaking every time.”

more detials of the issues encountered are available http://eprints.whiterose.ac.uk/increase/scraper.html

the project notes that the AIR project is investigating more sophisticated approaches to this problem (using machine-learning alogorithims) http://clg.wlv.ac.uk/projects/AIR/.

Output – IncReASe: Final report – self-archiving

Title: IncReASe Final Report

Page: 11, 21-22

Date Released: 30 April 2009

URI for Output: http://eprints.whiterose.ac.uk/increase/increase_finalreportv1.pdf

Summary of contents:
“Our observations suggest that conditions likely to improve self-deposit are:
(i) keeping things as simple as possible from the author’s perspective
(ii) always asking for the author’s final version of a work (… “Accepted Version” suggested by The VERSIONS project …)
(iii) facilitating capture of the work at the point of acceptance for publication. …
(iv) providing central support to monitor uploaded files and seek copyright clearance where required
(v) reminding authors to deposit: this could be a periodic reminder, or could be linked to a publication “event” such as a publication being indexed in a bibliographic database
(vi) highlighting the impact of deposit through the regular provision of usage data”

From the conclusions:
“There is probably no simple “optimum” deposit point for research outputs; however, in the short term, capturing papers at the point of acceptance for publication is probably the most realistic option. The emergence of desktop capture/deposit tools may facilitate earlier capture and assist with version control. Capturing the most appropriate version of a work continues to be an issue; all efforts should be made to inform researchers about the “accepted version” and its importance in the open access landscape. It is likely to be helpful to instil this awareness in early career researchers and PhD students by including open access / scholarly communication elements in training.”

Based on their survey work and interviews these are the project’s suggestions to support the self-archiving process; this is an ongoing challenge even with mandates; in itself it provides workflow advice and suggests what software tools are needed.

Project – HILT Phase IV

High-Level Thesaurus (Phase IV)

Short Project Name: HILT IV

Programme Name: Repositories and Preservation Programme

Strand: Shared Infrastructure Services

Brief project description:
Pilot terminology service to assist users of the IE with the discovery of the appropriate resources by subject browse and search.

“‘HILT phase IV: Transition to Service Testbed and Future Requirements Study’ aims to research, investigate and develop pilot solutions for problems pertaining to cross-searching multi-subject scheme information environments, as well as providing a variety of other terminological searching aids.

HILT phase IV will build on the work of phase III by moving HILT to a transition to service phase. This will allow an initial entry-level service to be built, tested for user requirements and retrieval effectiveness, refined in line with the findings, and extended to permit the use of a range of distributed terminology services for interoperability. It will also allow an examination of the level of need and interest amongst JISC services in respect of an operational service and, if appropriate, a scoping of the costs and requirements of a future operational phase of the service.

HILT phase IV will also conduct a parallel programme of research into selected topics germane to terminology services, as well research into the costs and requirements of an initial entry-level operational service and any future extension of this. ”

JISC Project URI: http://www.jisc.ac.uk/whatwedo/programmes/reppres/sharedservices/hilt2

Project URI: http://hilt.cdlr.strath.ac.uk/index.html

Start Date: 1st June 2002

End Date: 29th January 2009

Governance: JIIE

Contact Name and Role: Dennis Nicholson (Project Manager)

Name of Trawler: John

Available Outputs:
The following demostrator services are available:

  • HILT SOAP client
  • Demo of SRU
  • HILT2 Emulation
  • Vocabulary Browse/Search
  • Lucene Spell Checker
  • Wordnet
  • BUBL Search (example of embedding toolkit elements in a service)
  • OCLC client


Project – Fedorazon

Project Name: Fedorazon

Short Project Name: Fedorazon

Programme Name:  Repositories and Preservation

Strand: Repositories Start-up and Enhancement projects

JISC Project URI: http://www.jisc.ac.uk/whatwedo/programmes/reppres/sue/fedorazon.aspx

Project URI: http://www.ukoln.ac.uk/repositories/digirep/index/Fedorazon

Start Date:  1 October 2007

End Date: 31 March 2008

Governance: JISC IE

Contact Name and Role: David Flanders (Project Manager)

Brief project description:

The Aim of project Fedorazon is to enhance the content of repositories throughout the UK’s HE and FE sector by providing solutions for the scalability of repositories as they grow in size and complexity. It looks to remove the “hardware” barriers involved in launching and maintaining a repository. It will accomplish this initially by enabling the use of Fedora Commons repository software on-top-of Amazon’s virtual servers (EC2 & S3). By pre-configuring these servers, any HE/FE institution can “rent” Amazon server space and launch their own secure Fedora repository without having to pre-configure a local server within their institution. In short, institutions can launch their repository service in the same day they decide to have one, and without hiring a “hardware” expert. Overall, the project will begin to formulate the cost effectiveness for this kind of set-up and recommend best-practice to other repository departments.

Name of Trawler: Mahendra Mahey

Outputs: (just link to individual output postings) as a bulleted list

Project – CAIRO

Project Name: Complex Archive Ingest for Repository Objects

Short Project Name: CAIRO

Brief project description:
“This project will develop a tool for ingesting complex collections of born-digital materials, with basic descriptive, preservation and relationship metadata, into a preservation repository. The proposal is based on needs identified by the JISC-funded Paradigm project and the Wellcome Library’s Digital Curation in Action project and is a key building block in our strategy to develop digital repository architectures which can support the development of digital collections. This tool will be tested on personal digital collections already accessioned by the partner institutions and will provide an open-source tool for use by others with similar requirements. The project will produce technical and user support documentation and promote the tool with relevant audiences. “


Programme Name: Repositories and Preservation Programme

Strand: Tools and Innovation

JISC Project URI: http://www.jisc.ac.uk/whatwedo/programmes/reppres/cairo.aspx
Project URI: http://cairo.paradigm.ac.uk/about/index.html

Start Date: 2006-10-01

End Date: 2008-08-31

Governance: Integrated Information Environment Committee (JIIE)

Contact Name and Role: Susan Thomas, Project Manager

Name of Trawler: John

Project – MR CUTE 2

Project Name: Moodle Repository Create, Upload, Tag and Embed (MR-CUTE 2)

Programme Name: Repositories and Preservation Programme

Strand: Matching funding for digital repository development

JISC Project URI: http://www.jisc.ac.uk/whatwedo/programmes/programme_rep_pres/repositories_sue/mrcute.aspx

Project URI: http://www.learningobjectivity.com/mrcute/

Start Date: 2008

End Date: March 2009

Governance: Repositories and preservation advisory group

Contact Name and Role: Peter Kilcoyne, Project Manager

Brief project description: MrCute is an acronym for Moodle Repository Create Upload Tag Embed, and is intended as an optional Moodle module and block which allow direct and straightforward access to instititutional and other repositories of online learning materials. MrCute 1 was released as a Moodle extension at the end of March 2008. Phase 2 will be released at the end of March 2009 but it is hoped that it will be available for testing some time before that.

Mrcute is intended primarily within the United Kingdom and some portions may only function fully in the UK, but is released under an Open Source licence for use anywhere in the world.

Name of Trawler: Mahendra Mahey

Output – FAR – Adding Shibboleth to DSpace With CASAK

Output Name: Output – FAR – Adding Shibboleth to DSpace With CASAK

Title: Adding Shibboleth to DSpace With CASAK
Number of pages or page numbers: webpage

Date Released: unknown

URI for Output: https://dspace.far-project.lse.ac.uk/casak-shib.html (report) for

patch – https://saffron.caret.cam.ac.uk/svn/projects/far/trunk/

Summary of contents:

CASAK is a patch for DSpace which allows container auth to be integrated into DSpace. It provides a number of mapping and filtering facilities to transform data provided by the container into a form suitable for DSpace. These facilities are generally inferior to those available within the SP, but for simple cases the configuration is perhaps simpler.

Configuring CASAK to use Shibboleth is therefore simpler than many use-cases of CASAK.

The approach used here contrasts with that used in MAMS in that CASAK is not Shibboleth specific, and has been designed to make a range of container auth solutions possible with CASAK. The purpose of this was to increase the long-term sustainability of the patch by extending its application to as broad a user-base as possible.