Project – CLAReT

Project Name: CLAReT (Contextualised Learning Repository Tools)

Programme Name: Repositories and Preservation Programme

Strand: Information Environment

JISC Project URI: http://www.jisc.ac.uk/whatwedo/programmes/reppres/tools/claret.aspx
Project URI: http://www.claret.ecs.soton.ac.uk/index.htm

Start Date: 1 Oct 2006
End Date: 31 Oct 2007

Governance: JIIE

Contact Name and Role: Dr. Yvonne Howard, Project Manager

Brief project description:
“CLARET will develop a  prototype web service that can enable visual exploration and the use of social bookmarking of contextual metadata in a repository context.
The CLARET project will also explore the relationship between Learning Object Contextual metadata and the teaching and learning context as defined by working with practitioners. ”

Name of Trawler: John

Outputs:

Comments
Faroes
Please note this project led to the subsequent FAROES project

Project – MR CUTE 2

Project Name: Moodle Repository Create, Upload, Tag and Embed (MR-CUTE 2)

Programme Name: Repositories and Preservation Programme

Strand: Matching funding for digital repository development

JISC Project URI: http://www.jisc.ac.uk/whatwedo/programmes/programme_rep_pres/repositories_sue/mrcute.aspx

Project URI: http://www.learningobjectivity.com/mrcute/

Start Date: 2008

End Date: March 2009

Governance: Repositories and preservation advisory group

Contact Name and Role: Peter Kilcoyne, Project Manager

Brief project description: MrCute is an acronym for Moodle Repository Create Upload Tag Embed, and is intended as an optional Moodle module and block which allow direct and straightforward access to instititutional and other repositories of online learning materials. MrCute 1 was released as a Moodle extension at the end of March 2008. Phase 2 will be released at the end of March 2009 but it is hoped that it will be available for testing some time before that.

Mrcute is intended primarily within the United Kingdom and some portions may only function fully in the UK, but is released under an Open Source licence for use anywhere in the world.

Name of Trawler: Mahendra Mahey

Output – MetaTools – Final Report – Summary

Output Name: Output – MetaTools – Final Report

Title: MetaTools – Final Report
Number of pages or page numbers:  98 pages
Section:

Date Released: 24 November 2008

URI for Output: http://ie-repository.jisc.ac.uk/258/

Summary of contents:

Automatic metadata generation has sometimes been posited as a solution to the ‘metadata bottleneck’ that repositories and portals are facing as they struggle to provide resource discovery metadata for a rapidly growing number of new digital resources. Unfortunately there is no registry or trusted body of documentation that rates the quality of metadata generation tools or identifies the most effective tool(s) for any given task. The aim of the first stage of the project was to remedy this situation by developing a framework for evaluating tools used for the purpose of generating Dublin Core metadata. A range of intrinsic and extrinsic metrics (standard tests or measurements) that capture the attributes of good metadata from various perspectives were identified from the research literature and evaluated in a report. A test program was then implemented using metrics from the framework. It evaluated the quality of metadata generated from 1) Web pages (html) and 2) scholarly works (pdf) by four of the more widely-known metadata generation tools – Data Fountains, DC-dot, SamgI, and the Yahoo! Term Extractor. The intention was also to test PaperBase, a prototype for generating metadata for scholarly works, but its developers ultimately preferred to conduct tests in-house. Some interesting comparisons with their results were nonetheless possible and were included in the stage 2 report. It was found that the output from Data Fountains was generally superior to that of the other tools that the project tested. But the output from all of the tools was considered to be disappointing and markedly inferior to the quality of metadata that Tonkin and Muller report that PaperBase has extracted from scholarly works. Over all, the prospects for generating high-quality metadata for scholarly works appear to be brighter because of their more predictable layout. It is suggested JISC should particularly encourage research into auto-generation methods that exploit the structural and syntactic features of scholarly works in pdf format, as exemplified by PaperBase, and strongly consider funding the development of tools in this direction. In the third stage of the project SOAP and RESTful Web Service interfaces were developed for three metadata generation tools – Data Fountains, SamgI and Kea. This had a dual purpose. Firstly, the creation of an optimal metadata record usually requires the merging of output from several tools each of which, until now, had to be invoked separately because of the ad hoc nature of their interfaces. As Web services, they will be available for use in a network such as the Web with well-defined interfaces that are implementation-independent. These services will be exposed for use by clients without them having to be concerned with how the service will execute their requests. Repositories should be able to plug them into their own cataloguing environments and experiment with automatic metadata generation under more ‘real-life’ circumstances than hitherto. Secondly, and more importantly (in view of the relatively poor quality of current tools) they enabled the project to experiment with the use of a high-level ontology for describing metadata generation tools. The value of an ontology being used in this way should be felt as higher quality tools (such as PaperBase?) emerge. The high-level ontology is part of a MetaTools system architecture that consists of various components to describe, register and discover services. Low level definitions within a service ontology are mapped to higher-level human-understandable semantic descriptions contained within a MetaTools ontology. A user interface enables service providers register their service in a public registry. This registry is used by consumers to find services that match certain criteria. If the registry has such a service, it provides the consumer with a contract and an endpoint address for that service. The terms in the MetaTools ontology can, in turn, be part of a higher-level ontology that describes the preservation domain as a whole. The team believes that an ontology-aided approach to service discovery, as employed by the MetaTools project, is a practical solution. A stage 3 technical report was also written.

Output – FAR – Adding Shibboleth to DSpace With CASAK

Output Name: Output – FAR – Adding Shibboleth to DSpace With CASAK

Title: Adding Shibboleth to DSpace With CASAK
Number of pages or page numbers: webpage

Date Released: unknown

URI for Output: https://dspace.far-project.lse.ac.uk/casak-shib.html (report) for

patch – https://saffron.caret.cam.ac.uk/svn/projects/far/trunk/

Summary of contents:

CASAK is a patch for DSpace which allows container auth to be integrated into DSpace. It provides a number of mapping and filtering facilities to transform data provided by the container into a form suitable for DSpace. These facilities are generally inferior to those available within the SP, but for simple cases the configuration is perhaps simpler.

Configuring CASAK to use Shibboleth is therefore simpler than many use-cases of CASAK.

The approach used here contrasts with that used in MAMS in that CASAK is not Shibboleth specific, and has been designed to make a range of container auth solutions possible with CASAK. The purpose of this was to increase the long-term sustainability of the patch by extending its application to as broad a user-base as possible.

Output – FAR – LDAP server containing sample users with these attributes

Output Name: Output – FAR – LDAP server containing sample users with these attributes

Title: LDAP server containing sample users with these attributes
Number of pages or page numbers: webpage

Date Released: 29 June 2008

URI for Output: https://gabriel.lse.ac.uk/twiki/bin/view/Projects/FAR/FARProjectDemonstratorArchitecture

Summary of contents:

Technical description of the FAR Demonstrator Architecture.

Output – FAR – FAR attribute requirements report

Output Name: Output – FAR – FAR attribute requirements report

Title: FAR attribute requirements report
Number of pages or page numbers: webpage

Date Released: 19 August 2008

URI for Output:  https://gabriel.lse.ac.uk/twiki/bin/view/Projects/FAR/AttributeUseReport

Summary of contents:

  • Requirements for specific attributes to be used for Federated Access Management (FAM)-mediated access to repositories are not necessary because repository products use groups for access control already
  • It is recommended that group membership information is structured in attribute stores on a per-user basis (with a user object containing a list of groups of which the user is a member) as opposed to solely as a per-group basis (with a group object containing a list of members)
  • The use of groups for authorisation means that it should be possible for FAM to continue to be applicable with the introduction of new features to repository software

Output – PRIMO -Working Multimedia Repository

Output Name: Output – PRIMO -Working Multimedia Repository

Title: Working Multimedia Repository

URI for Output: http://primo.sas.ac.uk/eprints/

Summary of contents: Working multimedia repository

Output – ART – An ontology methodology and CISP the proposed Core Information about Scientific Papers

Output Name: Output – ART – An ontology methodology and CISP (Core Information about Scientific Papers)

Title: An ontology methodology and CISP the proposed Core Information about Scientific Papers
Number of pages or page numbers: 26 pages

Date Released: December 2007

URI for Output:  http://www.aber.ac.uk/compsci/Research/bio/art/publications/ReportCISPshort.pdf

Summary of contents:

This report contains details about CISP, the results from the online survey as well as the benefits of assuming an ontology methodology when producing meta-data.

This report has two main goals:

  • To introduce a new formalism for the description of scientific papers CISP (the Core Information about Scientific Papers);
  • Attract more attention to ontologies as a valuable methodology for developing metadata.

The report demonstrates the  advantages of an ontology methodology for developing metadata by applying it to the analysis of the Dublin Core metadata (DC). An ontology approach allows detecting potential weaknesses in the representation of the DC terms. Such weaknesses include overlap in the semantic meaning between the terms, logically incoherent representation of temporal and spatial relations as well as incoherence in the representation of content. An ontology can also suggest improvements to the DC.
The report describes an ontology methodology to construct CISP metadata about the content of papers. It makes use of an ontology of experiments EXPO proposed at the University of Wales, Aberystwyth as a core ontology, and DOLCE (a Descriptive Ontology for Linguistic and Cognitive Engineering) developed at the Laboratory for Applied Ontology, the Institute of Cognitive Science and Technology, Italy as an upper level ontology.
CISP is a defined set of leaf classes from these ontologies. It includes such key classes as <Goal of investigation>, <Object of investigation>, <Research method>, <Result>, <Conclusion>.

CISP can be used to generate abstracts and summaries of papers and also to facilitate storage and retrieval of information. CISP will constitute the basis for the ART tool. The latter is an authoring tool for the semantic annotation of papers stored in digital repositories. ART is intended for the semi-automatic annotation of data and metadata describing the scientific investigation represented in a research paper. ART will also be able to aid in the expression of research results directly in both a human and machine readable format, through the composition of text using ontology-based templates and stored typical key phrases. .
To find out more about ontology methodology refer to chapters 2 and 3 .
To learn about the proposed CISP metadata you can start reading from chapter 4 onwards.

Output – ART – Semantic Annotation of Papers: Interface & ENrichment Tool (SAPIENT)

Output Name:  Output – ART – Semantic Annotation of Papers: Interface & ENrichment Tool (SAPIENT)

Title: Semantic Annotation of Papers: Interface & ENrichment Tool (SAPIENT)
Date Released: 13 October 2008

URI for Output:  http://www.aber.ac.uk/compsci/Research/bio/art/sapient/

Summary of contents:

The first release of SAPIENT, the ART Tool for the annotation of general scientific papers has been circulated to annotators.

Output – LIFE2 – Economic evaluation of LIFE methodology

Output Name: Output – LIFE2 – Economic evaluation of LIFE methodology

Title: Economic evaluation of LIFE methodology
Number of pages or page numbers: 26 pages
Section:

Date Released:

URI for Output: http://eprints.ucl.ac.uk/7684

Summary of contents: Validation of the economic modelling and methodology for the Lifecycle and Generic Preservation formulae developed in Phase 1 of the LIFE project, with technical and presentational development of the models. Cloudlake Consulting Oy carried out this evaluation.  The major conclusions are on page 16:

All in all there seem to be two major application areas for the LIFE models:

  • Institutional repositories, which span a range of object types that are likely to populate the IR of a particular university.
  • Specialised collections of national libraries and similar organisations, which have a national and sometimes legal obligation to long-term archiving.
In the latter case it seems more sensible to apply the model to individual collections than to
the totality of objects stored in say a national library.
In such a case it is important for a national library, which works within budget restrictions, to be able to compare the long term preservation costs of different collections, in order to make informed priority decisions. This is in contrast with the Institutional Repository Case.
An important point which could have far-reaching consequences for the parameters of the
model is how institutional repositories (which are numerous) are going to solve the
preservation management issue. In contrast to national institutions such as the British Library,
universities would gain very obvious benefits from sharing resources for preservation, for
instance via consortia, outsourcing, using external service providers etc. A good case is for instance the technology watch function included in the model. One can argue if there is a need for every university to
duplicate this effort. A more sensible approach would be for certain service providers to
assume the responsibility for issuing guidelines.

Additional information:

Comments: