Output – HILTIV: demonstrators

Title: HILT phase IV demonstrators

Pages: webpage
Date Released: June 2008
URI for Output: http://hilt.cdlr.strath.ac.uk/hilt4/demonstrators.html

Summary of contents:
HILTIV has produced a number of service demostrators examining “cross-searching multi-subject scheme information environments”. The demonstration aspect of their work is that the mapppings being used are of selected sections of vocabularl schema.

They have a:


They have also embedded the demostrator service within BUBL (an catalogue of resources relating to Library and Information Sciences) http://bubl.ac.uk/hilt4.htm
They have also created some form of client for OCLC http://linuxserv.cdlr.strath.ac.uk/~anuj/cgi-bin/hilt4/oclc_client.cgi

Comments:

Output – IESR – Registry m2m Interfaces (OAI-PMH, SRU/W and Z39.50)

Title: IESR Registry Machine to Machine Interfaces (OAI-PMH, SRU/W and Z39.50)

Date Released: Unknown

URI for Output:  http://iesr.ac.uk/use/oaipmh/, http://iesr.ac.uk/use/sru/http://iesr.ac.uk/use/z3950/

Summary of contents:

These outputs provide m2m interfaces for cross, meta- or federated search applications.

Additional information:

Comments:

Output – IESR – HTML Plug-in

Title: IESR Registry HTML Plug-in

Date Released: Unknown

URI for Output: http://iesr.ac.uk/use/htmlplugin/

Summary of contents:

“The IESR Search HTML Plug-in allows you to add an IESR search to your website in order to discover new electronic resources. The plug-in is a simple HTML search box that shows results on the IESR website.”

Additional information:

You need a basic knowledge of HTML and CSS plus the ability to edit webpages. Simply add the following HTML, CSS and Javascript to your webpage to create the search box.

Comments:

Project – HILT Phase IV

High-Level Thesaurus (Phase IV)

Short Project Name: HILT IV

Programme Name: Repositories and Preservation Programme

Strand: Shared Infrastructure Services

Brief project description:
Pilot terminology service to assist users of the IE with the discovery of the appropriate resources by subject browse and search.

“‘HILT phase IV: Transition to Service Testbed and Future Requirements Study’ aims to research, investigate and develop pilot solutions for problems pertaining to cross-searching multi-subject scheme information environments, as well as providing a variety of other terminological searching aids.

HILT phase IV will build on the work of phase III by moving HILT to a transition to service phase. This will allow an initial entry-level service to be built, tested for user requirements and retrieval effectiveness, refined in line with the findings, and extended to permit the use of a range of distributed terminology services for interoperability. It will also allow an examination of the level of need and interest amongst JISC services in respect of an operational service and, if appropriate, a scoping of the costs and requirements of a future operational phase of the service.

HILT phase IV will also conduct a parallel programme of research into selected topics germane to terminology services, as well research into the costs and requirements of an initial entry-level operational service and any future extension of this. ”

JISC Project URI: http://www.jisc.ac.uk/whatwedo/programmes/reppres/sharedservices/hilt2

Project URI: http://hilt.cdlr.strath.ac.uk/index.html

Start Date: 1st June 2002

End Date: 29th January 2009

Governance: JIIE

Contact Name and Role: Dennis Nicholson (Project Manager)

Name of Trawler: John

Available Outputs:
The following demostrator services are available:

  • HILT SOAP client
  • Demo of SRU
  • HILT2 Emulation
  • Vocabulary Browse/Search
  • Lucene Spell Checker
  • Wordnet
  • BUBL Search (example of embedding toolkit elements in a service)
  • OCLC client

Comments:

Project – NAMES

Project Name: Names: Pilot national name and factual authority service

Programme Name: Repositories and Preservation Programme

Strand: Information Environment

JISC Project URI: http://www.jisc.ac.uk/whatwedo/programmes/reppres/sharedservices/names.aspx

Project URI: http://names.mimas.ac.uk/

Start Date: 1st May 2007

End Date:30th September 2008

Governance: RPAG

Contact Name and Role: Amanda Hill, Project Manager

Brief project description:

The project is scoping the requirements of UK institutional and subject repositories for a service that will reliably and uniquely identify individuals and institutions.

A prototype service is under development to test the various processes involved. This includes determining the most appropriate data format, setting up a test database, mapping data from different sources, populating the database with records and testing the use of the data.

This will provide important information about the future usefulness of a name authority service for institutional and subject-based repositories, and other applications beyond the repository sector.”

Outputs:

Comments:

The prototype service is now available as at 13th Jan 2009, but some development work is still to be done (acc to Names blog last paragraph).

Project – IESR

Project Name: Information Environment Service Registry (IESR)

Programme Name: Information Environment

Strand: Shared Infrastructure Services programme

JISC Project URI: http://www.jisc.ac.uk/whatwedo/services/mimas/iesr.aspx

Project URI: http://www.iesr.ac.uk/

Start Date: 1st Jan 2003

End Date: 31st Match 2009

Governance: JISC Integrated Information Environment committee?

Contact Name and Role: Vic Lyte, Project Manager

Brief project description:

“The IESR has been developed to provide a registry of information about electronic resources that are of value to teachers, researchers and learners. The IESR project is part of JISC‘s Shared Services Programme.

The aim is to create a reliable source of information that other applications, such as portals, can freely access through machine-to-machine protocols, in order to help their end users discover resources of assistance to them.

The IESR contains information about the resources themselves, technical details about how to access the resources, and contact details for the resource providers. For resource providers the IESR will hold a master description of their electronic resources, to which other potential users of the resources may be directed.

The registry is held in an XML repository using Cheshire information retrieval software.”

Outputs:

Project – CLAReT

Project Name: CLAReT (Contextualised Learning Repository Tools)

Programme Name: Repositories and Preservation Programme

Strand: Information Environment

JISC Project URI: http://www.jisc.ac.uk/whatwedo/programmes/reppres/tools/claret.aspx
Project URI: http://www.claret.ecs.soton.ac.uk/index.htm

Start Date: 1 Oct 2006
End Date: 31 Oct 2007

Governance: JIIE

Contact Name and Role: Dr. Yvonne Howard, Project Manager

Brief project description:
“CLARET will develop a  prototype web service that can enable visual exploration and the use of social bookmarking of contextual metadata in a repository context.
The CLARET project will also explore the relationship between Learning Object Contextual metadata and the teaching and learning context as defined by working with practitioners. ”

Name of Trawler: John

Outputs:

Comments
Faroes
Please note this project led to the subsequent FAROES project

Output -RIOJA -Costs and sustainability: overlay journal costs

Title: Repository Interface for Overlaid Journal Archives:costs estimates and sustainability issues

Pages: 12-16
Summary of contents:

The report sets out the costs for each of the four identified core functions of a journal:
https://rrtsynthesis.wordpress.com/2008/11/05/output-rioja-cost-estimates-and-sustainability-overlay-model/

Registration

first copy costs – editorial board, assigning reviewers etc., support and admin –
“Consultants in SQW Limited (2004) reported that first copy costs for a good to high quality journal are
estimated around – average price – $1500 ($1650 including first copy and fixed costs).”
however, “Harnad (2000) …indicates that conducting the peer review electronically and for papers residing in an open access archive could cost about 1/3 less of the actual page cost.”
ArXiv uses a low cost system of endorsement ($1-5 per item) in which previous submitters vouch for the relevancy of new work.

Certification
p13-p14 Note survey finding that there is no consensus on the issue of open or closed peer review.
p14 “King & Tenopir (2000) list the following activities in article processing: manuscript receipt processing, initial disposition decision making, identifying reviewers or referees, review processing, subject editing, special graphic and other preparation, formatting, copy editing, processing author approval, indexing, coding. There are also significant indirect costs – costs not directly associated with a particular process, such as administrative and managerial costs. Rowland reports costs for peer review in the range of $200-$400 per paper, including administrative support, for a journal with rejection rate of 50%.”

Awareness
p14-15 This covers current awareness and related dissemination tools and activities. The survey has noted the importance of such functions to the community using arXiv. RIOJA comments p15 “The awareness functions provided by arXiv and other repositories could clearly reduce central overheads for a repository-overlaid journal.” (see comment)

Archiving
This section briefly discusses how an overlay journal would need to ensure the preservation of accepted content. The section has little specific discussion of the practicalities and problems of preservation outside of the context of arXiv but does make a very useful suggestion in that on demand printing services are available ‘for printing paper versions of the journal’s issues at a cost of less than $250, including shipping and handling.’ (p16)

Comments:
Awareness – while it’s true that a repositories alerting services could reduce the need for an overlay journal to have such services, there is a tnesion here that the project doesn’t note (afaik) – not offering these services would significantly reduce the overlay journal’s visibility/ identity. not offering these services could have a direct impact on the visibiilty and ‘impact’ of the the journal. However, in the context of a journal based on items from a single repository (such as arxiv) the point is well made that this service is carried out anyway. [not clear if project thinks this though]

Archiving – though this costing largely sidesteps the issue of digital preservation the suggestion that (at least in the short term) copies of record could be printed on demand appears a significantly less expensive option than the current printing process.

Throughout the costing there is a heavy reliance on the ‘unique’ / ‘mature’ context of arXiv. It is not yet clear which of these characteristics of arXiv has had the stronger effect.

This examination of costing contributes to identifying relevant shared infrastructure services and assessing their feasibility.

Date Released: July 2008

URI for Output: http://eprints.ucl.ac.uk/12562/1/12562.pdf

Output – RIOJA – Journal Repository APIs

Title: RIOJA Journal-Repository APIs

Page: web page with related files

Summary of contents:
The page contains an introduction to and details about the xml APIs developed by RIOJA.
The APIs cover:
” Repository APIs

  • getRepositoryInfo: Get information about the repository
  • validateAuthor: Author validation
  • getMetadata: Metadata exchange
  • setStatus: Journal status change notification
  • getStatus: Get journal status from repository
  • getTrackbacks: Get repository trackbacks
  • getStatistics: Get repository statistics
  • getCurrentPaperInfo: Get current state of paper

Journal APIs

  • getJournalInfo: Get information about the journal
  • getStatus: Get submission status within the journal
  • submit: Submit to journal from repository “

The final section is of note:

  • Author managed papers are assumed by these APIs.
  • ‘Why not just use OAI?
  1. OAI is designed for metadata harvesting not as an immediate-reponse API to be used interactively. For RIOJA journals need to be able to get metadata dynamically during the submission process; OAI allows “wait” responses which require re-querying at a later time.
  2. RIOJA allows for extraction of information before a paper is public on a repository (e.g. for integrated submission) before it would be available by OAI
  3. RIOJA needs to track paper versions carefully. OAI does not include detailed paper version information.
    There should be a well-defined format for titles and abstracts with text formatting and equations so that they can display the same way on the journal and repository without further editing; OAI is very flexible but very vague.
  4. RIOJA requires several other functions apart from extracting Metadata ‘

This rationale is also laid out in the final report page 9.http://eprints.ucl.ac.uk/12562/1/12562.pdf
This section also details similarities between some of RIOJA’s desired functionality and that subsequently developed in SWORD.

Comments:

As commented on the software – the API’s currently require custom versions – if this functionality doesn’t make it into a core release using/ supporting these API’s is unlikely to be sustainable.

The case to replace an accepted api has to be very strong.

Looking at these reasons OAI-PMH is deemed unsuitable:

  1. no1 – In can see the case for this but given the model of overlay journal that the project has chosen (simlutaneous deposit to repository and for overlay publication). Although I’m not convinced this is the only possible workflow – this reason in itself is probably sufficient to examine alternative apis.
  2. no.2  – this seems to be more to do with repository configuration than OAI-PMH as such. There is no reason that a repository can’t be configured for secure OAI-PMH metadata harvesting. Again a lot is predicated on a particular workflow.
  3. no.3 – this is just plain wrong – versioning information is a question that relates to bibliographic metadata, OAI-PMH is a harvesting protocol. You can put whatever information you want into OAI-PMH. The observation is true for the simple dublin core metadata (oai_dc) that you must include with  a oai-pmh compliant repository – but this does not prevent you adding any other sets of information. It’s like saying you need an alternative to roads because your car needs lpg not petrol. It is certainly true that the information you get from most existing repositories (via OAI-PMH) does not support the needs of the an overlay journal but this, in itself, is not a reason to replace OAI-PMH.
  4. no4 – impossible to comment.

I’m not suggesting that there isn’t a need to use something other than OAI-PMH for an overlay journal, but, aside from the first point (which could be significant enough in itself, but is predicated on a particular workflow) the summary arguments for implementing an entirely new api interface are are weak. In the Final Report possible overlap with SWORD (which developed independently while the project was running) is noted. It doesn’t examine SRW/SRU interfaces.

I’d suggest this is a good example of an infrastructure encountering the limits of accepted protocols but also of a development setting itself up for long term problems by effectively requiring 3rd party support for custom api’s [though, please note: this is not a comment on the effectiveness of these APIs for their purpose as a pilot/ proof of concept service]

Date Released: 2008

URI for Output: http://cosmologist.info/xml/APIs.html

Output – RIOJA – Overlay Journal software products

Title: RIOJA Project Software Products

Page: web page with related files

Summary of contents: the page points to downloads for ‘modified software packages to start journals based on existing repositories, and to start new API-enabled ePrints-based repositories.’

The software is a modified version of OJS journal system (2.1.1). Modifications include:
‘ * Quick validated submission of papers using existing repository IDs
* Keyword-based matching system to speed assignment of referees and editors
* Continuous publication, using links to accepted versions of the paper hosted on the repository.
* Removal of parts not needed in an overlay journal (print publication, etc) ‘

The software retains OJS’ OAI functionality. As it stands the software can be used with the Arxiv repository or any modified ePrints installation (a modified version of ePrints 3 is also available for download).

Sample journals based on the software is linked: http://arxivjournal.org/ . Of the journals on this page the first two are linked explicitly to the project – cosmology continous and cosmology issued. Cosmology continous has a current issue (from July 2008) none of the other pages associated with these journals have any content. The other four journals seem to be experimental only either having no content or being labelled for developers and requiring a login.

There are a number of outstanding bugs noted.

Comments:
A model of software development that relies upon releasing modified versions of software both for data provider and overlay journal is probably going to struggle to be sustained and adopted.
From what I can tell RIOJA has demonstrated the concept of an overlay journal in this field well but unless their work makes it into core releases of the software they’re adopting (an issue not discussed here) their work will remain as proof of concept.

The issues encountered in this software development are relevant to any infrastructure project developing or heavily moddifying software.

Date Released: 2008

URI for Output: http://arxivjournal.org/rioja/