Output – IRep for NTU – Final Report: Interoperabilty

Title: IRep for NTU – Final Report

Number of pages or page numbers: 15-17, 21

Section: Interoperability and integration with other institutional systems and external systems; Implications and Recommendations

Date Released: 18/07/2008

URI for Output: http://www.ntu.ac.uk/irep/63815.doc

Summary of contents:

Discussion of how the repository integrates with particular existing systems and functions.
‘Integration with eSearch (Metalib): eSearch is the [established NTU] library federated search system
… users [can] perform ‘All Fields’ and ‘Title’ searches directly from the eSearch interface and also allowed thumbnails to be returned for display within this interface where available. … one minor drawback with returning IRep records through eSearch such as if multiple object manifestations are attached to a single record, only a link to the primary object is returned.’

‘Integration with SFX: Our approach with the “Research and Scholarly” collection was to ingest into the repository the totality of the 9000 over records from the Research Publications Database regardless whether or not we had the full text items (or equivalent) to ingest as well.’ Decision based on lack of access to local full text in short term and probable publishers’ restrictions on some pre-prints in longer term. Implemented using ‘using the SFX OpenURL link server from Ex Libris already available at NTU to allow us providing context-sensitive linking to full-text article’ to present ‘ “the most appropriate” location of an object’. This is however DOI dependent and as a result its use is currently limited. [Note full text statistics https://rrtsynthesis.wordpress.com/2008/09/30/output-ntu-irep-for-ntu-final-report-technical-implementation/ ]

‘Integration with NTU Researcher web profiles via X-Server: … IRep [may] completely replace the [Research Publications Database]. One major feature of the RPD is to publish a publications list for each researcher on their NTU profile web pages. This is supporting the RAE requirements.’ This functionality has been demonstrated and tested but is not being implemented before the current RAE review period ends.’

The learning materials system was being replaced during the project so interoperabilty testing was not possible.

Interoperability with OAI-PMH

  • Google and sitemaps

‘However by the time we went live with the repository in early May 2008, Google had released a statement saying that they were retiring support for OAI-PMH in Sitemaps [13]. This was quite a drawback for the project as one of the main pieces of feedback we received from the academics when presenting them about IRep was the clear benefit offered by the repository of having their publications found by search engines.’
Ex Libris are developing a work around (database cloned as static web pages [ functional but !!]).

  • Registry services

Registered with OpenDOAR.

Metadata
‘We found it was a great advantage to have librarian cataloguers ensuring compliance and monitoring the metadata quality as they were already familiar manipulating such data within other systems. However the cataloguers felt it was a step backward to use Dublin Core as the metadata schema due to its lack of richness in comparison to other schemas they were used to manipulate. However the project team purposely selected the Dublin Core schema in part due to its simplicity for the ease of manipulation by people having no expertise in this kind of data. … Yet the use of the Dublin Core schema brought another issue in terms of interoperability with SFX due to the lack of qualification and the ambiguity in the fields.’ (p21)

Comments:

No specific details about theses, data, or funding councils mentioned  though they are within scope – so general findings are of some relevance).

There is no mention of any other specific external OAI-PMH based services – harvesters??

Selection of metadata schema based on interoperability/ common ground rather than required (local) functionality?

Output – NTU: IRep for NTU – Final Report: Copyright and Intellectual Property management

Title: IRep for NTU – Final Report

Pages: 12-13

Section: Copyright and Intellectual Property management

Date Released: 18/07/2008

URI for Output: http://www.ntu.ac.uk/irep/63815.doc

Summary of contents:

NTU produced an evaluation of current IPR practice in the university and reviewed the possible use of Creative Commons licensing. It notes differences in practice between research and teaching materials within the institution. [The evaluation of the CC licenses is interesting (and has been done extensively in other projects (rightscom’s project for Jisc legal)) but the implications for the university may be of wider use:]
‘Creative Commons licences – Consequences for the University if proposal adopted
• All University materials containing third party content that are selected for deposit in the IRep would need to be checked carefully to ensure that is permissible to licence that content using Creative Commons.
• University created materials that contain third party content licensed under the Creative Commons attribution, non-commercial, share alike (by-nc-sa) licence would also need to be licensed under the same licence.
• The two University copyright policies, the Copyright in Learning and Teaching Materials and the Intellectual Property policy, would both need amending to reflect this change in policy.
• As the licences are non-exclusive the University would be able to licence commercial use of the materials if desired if either of the two most restrictive licences were used.’

Comments:
I think, the findings for the university point to the following issues:

  • Significant IPR clearing workload
  • Non-trivial need to change policies
  • Concerns over potential impact on commercialization

Output – NTU: IRep for NTU – Final Report: Implementation

Title: IRep for NTU – Final Report

Number of pages or page numbers: 6-11, 14-15, 19, 27

Section: Implementation, Outputs and Results, Appendix B – Statistics

Date Released: 18/07/2008

URI for Output: http://www.ntu.ac.uk/irep/63815.doc

Summary of contents:

Report outlines a system specification identifying required standards and features for their needs (P6,7)
There is a careful delineation of technical processes of system deployment (P9-11)

NTU outline of a variety of identified types of material to be managed and creation of a functional grouping into collections (P7,8):

  • Research and Scholarly Collection
  • Learning and Teaching Collection
  • Corporate Collection

Each of these collections required different workflows, administrative processes, and had different owners (p14-15).

The Research and Scholarly Collection ‘is the most significant and contained at the launch over 9500 records ingested from the University’s Research Publications Database.’ Many not full text though. (p19)

There are less than 25 asssets in each of the other collections (p27) [!]

There are 990 full text items – however 933 of these are for ‘Web files (.html; .php, et al) [Note – current policy is to link to remote files in such cases rather than store local copies of web files)’ (p27)
[!]

Comments:

NTU’s careful recording has provided a useful checklist of issues to consider when selecting a repository system and preparing to install it.

The careful policy development around the different collections highlights differing requirements of different content types.

Many problems with non research materials will not really have surfaced given numbers.

Full text non remote (i.e. ingested, not available elsewhere, stored by IR) = 57 files
It is not clear who holds these links… if many of these are to publisher sites, how the question of how the IR is better than the existing research database- may not be clear form the project as it stood at the point of reporting.

Output – NTU: IRep for NTU – Final Report: – Institutional Context

Title: IRep for NTU – Final Report

Number of pages or page numbers: 4, 5

Section: Background, Methodology

Date Released: 18/07/2008

URI for Output: http://www.ntu.ac.uk/irep/63815.doc

Summary of contents:

P4. Project arose out of identified need to have a digital asset management system.
‘We identified that digital assets such as theses, published papers, pre- and post-prints, and other e-scholarly works:

• Can be difficult to find;
• Duplicated across the University;
• Stored in an ad-hoc manner;
• Not compliant with copyright requirements;
• Deficient in appropriate metadata and open standards/access requirements;
• Under-exposed or not exposed at all;
• Cannot be searched via a single search engine;
• Cannot be ingested into a centrally management system.’

P5 NTU ‘created six workstreams (Procurement, Content, Technical, Business Processes, Communication & Training) … While the procurement process remained with the senior member of the Libraries and Learning Resources (LLR) department and was completed fairly early in the life cycle of the project, the other workstreams were a collaboration between LLR and other departments … LLR has well established close links with the University’s central Information Technology (IT) department (Information Systems) which supported with the technical and implementation process of the project. University’s Educational Development Unit and the academic schools and departments provided input and support on the content aspect. The teams of specialists within LLR would manage the workflows associated with content ingest ongoing maintenance of the repository, including the application of appropriate metadata standards.’

Comment:

Project background provides example reasons why a repository can assist.

Processes developed collaboratively to account for stakeholders skills and differing internal and external needs.

Output – INSPECT: Framework for the definition of significant properties – content types

Title: Framework for the definition of significant properties

Number of pages or page numbers: page18-43

Section: 3. Initial analysis of file types

Date Released: 05/02/2008

URI for Output: http://www.significantproperties.org.uk/documents/wp33-propertiesreport-v1.pdf

Summary of contents:

This substantive section of the report considers the application of the significant properties framework to four file types – Structured text (p18-23); Email(p24-35) ; Digital audio (p36-43); and Raster Images(p44-45).

The sections on structured text, email, and digital audio provide visual maps of properties to the significant properties framework. (as detailed https://rrtsynthesis.wordpress.com/2008/09/30/output-inspect-framework-for-the-definition-of-significant-properties-purpose/)

Significant properties are listed in detail for Email and Digital Audio

It is not clear if the section on raster images is complete.

Comments:

This is a valuable piece of work but appears incomplete; its use in any synthesis will be at the level of knowing it exsits and has been carried out – any use of this requires detailed engagement with the base document.

Output- INSPECT: Framework for the definition of significant properties -purpose and definition

Title: Framework for the definition of significant properties

Number of pages or page numbers: page 5-8

Section: 1.4 A framework for recording significant properties

Date Released: 05/02/2008

URI for Output: http://www.significantproperties.org.uk/documents/wp33-propertiesreport-v1.pdf

Summary of contents:

“The creation of a significant properties framework fulfils several purposes. It enables the institution to:
1) Analyse and catalogue the significant properties of a digital Record;
2) Review significant properties associated with an existing digital Record;
3) Assess the relative value of the property for the re-creation of the Record;
4) Quantitatively measure the value associated with the property;
5) Validate that the value associated with the property is correct.”

These pages then provide a definition of ‘a common set of information elements, … that may be used by an institution to identify the properties considered to be important and indicate the quality thresholds that must be met. ‘

Output – Inspect: Significant Properties Report – overview

Title: Significant Properties Report

Number of pages or page numbers: 10 pages

Section:

Date Released: 10 April 2007

URI for Output: http://www.significantproperties.org.uk/documents/wp22_significant_properties.pdf

Summary of contents:

Report provides a brief background and overview of the key concepts in digital preservation (including migration and authenticity) and a review of existing work on ‘significant properties’.

‘Use the term ‘significant properties’ in preference to either ‘significant
characteristics’ or ‘essence’, although we regard the terms as essentially interchangeable.’

‘For the purposes of this project we define ‘significant properties’ as:
the characteristics of digital objects that must be preserved over time in
order to ensure the continued accessibility, usability, and meaning of the
objects
.’

‘The significant properties of digital objects fall into 5 categories:
• content, eg. text, image, slides, etc.
• context, eg. who, when, why.
• [rendering], eg. font and size, colour, layout, etc.
• structure, eg. embedded files, pagination, headings, etc.
• behaviour, eg. hypertext links, updating calculations, active links, etc.’

Category ‘appearance’ in this report changed to ‘rendering’ in subsequent work (http://www.significantproperties.org.uk/documents/wp33-propertiesreport-v1.pdf p15). This page in the  report also provides fuller definitions of these 5 terms.

Output – IncReASe: Questionnaire

Title: Questionnaire – summary

Number of pages or page numbers: web page

Section:

Date Released: Post March 2008

URI for Output:http://eprints.whiterose.ac.uk/increase/quest_summary.html

Summary of contents:

of 330 respondents.

  • 70% hadn’t heard of institutional repository (WRRO)
  • 65% were unaware of funders Open Access policies
  • 57% did not know if there was a Open Access repository for their discipline
  • 78% did not submit details of their publications to any systems outside the university

‘Top three services that might encourage people to use WRRO:

  1. Statistics about publications (62%)
  2. Links to papers from personal websites (57%)
  3. RSS feeds (43%)’

Output – IncReASe: Database Prevalence Report – Full text issues

Title: Database Prevalence Report

Page numbers: 1, 5, 6

Section: Key findings,

Date Released: April 2008

URI for Output: http://eprints.whiterose.ac.uk/increase/milestone10_database_prevalence_report_v1.0.pdf

Summary of contents:

‘Most full text is distributed across the web pages of individual researchers; researchers may be using the local content management system to organise their files – but often the files are located on local drives.’ p1

‘Most full text collections could be described as grey literature.’ p1

Percentages of full text available are low – one 12% for working papers in one institution; mostly under 4% availab;e . p5

Output – IncReASe: Database Prevalence Report – Motivations

Title: Database Prevalence Report

Number of pages or page numbers: 1, 6

Section: Key findings, 3.1.6 Disciplinary Differences

Date Released: April 2008

URI for Output: http://eprints.whiterose.ac.uk/increase/milestone10_database_prevalence_report_v1.0.pdf

Summary of contents:

‘On the one hand, researchers who archive on personal pages may be interested in depositing into WRRO; on the other hand, sometimes those who archive on their own websites do so without regard for copyright laws. They may feel the repository offers them less freedom to archive than their current site’ p1

‘Maths and Computing were the disciplines with the strongest self-archiving behaviours.’ p,1,6