Output – VIF: Embedding Versioning

Title: VIF: Embedding Versioning Information in an Object

Pages: webpage
Date Released: Jan2008

URI for Output: http://www.lse.ac.uk/library/vif/Framework/Object/index.html

Summary of contents:
Because it is entirely possible for users to access objects within a repository without ever seeing any metadata, VIF recommends that some versioning information is embedded into the object and suggests the following:

“It is strongly recommended that at least one of the following solutions to embed versioning information into object is advocated and used systematically within a repository:

1. ID Tags and Properties Fields
2. Cover Sheet
3. Filename
4. Watermark”

The framework provides some further details about each of these. In essence – use something and use it consistently

Comments:
Supporting interoperabilty, especially interoperability over time is reliant on being able to distinguish between versions. It maybe simple to regard ‘use something consistently’ as a standard but if a repository at least follows it’s own standard. this provides a starting point for interoperability.

Output – VIF: Version labels or taxonomies

Title: VIF: Version labels or taxonomies

Pages: webpage
Date Released: May 2008

URI for Output: http://www.lse.ac.uk/library/vif/Framework/Essential/taxonomies.html

Summary of contents:
“Clarity of versions is important; but the terminology, even for just articles, is not static or decided. Consistent usage within one repository, possibly for particular items may be achievable, such as at LSE, but care should be taken in their use and their implementation should be supported by clear policy and definition.

Explicit definition of vocabulary used is a minimum requirement if taxonomies are used.”

Suggested taxonomies are the those produced by the following projects:

Comments:
In light of their survey results (noted here: https://rrtsynthesis.wordpress.com/2009/02/03/output-vifthe-results-of-the-vif-user-requirements-study-taxonomy/ ) VIF had already noted that widespread consistency in the use of any taxonomy is unlikely. Here they have recommended some appropriate standard taxonomies and noted that within a single repository consistent use may be possible.

Outputs – VIF: Dates

Title: VIF:Dates

Pages: webpage
Date Released: May 2008

URI for Output: http://www.lse.ac.uk/library/vif/Framework/Essential/dates.html

Summary of contents:

“# If there is ever only one option for a date, then it is critical that you find a way to make it clear which date you are referring to. You should agree on what the most relevant to your repository is, apply it consistently and provide information to users about it.
# VIF recommends is that if you only use one date, it should be Date Modified (by the author, not the repository) and wherever possible, this should be accompanied by a description of who made the changes and why.
# A key thing to remember when considering which date to use to enhance version identification, is that it should relate to the object at hand, not to the repository or to an understanding of the workflow.”

Comments:
The VIF project provides guidance about the, often ambiguous, use of dates in repositories.

Output – VIF: Versioning Information – identification

Title: VIF: Essential Versioning Information

Pages: webpage
Date Released: May 2008

URI for Output: http://www.lse.ac.uk/library/vif/Framework/Essential/index.html

Summary of contents:
As well as author and title information VIF reccomends that the clear identification of an item should be supported by as much of the following information as possible. The pieces of information should be exposed by “embedding them into an object or storing them in metadata”

” 1. Defined dates
2. Identifiers
3. Version numbering
4. Version labels or Taxonomies
5. Text description”

Comments:
The VIF provides further details about each of these types of information. Where there are key recomendations they are listed in this blog as seperate entries.

The consistent provsion of this set of information would better enable repository services to locate appropriate copies from aggregated copies of the same item.

Output – VIF:The results of the VIF user requirements study – datasets

Title: VIF:The results of the VIF user requirements study

Pages: webpage (summary of http://www.lse.ac.uk/library/vif/Versioning_Issues_-_Discussion_Paper.doc)
Date Released:

URI for Output: http://www.lse.ac.uk/library/vif/Problem/research.html

Summary of contents:
“VIF carried out further research into repositories that already contain some datasets, and investigated how these datasets are managed. Because this is a currently limited field, and because repository systems are not primarily configured to deal with such objects, we found that repository staff:

* Avoid versioning issues wherever possible by only keeping the most recent version. Older versions are deleted. This contrasts with how older version of other types of object are usually treated.
* By doing this, potential issues about which version people are citing becomes a problem.
* Have not found satisfactory ways to describe or indicate the relationship that a particular set of data holds to other related research outputs that are held in the repository.”

Comments:

This practice, if widespread beyond the survey group, represents a significant challenge that needs to be addressed (possibly by tool/repository plugin development). Succesfully citing and sharing datasets requires a stable and identifiable versioning system.

Output – VIF:The results of the VIF user requirements study – taxonomy

Title: VIF:The results of the VIF user requirements study

Pages: webpage (summary of http://www.lse.ac.uk/library/vif/Versioning_Issues_-_Discussion_Paper.doc)
Date Released:

URI for Output: http://www.lse.ac.uk/library/vif/Problem/research.html

Summary of contents:
“Many free text comments remarked that whilst the idea is a sound one in principle, implementing such a taxonomy [of versions] would be virtually impossible without some sort of enforcing body. Also, getting community agreement on the terminology used would be difficult due to the often polarised standpoints of publishers and information professionals. Insulating the vocabulary chosen from the pre-established terminology and bias of certain camps would clearly be a very serious undertaking.”

Comments:

This review of feedback received through the survey highlights the probable difficulties inherent in any proposed common/standard set of terms for versions of digital assets.

The contentiousness of agreeing a taxonomy of versions for articles can also be seen in the variety of responses to NISO’s work on journal article versions  (http://www.niso.org/publications/rp/RP-8-2008.pdf).

Output – VIF:The results of the VIF user requirements study – formats

Title: VIF:The results of the VIF user requirements study

Pages: webpage (summary of http://www.lse.ac.uk/library/vif/Versioning_Issues_-_Discussion_Paper.doc)
Date Released:

URI for Output: http://www.lse.ac.uk/library/vif/Problem/research.html

Summary of contents:
“There is an awareness by information professionals of a trend towards a wider range of object types being created. When asked what types of material they currently stored in their repositories, 95.4% of information professionals claimed that they currently store, or plan to store, text documents with many also stating that they store, or plan to store, audio files (73.6%), datasets (77.9%), images (83.3%), learning objects (46.5%) and video files (75.3%). This can be seen to be especially positive, especially in the context of the results of the academics survey, which suggested a large number of researchers either already create or intend to create audio files (47.2%), datasets (68%), images (72.5%), learning objects (74.6%) and video files (57.6%). As expected, the vast majority also intend to continue working with text documents.”

Comments:
Survey data about snapshot of content types stored by repositories and content types created by academics; it provides one comparasion between current ‘supply’ (what can be stored) and ‘demand’ (what users want to store) which informs the sector.

The figures for non-textual materials being (or about to be stored) by repositories seem quite high given comparable stats from OpenDOAR:

Content Types in OpenDOAR Repositories - Worldwide
From: OpenDOAR

Output – VIF – VIF user requirements study: repository purpose

Title: VIF:The results of the VIF user requirements study

Pages: webpage (summary of http://www.lse.ac.uk/library/vif/Versioning_Issues_-_Discussion_Paper.doc)
Date Released:

URI for Output: http://www.lse.ac.uk/library/vif/Problem/research.html

Summary of contents:
“The two groups did diverge on the perceived purpose of repositories. The academics we surveyed were very clear about their wish to only make the finished version of their output ultimately available and free text comments (often even in answers to questions on different subjects) showed that they considered repositories were useful to highlight latest research, but not necessarily to preserve the body of research. This contrasts directly with the wishes of information professionals, who overwhelmingly wanted to store all available versions.”

Comments:
A finding which highlights a potential difference of opinion between information professionals and academics about what the repository is there for. This lends support to the idea that preservation may not be perceived by academics as a key function of a repository (though counter example of Hull – Repomman etc.- should be noted).

Output: VIF – VIF website: versioning issues

Title: VIF: why versioning matters

Pages: webpage
Date Released:

URI for Output: http://www.lse.ac.uk/library/vif/Problem/importance.html

Summary of contents:
The project notes the common versioning issues that repositories face.

  • ” Confusion over whether an article is the published version, a copy that is identical in content to this but unformatted, a draft version, an edited version and so on.
  • Repository searches yielding many results which ostensibly appear to refer to the same item, but actually vary in terms of content, formatting or propriety file type.
  • Research work with multiple authors being deposited in different places at different stages of development without guidance as to which is authoritative or most recent.
  • Multimedia items being handled poorly by repositories that treat them as text, and their relationship to other objects that form part of the research project being undefined by the repository.
  • Vastly inconsistent approach of different repository software packages and implementations in how versions are dealt with.”

Comments:

Although this is intended to provide the context of the project, it also provides a succient introduction to the survey findings and the problem repositories face.

Project – VIF

Project Name: Version Identification Framework

Short Project Name:       VIF

Brief project description:
” Continuing from the work of the VERSIONS project the project will provide a common infrastructure for the naming and understanding of issues relating to versions of scholarly works.
The results of an online survey of repository users about current use of digital objects and about the versioning questions that arise will be used to inform a draft framework, to be developed through an expert working group comprising of members from the project partners and other key stakeholders.
The Version Identification Framework will be recommended to the JISC and digital repository communities through a community acceptance plan and a dissemination campaign. ”

Outputs:

Programme Name: Repositories and Preservation Programme

Strand: Discovery to Delivery

JISC Project URI: http://www.jisc.ac.uk/whatwedo/programmes/reppres/vif.aspx

Project URI: http://www.lse.ac.uk/library/vif/

Start Date: 2007-07-10

End Date: 2008-05-09

Governance:

Contact Name and Role: Jenny Brace, Project manager

Name of Trawler: John