Discussion of terminology broadened to consider the multiple roles of a repository and how these fit with the vision of enhanced scholarly communication. Following two years of intense repository activity, the role of repositories seems diffuse, and the lack of clarity about objectives is having a real impact on those working in this area.
- institutional repositories need to meet the local requirements of their users (as content creators) as well as the wider objectives of other stakeholders (searchers, institutions, funders)
- the ‘boundaries’ of what constitutes a repository are unclear
- repositories, particularly institutional repositories, need to interact with other systems to achieve the objectives associated with them
“a richer scholarly communication environment, based on open access to, and re-use of, scholarly materials. The phrase ‘scholarly communication’ is used here in its richest sense to include the life-cycle of information and knowledge from research to learning.”
In general terms this vision is still considered valid. However the vision is being interpreted in different ways from a fairly narrow focus on open access to a much more wide-ranging change in the publication process and the research environment.
Indeed open access itself is understood differently, varying from concentration on peer reviewed journal articles, to providing access to previously hidden grey literature,1 to making accessible versions of work in progress. Similarly there are different emphases on what richer scholarly communication incorporates, from a narrow interpretation of an increase in electronic journals, to a revolution in scholarly publishing patterns using a more Web based environment (Open Science, scholarly social networks and less emphasis on journal articles, centralised stores of shared learning materials).
Repositories might play various roles in future visions of scholarly communication, whether repositories as we know them now or more innovative types. Existing work in Open Science, Grid and Virtual Research Environments provides pointers to change in the scientific research process (see Carole Goble’s presentation to the British Library Board2 and David de Roure’s presentation at Repository Fringe 2008 as mentioned in section 3.2 above).
This Review was not intended to look at this wider picture of the future of scholarly communication in detail. However a very brief preliminary outline will be given which might suggest how further scenarios might be developed
Scholarly content is created, stored and shared on the Web. Despite the increase in less formal communication, for researchers the peer reviewed paper is still the key communication unit of record and status, and reward systems are based on peer reviewed outputs.
The amount of informal Web based communication is increasing and where it is deemed worthwhile is captured in repositories containing wikis, blog entries etc
OpenScience is the norm, with data produced in labs typically being stored automatically and made open for re-use.
The primary level of social interaction for researchers, teachers and learners is at the group level (cross institutional project, institutional research, lab or teaching group). A typical workflow is for a small group to produce content which is then shared with a few trusted individuals in a wider group for informal peer review. As confidence in the content grows it is shared with a wider audience and undergoes formal peer review. Repositories support these various levels of sharing resulting in an open access content item.
The group also requires access to existing resources for ‘re-use’ in support of the creation of new output3 A group repository enables access to research papers and data for re-use, and for the teacher access to existing learning materials for re-use. Students can build their own curriculum from open educational resources repositories.
Repository content would reside in various types and ‘levels’ of repository, with content gravitating to the appropriate level for preservation depending on the value of the content. Some repositories would reference content stored elsewhere, others would have a preservation function storing complete objects.
The storage of large data sets may be ‘in the cloud’ as an alternative to storage in HE institutions or data centres.
Recommendation 3: Given the environmental changes in the education, Web and scholarly environment we need to articulate an updated vision of richer scholarly communication. The vision should be based on the scholarly life-cycle from experiment through to research, teaching and learning. The various roles repositories might play should be mapped out, particularly as regards management of digital resources, open access and re-use.
The 2006 vision has been associated with several different high level objectives. From the start the JISC repository funding programmes objectives were broad to include improved asset management and to address needs for preservation, as well as promoting open access.
Whilst repositories have a role in supporting each of these objectives there are many other systems as well as policy and organisational issues involved. There is a danger that by not sufficiently differentiating these various objectives, the role of a repository in relation to any particular objective is unclear. Further distinguishing and prioritising these high level objectives and related activity would assist individual institutional repositories to set their own priorities and relate to wider activity.
Recommendation 4: Communications about repositories should emphasise higher level objectives. It would be helpful for JISC to differentiate high level objectives and subsume repository activity under those objectives. There needs to be a shift in emphasis from the ‘repository’ to the objective. JISC should ensure calls and funded activity relate to particular objectives rather than solely to ‘repositories’.
In the context of these multiple high level objectives, repositories are being established within particular institutions, each institution with its own community and requirements. In these institutions the repository will need to meet the requirements of that institutional community. The stakeholders may be researchers as they act as experimenters/data users/collaborators/authors; or teachers as creators of learning materials. Other stakeholders may be information managers. High level objectives need to be related and aligned with the local requirements of the institution.
It is often the case that a repository is set up as a pilot on the basis of a high level global objective, but with unknown detailed local requirements. Pretty quickly the local repository will need to meet local requirements in order to achieve short term success. There are a range of requirements that the repository might fulfil. The Repositories Support Project site in its pages on ‘making a case for a repository’4 lists nine benefits for researchers, ten for institutions, two for funders, thus illustrating the variety of local requirements there might be for establishing an IR. The varied motivation for implementation will result in different repository functionality. In addition there is likely to be further diversity in requirements for functionality for different types of content. Recognition of the complexity inherent in this mix of objectives, and the challenge presented to repository managers, is well documented by Dorothea Salo.5 Progress depends on institutions integrating IRs into their wider strategy for digital collections and digital information management.
In order to engage content creators within the institution, objectives need to be presented to users in terms of familiar processes and tools. Ideally objectives will be met by identifying common points in a large number of workflows that repositories could hook into. This acknowledges that content creators are likely to be motivated by systems that streamline their own familiar tasks rather than by higher level objectives. This is supported by a case study from the California Digital Library6 and a recent report by Carole Palmer7 showing creators are motivated by their own requirements for higher impact and increased citation rather than open access as such. In the IdeaScale discussion this is expressed as a repository needs to be defined in terms of researcher workflow, supporting the work of the user and creating repository content as part of this process.
The available software tools and JISC funded activity needs to support these diverse local institutional requirements. There needs to be work done to provide flexible solutions which at the moment tend to be divided between distinct platforms (eprints repository, learning materials repository, digital library, data store, preservation archive).
Recommendation 5: Tools and project outputs should be presented in terms of solutions to local institutional repository requirements.
Differing views were expressed in the consultation exercise on the richness in functionality of a repository leading to questioning the boundaries of a repository. Repositories were variously characterised as rich multi-functional environments (see Chris Rusbridge’s Research Repository Systems) to simple back-end data stores with added value services located elsewhere.8 Whilst the overall ambition is similar i.e. to support the researcher’s workflow in an integrated way, this variance in views leads to misunderstanding as to what a repository should be achieving.
Once again it would be helpful for repository activity to be subsumed under higher level objectives e.g. developing Virtual Research Environments, managing content for Virtual Learning Environments. JISC’s late 2008 funding call sought to address this issue by encouraging integration of the e-Research and IE themes in project bids.
Recommendation 6: Repository activity needs to be joined up more closely with other forward looking work such as the JISC Virtual Research Environment activity, open science, data sharing, preservation. At the same time at the local level there needs to be support for integration of repositories with institutional systems (Current Research Information Systems, Research Excellence Framework, Virtual Learning Environment, author identity systems).
- Talis talks with Herbert van de Sompel about SFX, OAI, and Repositories. http://blogs.talis.com/xiphos/2008/09/06/talis-talks-with-herbert-van-de-sompel-about-sfx-oai-and-repositories/ []
- Carole Goble, The Future of Research (Science and Technology), presentation at British Library Board Awayday, 23 September 2008, http://www.slideshare.net/dullhunk/the-future-of-research-science-and-technology-presentation []
- The term re-use refers to manipulation and adding of value to existing research outputs and research data. For example journal article re-use might include citation and text mining. Research data re-use might include re-analysis of original data, combination or comparison of original and additional data, data mining. []
- http://rsp.ac.uk/repos/justification []
- Salo, Dorothea. Innkeeper at the Roach Motel. Library Trends 57:2 (Fall 2008). http://digital.library.wisc.edu/1793/22088 []
- RSP Web site. eScholarship repository case study. http://www.rsp.ac.uk/repos/casestudies/california.php []
- Carole Palmer et al. Identifying factors of success in CIC institutional repository development. Andrew Mellon Foundation, August 2008. https://www.ideals.uiuc.edu/handle/2142/8981 []
- Chris Rusbridge has summarised much of this discussion on the Digital Curation blog. http://digitalcuration.blogspot.com/2008/08/comments-on-negative-click-research.html []