Repositories Through the Looking Glass
Andy Powell, from Eduserv Foundation (an educational charity based in Bath)
This presentation had a lot of useful views for us as we approach the new philosophy of our whole ECM environment.
So far (and at this conference) digital repositories seem mostly on the academic agenda in university libraries. Not much seems to be recognised regarding the challenges facing cultural institutions, so maybe we can learn something from the academic experience?
Powell has some cynical views re repositories.
He started by giving us his background with Dublin Core – he has been involved from early days, esp re web based metadata generation tool development. (The Abstract Model was discussed zzzzzzzzz.)
Then he moved on to JISC Info Environment – again aimed largely at the tertiary and further education environment. Most UK digital repositories are based in this environment. They use all the expected standards, but the environment has missed or ignored the Web and this is missing from most digital library spaces, particularly web architecture.
Eduserve has worked with the UK Science Museum and they started by modeling the infrastructure behind the repository (similar to what we have done with ECM). They have built (i.e. developed) a repository for them that they called a Web Content Management (WCM) system.
Serving stuff on the web still missing from JISC Road Maps. He was very positive about open access and what it will do to scholarly publishing – it isn't an "if", but a "when" – it will happen!
Repositories (to date) are mostly focussed on deposit, not servicing the web. WCMs are essential if they are to be used. Concepts such as search engine optimization are essential (not just having federated search within the environment).
He briefly touched on the "REST" architectural style – it focusses on resources and global identifiers.
Is the focus just to be on the institutional repositories or a global environment?
Web 2.0 means: the new "prosumer", remote applications, social-ness & exposed APIs; plus diffusion (eg. blogs, etc.) and "concentration" (via Lorcan Demsey's recent writings at OCLC) – hosting services that are global in scale, eg. Flickr, Technorati and maybe del.icio.us? He mentioned those using Amazon S3 hosting. Social networks are critical, particularly for research purposes and this needs global services.
Future – what would a web 2.0 repository look like? He said it would look like Slideshare Not many in the audience seemed to be using it. You can share, embed, tag, favourite, etc. Other attributes he suggested: a high quality web based document viewer; tagging; visible to Google; RSS; Amazon S3 (infra-structural services); social groups ability; global in scale. BUT – it doesn't support preservation, complex workflows and doesn't expose rich metadata – so what? Are they really needed (in this system)? How can these needs be met without destroying everything? I think that is the problem we have made for ourselves with some of our CMS - wanting them to be all things to all users and forgetting their most critical tasks.
One way forward may well be using SWAP – scholarly works application profile. Used to described eprints – scholarly works/publications held in repositories. They used FRBR – functional requirements for bibliographic records. (See also this Demsey blog post.)Simple Dublin Core (the metadata standard/protocol) doesn't do this – it is all about relationships, not just a flat structure description. But it may all be too complex in the end. Can we just encourage users to tag, vice deep formal cataloguing that nobody ever sees and few outside the institution ever use? This has effectively distorted much of our work (in the AWM) that seems lost to any users, even on our own site. Rich cataloguing records are locked away inside some of our CMS and NEVER exposed on the web. THIS IS FUNDAMENTALLY WRONG!
We need to learn more from Web 2.0 about what works on the web as most repositories are not working (particularly re sharing on the web). They are not marrying up with the social networks that researchers actually use. Slideshare gets by with almost no formal metadata, just by using tags and links between resources.
Open access is important – making content available on the web. Policy needs to reflect this. We still focus on deposit, not putting resources on the web.
Andy's clossing message was for us to think about resource orientation, not services – digital libraries ignore this at their peril.
Questions & further discussions:
Warwick Cathro (NLA) suggested that institutional repositories can account for needs such as preservation and richer identification (which is what we are aiming at to some extent), but Powell said that that world has not yet been built, at least not in the UK. Building the social layer is beyond that model.
Physicists seem to be sharing their knowledge and research in arXiv.org and they maintain their affiliations.
Why do we still publish as PDF – it is like still working on paper, why not XHTML – embedded links and micro-formats. It runs counter to the mainstream web. Citation is another huge area and is still at odds with how it works on the web yet even WordPress allows for this with an app.
Stuart Weibel from OCLC suggested that researchers are too lazy and won't do what is needed re deposit and identification of resources. But Powell remains optimistic that the low cost of sharing a presentation on slideshare brings massive benefits in terms of knowledge sharing. It is intuitive and obvious and can work with little encouragement. It is a second best to say “you must do this”. Systems must be more intuitive than that! (I think this is a key message for us and the new practices and protocols we will be setting up and using within our ECM.)
Re future of scholarly publishing, Powell said that Open Access is just an inevitable change. We see it in the music industry already. Researchers can make their stuff free on the web. Yes, people will still want to buy and want to publish in journals. Maybe they'll be different, but something will change. National funding bodies seem unable to fund global networks for researchers, so publishers are starting to step into that space, building “Facebooks” for researchers. What impact will blogging have on this – probably an increasing one.
[My apologies for this long post, but this is the one paper not yet provided to us online or via the CD we received at Rego. I had to take these rough notes during his presentation. I thought it was pretty relevant to us as we approach ECM implementation.]