Archive for the ‘Charleston Conference 2011’ Category

Greg Tannenbaum conducted his annual live version of his “I Hear the Train A Comin’” column in Against the Grain.  His guests this year were Anne Kenney, Cornell University Librarian; and Kevin Guthrie, Founder of JSTOR and ITHAKA.

(L-R) Greg Tannenbaum, Anne Kenney, Kevin Guthrie

Here is an edited transcript of their conversation:

GT: What are the biggest challenges facing the library community over the next 2-5 years? What has to give?

AK: Our materials budgets show us a path forward. The percent going to e-content has doubled from 30% in 2004 to 60% now. We have been surprised by the diversity of our holdings in the past; moving forward we will see more homogeneity. But our organizational arrangements are still heavily oriented towards physical services. Something must give as more and more electronic services become available.

KG: The notion of libraries and publishers as adversaries is not appropriate. It is really about the author and reader and a system to serve them. The allegiance to the intermediary structures must give because much restructuring is happening. It is very hard for the existing actors to give up on how they do things to allow freedom of reinvestment.

AK: The archive is moving towards being seen as a public good worthy of public support. We are moving toward a model of providing support from around the world.

KG: There is downward pressure on pricing of content. The question is what value are you adding on to that. Everybody in the space between the author and reader must figure out how they can contribute value.

GT: What aspect of the vendor-institutional relationship do publishers misunderstand?

AK: We tend to think that publishers see us as a sales channel, with less understanding of our mediation goals, providing access, preservation. There is a stronger relationship between the library and reader than would be considered.

GT: Same question for libraries?

KG: Librarians go into that profession because they are not looking to go into business, so there is a challenge understanding the business aspects. When you build scale, you build huge costs. Value is more challenging. How does a library value the materials they are getting? It is very difficult to mesure value.

AK: We want the same rights for electronic materials as for physical, as for example, e-lending of materials. Libraries see this as publishers trying to curb their traditional roles. The hidden environment of seeking information whereever it is found is justification for that. In the music environment, performers are using concerts for income. We have no similar process in libraries.

KG: There is a tendency to stay with the status quo. The concept of owning something has changed. Publishers sold books to libraries and they were loaned, but there was friction in that. All the former players will not necessarily survive to the new world. Collaboration between those helping the authors write and those helping them distribute is important.

GT: Circumstances and media have changed, which allows us an opportunity to revisit how we are operating.

AK: We are moving beyond the silos of publishers as well as the silos of libraries. Preservation is an area where publishers and libraries need to do much more work as we move towards licenses and not owning material. We are at real risk of losing access. We need mechanisms in place to preserve content. Publishers’ activities are insufficient so far.

KG: There is a tremendous amount to do with e-journals. New formats are coming and it is important that we are investing in those solutions.

GT: What should we take away from current legal cases?

AK: We need to understand the issues associated with what is appropriate for digital access to material. Libraries in the business of respecting agreements, contracts, and rights. If we do not do that, we will lose our sense of trust, which we will not let go lightly. We must respect privacy of use.

KG: The old adage that “nothing in newspapers and blogs is true” is indeed true! It is amazing how far from the truth some of the published articles are. We have benefitted greatly as a country from the respect for the rule of law. If we do not like the law, we advocate changes, and that is a great thing. Libraries have been good stewards of their responsibilities, and we must respect that.

GT: What game changer will we be talking about in Charleston in 2014?

KG: Books in electronic form is the big game changer of the present moment. We do not have broad access to books yet; when we do, everything will change. The Google Books initiative signaled that it was possible to digitize 15 million books. What if everything in a library is available electronically? That changes the way we operate. It is still not here yet, but those books are not yet as widely available as people want them to be.

AK: There are many different game changers. How do we manage the long continuum of scholarly communication? The outcome of the Google settlement is a main game changer, as is the develoopment of the Hathi Trust. Over 60 institutions and consortia are participating, and with 10 million volumes, it is in the company of the elite of ALA libraries. Now we have the ability to search across the content of all those volumes. We are moving towards new forms of reading, where we can mine information in new ways. Researchers will still look at physical books as we move toward orphan works becoming available. How do we as a community keep things lightweight and work together and not diminish the role of the individual institution but enhance it? The future will be in more pre- and post-collaborative activity.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

 

(L-R) T. Scott Plutchak, University of Alabama; Paul Courant, University of Michigan; H. Frederick Dylla, American Institute of Physics

The Friday plenary session opened with an executive roundtable discussion.  The panel consisted of 2 people in charge of large organizations who spent many years in research and who represent one audience that librarians want to serve.  They reported on a Scholarly Publishing Roundtable, which met to discuss issues and find common ground, then issued a report,which is on the AAU website. The issues are complicated and need to be balanced. Access only is not worth very much.

Here is an edited transcript of the conversation:

PC: The conclusions of the Scholarly Publishing Roundtable will be useful to define policies around public availability to publicly funded work. There is a remarkable heterogeneity in the ways things come to be published, and we must recognize that or we will be in serious trouble.

FD: For 3-1/2 centuries, publishers and librarians worked together. In the last 10 years, they had a fractious debate. The Roundtable changed the tone of the debate on public access. The President’s Science Office released some overriding principles to be followed and requested input on publishing practices.

SP: The “version of record” issue is a concern. The author’s final manuscript may be useful for immediate needs, but may not suffice as time goes on and revisions are made.

PC: That issue is complicated now as we produce everything electronically. You can lose the version of record easily in an unstable world. In the humanities, there is no version of record because of the continuing integration of multimedia, updates, etc. Versions of articles never become stable. What is the library’s role in this environment? The library and publisher begin to look very similar. Do we want to preserve the entire record? Do we want just a sample of it?

SP: This is particularly important in the healthcare field because lawyers often want a particular edition for a legal case. Traditional publishing in physics and similar disciplines is stable. Why?

FD: It goes back to the mimeograph machine where early versions were sent around asking for colleges. Now we fax documents to others. The physics community is well knit and has a half century of experience with collaboration. Some papers have hundreds of authors, so peer review is done internally.

PC: Economics is like physics and has always had a similar preprint culture of circulating papers before they are reviewed. We should not believe that a model that works in one world will work in all worlds.

SP: Science and scholarship is becoming more and more siloed and more interdisciplinary.

PC: The silo to the next piece of work is extremely important, but it may be more useful when it jumps over into another discipline.

FD: PLoS ONE is an interesting example. By forming an interdisciplinary and wide open journal, it has become the largest journal in the world. Many of us just use Google and go right to the abstract, without needing the indexing and other things on top of the article. We as publishers must be working on accurate discovery tools to help users locate articles.

SP: PLoS ONE is the first real game changer in publishing because it has shifted the process of peer review. One wonders what will happen to the rest of the journal space.

PC: I expect vertical alliances of journals. BEPress has set up various categories of journals, but articles are only reviewed once and then the journal where the article will be published is selected.

FD: This is just another corner of the publsihing ecosystem. The diversity of publishing is one thing I admire.

SP: Findability is the thing that worries me most about PLoS ONE. There is so much of interest that is being published that the challenge is not to separate the interesting from the uninteresting but to find the really important things among all the interesting ones.

FD: Our most important customers are the authors and readers. Everybody else serves them.

SP: The use of social networking may help researchers broaden their circle of colleagues beyond what they are aware of.

FD: Collexis, now part of Elsevier, set up something similar for biomedical researchers. We have UniPHY for physics. It is not too successful yet, but it is a good start.

SP: We are at least a generation or two removed from a true digital culture that parallels today’s print culture. The technical challenges are very solvable, but the economic and legal issues are much more difficult.

PC: The technology is actually quite good; we have become very good at transmitting large data files. But doing so requires payments and a set of arrangements different from anything we have seen.

FD: Costs and benefits must be a very important part of the equation. Someone must pay for the infrastructure.

SP: You both seem very optimistic about where we are going. Is that accurate? Where are the bright spots?

FD: I think the diverse group at the Roundtable showed the way for us to work through these problems. We all agreed on a set of principles for scholarly publishing.

PC: It’s now very inexpensive to copy and distribute work. That’s very good because new things that were formerly unimaginible are possible.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

 

 

Robert Darnton and Rachel Frick

In a session on the Digital Public Library of America (DPLA), Robert Darnton, Director of the Harvard University Library, likened the concept to Thomas Jefferson’s observation that often the use of something does not diminish its value. For example, using one candle to light another spreads light and does not diminish the value of the first candle. This idea acquired a 21st century luster with the spread of the Internet. The use of information does not diminish its value. Public good benefits the entire citizenry and one citizen’s benefit does not diminish another’s; it is not a zero-sum game. However in considering these concepts, we must not lose sight that the acquisition of knowledge as a public good is not without cost. (Someone had to purchase Jefferson’s candle!)

Moving on to the present, Darnton noted that the DPLA is an opportunity to realize the enlightment and goals upon which our country was founded. Google tried to establish a major digital library and demonstrated that today’s technology could be used to create a new kind of library which, in principle, could contain all the books in existence. But Darnton observed that Google Book Search is an example of a good idea gone bad because of copyright problems and the alleged infringement of it by the Author’s Guild. Google did not pursue a legal case which (if they won) would have provided a significant public benefit, but instead they chose a commercial approach and negotiated a settlement with the Guild. The settlement was rejected by a Federal Court. So the time has come to create a digital library to make our cultural heritage available to the entire world.

In 2010, a steering committee was formed to provide guidance. Working groups were set up and produced a way to create a master plan, which was presented to the public last week.

Here are some of Darnton’s thoughts about features of the DPLA:

1. The DPLA will be a distributed system aggregating collections from research libraries and institutions. It will not be a single database but will consist primarily of books in the public domain from the Hathi Trust, the Internet Archive, and digitized collections made by large libraries independent of Google. Government sources are also rich sources; all 50 states have digitized their major newspapers and given them to the Library of Congress. These can be given to DPLA. Because of copyright law, most current literature will not be in the DPLA. DPLA’s mission should be defined to make its service distinct from public libraries. Darnton suggests that the DPLA exclude anything published in the last 5-10 years.

2. At its launch in April 2013, the DPLA will probably contain a basic stock and will grow as funding permits. It will be interoperable with major digital libraries in other countries (it has already made an agreement to cooperate with Europeana). The example of Europeana suggests the bare minimum funding needed to get the DPLA going. Brewster Kahle of the Internet Archive estimates a cost of 30 cents to digitize a page, or $300 million to digitize the contents of a large library, but others think the cost is nearer to $1/page. The DPLA will grow in accordance with its budget, which nobody knows yet. If a coalition contributed $100M/year, a great library could be created in a decade. DPLA will cooperate with Europeana, which estimates 5 Million Euros/year for its operating costs.

3. The DPLA must respect copyright. The first copyright law struck a balance between authors and publishers by providing limitations on the term of copyright. The current limit tips the balance toward private commercial interests. Every book published since 1923 is now covered by copyright, regardless of whether it has been renewed, and many owners are unknown which has led to the orphan works problem. The DPLA could try to reach an agreement between authors and publishers of books that have gone out of print.

4. The DPLA steering committee established a contest to develop a technical infrastructure. The technical subcommittee will develop a draft prototype to go into operation when the DPLA is launched.

5. A governments committee has only begun to study the administrative issues of the DPLA. The present interim leadership at Harvard will continue until the final DPLA comes into existence. It will serve a very broad and diverse community and is meant to serve the entire country, so it probably will not be at any elitist institution. Most people think it should not be part of the Federal Government to keep it free from political pressures.

Rachel Frick, Director of the Digital Library Federation summarized the operational plans of the DPLA for the next 18 months.

  • Where possible, existing free or open source code will be used.
  • The DPLA will be freely accessible for others to port or replicate.
  • Metadata is the core of the discovery framework. It will aggregate existing library data and operate in a global data environment. All metadata will be freely available except where it would violate personal privacy.
  • Content will incorporate all formats, not just books. It will begin with already digitized works in the public domain. It will grow with orphan works.
  • Tools and services like APIs will provide enhanced uses of the content. The platform will be open to public innovation and enable the creation of new tools and services. It will provide APIs. With Europeana it will share an interoperable data model and source code.
  • Community will be a participatory platform that supports users and developers who wish to reuse content and metadata.
  • A discovery layer will provide access to secondary sources.

More information is available on the DLPA website.

The DPLA beta-sprint was an aggregate of metada from 1400+ collections in 44 states. Visit dpla.granger.illlinois.edu to see it. 60 organizations submitted letters of intent, and several were chosen to demonstrate their systems at the DPLA plenary conference.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

My thanks to Carol Tenopir, University of Tennessee, for her contributions to this post.

 

Clifford Lynch (L) and Lee Dirks (R)

Clifford Lynch, Director of the Coalition for Networked Information (CNI), and Lee Dirks from Microsoft Research gave a wonderful presentation in the final plenary session on the first day of the Charleston Conference.

Lynch began by enumerating some serious problems with the present system of scholarly communication in science.  These are not economic problems. They include:

  • Scale: scientific publishing is getting bigger and bigger–a scientific paper is published every 1 or 2 minutes.
  • Speed: We are under constant pressure to make research and discovery move faster. Achieving speed is often at high cost. We have huge problems with filtering and validating work.
  • Access: One of the hopeful possibilities for getting a handle on these problems is doing computation on the literature. But getting access to enough data is difficult because it is in silos.
  • Communication: There is a growing disconnect with practices and norms in scholarly work and how communication is operating. An excellent book about this is The Fourth Paradigm, which is available for free download.

More and more science is data- and computation-intensive and relies on communications among geographically displaced people. Some systems are starting to look at this–myExperiment is a system that lets researchers make their data accessible for sharing.

We must get past designing articles with the same old presentation of science where there are major issues of reproducibility, adding other people’s work, and recognition of data as a primary input and output of scientific inquiry.We need to manage data and integrate it into the traditional scholarly literature.

Lee Dirks from Microsoft Research followed and enumerated 7 platforms for open research that have started to emerge in the last 12 to 18 months.  They facilitate collaborative research with academia, and particularly scholarly communication. Many of them are open source, and have an API for sharing.  Here are Lee’s slides describing each one (I thank him for providing me with these copies and giving me permission to post them here).

 

 

 

 

 

These platforms all integrate into the Scholarly communication life cycle, as shown here.

Scholarly communication life cycle

 

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor

 

Jennifer Bazelely and Aaron Shrimplin

Aaron Shrimplin and Jennifer Bazeley from Miami University Libraries (the Miami in Ohio, not the one in Florida) have evaluated their relationship with e-books, conducting a survey of 735 users’ attitudes toward e-books. Respondents fell into 4 classes:

  • Book lovers have an inherent affinity for printed books.
  • Technophiles are interested in the possibilities of new technology for reading books.
  • Pragmatists see the pros and cons of both print and electronic forms of books.
  • Printers prefer print books and have specific difficulties with the usability  or readability of e-books.

The library is planning on ramping up their e-book collections, but there are many issues and more questions than answers.  So a preliminary study focused on the 2008 Springer e-book collection and its use over a 3 year period was conducted.

The Springer collection is divided into 12 subject collections.  The e-books have no DRM, and the owner has perpetual access–an attractive consideration.   E-books and journals are on the same site and are searchable together.  At Miami, the collection can be accessed through the OhioLINK electronic book center (EBC) or directly through Springer’s site.  The study compiled usage of 2,529 e-books published from 2008-2010 used on both platforms.  Only 23% of the titles had been used, and if this trend continues 54% of Miami’s e-books will be unused after 6 years.  Usage followed the well known 80/20 rule (Pareto Principle):  20% of used titles accounted for 80% of the downloads.  The Long Tail effect was also observed: of the infrequently used titles, about half had 3 uses or less.  A few high-use titles dominate the statistics;  the most used title had 28% of the total uses over 3 years.  Professional books, monographs, and especially textbooks accounted for the most usage.  Computer science was the most heavily used subject area.  Past usage predicted future usage; trends observed in 2008 continued in 2009 and 2010.

Platform matters:  e-books that are cross-searchable with journals is appealing, especially for pragmatists and technophiles.

Don Hawkins
Columnist, Information Today and Conference Circuit Blog Editor