Category Archives: cataloging

More on Serials and Linked Data

Last year I wrote an article on serials, FRBR, and linked data in the Journal of Library Metadata. My main goal was to re-think how libraries can make connections between articles and the journals in which they’re published using linked data. I used the FRBR model to link the article and the journal together at the Item level, envisioning both the article and the journal being positioned as Works.

I never felt entirely happy with my model, but I couldn’t figure out a better way at the time. I recognized several months ago that my thinking, when I wrote the article, was limited because I was focused on trying to create some kind of symmetry in the model.

Recently, I came up with another way to think about connecting journals and their articles, still using the FRBR model, and I think this makes a lot more sense. In my original article, I looked at the journal from a FRBR perspective and saw each individual issue of a journal as the Item in the FRBR hierarchy. But it was awkward, and I don’t think it worked particularly well.

In re-imagining this, however, I realized that an individual issue of a journal is really an expression of that journal.

A visual diagram of the FRBR hierarchy for a journal and an article in that journal

Serials FRBR model to link articles and journals together

The journal itself (“The New Yorker,” “The Paris Review,” “The New England Journal of Medicine”) is a work; it is a conceptual thing that doesn’t have expression outside of the issues that are published as part of its run. Each issue that is published is another expression of that journal. Similarly, if you think of an article as a work, they are published as an expression in a particular issue of a journal.

I think this model works much more organically, and makes a lot more sense that what I was originally trying to force to make sense because I was fixated on symmetry.

The other question I asked in the article was how we can deal with journal changes using linked data in the FRBR model. Merges, splits, and title changes can still create problems for someone in a library trying to find a particular resource. But I think linked data itself can solve this problem, without us needing to change the FRBR model by creating something like “super works” or “journal families.” We have a good way of linking former and succeeding titles together, but it doesn’t work as well when our metadata is contained in independent catalogs. However, if our “records” exist on the web and are openly linked, we can link to a former or succeeding title even if it’s not held in our own unique collection.

I don’t know if an idea like this will be picked up by the people who are currently arguing about the models we should use in a linked data environment. I suspect it’s too simplistic for them, which is what makes it appealing to me, but catalogers seem to like to make things as complicated as possible. But I felt that the niggling annoyance about my previously published model disappeared when I started thinking about linking resources together this way.

I’d love to hear your thoughts. Do you think this model makes sense?

The Catalog is Dead

In the five years I’ve been a part of libraryland, one conversation seems never to die: How can we improve our online catalogs? There are so many ways to approach this question, from a user-interface perspective or a metadata perspective, a software architecture perspective or a task-based perspective. But there is an assumption at work that not enough people have raised, and that I think deserves more consideration. Do we even need our online catalogs anymore?

The primary issue raised around our catalogs is their general user unfriendliness, from the terminology that we use to the overload of hard-to-decipher information that displays. Because of how library metadata is organized and silo-ed, our catalogs don’t provide access to the entirety of the resources available, and while modern discovery systems go some way toward alleviating that problem, they aren’t perfect and often just add to the chaos in the discovery landscape (especially since we’re all so keen to brand them and give them unique names).

The fact is that when a person is looking for information, they aren’t likely to go to the library first. We might have just the resource they’re looking for, but if that resource doesn’t surface in a Google search, they aren’t going to know about it. Students might be more likely to turn to the library for information, largely because they’re encouraged to do so by their professors, but they’re still much more likely to start online, outside of the library’s web presence. Clearly, what we need to do is to get our information out of our little corners of the web and into the wider web.

This is the promise of linked data. Yes, it’s The Buzzword of 2013 already. Every conference I attend has double the number of linked data presentations as the one before. And a lot of people still seem kind of fuzzy on what a linked data system would look like for libraries. I think part of the problem is that we’re still thinking about our System, and the promise of linked data is that the closed library discovery system could finally disappear.

I’m not saying our internal management systems would go away. Obviously, we still need those. But our users don’t need those, at least not for discovery. If we use linked data principles to describe our holdings, ALL of our collections and not just the items that are in our catalogs, that information will finally be able to be surfaced through web searches. Imagine searching for a book title in Google, and seeing not only the familiar Amazon link that pops up, but also a link to your local library, with a call number, availability information, and maybe even a link to Request an item, right there from the search results screen. Imagine searching on a subject topic using your favored search engine, whatever it might be, and finding a handful of resources at your local library right near the top of the list, including books, available journal articles, and archival material.

So far the focus of discussions around linked data has been on putting bibliographic metadata on the web. And that’s great, but once that’s done, it’s done. We don’t all need to do that. What we need to do is put our ownership information on the web. We need to link our ownership and availability information to a centralized bibliographic database (or a few), and make it available on the web for indexing by search engines.

Why do we still force our users to come to our special web sites to find what they are looking for? Why do we still keep thousands of copies of the same bibliographic metadata in thousands of databases around the world? Our data isn’t findable or usable right now by the people we’re ostensibly collecting it for. Our primary concern, as we talk about updating our metadata and our systems and our bibliographic framework, is how can we get all of this wonderful stuff that we own, and provide to people for free, out on the web where they are already living and working. Let’s stop trying to reinvent our own little corner of the web, and join in the game where everyone else is playing.

The Library of the Future?

During the closing keynote speech at LITA National Forum this year, Sara Houghton encouraged us to engage in a little thought exercise. She wanted us to imagine the ideal library of the future, without the limitations of what we believe is possible or what we’re currently doing. We should, she suggested, set aside some time to think about what we would like a library to be 20 or 30 or 40 years from now.

There are many different kinds of libraries, and what they’re going to be in the future will be vastly different from each other. My experience has been in academic libraries almost exclusively since I entered college at 18 years old. I’ve worked in academic libraries since 2007, and yeah, I have a lot of ideas about what the ideal academic library of the future should be. I work in technical service (metadata and systems), so I do tend to think about how the background systems will enable public facing services. But here, I don’t want to be limited by what I think the technology can do, so I’m focusing more on what we collectively will do, in the big picture.

To begin, I do believe that the library of the future will still inhabit a physical space. Our libraries will be beautiful, welcoming, well-lit spaces where students will come to work independently and in collaboration with other students. Library spaces will be flexible, to accommodate groups working together and students quietly reading and studying. Our library spaces will still hold physical collections, and patrons will still come into the library to access those collections, though they won’t be as large, or used in exactly the same ways. Our physical collections will likely be historical collections, special collections, and archives. Print collections may or may not circulate, and librarians will be on hand to help students new to primary source and historical collections access and interpret the materials they are working with (for example, historical government documents and maps, which won’t be searchable and manipulable in the same ways students are used to working). Physical collections will more likely than not be unique to a library, and will likely also have been digitized (or be in the process of being digitized).

Reference and Research Librarians will be an important part of the library services, though traditional reference desks may not. Students and researchers will have a relationship with their department’s librarian from the moment they enter the institution, and their librarian will be available in a variety of ways (email, text, chat, or whatever new communication mechanisms pop up) to assist with research, data organization and management, and as a liaison to other key services like writing centers and tutors. Librarians will frequently meet with students and researchers in their offices and in other locations on campus (computer labs, cafes, faculty offices). Librarians will teach information literacy embedded in the curriculum, from introductory composition classes to senior year thesis seminars. Rather than offering one-shot classes and hoping for the best, the information literacy curriculum will be built into the overall learning curriculum, and will expand over the course of a student’s time at the university, teaching the skills needed in a layered and integrated way. Librarians will also offer faculty workshops on data management and data and information literacy. Where possible, librarians will work collaboratively with faculty to enable faculty to teach information literacy to students, in a train the trainer model.

Most current resources will be accessed digitally, including monographs, fiction titles, journals, data sets, current government documents, and reference resources. All members of the university community will read on the digital device of their choice, and digital titles will be available in many formats to accommodate the technology used by the community.

Nearly all academic resources, like University Press monographs, journals, data sets, government documents, and reference sources, will be open access. The digital files themselves will rarely be hosted in the library itself or “owned” by the library, unless the library itself has digitized or published the resource. Users will be able to freely download these resources, read them on their device of choice, and annotate them however they wish. The library will serve as a curator of open access resources. Librarians who are familiar with the school’s fields of study and the resources required for teaching and research will build the library “collections” by curating links to resources hosted elsewhere, and providing access to resources hosted locally.

For titles that are not open access (contemporary fiction and non-academic titles), libraries will purchase (not lease) access to digital files, as well as a copy to be archived locally for preservation. Costs will largely be based on size of institution and download statistics, but cost models will be transparent and consistent. This will not be the bulk of future academic library collection development.

Funds that have previously been spent licensing access to academic journals will be spent instead on funding open access publication. Researchers at a library’s institution will apply for library grants to pay their own open access fees for publication. Libraries will also act as publishers, either in conjunction with a University press or on their own, if their institution doesn’t have an existing press. Libraries will host open access journals that are managed and edited by faculty, and will have publishing departments that acquire, edit, and provide access to journal and monograph titles in an open access model.

Libraries will also be the homes of subject repositories where feasible. Rather than merely collecting research, libraries will become the publishers of research in a global network of open access research publications and repositories. Libraries will also house institutional repositories, although these will focus almost entirely on preserving the administrative records of the institution, rather than the research outputs.

In addition to the Research Services and Publishing departments, libraries will have a Collection and Curation department. These librarians will be responsible for maintaining the physical collections, but also for curating links to appropriate resources. They will be responsible for ensuring access to external servers and managing relationships with other libraries and publishers. Additionally, they’ll be responsible for creating the metadata for locally hosted and published resources, digitizing local resources, and ensuring that access to local servers and digital collections is stable.

The ILS of the future will be almost unrecognizable from its current incarnations. They will integrate resources and metadata from the web, serve local metadata and resources back to the web, and act as workflow managers and statistics gathering tools.

Cataloging will be a very different activity, and will again consist of curating and collecting metadata from around the web to provide access, as well as creating local metadata for digital and print resources, and making it available on the web. Discovery will happen through locally-aware search engines. Users will set a library preference in their browsers or with their preferred search site that will prioritize the resources that have been curated by your library. But most library resources will largely be available freely, so discovery doesn’t necessarily have to happen through the library, and users can decide to prioritize a library entirely separate from their campus.

In the future, academic libraries won’t be stand-alone institutions, providing collections solely to their own patrons. The Library, instead, will be a global network of information, publications, and research, each library contributing to the whole by publishing, digitizing, and creating metadata. Our roles won’t be as gatekeepers, but as creators of scholarly resources and facilitators of scholarly communication. Our local services will consist of assistance to researchers, helping them gather the information they need to do research, and then helping them find the right place to publish it. Libraries, not for-profit publishers and journal aggregators, will power the scholarly communication engine. Our role will be not only to provide access, but to ensure preservation of the scholarly record, in all formats.

Of course there are details that I haven’t covered here. This is, after all, just a thought exercise. But if I were to use this vision as a source for strategic planning, I’d probably think seriously about how the library could become involved in scholarly communication and publishing at my institution. I’d put a lot of energy into re-modeling Research services. And I’d be actively engaged with the library community in building new models for metadata discovery and cataloging that are not based on local systems and local records. I’d apply for digitization grants to start digitizing local special collections. And I’d be actively engaged in global work to change the tenure model and reform copyright.

What is your ideal future library? Are there things in this vision of mine that make you cringe? That you absolutely can’t imagine happening? What do you think we should do now to create the library vision of your dreams?

Thoughts after ALA Midwinter

I got back from ALA Midwinter on Tuesday night, and after taking a day to ponder all the things I heard and discussed over the long weekend, I wanted to quickly write up a few observations and thoughts. I’m trying to take an overall approach, rather than detailing each session I attended, as I have in the past. I didn’t go to as many presentation sessions as I usually do: I had some committee meetings to attend, and I was trying to do a much better job at balancing conference stuff with my own need for down time.

There were three sessions I attended that shaped my general impressions of what’s going on right now in libraryland: The Cataloging Norms Interest Group session with Diane Hillman, Susan Massey, and Roman Panchyshyn, OCLC’s presentation on the changes they’re making to FirstSearch, and another OCLC presentation (by Kathryn Harnish) on the WorldShare platform and the underlying theories behind OCLC’s strategic vision.

It should come as no surprise to anyone that the word that’s been ringing through my mind since I came home is “change.” Yes, everyone is talking about change. Because, duh, we are all going to be facing a crap ton of it in the coming years. The timing of the Harvard Libraries’ announcement about restructuring was kind of fortuitous: Many librarians were talking about change and change management all weekend. Nearly every presentation I heard over during the meeting was at its root all about change.

Diane Hillman gave a great talk to the Cataloging Norms Interest Group about linked data, “From Records to Statements.” And what she said was, basically, “Hey catalogers, get ready, because everything you know will be different.” I think she did a terrific job of explaining how the future of data differs from our existing practices, and why our existing practices won’t serve us well going forward. Cataloging isn’t going to be about record creation and management anymore, and catalogers need to adapt and learn new skills. Metadata work going forward is going to be about aggregating data, working with programmers and developing new methods for handling and using data, modeling and documenting best practices, and evaluating and analyzing data. We’ll be working to create new tools to work with large amounts of metadata: We can’t think about bibliographic metadata on a piece-by-piece basis anymore. We need to make massive changes in our basic conceptual models, and the faster we do it, the better.

The OCLC presentations I attended were also pretty well focused on change: It sounds like OCLC itself is heading in a new strategic direction, and I think that’s a great thing. Kathryn Harnish’s presentation on the WorldShare Platform was well done and interesting, and I’ll probably end up talking more specifically about some of the things she discussed in a separate post. But the big takeaway for me is that OCLC is shifting the frame around what they’re doing. They’re thinking about data on a large scale, and how libraries can use that data in new ways, to improve effectiveness and to cooperate in ever more meaningful ways. I think it’s fantastic. The only thing I have to say, though, is that in both presentations and one-on-one conversations with OCLC folks, I wish there was less jargon and more solid information. They could definitely work a bit on transparency. As I like to remind myself, OCLC is OUR organization, it’s our cooperative. It would be nice if it didn’t feel sometimes like they are trying to sell it to us.

I think most of the librarians I know are aware of the need for significant transformations in the way we work. We have, after all, been talking about this for a long time. Libraries are notoriously slow about adopting new practices, and this worries me. We do not live in a slow world. But I feel hopeful that the constant murmur around change I heard at ALA is a sign that we know we have to pick up the pace, get on the ball, get our shit together, whatever metaphor you prefer for hurry-up-and-make-good, people!

Some things I think librarians should do in the coming months to start getting themselves and their organizations ready for change (self included):

  • Start learning about linked data and RDF. The W3C Library Linked Data Incubator Group Final Report is a great place to start.
  • Read The Age of the Platform by Phil Simon, and think about how data management works outside of libraries (PS – I tried to link to Worldcat for that book, but I couldn’t find it using a keyword, title, OR author search; if OCLC can’t fix stuff like that their ideas about platform driven development are kind of meaningless.)
  • Learn about change management. There are best practices, no matter your role or position in an organization.
  • Start thinking about your own skills and strengths, and your weaknesses. Come up with a plan for learning something new this year. Codeyear has been great fun for me so far. Position yourself well for the changes that are bound to come in your organization, rather than waiting for some kind of training to come from on high.
  • Read about RDA, if you haven’t already. Even if you’re not a cataloger, it will help to understand how library metadata is being conceptualized.

These are just a few things off the top of my head that I want to do this year; I’m sure some of you have more and better ideas for how you are getting ready for big ol’ fancy changes in libraryland. I’d love to hear them. Are there other big themes you’re hearing and seeing in the profession right now? If you went to ALA, what are you thinking about this week, now that you’re home?

Know Your MARC

I’m deep in the midst of a project that involves dealing with a lot of MARC records, and I know I’ve said it many times before, but seriously? How is it possible that this “standard” involves so many un-standard elements? Sure, a lot of the core data really is in the same place in every record (unless there have been some serious cataloging snafus in a library over a long period of time, which, well, isn’t that infrequent). But there is still so much digging and poking and searching I have to do to find things like local bibliographic record numbers and holdings data. Some of this has to do with variations among systems. Some has to do with libraries where systems weren’t well implemented. Some has to do with indifferent catalogers. Sad, but true.

What has really been surprising, though, is how often librarians don’t really know what their MARC records look like. Ok, I guess it’s not that surprising when you think about it. Librarians hardly ever work with raw MARC records. We work with library systems and software like Connexion that mask the MARC in graphic interfaces and user friendly field labels. In fact, one of my biggest surprises when I got my first library systems job was how difficult it is to look at an actual MARC record, or to see a MARC record translated into something human readable. We just don’t have a lot of software that does this.

Librarians also often don’t know what their library system will export. You’d think a system would just export everything. Behind the scenes, those systems should be translating various data fields into MARC tags, and every piece of data you’ve entered into a record should come out somewhere in MARC, right? Not so, my friends. Different systems are set up to export different pieces of data, and sometimes to export different data in different ways for different purposes. And then, of course, some systems force you to pay extra if you want to be able to export the data you need, if it wasn’t set up that way from the beginning. Brilliant.

If you’re a technical services librarian, and you’re not 100 percent sure what your MARC records look like or what your system is capable of spitting out, I say go ahead and experiment. In fact, I hope anyone who comes across this post who doesn’t know what their data really looks like should immediately start exporting whatever they can export. Download something like MarcEdit (which is, actually, the only piece of relatively user-friendly software I know of that will let you know look at MARC records in a human readable way). Export whatever you can, in as many different ways as you can. Use MarcEdit to convert it to marcmaker (a readable format designed by the Library of Congress) and start poking around. You can’t break it once you’ve pulled the records out. You might be surprised what you can (and can’t) get out of your library system.

There is no excuse for us to be so clueless about our own records, and the tools we have available to deal with them. Looking at your records outside of your ILS is the first step to really understanding how shareable an useable they are (or aren’t). Once we know what we’re working with, we’re much more capable of knowing how to make it better.