The CPC team is excited to be headed west! Not permanently, but we are packing our bags for the Digital Libraries Federation (DLF) Forum, being held in Vancouver, British Columbia in October.
Connecting Presidential Collections was selected as a Snapshot project update. In 2013, we did a poster presentation of CPC at the DLF Forum in Denver, CO. At that point, the project was mostly theoretical–we had only created a very basic beta website (that looked very different than the site does today). We had 6 partnerships and only about 25,000 items in CPC. Today CPC has 12 partnerships with more than 260,000 items covering 32 out of the 43 presidents. We have come a long way!
Of course, we will cover these updates in our talk but more importantly we will talk about the surprises that have come up and the lessons we have learned in the 2 years we have been working on this 3-year grant project sponsored by the IMLS and the Miller Center. We will also cover the work that we have been doing to smooth out the rough (and varying) edges of our partners’ metadata so that it can all play nicely together in our Solr index.
We will post the slides to our talk after October. Hope to see some of you there!
We’re hard at work these days preparing to showcase the CPC project at the 2014 Presidential Sites and Libraries Conference in Little Rock, June 2-5. We’ll be presenting a breakout session at 11am on Tuesday, June 3, and are so excited to share our work and learn about other projects.
(President Clinton is presenting that same evening–we do hope he won’t be intimidated by our work!)
Today is a big day for our project–Connecting Presidential Collections is up and running at PresidentialCollections.org. Although the beta website has been live for about 2 weeks, we are just announcing it today after some last minute tweaking. We are excited about this first step in what we hope will be a lasting project.
This launch marks the end of our IMLS-funded grant project that we began last year. We will now begin the process of applying for more funding in order to grow the website with more partner organizations and digital collections. We also hope to fine tune the functionality of the site in future development. We recognize that we have learned many lessons in this process (a post soon to follow on that topic–probably more than one), and there are changes or tweaks we would like to make. For example, the facets on the site are currently not very helpful to users because of the varied metadata from partners. We will need to work to standardize facets so that they can be consistently applied across the metadata and produce meaningful differences that users can facet on.
But for today, we will bask in our beta, warts and all!
We are learning so much in the final 6 weeks of this grant project. One lesson we have learned is that XML mapping does not really translate from sample data to actual data. I created the initial XML mapping document from sample data, but it did not match the maps that resulted from the actual data. In many cases, the differences were not major—just tweaks here and there. In some cases the partners sent me data in a different format or had refined it based on our conversations about the sample data.
And I do not regret the sample mapping because it taught me a lot about XML and Dublin Core. There are some real limitations with Dublin Core. For example, it does not have an obvious field for transcripts of letters or speeches or even a full text field. It also does not have a natural field for To: and From: in correspondence. We were able to adapt it in most cases but we also had the luxury of not including metadata if it did not fit the Dublin Core fields. Since we are offering users an abbreviated record for any given digital object and we want them to go to the partner organizations’ websites for more information, I had no problem just not mapping metadata that did not fit within the Dublin Core fields.
Still as this project goes forward, we might need to consider whether Dublin Core is the best choice as our metadata standard. We have had some discussions with our consultants about whether Dublin Core really works well with Solr. One consultant suggested that we use Solr’s native data structure because it seemed well matched to our data. We had long since decided to go forward with Dublin Core so for the purposes of this beta, we kept on our original path. But I think it is a good idea to have another look at that assumption in subsequent phases of this project to see if we might want to switch from Dublin Core to Sorl’s native data structure or even something else.
To provide some measure of comparison, the fields that our consultant suggested using included:
full_text – anything you want to be used in searching.
The fields that we are using are:
So clearly we have more fields to include using Dublin Core but the fields that were uniformly used across all partner collections were Title, Creator, Description, Publisher, Date and Source. I am not sure how much those other fields add. Perhaps going forward we should consider using minimal metadata fields so that users are more likely to click through to partner organizations. However, I think it is a balancing act of providing enough information about a digital object to be useful but not so much that we give it all away.
One of the many parts of this project that I am working on is trying to find a developer to help us get the beta product up and running. Although the Miller Center has an awesome web team with a great web designer and an incredible web developer, we need to find someone outside the Miller Center to create the beta product for this project. Our web developer is working on another project that is taking most of his time, and our web designer is kept very busy with all the internal work here. So I am looking for a developer.
Since we are considering using Blacklight and Solr, we are looking for a developer who knows those two products or is proficient with Ruby on Rails. I have talked to a few good possibilities but it seems that life in the world of developers is busy, and they haven’t had time in their schedules for our project. Luckily, the situation is looking up as we have a meeting with a good possible developer tomorrow.
I am cautiously optimistic and will write more as decisions are made.