After almost a week since Code4Lib 2013, I’m still not sure it’s all sunk in. Every year I look at the conference program, I wonder if the sessions will actually interest me, but I go anyway and get blown away. Regardless, I was more excited this year since I was the program committee lead this year, helped with sponsorship, prepared a lightning talk and decided to be the opening MC. Continue reading “Code4Lib 2013: Reflection & Thoughts”
Tag: c4l13
Code4Lib Day 3: Morning Notes
Hands off! Best Practices and Top Ten Lists for Code Handoffs
- Naomi Dushay, Stanford University Library
Code handoffs are never smooth. Ever.
Ratio of time spend reading vs. writing code, 10:1.
The Truck Test
- what if you were run over a truck and someone else had to take over?
- need to code so a stranger can read it and understand it
The Boy Scout Rule
- “Leave the code cleaner than you found it”
- need to maintain your code
- otherwise you’re part of the problem
It’s More Than Code
- naming should make sense: servers, scripts, everything
- config files should not point to boxes
- tools chosen can be the problem
- should you be rolling this on your own?
- probably something been done before
- some think if you write code really well, then you don’t need to comment. Not true.
- Documentation and comments are there to inform, explain, clarify, warn, need maintenance
- readme’s should make sense
- tests are code, should also think about readability of these
- failures should be addressed ASAP
- KISS – Keep It Simple Stupid
- DRY – don’t repeat yourself
Readable Code
- follow conventions
- meaningful names: variable, method, class, file
- small, single purpose methods
Cleverness that reduces readability isn’t clever.
Sources
- Clean Code: A Handbook of Agile Software Craftsmanship by Robert Martin
- Refactoring: Improving the Design of Existing Code by Martin Folwer et al.
The Care and Feeding of a Crowd
- Shawn Averkamp, University of Iowa
- Matthew Butler, University of Iowa
DIY History
- transcribe items in collection
- omeka + scripto + mediawiki
- still in development: want to add social media aspects/integration
- err sorry, brain temporarily sort of died. See slides and I’ll go get a cookie to recharge
How to be an effective evangelist for your open source project Creating a Commons
- Bess Sadler, Stanford University Library
Lost a member of our community this year: Aaron Swartz
- helped to define Creative Commons licenses
- 3 versions: machine, human, and lawyer readable
- code4lib should do the same principle
- shared engineering practices are becoming more and more important
- investment that’s worth it
- please get code contributors to sign a contributor license agreement
- can determine whether contract allows participation
- don’t want to lose informal sharing, but law cases have happened and we need to protect ourselves
Building Code
- what are we building?
- we are building a culture, a commons
- Fedora4lib – came early and rented a house together
- Hydra = a community
- cultivate a place where we can
- teaching at Ruby on Rails workshops – too big a job to leave to a small group of people
Hacker Epistemology
- how is knowledge acquired?
- how do we decide what’s true?
- collaboration with disregard of conventional mental thinking
Building the Community
- need to expand and include everyone who wants to join
- more steps in building a more inclusive community
- adopted a code of conduct, because it was a good idea and making an explicit statement
- need to let other people to know that we’re trying
- “We are all imposters.” – just acknowledge it, we all feel that way, but bolster ourselves
- allow ourselves to be seen even when there’s no guarantee of success
- we can support each other
- cannot be accomplished alone
- want to craft a process for submitting issues
Thank you, code4lib!
The End
And that’s it! Until 2014.

Code4Lib Day 3: Lightning Talks
Mark Matienzo – Wielding the Whip: Affect, Archives, & Ontological Fusion
- need to talk about emotion – a lot of things going on in my life
- inspired by Archives by Emotion
- why do we not acknowledge that archives are based on emotion, stories
- facebook ‘like’ is not an emotion
- losing connections to materials and history when not thinking about the platform and how that effects our stories
- can we write stories about our collections?
- should be using existing linked data to make those narratives
- let’s build this
- Slides
- Full Write-up
Jason Casden and Cory Lown – My #HuntLibrary
- student engagement platform
- how do students use the space?
- what do they choose to document? – using instagram
- student-drive archival selection
- making use, harvesting social media
- Implemention:
- moderated, responsive
- use for public display, can interactive, including larger public display
- inspired by kitten war: built image battle
- calculate popularity score
- also about preservation, collect images
- go.ncsu.edu/MyHunt
Steven Anderson – JavaScript Streaming Clientside Checksumming w/ HTML5 file upload
- basic using JavaScript to view checksum
- and then when files are uploaded
- Demo: hydratest.bpl.org:3000
- On GitHub
Will Hicks – Metadata Entry Beyond Usability
- think back to volunteering to donate blood, etc. what value did it give you?
- ~300 individuals creating metadata
- stats have bumped up with a new interface
- but 10% of the stuff is hidden
- “usable” forms are great, but little “ownership” and lack of domain expertise
- what if we applied the ideas used in social media?
- invite with openID, personalize projects, badges, stats, visualization, etc.
Kelly Lucas – Drupal OPAC Recipe
- the poor man’s Blacklight or Vufind with Solr backend
- process:
- MARC record dump into Solr using SolrMARC
- install search API, Search API Solr, Facet API, Sarnia and Views from Drupal.org
- configure Sarnia + FacetAPI (boosts, enable facets)
- create a view, add some fields, create an exposed filter (full text search box)
- slap some facets to the side of the page
- issues: new/updated records requires direct connection to ILS
- Slides
Karen Coyle – Nerd Poetry
- cowboy poetry – told around the camp fire
Mark Redar – Django Dublin Core App
- plugin app
- on Github
- RecordExpress – lightweight, easy to use esp. for those not familiar with XML
James Stuart – Taming Email
- a really big problem
- part of your job
- here’s how to tame email
- don’t organize: organizing your email is like alphabetizing your recycling
- mailstrom.co
- turn keyboard shortcuts on
- shortcutFoo
- don’t be distracting
- bookmark
- Also take a look at Making Thunderbird More Gmail-y
No Break! Have a Cute Animal Anyway

Code4Lib Day 3: Closing Keynote – Gordon Dunsire
Granularity in Library Linked Open Data
Fractals
- self-similar at all levels of granularity
- each circle represents of things that look very similar (snowflake looking pattern but of different sizes)
- characteristic of fractals
- cannot determine level: all levels are equal, some more equal than others
Multi-Faceted Granularity
- What is described by a bibliographic record? or a single statement?
- What is the level of description? How complete is it? e.g. AACR2
- How detailed is the schema used? How dumb? – especially relevant right now. The more detailed, the higher level of granularity possible.
- Semantic constraints? Unconstrained?
Resource Description Framework – Linked Data
- Triple: This resource | has intended audience | Juvenile
- Subject / Predicated / Object
- do each of these parts have granularity?
- higher/lower level, but should talk about coarse or fine grained granularity
Subject: What is the Statement About?
- we can focus on description an article / resource / work, then think about coarser or finer granularity:
- coarser: consortium collection / RDF map
- library collection / digital collection
- super-aggregate journal title / jurnal index
- aggregate: issue / festschrift
- focus on description an article / resource / work
- component: section / graphics / page
- sub-component: paragraph / markup
- finer: word rdf/xml
- uri / node
Predicate: What is the Aspect Described?
- similar coarse/fine breakdown:
- membership category
- access to resource
- access to content
- suitability rating
- audience and usage
- audience
- audience of audio-visual material
- diagram: possible audience map (partial) – unconstrained version to avoid collisions of isbd/dct/schema/rda/m21/frbrer
- different links can be made while still retain proper semantic links
- currently constructing just one giant graph
What is the Aspect Described?
- coarse to fine:
- resource record
- manifestation record
- title and s.o.r
- title statement
- title of manifestation
- title word
- first word of title
- why do librarians need so many titles? Why not just use dublin core title and be done with it? Because we need it to do our work e.g. spine title to browse
- title = string identifier
- RDA: what to do with this? how do we apply these needs?
- possible semantic map (partial) – I won’t even try to reproduce this
- need to take into account names and ranges
- make it more difficult, but more powerful
Semantic Reasoning: The Sub-Property Ladder
- this is where the graph becomes useful and property
- machines can’t reason, so we’re demantic the semantics such that we can give the rules to machines to process our data
- semantic rule:
- if property1 sub-property of property2;
- then data triple: resource property1 “string”
- implies data triple: resource property2 “string”
- otherwise, data triple remains the same
- simple enough for computer to carry out
- doesn’t matter how complex the map actually is, because it can still do it in matters of seconds
- machine entailment: isbd” “hast title proper” (finer) -> dct: “has title” (coarser)
- might sound simple, but making a computer do interferance
- ‘dumb(ing)-up, data has been lost, but still meaningful – moved from one schema to another
Data Triples from Multiple Schema / Entailed from Sub-Property Map / rom Property Domains
- frbrer: “has intended audience” – “primary school”
- isbd: “has note on use or audience” – “for ages 5-9”
- rda: “intended audience (work)” – “for children aged 7-“
- m21: “target audience” 0> m21terms: -> “Juvenile”
- definition attached to the vocabulary
- also talking about granularity
- can map the sub-property to top level of unc: “has note on use or audience”
- “is a” frbrer: “work”, isbd: “resource”, rda: “work” – rda and frbr schema actually separate, not semantically linked even though vocabulary is similar and RDA is based on FRBR
- once stabalized can be drawn from each other
What is the Aspect Described?
- coarser to finer:
- creator
- author
- screenwriting
- animation screenwriting
- children’s cartoon screenwriting
- different controlled vocabulary
- graph of RDA for author/creator/screenwriting in relation to work and agent
- graph of same thing, but for dc for creator and agent
- what is the semantic relationship between the dct creator and the rda creator?
- marcrel author maps to dc contributor, not creator – what is the relationship between rda author and marcrel author?
- decision from 2005, needs to be reappraised and reviewed
- relationship between dc creator and dc contributor?
- how does lcsh “screenwriters” fit?
Machine-Generated Granularity
- also has issues
- e.g. full-text indexing: down to the word level
- BabelNet: A very large multilingual ontology
- can get quite complex and granular
User-Generated Granularity
- users can actually generate useful metadata
- can use statistical methods to remove extremes and come back with consensus
- going to cause granularity problems e.g. “OK for my kids (7 and 9)”, “Too childish for me (age 14)”
KISS
- keep it simple, stupid
- keep it simple and stupid?
- data model is very simple: triples!
- in terms of complexity, actually very simple
- but metadata content is complex
- and therefore, resource discovery is complex
- complex structure of application of simple rules, similar in the hard sciences and math
- simplicity is elegance
AAA
- Anyone can say anything about any thing
- someone will say something about every thing
- in every conceivable way
- and then constrained linguistically
OWA
- open world assumption: the absence of a statement is not a statement of non-existence
Will it get so granular that it becomes too complex?
And the rest is science
Break Time

Code4Lib Day 2: Lightning Talks
Demian Katz – gamebooks.org, Geeby-Deeby, and the Dime Novel Bibliography Project.
- interactive fiction books
- Made a really big bibliography
- people sending in more
- put it in a big database
- open-source project to adapt backend
- MySQL db that models various types of entities, relationships
- backend system to edit entities, enter data, edit relationships
Rachel Frick – LODLAM Summit 2013 and Challenge
- linked open data in library, archive, and museum
- had challenge to win cash prize
- can still submit, just fill out form and make/submit video
Kenny Ketner – Occam’s Reader
- allow libraries to lend ebooks to each other using document delivery model
- compatible with IILiad
- no training, no workflow change
- can add formats incrementally (started with PDF)
- basic interface that discourages abuse
- access previously unaccessible resources
- in alpha testing
- Slides
Al Cornish – Orbis Cascade Alliance Shared ILS Project
- primary service is consortial services
- move to new system beyond traditional ILS
- single shared system (vs. currently 37)
- explore collaborative technical services, collection development
- Alma + Primo selected, migration in progress
Makoto Okamoto – Crowd Funding for Library in Japan
- culture of donation changed a lot after 3.11
- share experience and metrics
- key to success is setting up appropriate ticket
William Denton – Code4Lib 2013 Augmented Reality View in Layar
- picking points of interest from two spots
- 1: Google Places map
- 2: Twitter search API – of the ones that are geolocated
- Web service in Ruby and Sinatra, hosted on Heroku
- Rainbows End by Vernor Vinge – go read it.
- Slides
Rosalyn Metz – What I learned while I was away
- learned about planning, budget, and time
- time is the most important thing
- track your time, might be spending too much time on things e.g. don’t spend more than 10 mins on a single email
- can demonstrate where need help
- Slides
Nettie Lagace – Recent Cool Fun NISO Activities
- ResourceSync Framework Specification
- Bibligraphic Roadmap Initiative
- Slides
Chuck Koscher – Fundref
- list of funders
- what articles given certain funding
Andromeda Yelton – Five Conversations About Coding
- computer science majors in 1995: yardstick of who is cooler than who dependent on the most arcane knowledge
- boston python workshop 2012: women friendly course. Expecting to be judged.
- chad nelson, monday night. It’s not free
- bess sadler, yesterday. We have a problem with insecurity
- important to recognized our limitations, but have this imaginary yardstick
- ever done coding? majority. think coder? 1/2
Jeremy Morse – mPach: Publishing directly into HathiTrust
- sorry, didn’t quite get this one
Rob Dumas – Git in Five Minutes
- source control software
- accountability – record of changes over time
- can keep branching and merging
- Going from SVN to Git
- Slides
That’s all for today.

Code4Lib Day 2: Afternoon Notes
De-sucking the Library User Experience
- Jeremy Prevost, Northwestern University
Libraries hate library users. If we didn’t, our websites wouldn’t suck.
Discovery
- if a user can’t find it, why do you own it?
- spend a lot of money on acquiring resources or access to them
- want to allow them to find them
- Good: works like Google from the user’s perspective
- Bad: needs to know how it works to make it work e.g. need to know MARC; can only find known items
- live examples: Ex Libris Voyager vs. Primo
- Voyager: no relevant results even using boolean ‘AND’
- Primo: can use boolean or not, relevant results – de-sucked!
Requesting Item
- Request information/user experience also sucks
- Prepopulated info, request item if not available – de-sucked!
Renew Item
- consistency
- made interfaces consistent – de-sucked!
Mobile
- not going away
- no mobile until mid-2007 for iPhone
- jQuery mobile – Apr 2010 – but updating two sites sucks, no support for tablets
- Mar 2013: responsive design, bootstrap
Libraries don’t hate library users!
- start with something that you would enjoy using
Google Analytics, Event Tracking and Discovery Tools
- Emily Lynema, North Carolina State University Libraries
- Adam Constabaris, North Carolina State University Libraries
How to track in-page events. Decide which events to track, push to Google.
Event Tracking Use Cases
- hidden or externally AJAX events e.g. facets, tabs
- internal links that occur in multiple places e.g. request item
- external links
Examples
- Catalog: click on tabs twice as much as everything else; full text used a lot; browse graphical < text because of placement; about half request item even though in 2 different places
- Summon: trying to track what they could track. Paging more popular than facets
Implementation
- GA API script
- jQuery API
- HTML5 Data Attributes: data-* for use by scripts
- decide what to track
- basic technique
- Summon gets harder. Have to get it in the code. more selectors
Debugging & Testing
- set up safety net first
- know the debugger
- use the GA debug
- test a lot
Actions speak louder than words: Analyzing large-scale query logs to improve the research experience
- Raman Chandrasekar, Serials Solutions
- Susan Price, Serials Solutions
Single unified index for all the items from all libraries’ collections.
RMF Goals
- observe and log user actions e.g. queries, filters, click patterns
- compute quality of search results e.g. user behaviour
- analyze data to improve search results and enhance research experience
Data-Driven Documents: Visualizing library data with D3.js
- Bret Davidson, North Carolina State University Libraries
Why D3?
- uses technologies that you already know
- capable library – pre-built path generations, well maintained etc.
- community – documentation, training available
- might not because of learning curve, and don’t need something this complex
Examples
- suma – space assessment toolkit
- show visualization real time, tables, and CSV file
HTML5 Video Now!
- Jason Ronallo, North Carolina State University Libraries
Yes! Also, slides/presentation.
Here’s Why
- Flash video cannot be run on most mobile/tablets
How it Works
- uses video HTML tag
- use simple fallback – download if can’t view
- problem: browsers cannot decide on single codec to use; codec war
- solution: multiple sources: mp4, webm
- use poster attribute as “screenshot” and don’t have to download video right away
- add type attribute to say which format to use; can be very explicit
- only one video per page please!
- properties exposed in JavaScript
- can add custom controls, more info for users
- events that you can listen for e.g. timeupdate to update time in a video; update wording e.g. which floor
- analytics: play, pause, seek, ended
- can do visualization of engagement
- can style with CSS
- track for subtitles
Polyfills and Advantages
- provide video controls
- flash fallback
- progressive download and range requests
Future of Media on the Web
- DRM looks to be coming
- Popcornjs – can do annotation
- Web Audio API – mix audio, filters, etc.

Code4Lib Day 2: Morning Notes
REST IS Your Mobile Strategy
- Richard Wolf, University of Illinois at Chicago
- Slides
- Raw Material
REST
- Representational State Transfer – a methodology developed alongside HTTP 1.1
- clients request representations of resources from servers – typically a document
- basically turns into an API
Examples
- New York Times – Congress API
- Chicago Transit
iOS Development
- need to know: Xcode, Objective-C, Cocoa Touch, Provisioning
- Xcode – Apple developer, like Visual Studio or Eclipse
- Objective-C – strict superset of C
- Cocoa Touch – frameworks to talk to iOS, similar to RubyRails
- UIKit
- Provisioning Portal – annoying paperwork
OCLC Classify API
- give it an item, tell you how it’s classified including call number
Process
- Use Rested -MAC tool,grabs API information, and provides you the raw output
- Xcode – create a new basic project
- go from XML to Objective-C
- use RestKit – maps XML to Objective-C
- use PaintCode – create GUI
- hire an artist
- Apple App Review Process
Librobot App
- in the store by April 2nd
Why REST Matters – What are the Major Milestones
- math formula – importance of technology can be determined by the amount of money involved in a court case
- Personal Computers
- The Internet
- Mobility
- Build an API – ask for ideas, and apps will come.
Take Away
- you have interesting data
- make an API
- If we build it, they will come for it!
All Teh Metadatas Re-Revisited
- Esme Cowles, UC San Diego Library
- Matt Critchlow, UC San Diego Library
- Bradley Westbrook, UC San Diego Library
Continues the story from last year.
Needs
- more consistent data
- maintain syntax of hierarchical subjects
- improve support for complex objects
- align more strongly with the digital libraries community – most important
User Stories
- to understand requirements of administration and researchers
Sorry, I had to take a brain break and got a little lost. I’m also going to blame twitter and IRC for distracting me. Take a look at the slides:
Implementation
- DAMS Repository – new version of lightweight repository, with APIs
- Manager – separate and uses the API
- Public Access System – new frontend in Hydra, great community
Timeline
- release in summer
- code now available on Github
Browser/Javascript Integration Testing with Ruby
- Jessie Keck, Stanford University
- Slides
The Problem
- needed to test JavaScript
- especially since using progressive enhancement
- site works without JavaScript, then more features with JavaScript
- mistakes happen e.g. killed navigation,
Some Solution(s)
- Watir == Web Application Testing in Ruby
- built on watir-webdriver
- Capybara – RSpec/Cucumber driver
- ability to test responsive design
- webkit integration available
- personally like Capybara syntax (vs. Watir)
- automated test that there is JavaScript bug e.g. automatically test that facets working
Gotchas
- might want to use Watir Rails
- transactional fixtures
Linked Open Communism: Better discovery through data dis- and re- aggregation
- Corey A Harper, New York University
How to shut up about linked data and actually build something.
Context
- context, the narrative of the library/archive
- user stories
Death of Browse
- discovery systems don’t use authority control
- browse broken as UI design
- rich data in authorities disconnected
The Idea Implemented
- take EAD records, blow them up, take headings to match MARC records
- pull people, coporations, and topic – pull info from DBpedia
- index in Solr
- slower than would like
- On Github but is buggy
Solr Update
- Erik Hatcher, LucidWorks
Sorry, but we don’t use Solr, and anyone really interested I think can look up information the update. e.g. Apache Solr Release Notes
Check out the slides:
Break Time

Ask Anything
Who’s faculty? Half Faculty – small handful who care about being faculty
Planing and pilot phase of bringing together all resources of types. How to decide what to use and where to start?
Normalizing records from MARC to Solr. Want help with format.
How many have library degrees? 2/3 do, 1/3 don’t
Code4Lib – archiving our stuff? Talk to Mark/anarchivist. Mailing list is archived on the university server. Mirrored on post. Regular basis, dumped to media forward.
Goals of BIBFRAME? Replacing/superseding MARC.
First-timers to c4lcon? majority of room. All? < 20
Anyone collecting social media on behalf of user community or collection building purposes? Going to be a lightning talk tomorrow.
Anyone from a theology library? ~5 ppl
Want to know successful examples of gamification to support information literacy by @maccabeelevine e.g. Lemontree
Glossary of technology and stacks. On code4lib wiki? A guide for the perplexed. We can work on it.
Who is using graph databases? 2-3 ppl
Using DSpace? 25-30 FedoraCommons? 25-30 Hydra? 10-15
This conference working for you? Almost everyone.
What do people think of the wiki? One idea is to move it over to github code4lib account.
From the federal government? 3
Anyone interested in integrated TSM into Solr? anarchivist says he knows people
How many non-library degree people considering getting one? 2
How many have project managers as their title? ~12 Public? 5 Academic? rest
CodeRead – looking at PyMARC (sp?). Anyone else looking into this?
Didn’t get all the questions, but that’s most of them.
Lunch Time
Code4lib Day 1: Lightning Talks
Cynthia Ng – RULA Bookfinder
- Link to the full write-up
Julien Gibert – Turning a Solr Response into a RDF file
- Theses.fr
- Sorry, this went by me, plus I was busy running back to my seat
Bill Dueber – Datamart Report Generator at UMich
- actually talking about spreadsheets
- want to support data-drive decision making, but it’s boring, and canned reports tend not to do it
- can end up in substring hell
- solution: build data warehouse
- took Aleph oracle COBOL store, removed insanity and put it in another oracle database
- funds and inventory reports now possible
- running 20-25 reports a week
- more than when we ran it by hand, and saves lots of time
Jonathan Rochkind – bento_search
- RubyRails gem
- external search services e.g. Google books
- federated e.g. primo, eds, ebscohost, scopus, worldcat, google books
- can use whatever you want, just need to add it
- can customize to have link resolver
- github.com/jrochkind/sample_megasearch/
- much more functionality
Masao Takaku – saveMLAK project for two years
- came out of the effort to save museum, library, archive, kominkan (community centre) after the big earthquake
- gather information of facilities in damaged area using a wiki
- coordinate activities to rebuild
- efforts are still continuing
Jon Stroop – Loris Image Server
- define syntax for image access
- can specify width/height, part of image, quality
- Talk link
Ross Singer – How are you managing copyright?
- lazy attempt at crowd-sourced business development
- copyright is complicated
- there are standard licenses, but then there are a lot of exclusions and exceptions
- still, roughly the same model
- management already being done in some capacity by the universities
- but in US/Canada there is fair dealing and fair use
- Slides
Eric Nord – Candybars for Bugs
- Harold B. Lee Library
- worked on maps in library
- pop up map
- will give candy bar if found error
- only had to give away 18
- have a ‘report a problem’ with this item
- builds the idea to power the patron
Megan O’Neill Kudzia – Games for Pedagogy in the Library
- working with faculty
- a lot of interest, but no opportunity to talk about it
- purchasing games on an ask basis
- working out how to make accessible, in catalogue
- licensing issues for PC/console games
Geoffrey Boushey – GEDI Reference App for InterLibrary Loan
- General Electronic Document Interchange (ISO Standard)
- used by Ariel
- headers added to a file when sent from one institution to another
- basis for making an easy to use tool so different ILL systems can communicate with each other
- on Github
George Campbell – three.js: 3D Objects in the browser
- used to have to use flash or flip through images
- can now use interactive 3D graphics
- can scale, add text/images, move
John Sarnowski – Audio Archiving with Full Text Search
- ResCarta Toolkit
- display and play audio
- add metadata
- use conversion tool
- embeds into XML portion
- final file can then be searched
- words can be highlighted just like a text file
That’s the end of Day 1! Join us tomorrow. Time for a nap.
Code4Lib Day 1: RULA Bookfinder: Getting People to Books Fast! Lightning Talk
Not a New Problem
- mapping the shelf where an item is located
- common implementation: stackmap
- but paid
- implementation into catalogue similar, add button to click on to get map
What’s Different
- full screen = bigger map
- links to video tutorials in lightbox/fancybox
- share map through link, and via email
- automatically prioritize by loan period (regular vs. reference only) and availability, while still showing you the other locations
- responsive
- integrated search, built for mobile
- shelf signage (though seen 1-2 other libraries doing this as well)
Are People Using It?
- Launched mid-Nov, Dec exam period, Jan first real indication
- desktop ~2/3, mobile ~1/3 of usage
What Do Users Think?
- Demo’ed at Learning Commons Open House just before launch – a lot of positive feedback
- Usability Study
- most agree/strongly agree: easy to understand, easy to follow, prefer having shelf number
- a couple didn’t like look/colours of floor plan, but most still liked it
- most importantly: level of frustration = lower
Code4Lib Day 1: Afternoon Notes
Practical Relevance Ranking for 10 million books
- Tom Burton-West, University of Michigan Library
Search Challenges
- multilingual, 400+ languages
- OCR quality varies
- very long documents
- books are different from what they normally have
Relevance Ranking
- how to score, weigh
- default algorithm ranks very short documents very high
- needed to tune/customize parameters
- average document size is ~30 times larger
- did prelim testing with Solr4 and didn’t see the same problem, but need more testing
- dirty OCR complicates things, as well as language
- occurrence of words in specific chapters vs. whole book – should we index parts of books?
- similar issue with other objects e.g. bound journals, dictionaries & encyclopedias
- difficulty too is inconsistent metadata, breakdowns of articles/chapters/etc. will be inconsistent
- creating a testing plan and adding click logs
n Characters in Search of an Author
- Jay Luker, IT Specialist, Smithsonian Astrophysics Data System
- Slides
Goal of a search is to match user input to metadata. e.g. author names
Building the next generation of the ADS 2.0. Trying to increase recall without sacrificing precision.
Requirements
- match UTF-8 e.g. matching ASCII version to versions with diacritics/markings
- match more or less information e.g. first name initial but without triggering substring matching
- need to work with hand curated synonyms e.g. pseudonyms, maiden/married name
Solving the Problem
- normalization – strip out punctuation, rearrange name parts – based on whether a common is entered
- generate name part variations to whatever can be realistically expected
- transliteration – use index instrospection for list of synonyms
- expand user queries at each step:
- user searches
- normalize
- name part vars
- transliteration
- name parts vars of transliterated entries
- curated synonyms
- transliteration of anything added
- name part variations to catch everything
- assembled into large boolean query
Implementation
- Python/JavaScript prototype
- actual – Solr/Lucene
Evolving Towards a Consortium MARCR BIBFRAME Redis Datastore
- Jeremy Nelson, Colorado College, jeremy.nelson@coloradocollege.edu
- Sheila Yeh, University of Denver
I think this presentation speaks for itself.
Journal Article: Building a Library App Portfolio with Redis and Django
Hybrid Archival Collections Using Blacklight and Hydra
- Adam Wead, Rock and Roll Hall of Fame and Museum
- Presentation
Centre of everything is the Solr index. Blacklight puts everything into Solr. Library materials is easy enough, but with Archival collections use EAD with many items (not just one item as typical of MARC).
Extended Blacklight to search EAD
- index collections and single items from a collection
- search results include books, entire collections, and items from collections
Digital Content
- kept in Fedora – objects described using Rubys
- use Hydra to manage the content in Fedora – manages RDF relationships
- indexes into Solr
- Need to related Fedora content to its archival collection
- content originates from sources in collection, and part of series
- collection metadata already exists in Solr
- create RDF representations of collections
- Hydra queries Solr for collection meatadata
- creates objects for series, subseries, items
Issues
- terrible Solr performance for series, 500+ items
- no EAD “round tripping” – EAD can go into Solr, but not back out
- currently 60% complete
Citation search in SOLR and second-order operators
- Roman Chyla, Astrophysics Data System
Sorry, I don’t have notes for this. My brain is a bit fried by this point. Will post link when I get it.
Break Time
Breakout Sessions – reports will be available on the wiki
Next Up – lightning talks

You must be logged in to post a comment.