After almost a week since Code4Lib 2013, I’m still not sure it’s all sunk in. Every year I look at the conference program, I wonder if the sessions will actually interest me, but I go anyway and get blown away. Regardless, I was more excited this year since I was the program committee lead this year, helped with sponsorship, prepared a lightning talk and decided to be the opening MC. Continue reading “Code4Lib 2013: Reflection & Thoughts”
Category: Events
including conferences and workshops
Code4Lib Day 3: Morning Notes
Hands off! Best Practices and Top Ten Lists for Code Handoffs
- Naomi Dushay, Stanford University Library
Code handoffs are never smooth. Ever.
Ratio of time spend reading vs. writing code, 10:1.
The Truck Test
- what if you were run over a truck and someone else had to take over?
- need to code so a stranger can read it and understand it
The Boy Scout Rule
- “Leave the code cleaner than you found it”
- need to maintain your code
- otherwise you’re part of the problem
It’s More Than Code
- naming should make sense: servers, scripts, everything
- config files should not point to boxes
- tools chosen can be the problem
- should you be rolling this on your own?
- probably something been done before
- some think if you write code really well, then you don’t need to comment. Not true.
- Documentation and comments are there to inform, explain, clarify, warn, need maintenance
- readme’s should make sense
- tests are code, should also think about readability of these
- failures should be addressed ASAP
- KISS – Keep It Simple Stupid
- DRY – don’t repeat yourself
Readable Code
- follow conventions
- meaningful names: variable, method, class, file
- small, single purpose methods
Cleverness that reduces readability isn’t clever.
Sources
- Clean Code: A Handbook of Agile Software Craftsmanship by Robert Martin
- Refactoring: Improving the Design of Existing Code by Martin Folwer et al.
The Care and Feeding of a Crowd
- Shawn Averkamp, University of Iowa
- Matthew Butler, University of Iowa
DIY History
- transcribe items in collection
- omeka + scripto + mediawiki
- still in development: want to add social media aspects/integration
- err sorry, brain temporarily sort of died. See slides and I’ll go get a cookie to recharge
How to be an effective evangelist for your open source project Creating a Commons
- Bess Sadler, Stanford University Library
Lost a member of our community this year: Aaron Swartz
- helped to define Creative Commons licenses
- 3 versions: machine, human, and lawyer readable
- code4lib should do the same principle
- shared engineering practices are becoming more and more important
- investment that’s worth it
- please get code contributors to sign a contributor license agreement
- can determine whether contract allows participation
- don’t want to lose informal sharing, but law cases have happened and we need to protect ourselves
Building Code
- what are we building?
- we are building a culture, a commons
- Fedora4lib – came early and rented a house together
- Hydra = a community
- cultivate a place where we can
- teaching at Ruby on Rails workshops – too big a job to leave to a small group of people
Hacker Epistemology
- how is knowledge acquired?
- how do we decide what’s true?
- collaboration with disregard of conventional mental thinking
Building the Community
- need to expand and include everyone who wants to join
- more steps in building a more inclusive community
- adopted a code of conduct, because it was a good idea and making an explicit statement
- need to let other people to know that we’re trying
- “We are all imposters.” – just acknowledge it, we all feel that way, but bolster ourselves
- allow ourselves to be seen even when there’s no guarantee of success
- we can support each other
- cannot be accomplished alone
- want to craft a process for submitting issues
Thank you, code4lib!
The End
And that’s it! Until 2014.

Code4Lib Day 3: Lightning Talks
Mark Matienzo – Wielding the Whip: Affect, Archives, & Ontological Fusion
- need to talk about emotion – a lot of things going on in my life
- inspired by Archives by Emotion
- why do we not acknowledge that archives are based on emotion, stories
- facebook ‘like’ is not an emotion
- losing connections to materials and history when not thinking about the platform and how that effects our stories
- can we write stories about our collections?
- should be using existing linked data to make those narratives
- let’s build this
- Slides
- Full Write-up
Jason Casden and Cory Lown – My #HuntLibrary
- student engagement platform
- how do students use the space?
- what do they choose to document? – using instagram
- student-drive archival selection
- making use, harvesting social media
- Implemention:
- moderated, responsive
- use for public display, can interactive, including larger public display
- inspired by kitten war: built image battle
- calculate popularity score
- also about preservation, collect images
- go.ncsu.edu/MyHunt
Steven Anderson – JavaScript Streaming Clientside Checksumming w/ HTML5 file upload
- basic using JavaScript to view checksum
- and then when files are uploaded
- Demo: hydratest.bpl.org:3000
- On GitHub
Will Hicks – Metadata Entry Beyond Usability
- think back to volunteering to donate blood, etc. what value did it give you?
- ~300 individuals creating metadata
- stats have bumped up with a new interface
- but 10% of the stuff is hidden
- “usable” forms are great, but little “ownership” and lack of domain expertise
- what if we applied the ideas used in social media?
- invite with openID, personalize projects, badges, stats, visualization, etc.
Kelly Lucas – Drupal OPAC Recipe
- the poor man’s Blacklight or Vufind with Solr backend
- process:
- MARC record dump into Solr using SolrMARC
- install search API, Search API Solr, Facet API, Sarnia and Views from Drupal.org
- configure Sarnia + FacetAPI (boosts, enable facets)
- create a view, add some fields, create an exposed filter (full text search box)
- slap some facets to the side of the page
- issues: new/updated records requires direct connection to ILS
- Slides
Karen Coyle – Nerd Poetry
- cowboy poetry – told around the camp fire
Mark Redar – Django Dublin Core App
- plugin app
- on Github
- RecordExpress – lightweight, easy to use esp. for those not familiar with XML
James Stuart – Taming Email
- a really big problem
- part of your job
- here’s how to tame email
- don’t organize: organizing your email is like alphabetizing your recycling
- mailstrom.co
- turn keyboard shortcuts on
- shortcutFoo
- don’t be distracting
- bookmark
- Also take a look at Making Thunderbird More Gmail-y
No Break! Have a Cute Animal Anyway

Code4Lib Day 3: Closing Keynote – Gordon Dunsire
Granularity in Library Linked Open Data
Fractals
- self-similar at all levels of granularity
- each circle represents of things that look very similar (snowflake looking pattern but of different sizes)
- characteristic of fractals
- cannot determine level: all levels are equal, some more equal than others
Multi-Faceted Granularity
- What is described by a bibliographic record? or a single statement?
- What is the level of description? How complete is it? e.g. AACR2
- How detailed is the schema used? How dumb? – especially relevant right now. The more detailed, the higher level of granularity possible.
- Semantic constraints? Unconstrained?
Resource Description Framework – Linked Data
- Triple: This resource | has intended audience | Juvenile
- Subject / Predicated / Object
- do each of these parts have granularity?
- higher/lower level, but should talk about coarse or fine grained granularity
Subject: What is the Statement About?
- we can focus on description an article / resource / work, then think about coarser or finer granularity:
- coarser: consortium collection / RDF map
- library collection / digital collection
- super-aggregate journal title / jurnal index
- aggregate: issue / festschrift
- focus on description an article / resource / work
- component: section / graphics / page
- sub-component: paragraph / markup
- finer: word rdf/xml
- uri / node
Predicate: What is the Aspect Described?
- similar coarse/fine breakdown:
- membership category
- access to resource
- access to content
- suitability rating
- audience and usage
- audience
- audience of audio-visual material
- diagram: possible audience map (partial) – unconstrained version to avoid collisions of isbd/dct/schema/rda/m21/frbrer
- different links can be made while still retain proper semantic links
- currently constructing just one giant graph
What is the Aspect Described?
- coarse to fine:
- resource record
- manifestation record
- title and s.o.r
- title statement
- title of manifestation
- title word
- first word of title
- why do librarians need so many titles? Why not just use dublin core title and be done with it? Because we need it to do our work e.g. spine title to browse
- title = string identifier
- RDA: what to do with this? how do we apply these needs?
- possible semantic map (partial) – I won’t even try to reproduce this
- need to take into account names and ranges
- make it more difficult, but more powerful
Semantic Reasoning: The Sub-Property Ladder
- this is where the graph becomes useful and property
- machines can’t reason, so we’re demantic the semantics such that we can give the rules to machines to process our data
- semantic rule:
- if property1 sub-property of property2;
- then data triple: resource property1 “string”
- implies data triple: resource property2 “string”
- otherwise, data triple remains the same
- simple enough for computer to carry out
- doesn’t matter how complex the map actually is, because it can still do it in matters of seconds
- machine entailment: isbd” “hast title proper” (finer) -> dct: “has title” (coarser)
- might sound simple, but making a computer do interferance
- ‘dumb(ing)-up, data has been lost, but still meaningful – moved from one schema to another
Data Triples from Multiple Schema / Entailed from Sub-Property Map / rom Property Domains
- frbrer: “has intended audience” – “primary school”
- isbd: “has note on use or audience” – “for ages 5-9”
- rda: “intended audience (work)” – “for children aged 7-“
- m21: “target audience” 0> m21terms: -> “Juvenile”
- definition attached to the vocabulary
- also talking about granularity
- can map the sub-property to top level of unc: “has note on use or audience”
- “is a” frbrer: “work”, isbd: “resource”, rda: “work” – rda and frbr schema actually separate, not semantically linked even though vocabulary is similar and RDA is based on FRBR
- once stabalized can be drawn from each other
What is the Aspect Described?
- coarser to finer:
- creator
- author
- screenwriting
- animation screenwriting
- children’s cartoon screenwriting
- different controlled vocabulary
- graph of RDA for author/creator/screenwriting in relation to work and agent
- graph of same thing, but for dc for creator and agent
- what is the semantic relationship between the dct creator and the rda creator?
- marcrel author maps to dc contributor, not creator – what is the relationship between rda author and marcrel author?
- decision from 2005, needs to be reappraised and reviewed
- relationship between dc creator and dc contributor?
- how does lcsh “screenwriters” fit?
Machine-Generated Granularity
- also has issues
- e.g. full-text indexing: down to the word level
- BabelNet: A very large multilingual ontology
- can get quite complex and granular
User-Generated Granularity
- users can actually generate useful metadata
- can use statistical methods to remove extremes and come back with consensus
- going to cause granularity problems e.g. “OK for my kids (7 and 9)”, “Too childish for me (age 14)”
KISS
- keep it simple, stupid
- keep it simple and stupid?
- data model is very simple: triples!
- in terms of complexity, actually very simple
- but metadata content is complex
- and therefore, resource discovery is complex
- complex structure of application of simple rules, similar in the hard sciences and math
- simplicity is elegance
AAA
- Anyone can say anything about any thing
- someone will say something about every thing
- in every conceivable way
- and then constrained linguistically
OWA
- open world assumption: the absence of a statement is not a statement of non-existence
Will it get so granular that it becomes too complex?
And the rest is science
Break Time

Code4Lib Day 2: Lightning Talks
Demian Katz – gamebooks.org, Geeby-Deeby, and the Dime Novel Bibliography Project.
- interactive fiction books
- Made a really big bibliography
- people sending in more
- put it in a big database
- open-source project to adapt backend
- MySQL db that models various types of entities, relationships
- backend system to edit entities, enter data, edit relationships
Rachel Frick – LODLAM Summit 2013 and Challenge
- linked open data in library, archive, and museum
- had challenge to win cash prize
- can still submit, just fill out form and make/submit video
Kenny Ketner – Occam’s Reader
- allow libraries to lend ebooks to each other using document delivery model
- compatible with IILiad
- no training, no workflow change
- can add formats incrementally (started with PDF)
- basic interface that discourages abuse
- access previously unaccessible resources
- in alpha testing
- Slides
Al Cornish – Orbis Cascade Alliance Shared ILS Project
- primary service is consortial services
- move to new system beyond traditional ILS
- single shared system (vs. currently 37)
- explore collaborative technical services, collection development
- Alma + Primo selected, migration in progress
Makoto Okamoto – Crowd Funding for Library in Japan
- culture of donation changed a lot after 3.11
- share experience and metrics
- key to success is setting up appropriate ticket
William Denton – Code4Lib 2013 Augmented Reality View in Layar
- picking points of interest from two spots
- 1: Google Places map
- 2: Twitter search API – of the ones that are geolocated
- Web service in Ruby and Sinatra, hosted on Heroku
- Rainbows End by Vernor Vinge – go read it.
- Slides
Rosalyn Metz – What I learned while I was away
- learned about planning, budget, and time
- time is the most important thing
- track your time, might be spending too much time on things e.g. don’t spend more than 10 mins on a single email
- can demonstrate where need help
- Slides
Nettie Lagace – Recent Cool Fun NISO Activities
- ResourceSync Framework Specification
- Bibligraphic Roadmap Initiative
- Slides
Chuck Koscher – Fundref
- list of funders
- what articles given certain funding
Andromeda Yelton – Five Conversations About Coding
- computer science majors in 1995: yardstick of who is cooler than who dependent on the most arcane knowledge
- boston python workshop 2012: women friendly course. Expecting to be judged.
- chad nelson, monday night. It’s not free
- bess sadler, yesterday. We have a problem with insecurity
- important to recognized our limitations, but have this imaginary yardstick
- ever done coding? majority. think coder? 1/2
Jeremy Morse – mPach: Publishing directly into HathiTrust
- sorry, didn’t quite get this one
Rob Dumas – Git in Five Minutes
- source control software
- accountability – record of changes over time
- can keep branching and merging
- Going from SVN to Git
- Slides
That’s all for today.
