linked data – Page 2 – Learning (Lib)Tech

Code4lib Day 2: Lightning Talks

Scott Hanrath – Zotero and SHERPA/RoMEO API mashup

quick and dirty way to filter a collection of articles by publisher policies
use Zotero and SHERPA/RoMEO APIs to tag articles with publisher policies
work flow?
zotero plugin?
Code on github

David Walker – Basic Learning Tool Interoperability (LTI) Protocol

Need LMS to pull all the relevant library information, items, etc.
In LMS, register library tool as if it were a native building block
When insert into course, make a little iframe of tool
Hidden form elements post it to tool with data of course, data, security (OAuth)

Peter Murray – Introducing FOSS4LIB.org

Lyrasis’ response to survey on what librarians wanted
open source adopters are still in the early adopters stage
thus, website was created
determine whether OSS is right for the library including cost
help to select software
Call to action: register packages, releases, events, providers

Mark Matienzo – I’ve Got Good News

C4L11: fiwalk with me – using open source digital forensics software to support pre-digest work
update of work since then
pluggable
could integrate anything
two working plugins: virus scanner, file format identification against PRONOM
Code on github
BitCurator

Mike Durbin – Edge Cases – Digitizing and delivering undescribed items in EAD

should automate as much of the workflow as possible
items selected for digitization, scanning, created/updating spreadsheet with ID and sequence, name image files according to ID/sequence
put it in for automated processing including quality control, files pushed into master file archive, ingested into Fedora, and e-mail is sent to collection manager
Finally, publication

Ryuuji Yoshimoto – Introducing CALIL.JP, scraping/mashup all of OPACs in JAPAN! PDF Slides

OPACs have no API
so start scraping OPACs, fighting with dirty HTML
2 months to scrap 200+ OPACs
CALIL.JP
realtime holding by through the CALIL API by ISBN, returns XML or JSON
item information from amazon and Google
now have many third-party apps e.g. browser extension

Kåre Fiedler Christiansen – Chucking all the software components in a library together to present recorded radio and tv

built MPEG -> streaming server
website -> cool design
cool design, website, streaming server, access control -> cool website
except lawyers, oh noes!
PDF Slides

Joel Richard – Introducing Macaw: Metadata Collection Tool for Book-like things

digitizing lots of book-like things including pamphlets
most libraries sent to Internet Archive then to the Biodiversity Heritage Library
but some items too large to fit on usual scanning hardware
had to use camera, but had to add metadata
Macaw collects metadata but doesn’t really do workflow
two views: thumbnails and list
take data from wherever, using Z39.50 or CSV, into Macaw
custom export from Macaw, including Internet Archive, the library
each piece is modular
Code on Google Code

Rachel Frick – LOD-LAM Incubator Project

Linked Open Data for Library, Archives, and Museums
lightweight approach in terms of funding and consultation
timeline: March – May = recruit panel, fundraising, open comment

Mao Tsunekawa – Project Shizuku : Making Friends in Libraries

Shizuku 2.0
software development project in supporting encounters among library users
not recommending books, recommending users instead
visualize circulation data for finding other users reading the same books
can share history of reading books
developing Baron which allows searching OPAC and then making friends

Keith Folsom – Archivists’ Toolkit Database Server on an Amazon EC2 Instance

multi-institutional
hosting on small instance of amazon
Ubuntu/MySQL
single open port
download kit with PuTTY
going out of pilot

Rebecca Jones – Call for Services

InnovativeInterfaces
provide SQL access
working on RESTful services
What services would people like to have?
Live Beta in March

Code4lib Day 2 Morning: Notes & TakeAways

I didn’t take full notes on all the presentations. I like to just sit back and listen to some of the presentations, especially if there are a lot of visuals, but I do have a few notes.

Full Notes for the following sessions:

Building Research Applications with Mendeley

by William Gunn, Mendeley

Number of tweets a PLoS article gets is a better predictor of number of citations than impact factor.
Mendeley makes science more collaborative and transparent. Great to organize papers and then extract and aggregate research data in the cloud.
Can use impact factor as a relevance ranking tool.
Linked Data right now by citation, but now have tag co-occurrences, etc.
Link to slides.

NoSQL Bibliographic Records: Implementing a Native FRBR Datasotre with Redis

No notes. Instead, have the link to the presentation complete with what looks like speaker notes.

Ask Anything!

Things not taught in library school: all the important things, social skills, go talk to the professor directly if you want to get into CS classes.
Momento project and UK Archives inserting content for their 404s.
In response to librarians lamenting loss of physical books, talk to faculty in digital humanities to present data mining etc., look at ‘train based’ circulations, look at ebook stats.
Take a look at libcatcode.org for library cataloguers learning to code as well as codeyear hosted by codeacademy.

Code4lib Day 1 Morning: HTML5, Microdata and Schema.org (and other takeaways)

I did not take notes on everything in part because some of it was very technical and it can be hard to do notes, but here are some takeaways from the morning:

Versioning Control: Use it, Git or Mercurial. Doesn’t need to be code, can be data too. – Description and Slides
Take library data and make it available to users, can’t expect them to search for it.
Linked Data doesn’t need to be a huge project. Start small.
Why RDF? It’s flexible with easy addition of new attributes or classes, and works cleanly with an iterative approach.

HTML5 Microdata and Schema.org

by Jason Ronallo

Other than getting good ranking, we need to provide rich results, i.e. rich snippets. Some digital collection have been providing rich snippets already, such as NCSU Libraries.

How do we get this?

embedded semantic markup
HTML5 Semantics include nav, header, article, section, footer
HTML5 Microdata is a syntax for annotating content to communicate meaning of data to machines
similar to RDFA, other microdata
Microdata comes back as tree based JSON and allows for DOM API

For example:

<div itemscope itemtype=”http://schema.org/Organization” itemref=”logo”>
<a itemprop=”url” href=”http://code4lib.org/”>
<span itemprop=”name”>Code4Lib<\span>
</a>
</div>
where: scope = about something
type = type of item
prop = properties

For the user, there is no difference as display is the same. This provides a complete data model.

Schema.org is a one-stop shop for vocabulary in describing items on the web.

Apologies, I did not take extensive notes on it, but to read more, check out the slides below or the Code4lib article he wrote.