code4lib – Page 16 – Learning (Lib)Tech

Code4lib Day 2: Mobile Breakout Notes

Just a few notes if anyone is interested:

digital collections tool: wolf walk, digital images
- geolocation using JavaScript to make it availble in HTML5
- find it easier in app store, means more people use app
HTML5 more clunky with jQuery mobile
native app smoother, especially Google Maps
mobile site: only force on homepage and opt out using query string
mobile app/site needs to be interoperable
designing: mobile framework better at bringing out ideas vs. developing web version of a website
how do you build up services? concentrate on what is needed on mobile devices
should have just what you need and do it well while taking advantage of mobile aspects e.g. bluetooth, GPS
how best to build?: REST layer on top of what’s available
time resource: if know objective C, then just adding functions
voice use: comfort level? accurate enough? difficulty with quiet/study areas/floors?
phone tap to reserve seats/room, other application?
staff use in stacks? Shelflister: barcode scanner inserts into web form for shelf reading or collection development, including circulation data
mobile hours: just give today’s hours or closed (give tomorrow’s hours)

Code4lib Day 2 Afternoon: Notes and Takeaways

An afternoon of more presentations, which were a bit more technical in terms of data indexing, storage, and use. As a result, there are no detailed posts, but here are a few notes and takeaways.

Be careful when you try to parse a bunch of files you download from the web. Parse and store, distribute up front, and build a test index first.
Making Software Work – read it
The results of one study is not the truth.
It’s hard to make a study repeatable.
Does agile work? That’s the wrong questions. Really, when does bug fixing has the highest cost?
High-risk bugs should be done as early as possible, instead of the easy bugs.
What language? Depends on the problem.
Make developer happiness hours. (block off time with no meetings)
Give people open sight lines instead of high cubicle walls.
Be as transparent as possible (e.g. JIRA) including progress.
Put projects into short iteration cycles.
No code without passing tests!
Slides (PDF) for the last talk: Practical Agile: What’s Working for Stanford, Blacklight, and Hydra by Naomi Dushay

In-browser Data Storage and Me

by Jason Casden, North Carolina State University

Suma: data collection application using in-browser storage.
Indexed database API (aka IndexedDB, WebSimple DB) is where things seem to be going, but limited browser support.
Web (DOM) Storage is basically universally supported.
Web SQ DB still viable option.
lawnchair: object storer, but have adapters for a long list of DBs/APIs.
persistence.js: asynchronous JavaScript object-relational mapper and adapters are being built out. Can be used with node.js and MySQL.

Slides

Code4lib Day 2 Morning: Notes & TakeAways

I didn’t take full notes on all the presentations. I like to just sit back and listen to some of the presentations, especially if there are a lot of visuals, but I do have a few notes.

Full Notes for the following sessions:

Building Research Applications with Mendeley

by William Gunn, Mendeley

Number of tweets a PLoS article gets is a better predictor of number of citations than impact factor.
Mendeley makes science more collaborative and transparent. Great to organize papers and then extract and aggregate research data in the cloud.
Can use impact factor as a relevance ranking tool.
Linked Data right now by citation, but now have tag co-occurrences, etc.
Link to slides.

NoSQL Bibliographic Records: Implementing a Native FRBR Datasotre with Redis

No notes. Instead, have the link to the presentation complete with what looks like speaker notes.

Ask Anything!

Things not taught in library school: all the important things, social skills, go talk to the professor directly if you want to get into CS classes.
Momento project and UK Archives inserting content for their 404s.
In response to librarians lamenting loss of physical books, talk to faculty in digital humanities to present data mining etc., look at ‘train based’ circulations, look at ebook stats.
Take a look at libcatcode.org for library cataloguers learning to code as well as codeyear hosted by codeacademy.

Code4lib Day 2: How People Search the Library from a Single Search Box

by Cory Lown, North Carolina State University

While there is only one search box, typically there are multiple tabs, which is especially true of academic libraries.

73% of searches from the home page start from the default tab
which was actually opposite of usability tests

Home grown federated search includes:

catalog
articles
journals
databases
best bets (60 hand crafted links based on most frequent queries e.g. Web of Science)
spelling suggestions
loaded links
FAQs
smart subjects

Show top 3-4 results with link to full interface.

Search Stats

From Fall 2010 and Spring 2011, ~739k searches 655k click-throughs

By section:

7.8% best bets (sounds very little, but actually a lot for 60 links)
41.5% articles, 35.2% books and media, 5.5% journals, ~10% everything else
23% looking for other things, e.g. library website
for articles: 70% first 3 results, other 30% see all results
trends of catalogue use is fairly stable, but articles peaks at the end of term

How to you make use of these results?

Top search terms are fairly stable over time. You can make the top queries work well for people (~37k) by using the best bets.

Single/default search signals that our search tools will just work.

It’s important to consider what the default search box doesn’t do, and doubly important to rescue people when they hit that point.

Dynamic results drive traffic. When putting few actual results, the use of the catalogue for books went up a lot compared to suggesting to use the catalogue.

Collecting Data

Custom log is being used right now by tracking searches (timestamp, action, query, referrer URL) and tracking click-throughs. An alternative might be to use Google Analytics.

For more, see the slides below or read the C&RL Article Preprint.

Code4lib Day 2: Discovering Digital Library User Behavior with Google Analytics

by Kirk Hess, University of Illinois Urbana-Champaign

Why Google Analytics?

free
JavaScript based
small tracking image (visible via Firebug) = mostly users not bots
works across domains
easy to integrate with existing system
API

Some useful things in the interface:

heat map
content drill down – click on page and see where users went from there
visitor flow
events

Export Data Using API

Analytics API
Java or Javascript (assuming, anything actually)
export any field into a database for further analysis (in this case MySQL db)

Analyze Data

Which items are popular?
How many time was an item viewed?
Downloaded?
Effective collection size – see if people seeing/using
found typically, many things are not popular
discover a lot of other things about users

Next Steps

found, need to change site design
change search weighting
- allow users to sort by popularity (based on previous data)
- recommender system – think Amazon
add new tracking/new repositories
analyze webstats – hard to look at direct access

Moving away from JavaScript based since a lot of mobile devices don’t have it.

The event analysis code has been posted on github and adding events to link code will be added later to his Github account.