This year’s annual report was not nearly as exiting as last year’s. While the busiest day’s top post was unrelated, once again, the reason for February 13th being the top view date is due to Code4Lib. Continue reading “2013 Blog Stats in Review”
Tag: statistics
Year 2012 in review
The WordPress.com stats helper monkeys prepared a 2012 annual report for this blog.
Here’s an excerpt:
4,329 films were submitted to the 2012 Cannes Film Festival. This blog had 32,000 views in 2012. If each view were a film, this blog would power 7 Film Festivals
Click here to see the complete report.
No surprise that the Twitter, and WordPress plugin posts got the most views. However, the blog hit a high of 590 views in a day due to the Code4Libcon 2012 Opening Keynote post.
I didn’t quite realize either that I made 89 posts in 2012, since my goal is one per week!
Code4lib Day 2 Afternoon: Notes and Takeaways
An afternoon of more presentations, which were a bit more technical in terms of data indexing, storage, and use. As a result, there are no detailed posts, but here are a few notes and takeaways.
- Be careful when you try to parse a bunch of files you download from the web. Parse and store, distribute up front, and build a test index first.
- Making Software Work – read it
- The results of one study is not the truth.
- It’s hard to make a study repeatable.
- Does agile work? That’s the wrong questions. Really, when does bug fixing has the highest cost?
- High-risk bugs should be done as early as possible, instead of the easy bugs.
- What language? Depends on the problem.
- Make developer happiness hours. (block off time with no meetings)
- Give people open sight lines instead of high cubicle walls.
- Be as transparent as possible (e.g. JIRA) including progress.
- Put projects into short iteration cycles.
- No code without passing tests!
- Slides (PDF) for the last talk: Practical Agile: What’s Working for Stanford, Blacklight, and Hydra by Naomi Dushay
In-browser Data Storage and Me
by Jason Casden, North Carolina State University
- Suma: data collection application using in-browser storage.
- Indexed database API (aka IndexedDB, WebSimple DB) is where things seem to be going, but limited browser support.
- Web (DOM) Storage is basically universally supported.
- Web SQ DB still viable option.
- lawnchair: object storer, but have adapters for a long list of DBs/APIs.
- persistence.js: asynchronous JavaScript object-relational mapper and adapters are being built out. Can be used with node.js and MySQL.
Slides
Code4lib Day 2: How People Search the Library from a Single Search Box
by Cory Lown, North Carolina State University
While there is only one search box, typically there are multiple tabs, which is especially true of academic libraries.
- 73% of searches from the home page start from the default tab
- which was actually opposite of usability tests
Home grown federated search includes:
- catalog
- articles
- journals
- databases
- best bets (60 hand crafted links based on most frequent queries e.g. Web of Science)
- spelling suggestions
- loaded links
- FAQs
- smart subjects
Show top 3-4 results with link to full interface.
Search Stats
From Fall 2010 and Spring 2011, ~739k searches 655k click-throughs
By section:
- 7.8% best bets (sounds very little, but actually a lot for 60 links)
- 41.5% articles, 35.2% books and media, 5.5% journals, ~10% everything else
- 23% looking for other things, e.g. library website
- for articles: 70% first 3 results, other 30% see all results
- trends of catalogue use is fairly stable, but articles peaks at the end of term
How to you make use of these results?
Top search terms are fairly stable over time. You can make the top queries work well for people (~37k) by using the best bets.
Single/default search signals that our search tools will just work.
It’s important to consider what the default search box doesn’t do, and doubly important to rescue people when they hit that point.
Dynamic results drive traffic. When putting few actual results, the use of the catalogue for books went up a lot compared to suggesting to use the catalogue.
Collecting Data
Custom log is being used right now by tracking searches (timestamp, action, query, referrer URL) and tracking click-throughs. An alternative might be to use Google Analytics.
For more, see the slides below or read the C&RL Article Preprint.
Code4lib Day 2: Discovering Digital Library User Behavior with Google Analytics
by Kirk Hess, University of Illinois Urbana-Champaign
Why Google Analytics?
- free
- JavaScript based
- small tracking image (visible via Firebug) = mostly users not bots
- works across domains
- easy to integrate with existing system
- API
Some useful things in the interface:
- heat map
- content drill down – click on page and see where users went from there
- visitor flow
- events
Export Data Using API
- Analytics API
- Java or Javascript (assuming, anything actually)
- export any field into a database for further analysis (in this case MySQL db)
Analyze Data
- Which items are popular?
- How many time was an item viewed?
- Downloaded?
- Effective collection size – see if people seeing/using
- found typically, many things are not popular
- discover a lot of other things about users
Next Steps
- found, need to change site design
- change search weighting
- allow users to sort by popularity (based on previous data)
- recommender system – think Amazon
- add new tracking/new repositories
- analyze webstats – hard to look at direct access
Moving away from JavaScript based since a lot of mobile devices don’t have it.
The event analysis code has been posted on github and adding events to link code will be added later to his Github account.