Code4LibBC Day 1: Lightning Talks Part 2

Code4LibBC mornings are all about lightning talks. Here’s Part2.##AtoM’s XML-to-XSLT conversion feature for creating user-friendly PDF finding aids

Dan Gillean

What is AtoM? Access to Memory
migrating SFU Archives and Special Collections to Archivematica and AtoM
researchers browse whole descriptions into PDF finding aids
transform EAD output into PDF
EAD + XSLT + XSL-FO / ApacheFOP = PDF
job scheduler does it asynchronously by admin
next steps: dev project undersway: print full lower-level descriptions, make available in PDF, RTF, and TXT; make options configurable for AtoM ; custom style sheets; will be included in AtoM 2.2 release (2015)

Web Scraping for Fun and Profit

Mark Jordan

normally would need walk through the structure to pull the information you want
why scrape? No API to content, want very specific data, but don’t care about visual layout
why not? there is an API to content, and sometimes it won’t work on a site
real examples: convert author pages in SFU’s IR to linked data; gathering info about OJS journals; getting a full list of CONTENTdm collections (but didn’t work due to use of JavaScript)
some little scripts to grab titles. Code on Github* Non-Coding Approaches to Scraping e.g. import.io, ScraperWiki; Browser plugins: Web Scrape (Chrome), Outwit Hub (Firefox)

Dethe Elza

Webmaker – tools to help people create things for the web and learn to code
embedded journalist
Software Carpentry to teach scientists to code
Hive Vancouver – unite all the learning activities in a city e.g. after school clubs
it’s a catalogue, and hosts events
1st event was a pop up event at VPL
Vancouver or web based learning resources to go up on the website