Big Data
- 90% of the world’s data was created in the last 2 years
- can tell us much that other information cannot
- emphasize the need for analysis and interpretation
- your data is mined and used to make decisions for you, even more so in the future
- to prepare, know that big data will affect data management, discovery tools, new jobs, revised skills requirements, and revised infrastructures
- businesses will be made up of who has the most data and knows how to best use it
Data Visualization
Patrick Cain
- examples: homicide map, number of registered organ donours, bedbugs, drunk driving, crowd sourced neighbourhood maps
- providing information on what is otherwise a closed database
- problem is analyzing the data and drawing conclusions
- made people look at their community/neighbourhood in a different way
- more examples: includes information on free visualization tools
- a lot of public documents are behind a wall
- finding the balance between potential harm and making data open
- the data is imperfect, but help to tell a story
- have information on where the data comes from
- raw data gives you more power, can ask your own questions, but still have the difficulty of interpreting the data
The Changing Face of Toronto Public Library’s Data
Alan Harnum
- patrons do the usual: borrow, use ref, ask questions, attend programs, use computers, wireless, visit the website, etc.
- end up with a lot of transactional data, but it’s not all one big system, and not always available right away, in the form we want it, or at all
- conceptualization and modelling of data
- usage patterns are changing: digital borrow increasing, data needs are changing, policy needs are changing
- understanding data relationships: what influences what e.g. How many people use the library only for wireless?
- need for real-time data to improve service planning, responsiveness to inquiries, decision making
- policy implications
- privacy – need to balance wanting to collect data with protecting customer privacy (datasets can be deanonymized)
- service delivery – robust data, but without losing what patrons like
- evaluation and measurement – how to use effectively while remembering that data is not the only decision making tool
- most important to remember: not everything that counts is counted, but not everything that is counted counts
Lunch Time
Lightning Talks
Big Data in Libraries
MJ Suhonos
- not actually as big in libraries: LoC: 1.9mill, Europeana: 20mill
- big data varies depending on the capacity
- think really complicated, but not actually that complex
- big data = cumbersome, out of our reach
- we don’t have to use the old tools, there are new ones
- have new opportunities
- cloud is not a magical bullet, just another tool – can do it in a more flexible way
- less about size and more about freedom and new opportunities because we didn’t have the tools in the past
- increasing the capacity around you
- can increase the discoverability of the long tail
- how to improve tools a little bit to solve problems we couldn’t before all over the place
- linked data is metadata infrastructure
- open data is policy infrastructure
- “The cloud is a lie”
Engagement and Impact of Twitter by Canadian Libraries
Angela Hamilton & Sarah Forbes
- found mostly analytic tools for marketers and for profit companies
- used tool to pull tweets: at 38k
- looking at the content, so no way to automate
- coding considerations: retweets, mentions, content type, tone, hashtags, links and media
- should have double coded to be more accurate
- should have gotten more help
On Dentographs
William Denton
- DDC in checkerboard to visualize depth and breath of the collection
- particularly good for collection comparisons
- LC in mountain version
- could do animation of how it changes year to year or day to day
- if use internal data, could do it based on circulation or holds
- by doing visualization, know what to do next time
- going from one medium to another, can extend
- Presentation Write up
- Code4Lib Journal Article
Open Data Policy in Canada
- Tracey Lauriault
examples: Open North, Hacking Health, Treaty Process, Residential School Map, municipal quality of life, AAAS remote sensing
where to get data
- Federal Data: opendata.gc.ca, research data Canada, data.gc.ca, Canadian International Development Agency
- Provinces: Ontario, Alberta, Quebec, BC, Saskatchewan
- Cities: 36 cities right now
- Community Data Portal
data advocacy
- Community Data Canada
- Canadian Council on Social Development
- Data Liberation Initiative
data policy
- not a lot of funding that requires data management
- Canadian Insititute of Health Research – encourage, but not policy
- Open Government Resolution by Office of the Information Commissioner of Canada – unfortunately don’t have a lot of power
- GeoConnections policy primers and guidelines
- Open Government Partnership – but only federal
- cippic – do everything in Canada
- should make public anything that is publicly funded
Privacy by Design: Big Privacy for Big Data
Michelle Chibba
- Ontario’s Information and Privacy Commissioner
- philosophy: consultation, cooperation, and collaboration
- confidentiality not the same as privacy
- privacy is all about the individual, and individual rights
- if unique, persistent, and linked to individual then it’s personally identifiable information (PII)
- people must be able to trust that organizations will manage their information properly
- forget the content, the metadata is what identifies the person
- most importantly, the information is persistent
- any digitization that can be intercepted and recreated into understandable information, then it is a record
- actually fairly easy to deanonymize information based on a few data points
- good data security is not the same thing as privacy
- most privacy breaches remain unknown
- need privacy by design
- proactive
- default
- embedded
- full functionality
- security
- viable and transparent
- can de-identify data using proper techniques based on the objectives/needs: Dispelling the myths surrounding de-identification
- data comanagement: accountability, minimization (collect little, use central registry), security, access
- UI design concepts tied to transparency and trust focused on context, awareness, discoverability, comprehension
- big data touching on privacy, example: connect un/structured data by casinos: card counter, relative of employee, or other relationships with the casino
- need to be able to use both unstructured and structured data to connect the dots
- features for next-generation sensemaking systems: full attribution, data tethering, analytics on anonymized data, tamper-resistent audit logs, false negative favouring methods, self-correction false positives, information transfer accounting
- your identity is your most valuable possession
Yet another awesome interweb broadcast, thanks Cynthia, i appreciate it!
You’re welcome ^^