Tuesday, March 27, 2012

National Statistical Offices: Independent, Identical, Simultaneous Actions Thousands of Miles Apart

National Statistical Offices: Independent, Identical, Simultaneous Actions Thousands of Miles Apart
Several weeks ago, at the initiative of Brian Pink, the Australian statistician, leaders of the government statistical agencies from Australia, Canada, New Zealand, United Kingdom, and the United States held a summit meeting to identify common challenges and share information about current initiatives.  While there had been casual sharing of partial information in previous years among these leaders, this event was unprecedented. 
The five countries share languages and some cultural features; they vary in size and in the organization of their statistical systems.  They also vary in the current health of their national economies, their regional economic foci, and key social and political issues.   None of them have population registers with mandatory updating features.  The legal frameworks of the countries’ statistical systems give different powers to the chief statistician.
While meetings of this character happen periodically in many sectors, the findings of the meeting were notable on one dimension – the five countries’ statisticians report that the strategic activities now being mounted are very nearly identical.  They perceive the same likely future challenges for central government statistical agencies, and they are making similar organizational changes to prepare for the future.  While they vary in specific current innovations, the components of the full future vision are remarkably similar.
Ingredients of the future vision:
  1. The volume of data generated outside the government statistical systems is increasing much faster than the volume of data collected by the statistical systems; almost all of these data are digitized in electronic files.
  2. As this occurs, the leaders expect that relative cost, timeliness, and effectiveness of traditional survey and census approaches of the agencies may become less attractive.
  3. Blending together multiple available data sources (administrative and other records) with traditional surveys and censuses (using paper, internet, telephone, face-to-face interviewing) to create high quality, timely statistics that tell a coherent story of economic, social and environmental progress must become a major focus of central government statistical agencies.
  4. This requires efficient record linkage capabilities, the building of master universe frames that act as core infrastructure to the blending of data sources, and the use of modern statistical modeling to combine data sources with highest accuracy.
  5. Agencies will need to develop the analytical and communication capabilities to distill insights from more integrated views of the world and impart a stronger systems view across government and private sector information.
  6. There are growing demands from researchers and policy-related organizations to analyze the micro-data collected by the agencies, to extract more information from the data.

Monday, March 26, 2012

Forbes: Improving Decision Making in the World of Big Data

Christopher Frank
Christopher Frank, Contributor
Fearlessly tackling analytic issues to improve decision making.

3/25/2012 @ 11:42AM |980 views

Improving Decision Making in the World of Big Data

‘Big Data’ is having its 15-minutes of fame. It is the hot term referring to the increasingly large datasets of information being amassed as a result of our social, mobile, and digital world. In the past 12-months, the use of the term in the U.S. has increased 1,211% on the internet.

Lack of data is not the issue. Lack of strategy is.

IBM estimates that every day 2.5 quintillion bytes of data are created – so much that 90% of the data in the world today has been created in the last two years. It is mind-boggling. The irony is we have more information available, but we feel less informed.

Read more.....

Monday, March 19, 2012

Visualize This: The FlowingData Guide to Design, Visualization, and Statistics

From: http://book.flowingdata.com/

A book by Nathan Yau who writes for FlowingDataVisualize This is a practical guide on visualization and how to approach real-world data. The book is published by Wiley and is available on Amazon and other major online booksellers.


Chapter 1 — Telling Stories with Data
Chapter 2 — Handling Data
Chapter 3 — Choosing Tools to Visualize Data
Chapter 4 — Visualizing Patterns over Time
Chapter 5 — Visualizing Proportions
Chapter 6 — Visualizing Relationships
Chapter 7 — Spotting Differences
Chapter 8 — Visualizing Spatial Relationships
Chapter 9 — Designing with a Purpose
There are lots of books on visualization that describe best practices and design concepts, but what do you do when it comes time for you to actually make something?
If you don't know how to use the software in front of you, the abstract isn't all that useful. And with growing amounts of data, it's becoming more important to be able to make sense of and communicate with it all.
In Visualize This, Nathan Yau teaches you how to create graphics that tell stories with real data, and you'll have fun in the process. Learn to make statistical graphics in R, design in Illustrator, and create interactive graphics in JavaScript and Flash & Actionscript.
Yau draws from his experience as a graduate student in statistics and his work with major news organizations for an engaging, data-first approach. After all, visualization is about the data it's based on.
Chapters group examples and tutorials by data type and take you through the process of data exploration and analysis, to visuals, and finally, to a graphic that is fit for publication for print and online.
Read the book cover-to-cover, or keep it on your desk as a reference for your data projects. Pages are in full color with tons of graphics to inspire and to help you learn visually

Open data: Jimmy Wales and the Man from Sweden - Web Exclusive Article - Significance Magazine

Open data: Jimmy Wales and the Man from Sweden - Web Exclusive Article - Significance Magazine
Author: Julian Champkin
Jimmy Wales at the Gottlieb Duttweiler Awards Show, 2011. Image by Thomas Entzeroth (photographer) on behalf of Gottlieb Duttweiler Institute.
Jimmy Wales at the Gottlieb Duttweiler
Awards Show, 2011. Image by Thomas
Entzeroth (photographer) on behalf
of Gottlieb Duttweiler Institute/Wikimedia .
Last week we learned that Jimmy Wales , the founder of Wikipedia, is to act as an unpaid adviser to the UK Government on opening up data to the public. His history is involving people in creating public content; so the appointment is to be welcomed. The more data we all can see, the more open our government, and the better-served is democracy.
But data is not always enough on its own. The government already releases great quantities of data about its spending, for example; every item of local government spending over £500, for example, is available to the public by law; go towww.data.gov.uk  to find it. Health and education data are there as well. The equivalent site for US government data is www.data.gov . The problem is, though, that what you will usually find at these sites are huge spreadsheets of numbers – pounds or dollars spent, or numbers of people in various categories, and it is not remotely obvious what the numbers mean or what story they tell.
For data to have meaning it must be interpreted, and that is the rub. BBC Radio Four’s Today programme on Tuesday this week had an item on the issue of open data which included an interview with Hans Rosling. Rosling of course is the Swede and statistician who made statistics a hit on prime time television with his Joy of Stats series last year. You can read a Significance interview with him here. BBC Radio Four’s interviewer asked him if there can be such a thing as too much data; you can hear his answer here .
Hans Rosling. Image by Tobias Andersson Åkerblom.
Hans Rosling. Image by
Rosling’s point was that making data openly available breaks the monopoly of the state. He checks train delays on his commute into Stockholm each morning, using an app which taps in to live data on which trains are running late. The app has improved his life; but, he says, governments would not have thought of writing the app; it is the availability of data that makes it possible, but some genius then wrote the app to use that data – interpreted it – to the public, in other words. ‘It is the value added to the data which makes it useful to the public, not the data itself.
There are of course issues of privacy. In Sweden and Norway , how much every citizen earns is public knowledge, because the government publishes everyone’s tax returns. ‘Britain is a bit old-fashioned not to do that’ he says. But an individual’s health records, he says, should not be publicly available. ‘It is no-one’s business but my own what the results of my HIV test turn out to be.’
How to decide what should be public? ‘Judge each case on its merits’ he says. Can you have too much data? ‘Can you have too many books? No. you just have to know what you have.’

Friday, March 16, 2012

Standards for statistical data dissemination: a wish list

Standards for statistical data dissemination: a wish list
View more PowerPoint from Xavier Badosa

The digitization of information exchange processes has led in many industries to define standards to be used in the B2B side of the value chain for the conversations between key partners. The agencies involved in statistical production are not an exception and need to agree on standards that can be used in the exchange of data and metadata between them. However, before these standards have been fully adopted, new needs have arisen that have stressed the importance of machine readable formats for the reuse of the public sector information. Open data initiatives have usually found a strategic ally in the statistical offices because timeliness, punctuality and accessibility are part of the code of practice in official statistics. This has increased the necessity of standards not only for data exchange between organizations specializing in statistical production but also for dissemination to third parties. The presentation will try to address the requirements that the dissemination standards should meet in this new context.

Saturday, March 3, 2012

A new Info Space on Statistical Data and Metadata eXchange (SDMX)

To help you get familiar with the SDMX international standard for data and metadata exchange, Eurostat has opened a new SDMX Info Space offering quick reference guides, news, updates on implementation activities, an inventory of available software tools and links to training tutorials.
Through the SDMX Info Space you can play or download the new set of self-learning videos and student books which explain SDMX from A to Z. We suggest starting with Welcome to SDMX (why and how SDMX can help you). After watching this video, you will know why and how SDMX is useful in your work with statistical data and metadata.
The Info Space is available directly from Eurostat website.