Wednesday, July 25, 2012

Computing for data analysis

July 24, 2012 to Statistics by Nathan Yau
If you want to learn visualization, you should learn data. To learn data, you should learn statistics. Where to begin? The free analysis courses offered on Coursera, by Johns Hopkins professors is probably a good place to start. Currently available: Computing for Data Analysis with biostatistics professor Roger D. Peng and Data Analysis with Jeff Leek, also a biostatistics professor.

Tuesday, July 17, 2012

ScienceDaily - Getting to the Bottom of Statistics: Software Utilizes Data from the Internet for Interpreting Statistics

ScienceDaily (July 16, 2012) — Interpreting the results of statistical surveys, e.g., Transparency Internation­al's corruption indices, is not always a simple matter. As Dr. Heiko Paulheim of the Knowledge Engineering Group at the TU Darmstadt's Computer Sciences Dept. put it, "Although methods that will unearth explanations for statistics are available, they are confined to utilizing data contained in the statistics involved. Further, background information will not be taken into account. That is what led us to the idea of applying data-mining methods that we had been studying here to the semantic web in order to obtain further, background infor­ma­tion that will allow us to learn more from statistics."

The "Explain-a-LOD" tool that Paulheim developed accesses linked open data (LOD), i.e., enormous compilations of publicly available, semantically linked data accessible on the Internet, and, from that data, automatically formulates hypo­theses regarding the interpretation of arbitrary types of statistics. To start off, the statistics to be interpreted are read into Explain-a-LOD. Explain-a-LOD then automatically searches the pools of linked open data for associated records and adds them to the initial set. Paulheim explained that, "If, for example, the country "Germany" is listed in the corruption-index data, LOD‑records that contain information on Germany will be identified and further attributes, such as its population, its membership in the EU and OECD, or the total number of companies domiciled there, generated. Attributes that are unlikely to yield useful hypotheses will be automatically deleted in order to reduce the volumes of such enriched statistics.

Explain-a-LOD helps to interpret statistics, like for example the corruption perceptions index by Transparency International. (Credit: Diagram: Transparency International)

Tuesday, July 10, 2012

RSSeNews - ‘Big Data’ event now available to view

‘Big Data’ event now available to view
The Big Data Opportunity event, chaired last week (3 July 2012) by RSS executive director Hetan Shah and featuring cabinet minister Francis Maude, is now available to view on YouTube.
The event was hosted by think tank Policy Exchange, following the publication of its report on how the government could make better use of big data in the public sector. Francis Maude gave a speech about the government’s vision for Big Data and how it could boost efficiency and improve public service delivery.
In his speech, Maude said that an estimated £37billion a year was lost across the public sector ‘through debt, fraud and error – in a large part because of our failure to share data effectively’. He confirmed that the government was to adopt world wide web inventor Tim Berners-Lee’s ’5 Star Scheme’ for assessing the usability of the data as well as appointing a privacy expert, ‘to make sure we bring in the latest expertise on privacy measures.’
Maude’s speech was followed by speeches from James Petter, UK managing director of EMC – the event’s sponsor – and Chris Yiu, head of digital government at Policy Exchange. The speeches were followed by a Q&A discussion chaired by Hetan Shah.
Shah said ‘This event shows there are real opportunities which can come from the open data agenda, but that there are a number of key issues including privacy, data quality and the capability of civil servants which need to be addressed.’

Thursday, July 5, 2012

Weave - Web-based Analysis and Visualization Environment

Open Indicators Consortium Logo
Weave (BETA 1.0) is a new web-based visualization platform designed to enable visualization of any available data by anyone for any purpose. Weave is an application development platform supporting multiple levels of user proficiency – novice to advanced – as well as the ability to integrate, disseminate and visualize data at “nested” levels of geography.

Weave has been developed at the Institute for Visualization and Perception Research of the University of Massachusetts Lowell in partnership with the Open Indicators Consortium, a fifteen member national collaborative of public and nonprofit organizations working to improve access to more and higher quality data.

Read more.....

US demographics by county
US demographics by county
The UN Millennium Development Goals in Weave, with an automatic regression line on the scatterplot
The UN Millennium Development Goals in Weave, with an automatic regression line on the scatterplot
Using WMS to overlay satellite images in Map tool
Using WMS to overlay satellite images in Map tool
Advanced shape streaming to support parcel level data
Advanced shape streaming to support parcel level data

Tuesday, July 3, 2012

International Open Government Data Conference: July 6-12 (Virtual and in Washington DC)

July 10-12, 2012
12:30 GMT/8:30 am ET or convert time
We’re expecting over 400 people from more than 40 countries. We’ve got technologists, government officials, civil society organizations and the private sector coming together for 3 days of over a 100 speakers sharing their experiences, stimulating new ideas, and demonstrating the power of putting open data to work.  
Conference topics include: 
- What do successful open government data initiatives look like worldwide?
- How are countries getting started with open data and what are their challenges?
- What are the tools, technologies and platforms for managing and releasing open data?
- How can you engage citizens, the public and private sectors around open government data?
- What's the future of open government data?
See all topics >
Presenters include:
- Caroline Anstey | Managing Director, World Bank
- Steven Van Roekel | Federal Chief Information Officer, United States
- Todd Park | Chief Technology Officer, United States
- Stela Mocan | Executive Director, e-Government Center, Moldova
- Mario Spinelli | Secretary for Corruption Prevention and Strategic Information, Brazil
- Carlos Viniegra | Chief Information Officer, Mexico
- David Eaves | Open Government Advocate, Canada
- David McClure | Office of Citizen Services and Innovative Technologies, United States
- Nathan Eagle | Chief Executive Officer, Jana
- Dr. Rufus Pollock | Co-Founder, Open Knowledge Foundation
- Alexander B. Howard | Government 2.0 Washington Correspondent, O'Reilly Media
...and many more!
All plenaries, keynotes and sessions will webcast on this page and we’ll have a series of hosts running the liveblog below. If you're in Washington D.C, you can still register to attend the conference. 
Join the liveblog and follow #IOGDC on Twitter.
A live blog and webcast will be available at

July 6-7: Lightning talks on the best uses of open data around the world! All virtual at the following times
July 6 from 7:00-11:00 am (EST):
(World time clock: )

July 7 from 2:00-6:00 pm (EST):
(World time clock: )

July 9: Tutorial on Open Data from 1:00-4:30 pm (EST)
(World time clock:
A live blog and webcast may be available at

July 10-12: International Open Government Conference held at the World Bank Headquarters at 1818 H Street NW, Washington DC. The full agenda is available here:
World time clock: (each day)
A live blog and webcast will be available at

NZ.Stat beta release is now available

Today Statistics New Zealand launched (as Beta) their branded release of the OECD.Stat data warehouse product. ( )
The work has been done in collaboration with the OECD, ABS and IStat over the last two years to develop this product release in alignment with the rigorous needs of a Statistical Agency.

image, link to nz.stat.  

What is NZ.Stat?

NZ.Stat is a free web tool that will soon be replacing Table Builder. NZ.Stat will allow you to:
  • download large volumes of data (up to 100,000 cells in Excel, and 1,000,000 in CSV format)
  • search for items within dimensions
  • view metadata alongside the table (rather than in a separate window).
NZ.Stat is powered by software provided by the OECD, which is also used by other statistical agencies around the world.

What is the beta release?

The beta release allows you to preview NZ.Stat, but there are still some issues to resolve. We are still working on enhancing the tool. Please tell us what you think of NZ.Stat.
Table Builder remains the authoritative source of tabular data, and all new tables or updates will appear in Table Builder first.