Access to Statistics
Saturday, June 6, 2020
Tuesday, September 22, 2015
What's the Difference Between Data Science and Statistics?
From: https://www.udemy.com/data-science/#article
Not long ago, the term "data science" meant nothing to most people -- even the those who worked in data. A likely response to the term was: "Isn't that just statistics?".
These days, data science is hot. The job of "data scientist" was referred to by the Harvard Business Review as the "Sexiest Job of the 21st Century." Why did data science come to exist? And just what is it that distinguishes data science from statistics?
The very first line of the American Statistical Association's definition of statistics is "Statistics is the science of learning from data..." Given that the words "data" and "science" appear in the beginning fragment of this definition, one might assume that data science is just a rebranding of statistics. A number of Twitter humorists certainly have:
"A data scientist is a statistician who lives in San Francisco"
"Data Science is statistics on a Mac."
While there's a grain of truth in these jokes, the reality is more complicated. Data science, and its differentiation from statistics, has deep roots in the history of computers.
Statistics was primarily developed to help people deal with pre-computer data problems like testing the impact of fertilizer in agriculture, or figuring out the accuracy of an estimate from a small sample. Data science emphasizes the data problems of the 21st Century, like accessing information from large databases, writing code to manipulate data, and data visualization.
A Computer from the 1960s.
Friday, August 22, 2014
UNECE-coordinated work relating to Big Data
From: http://www1.unece.org/stat/platform/display/bigdata/Big+Data+in+Official+Statistics;jsessionid=AE7DF06FDB27C80A30DACD65F6BDADB6
- Preliminary results of the survey "Skills necessary for people working with Big Data in Statistical Organisations". More detailed analysis will be prepared by October 2014.
- International collaboration project on The Role of Big Data in the Modernisation of Statistical Production - This project, overseen by the High-Level Group for the Modernisation of Statistical Production and Services, will run during 2014, and will:
- identify, examine and provide guidance for statistical organizations to act upon the main strategic and methodological issues that Big Data poses for the official statistics industry
- demonstrate the feasibility of efficient production of both novel products and ‘mainstream’ official statistics using Big Data sources, and the possibility to replicate these approaches across different national contexts
- facilitate the sharing across organizations of knowledge, expertise, tools and methods for the production of statistics using Big Data sources.
- Position paper What does Big data mean for official statistics? (March 2013) drafted for the High Level Group for the Modernization of Statistical Production and Services (HLG).
Thursday, July 3, 2014
Flowing Data: Data science, big data, and statistics – all together now
From: http://flowingdata.com/2014/07/02/data-science-big-data-and-statistics-all-together-now/
JULY 2, 2014 | STATISTICS
Terry Speed, a emeritus professor in statistics at University of California at Berkeley, gave an excellent talk on how statisticians can play nice with big data and data science. Usually these talks go in the direction of saying data science is statistics. This one is more on the useful, non-snarky side.
JULY 2, 2014 | STATISTICS
Terry Speed, a emeritus professor in statistics at University of California at Berkeley, gave an excellent talk on how statisticians can play nice with big data and data science. Usually these talks go in the direction of saying data science is statistics. This one is more on the useful, non-snarky side.
Tuesday, April 15, 2014
The Guardian - Big data and open data: what's what and why does it matter?
From: http://www.theguardian.com/public-leaders-network/2014/apr/15/big-data-open-data-transform-government
Both types of data can transform the world, but when government turns big data into open data it's especially powerful
Joel Gurin, New York University
Guardian Professional, Tuesday 15 April 2014 10.49 BST
Big data and the new phenomenon open data are closely related but they're not the same. Open data brings a perspective that can make big data more useful, more democratic, and less threatening.
While big data is defined by size, open data is defined by its use. Big data is the term used to describe very large, complex, rapidly-changing datasets. But those judgments are subjective and dependent on technology: today's big data may not seem so big in a few years when data analysis and computing technology improve.
Open data is accessible public data that people, companies, and organisations can use to launch new ventures, analyse patterns and trends, make data-driven decisions, and solve complex problems. All definitions of open data include two basic features: the data must be publicly available for anyone to use, and it must be licensed in a way that allows for its reuse. Open data should also be relatively easy to use, although there are gradations of "openness". And there's general agreement that open data should be available free of charge or at minimal cost.
The relationship between big data and open data
Both types of data can transform the world, but when government turns big data into open data it's especially powerful
Joel Gurin, New York University
Guardian Professional, Tuesday 15 April 2014 10.49 BST
Big data and the new phenomenon open data are closely related but they're not the same. Open data brings a perspective that can make big data more useful, more democratic, and less threatening.
While big data is defined by size, open data is defined by its use. Big data is the term used to describe very large, complex, rapidly-changing datasets. But those judgments are subjective and dependent on technology: today's big data may not seem so big in a few years when data analysis and computing technology improve.
Open data is accessible public data that people, companies, and organisations can use to launch new ventures, analyse patterns and trends, make data-driven decisions, and solve complex problems. All definitions of open data include two basic features: the data must be publicly available for anyone to use, and it must be licensed in a way that allows for its reuse. Open data should also be relatively easy to use, although there are gradations of "openness". And there's general agreement that open data should be available free of charge or at minimal cost.
The relationship between big data and open data
Source: Joel Gurin
This Venn diagram maps the relationship between big data and open data, and how they relate to the broad concept of open government. More....
This Venn diagram maps the relationship between big data and open data, and how they relate to the broad concept of open government. More....
Friday, March 21, 2014
Report of MPs’ inquiry into UK statistics and open data published - StatsLife
Written by Web News Editor on . Posted in News
Access to public sector data must never be sold or given away, and should be made open by default, according to a report on Statistics and Open Data published on 17 March 2014 by the Public Administration Select Committee (PASC).
One of the report's key recommendations is that data should be made 'open' by default, ie accessible to all, free of restrictions on use or redistribution, and in a digital, machine-readable format. 'There should be a presumption that restrictions on government data releases should be abolished,' the report notes. ‘It may be necessary to exempt certain data sets from this presumption, but this should be on a case-by-case basis.' The report also said that charging for some data may occasionally be appropriate, 'but this should become the exception rather than the rule.'
Saturday, March 15, 2014
Why the wealthiest countries are also the most open with their data - Washington Post
From: http://www.washingtonpost.com/blogs/wonkblog/wp/2014/03/14/why-the-wealthiest-countries-are-also-the-most-open-with-their-data/?tid=hpModule_79c38dfc-8691-11e2-9d71-f0feafdd1394
The Oxford Internet Institute this week posted a nice visualization of the state of open data in 70 countries around the world, reflecting the willingness of national governments to release everything from transportation timetables to election results to machine-readable national maps. The tool is based on the Open Knowledge Foundation's Open Data Index, an admittedly incomplete but telling assessment of who willingly publishes updated, accurate national information on, say, pollutants (Sweden) and who does not (ahem, South Africa).
Tally up the open data scores for these 70 countries, and the picture looks like this, per the Oxford Internet Institute (click on the picture to link through to the larger interactive version):
That's Great Britain in the lead at left, followed by the U.S., Denmark, Norway, the Netherlands and Australia. Each segment in the above chart corresponds to a country's score on one of the component metrics (election results, government budget, etc.). The orange outlier in that left group is Israel. Meanwhile, Kenya, Yemen and Bahrain are among the countries at the far right. More.....
Friday, March 14, 2014
Monday, March 10, 2014
VB News - Statwing picks up funding from data science luminary Hammerbacher
From: http://venturebeat.com/2014/01/30/statwing-picks-up-funding-from-data-science-luminary-hammerbacher/
Above: A correlation as shown in Statwing's software.
Image Credit: Statwing
January 30, 2014 3:01 PM
Jordan Novet
Big data projects are trendy, but they can be hard to pull off.
Above: A correlation as shown in Statwing's software.
Image Credit: Statwing
January 30, 2014 3:01 PM
Jordan Novet
Big data projects are trendy, but they can be hard to pull off.
Venture capitalists understand the problem. They’ve been betting on startups like DataHero andChartio that aim to make analysis and visualization of data fast and simple. Now another startup,Statwing, has revealed new backing, and it comes from a leading figurehead in the big data world, Jeff Hammerbacher, a cofounder of fast-growing big data company Cloudera.
Statwing uses a clean point-and-click interface, as opposed to a clunky and overly complicated tool like Microsoft Excel. Users can drop in data from a spreadsheet and then get super-clear statements that tell users what they’re looking for, alongside visualizations and high-level statistics.
Statwing uses a clean point-and-click interface, as opposed to a clunky and overly complicated tool like Microsoft Excel. Users can drop in data from a spreadsheet and then get super-clear statements that tell users what they’re looking for, alongside visualizations and high-level statistics.
Sunday, March 2, 2014
BusinessNewsDaily Reference: What is Statistical Analysis?
From: http://www.businessnewsdaily.com/6000-statistical-analysis.html
ByChad Brooks, BusinessNewsDaily Contributor | February 28, 2014 12:07am ET
In an effort to organize their data and predict future trends based on the information, many businesses rely on statistical analysis.
While organizations have lots of options on what to do with their big data, statistical analysis is a way for it to be examined as a whole, as well as broken down into individual samples.
The online technology firm TechTarget.com describes statistical analysis as an aspect of business intelligence that involves the collection and scrutiny of business data and the reporting of trends.
ByChad Brooks, BusinessNewsDaily Contributor | February 28, 2014 12:07am ET
While organizations have lots of options on what to do with their big data, statistical analysis is a way for it to be examined as a whole, as well as broken down into individual samples.
The online technology firm TechTarget.com describes statistical analysis as an aspect of business intelligence that involves the collection and scrutiny of business data and the reporting of trends.
Thursday, January 9, 2014
Statistics eXplorer accessible for education and research
From: http://ncva.itn.liu.se/explorer?l=en
Statistics eXplorer integrates many common InfoVis and GeoVis methods required to make sense of statistical data, uncover patterns of interests, gain insight, tell-a-story and finally communicate knowledge. Statistics eXplorer was developed based on a component architecture and includes a wide range of visualization techniques enhanced with various interaction techniques and interactive features to support better data exploration and analysis. It also supports multiple linked views and a snapshot mechanism for capturing discoveries made during the exploratory data analysis process which can be used for sharing gained knowledge.
More.....
Statistics eXplorer integrates many common InfoVis and GeoVis methods required to make sense of statistical data, uncover patterns of interests, gain insight, tell-a-story and finally communicate knowledge. Statistics eXplorer was developed based on a component architecture and includes a wide range of visualization techniques enhanced with various interaction techniques and interactive features to support better data exploration and analysis. It also supports multiple linked views and a snapshot mechanism for capturing discoveries made during the exploratory data analysis process which can be used for sharing gained knowledge.
More.....
Tuesday, January 7, 2014
Text Mining: The Next Data Frontier - Scientific Computing
From: http://www.scientificcomputing.com/articles/2014/01/text-mining-next-data-frontier#.UswIHNLuLTo
Mon, 01/06/2014 - 2:04pm
Mark Anawis
By some estimates, 80 percent of available information occurs as free-form text
Mon, 01/06/2014 - 2:04pm
Mark Anawis
By some estimates, 80 percent of available information occurs as free-form text
Text Mining: The Next Data Frontier
Figure 1: Text Mining and Related Fields
Josiah Stamp said: “The individual source of the statistics may easily be the weakest link.” Nowhere is this more true than in the new field of text mining, given the wide variety of textual information. By some estimates, 80 percent of the information available occurs as free-form text which, prior to the development of text mining, needed to be read in its entirety in order for information to be obtained from it. It has been applied to spam filters, fraud detection, sentiment analysis, identification of trends and authorship.
Text mining can be defined as the analysis of semi-structured or unstructured text data. The goal is to turn text information into numbers so that data mining algorithms can be applied. It arose from the related fields of data mining, artificial intelligence, statistics, databases, library science, and linguistics (Figure 1).
Text mining can be defined as the analysis of semi-structured or unstructured text data. The goal is to turn text information into numbers so that data mining algorithms can be applied. It arose from the related fields of data mining, artificial intelligence, statistics, databases, library science, and linguistics (Figure 1).
Saturday, December 21, 2013
The Best Data Visualizations of 2013 - Gizmodo
From: http://gizmodo.com/the-best-data-visualizations-of-2013-1485611407
26,711g25L
Visualization continues to mature and focus more on the data first than on novel designs and size. People improved on existing forms and got better at analysis. Readerships seemed to be more ready and eager to explore more data at a time. Fewer spam graphics landed in my inbox. So all in all, 2013 was a pretty good year for data and visualization. Let's have a look back.
Tuesday, December 10, 2013
The use of registers in the context of EU–SILC - Eurostat
From: http://epp.eurostat.ec.europa.eu/cache/ITY_OFFPUB/KS-TC-13-004/EN/KS-TC-13-004-EN.PDF
The European Union Statistics on Income and Living Conditions (EU-SILC) instrument is the main data source on income, poverty, social exclusion and living conditions in Europe. It provides the data for the
calculation of the Europe 2020 social inclusion target and further EU flagship indicators in the social field. In the current financial and economic crisis, the pressure for timelier and more comprehensive data on poverty and social exclusion has become very acute. In view of the flexibility of the EU-SILC instrument, which allows countries to combine survey and administrative data source(s), and given the advantages of administrative data in terms of burden, cost and survey error reduction, a broader use of registers, and in particular register income data, for EU-SILC is envisaged among Member States.
Friday, December 6, 2013
China turns to big data to gauge inflation - China Daily
2013-12-06 13:42
China DailyWeb Editor: qindexing
China DailyWeb Editor: qindexing
Share on sinaweiboShare on twitterShare on facebookMore Sharing Services
Through teaming up with high-tech companies, China's National Bureau of Statistics will start using big data technology to improve the collecting, processing and producing of the country's consumer price index, a key gauge of inflation.
Xian Zude, chief statistician with the NBS, said in an interview with xinhuanet.com on Wednesday that his bureau will use big data to achieve a "breakthrough" in the census of the CPI.
He added that the bureau will include data from Chinese e-commerce companies in official statistics in an effort to bring the CPI census to the next level and ease the time-consuming tasks of those doing the censuses and surveys.
Monday, December 2, 2013
Solving Big Data’s big skills shortage - The Conversation
From: http://theconversation.com/solving-big-datas-big-skills-shortage-20352
The skills required to tap Big Data include statistics, mathematics, computer science and engineering. Shutterstock.com
According to analyst firm vpnMentor, Big Data is at the portion of the hype cycle called the “peak of inflated expectations”.
The business world is awash with all sorts of claims about the magic of Big Data and how it will transform industries by increasing productivity and profits and opening up opportunities that nobody even knew existed.
But this will only happen if companies are able to hire enough people who actually understand what Big Data is, how to collect it, and preserve it. Computing and analytical skills are also required to get Big Data to reveal its hidden secrets and visualise it in novel ways. And there unfortunately, is the rub. There are just not enough data scientists, people with the required skills to satisfy this unmet demand.
The shortfall in Big Data experts is set to rise and in the UK alone, one digital industries employer body has predicted there will be a need for 69,000 of these experts in the next five years. This claim is not original. Back in 2011, McKinsey & Co was claiming a US shortfall in Big Data experts of 140,000 - 190,000 by 2018.
The shortfall in Big Data experts is being manifested in a number of ways. The first and most obvious is through recruiters casting an ever-widening net in their search for appropriate talent.
There is some agreement that Big Data analysis and data visualisation requires skills in computing as well as statistics and mathematics. This has meant that university graduates with statistics, computer science and engineering have been the main source of potential employees. --------
|
Friday, November 1, 2013
Charts that changed the world—way before big data
By: Eric Rosenbaum | CNBC.com
|
|
|
Source: Tableau Software
Some years ago I was sent by an employer to a one-day course taught by Yale professor emeritus Edward Tufte on presenting data and information. This was a decade before big data was a buzz term and companies in the business of data visualization, like Tableau Software, were going public to a very enthusiastic investor market.
Data visualization is a growing field in which massive amounts of data are measured in quantities reaching exabytes and crunched by an ever-increasing number of Silicon Valley servers ultimately to be presented in visual displays. How the intersection of data, analytics and business evolves is an open question. But according to Tableau CEO Christian Chabot in a presentation to data geeks earlier this year, some of the world's greatest thinkers gained tremendous insight and changed the world simply by organizing and deciphering basic data sets in new ways.
UK Government publishes data capability strategy
From: StatsLife
Written by Web News Editor on . Posted in News
Written by Web News Editor on . Posted in News
The government, in partnership with the recently formed Information Economy Council, has just published its Data Capability Strategy, which outlines how the opportunities in open/big data might be utilised for the benefit of all.
Titled ‘Seizing the Data Opportunity’, the strategy begins with the premise that the increasing significance of data is ‘one of the greatest opportunities and challenges facing policymakers today.’ It cites figures by the Centre for Economics and Business Research which estimates that the big data marketplace could create 58,000 new jobs in the UK between 2012 and 2017.
Read more...
Read more...
Thursday, October 31, 2013
Wednesday, October 30, 2013
Simply Statistics Unconference on the Future of Statistics
From: http://www.youtube.com/watch?v=Y4UJjzuYjfM&feature=share
Twitter flow: https://twitter.com/search?q=%23futureofstats&src=typd
Twitter flow: https://twitter.com/search?q=%23futureofstats&src=typd
Tuesday, October 1, 2013
Tuesday, September 10, 2013
Essential Collection of Visualisation Resources
From: http://www.visualisingdata.com/index.php/resources/
More ......
Here is a collection of some of the most important, effective, useful and practical data visualisation tools. The content covers the many different resources used to create and publish visualisations, tools for working with colour, packages for handling data, places to obtain data, the most influential books and educational programmes and qualifications in visualisation itself.
Data and visualisation tools
Subscribe to:
Posts (Atom)