ScienceDaily (July 16, 2012) — Interpreting the results of statistical surveys, e.g., Transparency International's corruption indices, is not always a simple matter. As Dr. Heiko Paulheim of the Knowledge Engineering Group at the TU Darmstadt's Computer Sciences Dept. put it, "Although methods that will unearth explanations for statistics are available, they are confined to utilizing data contained in the statistics involved. Further, background information will not be taken into account. That is what led us to the idea of applying data-mining methods that we had been studying here to the semantic web in order to obtain further, background information that will allow us to learn more from statistics."
The "Explain-a-LOD" tool that Paulheim developed accesses linked open data (LOD), i.e., enormous compilations of publicly available, semantically linked data accessible on the Internet, and, from that data, automatically formulates hypotheses regarding the interpretation of arbitrary types of statistics. To start off, the statistics to be interpreted are read into Explain-a-LOD. Explain-a-LOD then automatically searches the pools of linked open data for associated records and adds them to the initial set. Paulheim explained that, "If, for example, the country "Germany" is listed in the corruption-index data, LOD‑records that contain information on Germany will be identified and further attributes, such as its population, its membership in the EU and OECD, or the total number of companies domiciled there, generated. Attributes that are unlikely to yield useful hypotheses will be automatically deleted in order to reduce the volumes of such enriched statistics.
Further information: http://www.ke.tu-darmstadt.de/resources/explain-a-lod