Monday, 30 September 2013

Renewable electricity sites mapped

Interactive maps of UK renewable electricity generating sites are published by the Department of Energy and Climate Change and RenewableUK. I've had a go at creating a slightly different style of interactive map with the DECC data, with circles sized in proportion to electricity generating capacity.

You can view one type of generation at a time by hovering over the legend. The north-south differences (for example in wind, hydro and solar generation) are striking but not completely surprising given the differences in climate and landscape.

The map uses D3 and Leaflet, and the code is based on this page by Mike Bostock.

Thursday, 26 September 2013

A big list of Office for National Statistics publications

I've put together a categorised list of over 100 regularly-published Office for National Statistics releases, covering topics including the economy, demography, and health. It's hopefully helpful as a quick overview of the breadth of information that's published.

Monday, 23 September 2013

Sankey Diagrams

Sankey Diagrams are useful for showing flows, such as energy flows or movements of people. My favourite example is the Energy Flow Chart, produced annually by the UK's Department of Energy and Climate Change.
Versions of the chart going back to 1974 are available from the National Archives. It's interesting to see the shift from coal to gas and the reduction in energy consumption by industry over this period. Note that different units are used in the 1974 and 2012 charts; 1 toe equals approximately 397 therms (DUKES 2013 page 229). 
DUKES Annex H includes detailed flow charts for individual fuels.

There's an entire blog on Sankey diagrams at, which is useful for inspiration and for advice on design. The site includes a list of software for creating the diagrams.

The D3 Sankey plugin is fairly easy to use if you know some JavaScript; see Mike Bostock's example. The 2012 UK flow chart (above) was created in Adobe InDesign.

Friday, 20 September 2013

A map of the Welsh Index of Multiple Deprivation

I published a map of deprivation in Wales yesterday, using data from the Welsh Index of Multiple Deprivation 2011.
In my previous posts on deprivation maps, I have focused on the map design themselves rather than the important information that the maps contain. This is simply because I feel more qualified to comment on the presentation rather than the data itself.

Alasdair Rae's site on the Scottish index has an explanation page which is useful for gaining a better understanding of deprivation maps in general. The Welsh Government's WIMD 2011 page includes a guidance document which explains how the deprivation indices are calculated.

The Joseph Rowntree Foundation yesterday published a report on poverty and social exclusion in Wales.

Wednesday, 18 September 2013

World Development Indicators

World Development Indicators is a database compiled by the World Bank, containing 1289 indicators (by my count) in the categories of world view, people, environment, economy, states and markets, and global links. The database covers the world, regions, and countries, contains annual data, and is updated in April, July, September and December each year. Edit: There are also some updates between these dates; see here for details.

Data visualisation tools are available for a selection of the indicators, including Google Public Data Explorer. There are also nice mobile apps for viewing the indicators.

There are a few options for downloading data:
  • The Databank is an online tool for selecting, viewing and downloading data.
  • The WDI package for R can be used to search for and download data directly to R. See the README page for a quick tutorial.
  • There is also a zip file containing all of the data and metadata in csv format (39 MB download). This could be useful if you plan to use the WDI often.

Tuesday, 17 September 2013

A map of the Scottish Index of Multiple Deprivation

The screenshot below shows my interactive map of SIMD 2012, which is based on the design I used previously for a map of London.

I am aware of three existing interactive maps that use this data set:
  • The Scottish Government's map has many features, including the ability to show changes over time. The map's downside is that it uses only a relatively small portion of the browser window.
  • Alasdair Rae's site was obviously a major source of inspiration for mine. The map's tooltips show change in rank over time, and the site also has some advice on interpreting the maps.
  • Oliver O'Brien's map uses an unusual and effective technique of shading only buildings.
I've also created a visualisation of the the SIMD ranks of data zones in each local authority area, based on the barcode charts in the Scottish Government's SIMD publication.

Monday, 16 September 2013

Eurostat data in R

Eurostat, the statistical office of the European Union, collects and publishes national and regional data for Europe. The site's features include Regional Statistics Illustrated, a data tool with interactive maps and charts and Statistics Explained articles.

It's possible to download data from the Eurostat database to carry out your own analysis and visualisation. There are two possible approaches to this:
  1. The browse/search page. Within this, the Database sections are customisable, while the Tables are ready-made. The advantages of downloading data by this route is that very little processing is required to get the data in the correct shape. The disadvantage is that you need to use a web browser to get data rather than downloading straight to software such as R. But the process can be made very quick by using bookmarks.
  2. From the bulk data downloads section, you can download full datasets for use in statistical software.
I'll begin by discussing option 1. When you click on a dataset from the browse/search page, a new window opens (see screenshot). In this window, you can choose the variables, countries etc that you are interested in.

To find out more about the dataset, click "Explanatory texts (metadata)".

It's possible to bookmark your data selection for future use.

To download the data for use in a stats package, click the Download button and choose to download in csv format.

This is an example script for plotting data downloaded from Eurostat. Here is the result:

For downloading from the bulk facility, this is useful blog post by Johannes Kutsam. Here is an example script I wrote using Johannes's function. The result is very similar to the chart above, although for some reason estimated values seem to be missing from the downloaded data.

Wednesday, 11 September 2013

Using the Nomis API with R for regional labour market data

Nomis, a site run by the University of Durham on behalf of the Office for National Statistics, is a fantastic source for labour market statistics, including local data.

The site's features include regional profiles and web-based tools for querying the Nomis database. In this post, I'll briefly discuss the Nomis API, which is useful for downloading the latest data to stats software, web apps etc. Spencer Hedger, who works on the Nomis team, has written several helpful blog posts on using the API. Also see the API reference pages which are linked to from your Nomis account page.
Chart created using R and Nomis API

This is a quick example of downloading data to R using an API link. The API link was generated using the process described in Spencer Hedger's blog post; see the link above. The resulting chart is shown to the right.

The API has several format options in addition to CSV, such as Google Visualisation JSON and KML.

Wednesday, 4 September 2013

ONS time series data in R

The datasets on the Office for National Statistics website contains thousands of economic time series. It's straightforward to select series by hand and download a CSV file of the data. Better still, because these CSV downloads have a consistent naming pattern, you can download the variables of your choice into a program like R if you know the variable and dataset codes you're looking for.

I've created this R script as an example of downloading data directly to R. It only takes a few lines of code to download the data to R and remove the metadata at the end of the file. The script also adds a bit of additional microdata to make the numbers easier to work with, and reshapes the data to make it tidy. Finally, it plots a few employment rate charts:
employment rate

A pitfall of this approach to downloading data is that some variables such as GDP are published in different datasets depending on the month (preliminary estimate, second estimate etc.).

Rather than getting the data directly from ONS, you could try the Quandl API (HT: jamesz). I've only just started using Quandl, but it seems like a great resource and has data from hundreds of sources.