{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from IPython.display import display, HTML\n", "\n", "import pandas as pd\n", "\n", "from jinja2 import Template\n", "\n", "from bokeh.models import (\n", " ColumnDataSource, Plot, Circle, Range1d, \n", " LinearAxis, HoverTool, Text,\n", " SingleIntervalTicker, Slider, CustomJS\n", ")\n", "from bokeh.palettes import Spectral6\n", "from bokeh.plotting import vplot\n", "from bokeh.resources import JSResources\n", "from bokeh.embed import file_html\n", "\n", "from data import process_data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Setting up the data\n", "The plot animates with the slider showing the data over time from 1964 to 2013. We can think of each year as a seperate static plot, and when the slider moves, we use the Callback to change the data source that is driving the plot.\n", "\n", "We could use bokeh-server to drive this change, but as the data is not too big we can also pass all the datasets to the javascript at once and switch between them on the client side.\n", "\n", "This means that we need to build one data source for each year that we have data for and are going to switch between using the slider. We build them and add them to a dictionary `sources` that holds them under a key that is the name of the year preficed with a `_`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [], "source": [ "fertility_df, life_expectancy_df, population_df_size, regions_df, years, regions = process_data()\n", "\n", "sources = {}\n", "\n", "region_color = regions_df['region_color']\n", "region_color.name = 'region_color'\n", "\n", "for year in years:\n", " fertility = fertility_df[year]\n", " fertility.name = 'fertility'\n", " life = life_expectancy_df[year]\n", " life.name = 'life' \n", " population = population_df_size[year]\n", " population.name = 'population' \n", " new_df = pd.concat([fertility, life, population, region_color], axis=1)\n", " sources['_' + str(year)] = ColumnDataSource(new_df)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "sources looks like this\n", "\n", "```\n", "{'_1964': ,\n", " '_1965': ,\n", " '_1966': ,\n", " '_1967': ,\n", " '_1968': ,\n", " '_1969': ,\n", " '_1970': ,\n", " '_1971': ...\n", "\n", "```\n", " \n", "We will pass this dictionary to the Callback. In doing so, we will find that in our javascript we have an object called, for example _1964 that refers to our ColumnDataSource. Note that we needed the prefixing _ as JS objects cannot begin with a number.\n", "\n", "Finally we construct a string that we can insert into our javascript code to define an object.\n", "\n", "The string looks like this: `{1962: _1962, 1963: _1963, ....}`\n", "\n", "Note the keys of this object are integers and the values are the references to our ColumnDataSources from above. So that now, in our JS code, we have an object that's storing all of our ColumnDataSources and we can look them up. " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [], "source": [ "dictionary_of_sources = dict(zip([x for x in years], ['_%s' % x for x in years]))\n", "js_source_array = str(dictionary_of_sources).replace(\"'\", \"\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Build the plot" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Set up the plot\n", "xdr = Range1d(1, 9)\n", "ydr = Range1d(20, 100)\n", "plot = Plot(\n", " x_range=xdr,\n", " y_range=ydr,\n", " title=\"\",\n", " plot_width=800,\n", " plot_height=400,\n", " outline_line_color=None,\n", " toolbar_location=None, \n", ")\n", "AXIS_FORMATS = dict(\n", " minor_tick_in=None,\n", " minor_tick_out=None,\n", " major_tick_in=None,\n", " major_label_text_font_size=\"10pt\",\n", " major_label_text_font_style=\"normal\",\n", " axis_label_text_font_size=\"10pt\",\n", "\n", " axis_line_color='#AAAAAA',\n", " major_tick_line_color='#AAAAAA',\n", " major_label_text_color='#666666',\n", "\n", " major_tick_line_cap=\"round\",\n", " axis_line_cap=\"round\",\n", " axis_line_width=1,\n", " major_tick_line_width=1,\n", ")\n", "\n", "xaxis = LinearAxis(ticker=SingleIntervalTicker(interval=1), axis_label=\"Children per woman (total fertility)\", **AXIS_FORMATS)\n", "yaxis = LinearAxis(ticker=SingleIntervalTicker(interval=20), axis_label=\"Life expectancy at birth (years)\", **AXIS_FORMATS) \n", "plot.add_layout(xaxis, 'below')\n", "plot.add_layout(yaxis, 'left')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Add the background year text\n", "We add this first so it is below all the other glyphs" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Add the year in background (add before circle)\n", "text_source = ColumnDataSource({'year': ['%s' % years[0]]})\n", "text = Text(x=2, y=35, text='year', text_font_size='150pt', text_color='#EEEEEE')\n", "plot.add_glyph(text_source, text)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Add the bubbles and hover\n", "We add the bubbles using the Circle glyph. We start from the first year of data and that is our source that drives the circles (the other sources will be used later).\n", "\n", "plot.add_glyph returns the renderer, and we pass this to the HoverTool so that hover only happens for the bubbles on the page and not other glyph elements." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Add the circle\n", "renderer_source = sources['_%s' % years[0]]\n", "circle_glyph = Circle(\n", " x='fertility', y='life', size='population',\n", " fill_color='region_color', fill_alpha=0.8, \n", " line_color='#7c7e71', line_width=0.5, line_alpha=0.5)\n", "circle_renderer = plot.add_glyph(renderer_source, circle_glyph)\n", "\n", "# Add the hover (only against the circle and not other plot elements)\n", "tooltips = \"@index\"\n", "plot.add_tools(HoverTool(tooltips=tooltips, renderers=[circle_renderer]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Add the legend\n", "\n", "Finally we manually build the legend by adding circles and texts to the upper-right portion of the plot." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "text_x = 7\n", "text_y = 95\n", "for i, region in enumerate(regions):\n", " plot.add_glyph(Text(x=text_x, y=text_y, text=[region], text_font_size='10pt', text_color='#666666'))\n", " plot.add_glyph(Circle(x=text_x - 0.1, y=text_y + 2, fill_color=Spectral6[i], size=10, line_color=None, fill_alpha=0.8))\n", " text_y = text_y - 5 " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Add the slider and callback\n", "Last, but not least, we add the slider widget and the JS callback code which changes the data of the renderer_source (powering the bubbles / circles) and the data of the text_source (powering background text). After we've set() the `data` we need to trigger() a `change`. slider, renderer_source, text_source are all available because we add them as args to Callback.\n", "\n", "It is the combination of `sources = %s % (js_source_array)` in the JS and `Callback(args=sources...)` that provides the ability to look-up, by year, the JS version of our python-made ColumnDataSource." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Add the slider\n", "code = \"\"\"\n", " var year = slider.get('value'),\n", " sources = %s,\n", " new_source_data = sources[year].get('data');\n", " renderer_source.set('data', new_source_data);\n", " text_source.set('data', {'year': [String(year)]});\n", "\"\"\" % js_source_array\n", "\n", "callback = CustomJS(args=sources, code=code)\n", "slider = Slider(start=years[0], end=years[-1], value=1, step=1, title=\"Year\", callback=callback, name='testy')\n", "callback.args[\"renderer_source\"] = renderer_source\n", "callback.args[\"slider\"] = slider\n", "callback.args[\"text_source\"] = text_source" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Embed in a template and render\n", "Last but not least, we use vplot to stick togethre the chart and the slider. And we embed that in a template we write using the script, div output from components.\n", "\n", "We display it in IPython and save it as an html file." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false, "scrolled": false }, "outputs": [], "source": [ "# Stick the plot and the slider together\n", "layout = vplot(plot, slider)\n", "\n", "# Open our custom template\n", "with open('gapminder_template.jinja', 'r') as f:\n", " template = Template(f.read())\n", "\n", "# Use inline resources\n", "js_resources = JSResources(mode='inline') \n", "html = file_html(layout, resources=(js_resources, None), title=\"Bokeh - Gapminder Bubble Plot\", template=template)\n", "\n", "display(HTML(html))" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.1" } }, "nbformat": 4, "nbformat_minor": 0 }