The Libraries of ipythonblocks.org

In this post I’ll describe the libraries used by ipythonblocks.org to turn requests into web pages and JSON to send back to users. In some future posts I’ll describe how it’s actually put on the internet. If you’re curious about the code you can see it on GitHub.

Back End

The back end consists of GET and POST REST endpoints for ipythonblocks to talk to and handlers for the site itself: main and about pages, a random grid redirect, and the individual grid views. In all there are about six handlers for all of ipythonblocks.org.

Framework

ipythonblocks.org is such a simple site that any lightweight framework could probably handle it. I went with Tornado mainly because I’ve used it before and I like the way applications are designed using Tornado. That it includes a template engine and a high performance web server are also pluses. If I’d not used Tornado, Flask and Jinja2 would have been my second choice.

Database

Choosing a database was something of an agonizing decision. You can choose from SQL, NoSQL, and key-value stores; and within each of those you have many more choices. I like the simplicity of working with schema-less databases like MongoDB, and I was very intrigued by RethinkDB, but in the interest of having a simple setup that allowed me to focus on developing app logic I ended up using sqlite. I use the dataset library to take care of some of the SQL overhead (like table creation) so that I can combine the simplicity of sqlite with a more NoSQL-like interface.

At some point I may want to move to another database, especially one running on a dedicated machine so that swapping the application server can be done without worrying about the database. When I get to that point I’ll probably take another look at RethinkDB and see if it’s ready for my application.

To avoid database lookups of recently visited pages I’m using memcached and talking to it from Python via the pylibmc library.

Logging

Python’s built in logging can certainly get the job done, but its interface has some rough edges I don’t like. Configuration can be painful for sophisticated cases and any kind of structured logging requires custom formatting. I think Twiggy is a much more “Pythonic” approach to logging with simpler configuration and built in structured logging. ipythonblocks.org was my first time using Twiggy and I’d use it again. (Though it is unfortunately not Python 3 compatible at this time.)

Other

Requests to the POST endpoint are validated using jsonschema. This provides protection for the app against incorrectly configured requests and can be used as a kind of documentation on what requests should look like.

I use the hashids library to turn the integer SQL IDs of grid entries into short strings, as in http://ipythonblocks.org/zcezcM. This is a URL form people are familiar with and it allows the implementation of “secret” grid posts that have public URLs but are difficult to find unless someone gives you the URL.

Users of ipythonblocks can include code with their posted grids and I use Pygments to highlight the syntax of the code and format it for HTML. Pygments is decent enough to escape HTML included in the posted code so I don’t have to worry about that breaking the page rendering. The color scheme used is Base16 Chalk Light via https://github.com/idleberg/base16-pygments.

Finally, I use ipythonblocks itself to turn grid data into rendered HTML via the same methods used by the IPython Notebook.

Front End

The back end renders and delivers static HTML to browsers (or JSON to ipythonblocks) so there isn’t much fancy going on in the front end. I use CSS media queries to adjust the site margins for small screens, and on the front page I use Pure CSS grids to make a responsive three-column layout that collapses to a single column on small screens.

ipythonblocks.org uses the Source family of fonts from Adobe delivered by Google Fonts.

The Libraries of ipythonblocks.org

Announcing ipythonblocks.org

Way back…

About a year ago, inspired by Greg Wilson, I wrote ipythonblocks as a fun way for students (and anyone else!) to practice writing Python with immediate, step-by-step, visual feedback about what their code is doing. When I’ve taught using ipythonblocks it has always been a hit—people love making things they can see. And after making things people love to share them.

Sometime last year Tracy Teal suggested I make a site where students could post their work from ipythonblocks, share it, and even grab the work of others to remix. Today I’m happy to announce that that site is live: ipythonblocks.org.

How it works

With the latest release of ipythonblocks students can use post_to_web and from_web methods to interact with ipythonblocks.org. post_to_web can include code cells from the notebook so the creation process can be shared, not just the final result. from_web can pull a grid from ipythonblocks.org for a student to remix locally. See this notebook for a demonstration.

Thank you

There are many people to thank for helping to make ipythonblocks.org possible. Thanks to Tracy Teal for the original idea, thanks to Rackspace and Jesse Noller for providing hosting, and thanks to Kyle Kelley for helping with ops and deployment.  Most of all, thanks to my family for putting up with me working at a startup and taking on projects.

Announcing ipythonblocks.org

Broadcasting IPython Notebooks

A useful feature of the IPython Notebook is that you can set the server to broadcast so that others on your local network can see the server and your notebooks. This is especially nice as a teacher so that students can load your notebooks as you work, copy text out of them, and see them in their entirety instead of just what you have on screen. Here’s the outline of what to do, with detailed instructions below:

  1. Create an IPython profile with a password for the Notebook server.
  2. Figure out your IP address on the local network.
  3. Launch IPython in broadcast + read-only mode using your new profile.
  4. Have your students navigate to your Notebook server.

Continue reading “Broadcasting IPython Notebooks”

Broadcasting IPython Notebooks

You—Yes, You—Should Go to SciPy 2013

The SciPy 2013 conference is coming up on June 24-29 in Austin, Texas and you should go. Here are some good reasons:

You’ll learn something. There are beginning, intermediate, and advanced tutorial tracks this year, so it would be pretty much impossible for you not to learn something at those. I’ll even be there with Katy Huff teaching a tutorial on version control and testing. Even if you can’t make it to the tutorials there will lots of great talks and BOF sessions.

There are domain specific mini-symposia. If your field is represented you can go for a concentrated dose of relevant talks and to meet other Python users in your field. Here are the specific domains this year:

  • Astronomy & astrophysics
  • Bio-informatics
  • GIS – Geospatial Data Analysis
  • Medical imaging
  • Meteorology, climatology, and atmospheric and oceanic science

It’ll be fun! The scientific Python community is chock full of really nice people. Even if you’re new and just learning how to use Python you’ll meet people who are eager to talk and make you feel welcome. (If you find this is not the case, email me or tweet me and I will see if I can help.)

Diversity at SciPy

I’ve been going to SciPy since 2010 and every year the attendees and speakers have been disappointingly white and male. Last year Andy Terrel and I chided the conference organizers about this and it looks like this year the organizers (which include Andy) are actually trying to do something about diversity: there is a Diversity Statement, a Code of Conduct, and pyladies will be there as a community sponsor.

If you’re not sure about SciPy because you’re worried you won’t fit in or won’t be welcome I want to be the first to tell you that you don’t need to worry and that you should come. Everyone who comes to SciPy has agreed to abide by the Code of Conduct and the conference organizers are there to help if you experience any problems. SciPy is a conference for everyone and having a more diverse community is good for all of us.

You—Yes, You—Should Go to SciPy 2013

Data Provenance with GitPython

Data Provenance

When running scientific software it is considered a best practice to automatically record the versions of all the software you use. This practice is sometimes referred to as recording the provenance of the results and helps make your analysis more reproducible. Almost all software libraries will have a version number that you can somehow access from your own software. For example, NumPy’s version number is recorded in the variable numpy.__version__ and most Python packages will having something similar. Python’s version is in the variable sys.version (and, alternatively, sys.version_info).

However, a lot of personal or lab software doesn’t have a version number. The software might change so fast and be modified by so many people that manually incrememented version numbers aren’t very practical. There’s still hope in this situation, though, if the software is under version control. (Your software is under version control, isn’t it?) In Subversion the keyword properties feature is often used to record provenance. There isn’t a compatible feature in Git, but for Python software in Git repositories we can engineer a provenance solution using the GitPython package.

Returning to Previous States with Git

When you make a commit in Git the state of the repository is recorded and given a label based on a hash of the commit data. We can use the commit hash to return to any recorded state of the repository using the “git checkout” command. This means that if you know the commit hash of your software when you created a certain set of results, you can always set your software back to that state to reproduce the same results. Very handy!

Recording the Commit Hash

When you import a Python module, code at the global level of the module is actually executed. This is often used to set global variables within the module, which is what we’ll do here. GitPython lets us interact with Git repos from Python and one thing we can do is query a repo to get the commit hash of the current “HEAD“. (HEAD is a label in Git pointing to the latest commit of whatever state the repository is currently in.)

What we can do with that is make it so that when our software modules are imported they set a global variable containing the commit hash of their HEAD at the time the software was run. That hash can then be inserted into data products as a record of the software version used to create them. Here’s some code that gets and stores the hash of the HEAD of a repo:

from git import Repo
MODULE_HASH = Repo('/path/to/repo/').head.commit.hexsha

If the module we’re importing is actually inside a Git repo we can use a bit of Python magic to get the HEAD hash without manually listing the path to the repo:

import os.path
from git import Repo
repo_dir = os.path.abspath(os.path.dirname(__file__))
MODULE_HASH = Repo(repo_dir).head.commit.hexsha

(__file__ is a global variable Python automatically sets in imported modules.)

Versioned Data

Some data formats, especially those that are text based, can be easily stored in version control. If you can put your data in a Git repo then the same strategy as above can be used to get and store the HEAD commit of the data repo when you run your analysis, allowing you to reproduce both your software and data states during later runs. If your data does not easily fit into Git it’s still a good idea to record a unique identifier for the dataset, but you may need to develop that yourself (such as a simple list of all the data files that were used as inputs).

Data Provenance with GitPython

Install Scientific Python on Mac OS X

These instructions detail how I install the scientific Python stack on my Mac. You can always check the Install Python page for other installation options.

I’m running the latest OS X Mountain Lion (10.8) but I think these instructions should work back to Snow Leopard (10.6). These instructions differ from my previous set primarily in that I now use Homebrew to install NumPy, SciPy, and matplotlib. I do this because Homebrew makes it easier to compile these with non-standard options that work around an issue with SciPy on OS X.

I’ll show how I install Python and the basic scientific Python stack:

If you need other libraries they can most likely be installed via pip and any dependencies can probably be installed via Homebrew.

Command Line Tools

The first order of business is to install the Apple command line tools. These include important things like development headers, gcc, and git. Head over to developer.apple.com/downloads, register for a free account, and download (then install) the latest “Command Line Tools for Xcode” for your version of OS X.

If you’ve already installed Xcode on Lion or Mountain Lion then you can install the command line tools from the preferences. If you’ve installed Xcode on Snow Leopard then you already have the command line tools.

Homebrew

Homebrew is my favorite package manager for OS X. It builds packages from source, intelligently re-uses libraries that are already part of OS X, and encourages best practices like installing Python packages with pip.

To install Homebrew paste the following in a terminal:

ruby -e "$(curl -fsSL https://raw.github.com/mxcl/homebrew/go)"

The brew command and any executables it installs will go in the directory /usr/bin/local so you want to make sure that goes at the front of your system’s PATH. As long as you’re at it, you can also add the directory where Python scripts get installed. Add the following line to your .profile, .bash_profile, or .bashrc file:

export PATH=/usr/local/bin:/usr/local/share/python:$PATH

At this point you should close your terminal and open a new one so that this PATH setting is in effect for the rest of the installation.

Python

Now you can use brew to install Python:

brew install python

Afterwards you should be able to run the commands

which python
which pip

and see

/usr/local/bin/python
/usr/local/bin/pip

for each, respectively. (It’s also possible to install Python 3 using Homebrew: brew install python3.)

NumPy

It is possible to use pip to install NumPy, but I use a Homebrew recipe so I avoid some problems with SciPy. The recipe isn’t included in stock Homebrew though, it requires “tapping” two other sources of Homebrew formula:

brew tap homebrew/science
brew tap samueljohn/python

You can learn more about these at their respective repositories:

With those repos tapped you can almost install NumPy, but first you’ll have
to use pip to install nose:

pip install nose

I compile NumPy against OpenBLAS to avoid a SciPy issue. Compiling OpenBLAS requires gfortran, which you can get via Homebrew:

brew install gfortran
brew install numpy --with-openblas

SciPy

And then you’re ready for SciPy:

brew install scipy --with-openblas

matplotlib

matplotlib generally installs just fine via pip but the custom Homebrew formula takes care of installing optional dependencies too:

brew install matplotlib

IPython

You’ll want Notebook support with IPython and that requires some extra dependencies, including ZeroMQ via brew:

brew install zeromq
pip install jinja2
pip install tornado
pip install pyzmq
pip install ipython

pandas

Pandas should install via pip:

pip install pandas

Testing It Out

The most basic test you can do to make sure everything worked is open up an IPython session and type in the following:

import numpy
import scipy
import matplotlib
import pandas

If there are no errors then you’re ready to get started! Congratulations and enjoy!

Install Scientific Python on Mac OS X

PyCon 2013 Review

PyCon 2013 was my first PyCon and it was, bar none, the best conference I’ve ever been to. And it wasn’t just the free Raspberry Pi or the Wreck-it-Ralph swag from Disney or the fact that I stood next to Guido for a minute during the poster session. No, PyCon is just good people. The Python community is diverse and accepting, and I can’t list all the awesome, kind people I met there.

There were, unfortunately, disappointments, but what other tech conference has a sold-out full-day education summit, or raises $10k for a community group, or raises $6k for cancer research and the John Hunter Memorial fund with a 5k fun run? And PyCon attendees were 20% women! It’s amazing to have been a part of conference where community, generosity, and outreach were put front and center. I tried to do my small part by giving people directions during the tutorials.

Anyway, on to the specifics of what I did:

Tutorials

The first tutorial I went to was called “A beginner’s introduction to Pydata: how to build a minimal recommendation engine”. The intent of the tutorial was to introduce NumPy and pandas. I was hoping to learn some pandas-fu but I found the material poorly organized and didn’t feel like I was getting a good idea of why/when to use particular pandas features. The video for this one doesn’t seem to be up yet.

The second tutorial I went to was called “Bayesian statistics made simple” and this one was awesome! I was comfortable with Bayesian stats beforehand but a refresher never hurts and the instructor (Allen Downey) gave terrific explanations. He had a little Bayesian stats library for us to use in the programmatic examples, which was fun. (Though I had to re-compile NumPy and SciPy to get it to work. It used the one little corner of SciPy that’s often broken on Macs.) If you’re interested in learning more Downey is working on a new book called Think Bayes that you can read for free, Fernando Perez has posted his notebook from the course, and you can watch the video.

Education Summit

The PyCon Education Summit brought together educators from all kinds of backgrounds from K-12 teachers to those teaching adults. I went due to my interest as an instructor for Software Carpentry. Most of the discussion focused on teaching Python/computation in long-form courses to people who have zero programming experience.

I didn’t take much concrete away from the summit, but I was impressed with the sheer level of energy going into the Python/education nexus. There are many people out there experimenting with Python in education and developing lessons that use Python. There are also a lot of user groups around the country (like the Boston Python Workshop) that are actively working to bring new people into the Python world. Many people do this in their spare time! That’s the kind of community devotion I love about Python.

I gave a five minute lightning talk at the summit that was part a preview of my PyCon talk and part showing off ipythonblocks. The Notebooks for that are at nbviewer.ipython.org/5165758.

Talks

The first and most important thing you should know about the talks is that they were all recorded and the videos are online. There were about a million concurrent talk sessions and I’m still catching up on all the great stuff I missed. I highly recommend starting with the opening/closing statements from Jesse Noller and the Raspberry Pi keynote from Eben Upton:

I think there were standing ovations during each of those. And then there were the great regular talks I saw in person:

  • Python: A Toy Language by Dave Beazley
    • Do not miss a chance to see Dave Beazley talk. You will be thoroughly entertained and leave wondering why you do such boring things with your code. Here Beazley talks about using Python to control a hobby CNC mill.
  • How the Internet works by Jessica McKellar
    • Learn about the underlying structure and protocol of the web!
  • Awesome Big Data Algorithms by Titus Brown
    • Titus gives a great introduction to some algorithms and data structures that help deal with Data of Unusual Size. Also check out his blog post on the talk with links to his notebooks.
  • Who’s there? – Home Automation with Arduino/Raspberry Pi by Rupa Dachere
    • Rupa tells us how she built an automated front door camera. This talk was standing room only!
  • What Teachers Really Need from Us by Selena Deckelmann
    • Selena relates her experience getting to know teachers and how we as developers can best help them.

My Talk

I gave a talk titled “Teaching with the IPython Notebook” that focused on how the IPython Notebook can help students learning Python. (Primarily by simplifying their interface to Python.) It seemed to go well and I’m really glad I did it! The video is up and my presentation notebook is at nbviewer.ipython.org/5165431.

Posters

I stopped by Simeon Franklin’s poster about making Python more beginner friendly and I was really impressed with the level of interest surrounding the topic. Even Guido was there seriously engaged in this discussion. With engagement of this magnitude at that level I think we’ll see people putting serious effort into making Python more user friendly right out of the box, which will be wonderful.

Observations

As Wes McKinney noted on Twitter, there were two things everywhere at PyCon this year: the IPython Notebook and Raspberry Pis. It seemed like every other talk and tutorial was using the Notebook and it’s no surprise, the Notebook is so fantastic for presenting code plus supporting material and then sharing it. It’s a major boon to Python.

Everyone at the conference (plus some kids who came for free tutorials) left with a Raspberry Pi. These amazing little computers enable all kinds of projects, often attached to an Arduino for talking to hardware. In Eben Upton’s keynote I learned that the “Pi” in “Raspberry Pi” is for Python since much of the system is built on Python. The site raspberry.io has been set up as a community of projects that use RPis but I’m sure Googling turns up a ton more. A small, cheap, low powered, easy to program computer just has so many possibilities! I haven’t had a chance to start hacking on mine yet but I’m looking forward to it!

Thanks

A big thanks to STScI for sending me. Thanks to Greg Wilson for suggesting the talk idea and thanks to Titus Brown and Ethan White for looking over my proposal.

PyCon 2013 Review

Heading to PyCon

Today I’m flying out to Santa Clara for PyCon! This will be my first one and I’m really excited!

Tutorials

On Wednesday I’m taking a couple of data oriented tutorials: Introduction to Pydata and Bayesian statistics made simple. I already know quite a bit on those topics but I’m looking forward to learning more about pandas and PyTables, and it can’t hurt to brush up on my statistics.

Education Summit

On Thursday I’m taking part in the first ever Python Education Summit. There should be a lot to learn about teaching with Python and I’m curious to see what is done in venues other than Software Carpentry.

Talks

I haven’t yet decided on all of the talks I want to see but Daniel Greenfield’s PyCon guide has given me some good ideas. My own talk on Teaching with the IPython Notebook is on Saturday at 1:55 PM. The talks immediately after mine look interesting; the first is on building scientific applications with Python and the second is about Numba, a tool for speeding up numeric calculations.

Job Fair

There are a ton of companies signed up for the PyCon Job Fair, I’ll definitely be there with my resume!

Say Hi!

I’m looking forward to meeting a lot of people at PyCon! If you would like to meet up at the conference drop me a line on App.net, twitter, Google+, or by email. See you there!

Heading to PyCon

A Styled HTML Document from Markdown

There are many, many command line converters for turning Markdown into HTML, but for the most part these make HTML fragments, not full documents with CSS styling. That’s fine most of the time (e.g. when I’m writing blog posts), but sometimes I want a full, pretty document so I can print it out (typically for presentation notes).

To fill this hole I put together a small script that converts Markdown and wraps the HTML result in a template that includes Bootstrap CSS. I set the fonts to sans-serif and monospace so that they are taken from the defaults for your browser, making it easier to use your favorite fonts.

The script requires the Python libraries Python Markdown, mdx_smartypants (a Python-Markdown extension), and Jinja2.

#!/usr/bin/env python
import argparse
import sys
import jinja2
import markdown
TEMPLATE = """<!DOCTYPE html>
<html>
<head>
<link href="http://netdna.bootstrapcdn.com/twitter-bootstrap/2.3.0/css/bootstrap-combined.min.css&quot; rel="stylesheet">
<style>
body {
font-family: sans-serif;
}
code, pre {
font-family: monospace;
}
h1 code,
h2 code,
h3 code,
h4 code,
h5 code,
h6 code {
font-size: inherit;
}
</style>
</head>
<body>
<div class="container">
{{content}}
</div>
</body>
</html>
"""
def parse_args(args=None):
d = 'Make a complete, styled HTML document from a Markdown file.'
parser = argparse.ArgumentParser(description=d)
parser.add_argument('mdfile', type=argparse.FileType('r'), nargs='?',
default=sys.stdin,
help='File to convert. Defaults to stdin.')
parser.add_argument('-o', '--out', type=argparse.FileType('w'),
default=sys.stdout,
help='Output file name. Defaults to stdout.')
return parser.parse_args(args)
def main(args=None):
args = parse_args(args)
md = args.mdfile.read()
extensions = ['extra', 'smarty']
html = markdown.markdown(md, extensions=extensions, output_format='html5')
doc = jinja2.Template(TEMPLATE).render(content=html)
args.out.write(doc)
if __name__ == '__main__':
sys.exit(main())
view raw markdown_doc hosted with ❤ by GitHub
A Styled HTML Document from Markdown

ipythonblocks – A Visual Tool for Practicing Python

Learning to program and learning the basics of control flow can be tricky business for novices. I wanted to make something that provided immediate, visual feedback to students as they practice things like for loops and if statements so they can see precisely what their code is (or isn’t) doing. So I wrote ipythonblocks.

The IPython Notebook makes it possible to display rich representations of Python objects using HTML (among other things). That allowed me to make a Python object whose representation in the Notebook is a colored table. Students can index into the table to change the color properties of individual table cells and then immediately display their changes.

With ipythonblocks instructors can give coding problems like ‘turn every block in the third column red’ or ‘turn every blue block green’ and by displaying their blocks students can see right away whether their code is having the desired effect.

Check out the demo notebook to see ipythonblocks in action.

ipythonblocks – A Visual Tool for Practicing Python