Announcing ipythonblocks.org

Way back…

About a year ago, inspired by Greg Wilson, I wrote ipythonblocks as a fun way for students (and anyone else!) to practice writing Python with immediate, step-by-step, visual feedback about what their code is doing. When I’ve taught using ipythonblocks it has always been a hit—people love making things they can see. And after making things people love to share them.

Sometime last year Tracy Teal suggested I make a site where students could post their work from ipythonblocks, share it, and even grab the work of others to remix. Today I’m happy to announce that that site is live: ipythonblocks.org.

How it works

With the latest release of ipythonblocks students can use post_to_web and from_web methods to interact with ipythonblocks.org. post_to_web can include code cells from the notebook so the creation process can be shared, not just the final result. from_web can pull a grid from ipythonblocks.org for a student to remix locally. See this notebook for a demonstration.

Thank you

There are many people to thank for helping to make ipythonblocks.org possible. Thanks to Tracy Teal for the original idea, thanks to Rackspace and Jesse Noller for providing hosting, and thanks to Kyle Kelley for helping with ops and deployment.  Most of all, thanks to my family for putting up with me working at a startup and taking on projects.

Announcing ipythonblocks.org

Broadcasting IPython Notebooks

A useful feature of the IPython Notebook is that you can set the server to broadcast so that others on your local network can see the server and your notebooks. This is especially nice as a teacher so that students can load your notebooks as you work, copy text out of them, and see them in their entirety instead of just what you have on screen. Here’s the outline of what to do, with detailed instructions below:

  1. Create an IPython profile with a password for the Notebook server.
  2. Figure out your IP address on the local network.
  3. Launch IPython in broadcast + read-only mode using your new profile.
  4. Have your students navigate to your Notebook server.

Continue reading “Broadcasting IPython Notebooks”

Broadcasting IPython Notebooks

Data Provenance with GitPython

Data Provenance

When running scientific software it is considered a best practice to automatically record the versions of all the software you use. This practice is sometimes referred to as recording the provenance of the results and helps make your analysis more reproducible. Almost all software libraries will have a version number that you can somehow access from your own software. For example, NumPy’s version number is recorded in the variable numpy.__version__ and most Python packages will having something similar. Python’s version is in the variable sys.version (and, alternatively, sys.version_info).

However, a lot of personal or lab software doesn’t have a version number. The software might change so fast and be modified by so many people that manually incrememented version numbers aren’t very practical. There’s still hope in this situation, though, if the software is under version control. (Your software is under version control, isn’t it?) In Subversion the keyword properties feature is often used to record provenance. There isn’t a compatible feature in Git, but for Python software in Git repositories we can engineer a provenance solution using the GitPython package.

Returning to Previous States with Git

When you make a commit in Git the state of the repository is recorded and given a label based on a hash of the commit data. We can use the commit hash to return to any recorded state of the repository using the “git checkout” command. This means that if you know the commit hash of your software when you created a certain set of results, you can always set your software back to that state to reproduce the same results. Very handy!

Recording the Commit Hash

When you import a Python module, code at the global level of the module is actually executed. This is often used to set global variables within the module, which is what we’ll do here. GitPython lets us interact with Git repos from Python and one thing we can do is query a repo to get the commit hash of the current “HEAD“. (HEAD is a label in Git pointing to the latest commit of whatever state the repository is currently in.)

What we can do with that is make it so that when our software modules are imported they set a global variable containing the commit hash of their HEAD at the time the software was run. That hash can then be inserted into data products as a record of the software version used to create them. Here’s some code that gets and stores the hash of the HEAD of a repo:

from git import Repo
MODULE_HASH = Repo('/path/to/repo/').head.commit.hexsha

If the module we’re importing is actually inside a Git repo we can use a bit of Python magic to get the HEAD hash without manually listing the path to the repo:

import os.path
from git import Repo
repo_dir = os.path.abspath(os.path.dirname(__file__))
MODULE_HASH = Repo(repo_dir).head.commit.hexsha

(__file__ is a global variable Python automatically sets in imported modules.)

Versioned Data

Some data formats, especially those that are text based, can be easily stored in version control. If you can put your data in a Git repo then the same strategy as above can be used to get and store the HEAD commit of the data repo when you run your analysis, allowing you to reproduce both your software and data states during later runs. If your data does not easily fit into Git it’s still a good idea to record a unique identifier for the dataset, but you may need to develop that yourself (such as a simple list of all the data files that were used as inputs).

Data Provenance with GitPython

Teaching with ipythonblocks at UW

I’ve got a blog post up over on the Software Carpentry blog about trying out ipythonblocks in the classroom for the first time. Summary: it was a hit! The students really got a lot out of being able to immediately see the result of their code. We also did a lot of “what do you think this will do?”, which I think helped get the students thinking a bit more computationally. Some of the more advanced students even struck off on their own making their own designs instead of just sitting there bored.

I’m really looking forward to using ipythonblocks again at my next boot camps in May, and I hope others get some use out of it in the meantime!

Teaching with ipythonblocks at UW

Software Carpentry California Tour 2012

Over the course of eight days in October I taught at three Software Carpentry boot camps in California. It was utterly exhausting and tremendously rewarding and hopefully I’ll do it again sometime. I want to take a moment to post a little feedback from the third boot camp and mention some of the things I learned on the trip.

Continue reading “Software Carpentry California Tour 2012”

Software Carpentry California Tour 2012

Teaching with the IPython Notebook

For a few months now I’ve been using the IPython Notebook as my primary teaching tool for Python topics. Within Software Carpentry we’re also switching over to using the Notebook for both in-person bootcamps and our online repository of material. Ethan White and I put together a post on this topic on the Software Carpentry blog and now Titus Brown has blogged with his own thoughts. We’ve put in a PyCon proposal for a panel on this topic in 2013.

The IPython developers have to be given a huge amount of credit for putting together the Notebook and the rest of IPython. The Notebook especially is quite a feat: a top notch research/engineering/teaching tool all in one. And they aren’t resting on their laurels, they have a ton of ideas in mind for the Notebook in the future, including a slide-show mode. I’m definitely looking forward to seeing what they’ve got!

As with many open source projects, the IPython developers struggle to find the time and funding to write their software. If any open source project is helping with your job or your research you can easily help by citing the software in your papers and in public on social media or blogs. This gives the developers more ammunition the next time they’re writing grants, so please make your support known!

Teaching with the IPython Notebook

Software Carpentry at Johns Hopkins

This week Joshua Smith and I hosted a Software Carpentry boot camp at Johns Hopkins University in Baltimore. We also had awesome teaching help from Sasha Wood and Mike Droettboom.

We opted for a small class since this was our first time hosting a boot camp and because of space limitations. Based on the rate at which people signed up for the class it didn’t seem like there was massive local demand anyway, but we were pleasantly surprised when we had a student from Brooklyn, NY and a student commuting from Virginia. There is definitely some existing demand for the skills Software Carpentry offers and I’m glad we could put on an accessible boot camp for those people.  Most of the rest of the students were physics and astronomy grad students or post-docs from JHU and STScI.

The boot camp was broken into four half-day courses: shell, Python, version control, and software engineering. Mike and I co-taught the Python and software engineering sessions.

The overall feedback from the students was quite positive and I’m looking forward to doing this again. (Here is the requisite good/bad Software Carpentry feedback post: http://software-carpentry.org/2012/06/feedback-from-johns-hopkins/.) Below I have some notes on the sections I taught, plus some overall thoughts. Continue reading “Software Carpentry at Johns Hopkins”

Software Carpentry at Johns Hopkins

Essential Everyday Python Links

This week I’ll be teaching beginning Python at a Software Carpentry bootcamp in Toronto and I’m planning to leave the students with my most frequently visited Python links. This is strictly core Python, no third-party packages.

What are your most visited core Python references?

Updated:

  • Prasanth suggests Doug Hellmann’s  Python Module of the Week. Doug covers most of the modules in the Python standard library with nice descriptions and examples.
Essential Everyday Python Links