You—Yes, You—Should Go to SciPy 2013

The SciPy 2013 conference is coming up on June 24-29 in Austin, Texas and you should go. Here are some good reasons:

You’ll learn something. There are beginning, intermediate, and advanced tutorial tracks this year, so it would be pretty much impossible for you not to learn something at those. I’ll even be there with Katy Huff teaching a tutorial on version control and testing. Even if you can’t make it to the tutorials there will lots of great talks and BOF sessions.

There are domain specific mini-symposia. If your field is represented you can go for a concentrated dose of relevant talks and to meet other Python users in your field. Here are the specific domains this year:

  • Astronomy & astrophysics
  • Bio-informatics
  • GIS – Geospatial Data Analysis
  • Medical imaging
  • Meteorology, climatology, and atmospheric and oceanic science

It’ll be fun! The scientific Python community is chock full of really nice people. Even if you’re new and just learning how to use Python you’ll meet people who are eager to talk and make you feel welcome. (If you find this is not the case, email me or tweet me and I will see if I can help.)

Diversity at SciPy

I’ve been going to SciPy since 2010 and every year the attendees and speakers have been disappointingly white and male. Last year Andy Terrel and I chided the conference organizers about this and it looks like this year the organizers (which include Andy) are actually trying to do something about diversity: there is a Diversity Statement, a Code of Conduct, and pyladies will be there as a community sponsor.

If you’re not sure about SciPy because you’re worried you won’t fit in or won’t be welcome I want to be the first to tell you that you don’t need to worry and that you should come. Everyone who comes to SciPy has agreed to abide by the Code of Conduct and the conference organizers are there to help if you experience any problems. SciPy is a conference for everyone and having a more diverse community is good for all of us.

You—Yes, You—Should Go to SciPy 2013

SciPy 2012

Last week was the SciPy 2012 conference in Austin, TX. I’ve been for the last three years and it’s a good event for the tutorials on scientific Python packages, to see what crazy fields people are using Python in (brain-robotics interaction!), to hack on your favorite project, and to meet the maintainers of some of the packages you use every day. I recommend it especially to people who are new to scientific Python as you’ll learn a lot your first year. All of the tutorials and talks from SciPy are recorded and should be up on the web soon. You can also check out the #scipy2012 stream on Twitter.

Tutorials

This year I went to tutorials on scikits-learn, HDF5 with PyTables, pandas, and IPython:

scikits-learn

scikits-learn is the package for machine learning in Python. I haven’t had occasion to use it but if I ever need to do ML scikits-learn will be the first tool I reach for. The documentation is excellent and the interfaces seem simple and logical. Jake VanderPlas’ excellent tutorial is online at http://astroml.github.com/sklearn_tutorial/.

HDF5 with Pytables

HDF5 is a data format that can handle pretty much any kind of data, and when paired with PyTables it seems to be especially useful for efficiently working with very large datasets. One nice feature of HDF is that you can put many different arbitrary datasets in one file and attach separate metadata to every node of the file.

pandas

pandas has been skyrocketing in popularity lately and for good reason. If you’re working with time series or tabular data pandas is probably the tool you should be using. pandas has extremely slick indexing and slicing, handling of missing data, and intelligent data alignment. The DataFrame object supports SQL-like merging and joining. Time handling is really amazing. pandas supports time zones and daylight saving time. It has really rich support for different periods and frequencies. If you’re working with time series or tabular data you should really give pandas a try.

Ipython

I’ve written about IPython before. The notebook is the big star these days and they demoed some really cool features, including making interactive plots with JavaScript. All of their tutorial notebooks are available on GitHub and I recommend you check them out if you’re curious about all the crazy stuff the notebook can do.

Talks

All of the talks are listed on the SciPy website. Many of the talks tend to be very domain specific but the keynotes are always interesting. This year John  Hunter talked about the growth of matplotlib and Joshua Bloom talked about how he has used Python throughout his astronomy career, from machine learning to remotely controlling telescopes.

The theme of SciPy 2012 seemed to be high powered computing. This has been the case in years past, but while in previous years the focus has been on the cloud or GPUs, this year seemed to be about serial optimization using fancy compilers, especially LLVM. Numba and Julia seemed to be stars of the conference in this respect. (Julia isn’t actually related to Python, but the language syntax does look a lot like Python. People seemed especially excited about the prospect of calling Julia from Python much as you can call C from Python. Anything to not have to work in C…)

One thing that caught my eye during the lightning talks was pipe-o-matic, a Python tool for making data pipelines. It’s still coming together but I could see it being useful. Pipelines are defined in simple text files and the executables and their versions are fully specified. This means you can archive/version control your pipelines along with your data to track which versions of which programs were used to make the data. pipe-o-matic supposedly has full support for resuming or restarting pipelines after errors. It looks simple and lightweight.

Astronomy Mini-Symposium

This year there were topical mini-symposia on geophysics, astronomy, meteorology, and bioinformatics. The astronomy talks included AstroPy, astroML for machine learning in astronomy, and the yt visualization toolkit for astronomical simulations. yt makes some seriously beautiful visualizations.

It was great to have the topical symposium and see what others are doing with Python in astronomy, I hope these sessions keep showing up at SciPy and other places. Maybe AAS should have some Python sessions?

Sprints

This was my first year participating in the sprints, which are basically hack days where project developers will organize groups of programmers to tackle certain issues in their projects. There were groups working on scikits-learn, scikits-image, AstroPy, matplotlib, IPython and more. I ended up working on my own project, brewer2mpl, which I will discuss in another blog post.

Other Thoughts

Each time I’ve been to SciPy over the last three years I’ve been struck by how few women are there (~2% of attendees). Maybe I shouldn’t be shocked to see so few women at a conference on scientific programming, but it’s really pathetic, and it would be nice to see the organizers make an effort to recruit a more diverse group of attendees. I think a significant fraction of the women at SciPy 2012 were astronomers, keep up the good work astronomy!

SciPy 2012