More Commits via the GitHub API

I wrote a bit ago about making commits via the GitHub API. That post outlined making changes in two simplified situations: making changes to a single file and making updates to two existing files at the root of the repository. Here I show a more general solution that allows arbitrary changes anywhere in the repo.

I want to be able to specify a repo and branch and say "here are the contents of files that have changed or been created and here are the names of files that have been deleted, please take all that and this message and make a new commit for me." Because the GitHub API is so rudimentary when it comes to making commits that will end up being a many-stepped process, but it’s mostly the same steps repeated many times so it’s not a nightmare to code up. At a high level the process goes like this:

  • Get the current repo state from GitHub
    • This is the names and hashes of all the files and directories, but not the actual file contents.
  • Construct a local, malleable representation of the repo
  • Modify the local representation according to the given updates, creations, and deletions
  • Walk though the modified local "repo" and upload new/changed files and directories to GitHub
    • This must be done from the bottom up because a change at the low level means every directory above that level will need to be changed.
  • Make a new commit pointed at the new root tree (I’ll explain trees soon.)
  • Update the working branch to point to the new commit

This blob post is readable as an IPython Notebook at I’ve also reproduced the notebook below. Read More »