Remote ipython notebooks

I use ipython notebooks a lot. For interactive tinkering, for calculations with units, for interactive control of long-running processes, for easy parallelization (details to come), and to generate plots for all my papers. When I can, I run the notebooks on a server rather than my rather wimpy laptop; apart from the obvious advantages of more cores, more disk space, and more RAM, this also means I can leave things running for ages. It's all very convenient, but it requires some setup. As with the ssh configuration, I've found myself explaining the setup to several different people recently, so I thought I'd put it here. (Still working on that cute-kittens post.)

First step is to get ipython notebooks installed. It may well be part of your server's distribution, or you might end up using pip and virtualenv (you might want to do it this way anyway; I use virtualenvwrapper too). The only thing to watch out for is that sufficiently old ipython setups don't auto-save your notebooks; this was terrible, but has been fixed for ages now.

Most of the ipython options are conveniently set up in an ipython profile; to create one, run
ipython profile create
which should create a directory ~/.config/ipython/profile_default/ full of python files. Each such file contains default settings with fairly elaborate descriptions. For the notebook settings, you want ipython_notebook_config.py.

I usually set:
c.NotebookApp.open_browser = False
since the server can't really open a browser window on my laptop anyway. 

Also:
c.NotebookApp.port = 9888
where the port number is some port number I've set up forwarding for in my ssh configuration. (I should say that if your server has publicly-accessible ports you can avoid forwarding and set up HTTPS access, but all of my servers require me to go through an ssh portal host, so I just use port forwarding and a password.) If this port is taken, the notebook will try this port plus one, two, et cetera, so if you're going to set up ssh port forwarding it makes sense to set up multiple ports in a row. You'll also want to make sure the number you pick isn't the same number other users on the same machine are using.

You'll also want to set the password with this option, though you'll need to change the value:
c.NotebookApp.password = \
    u'sha1:70ff5f0a3de4:10400d2fd44981f583b718e824b5dd242a289847'
The way to generate a hashed password is to run a non-notebook ipython session, and import and run IPython.lib.passwd. The reason it can't be a notebook is that it will prompt you for a password (twice) and then print out the hash. Yes, it's annoying to have yet another password, but since the notebook is accessed through a TCP port, there's no user-based control even if you're using the default listen-only-to-localhost. So other users of the same machine can connect to your notebook servers if you don't use a password.

I always use inline plots, and it takes substantial jiggery-pokery (I use a forwarding X proxy) to make pop-up plots work on a persistent notebook session, so I set inline plots as the default:
c.IPKernelApp.pylab = "inline"
You can change modes from within a notebook, so if you have successfully jiggered and poked this doesn't limit your options.

Once you have the config files set up, I recommend starting a tmux or screen session on the server. The notebook server process will live inside a window here so that it survives disconnection (also this is where stray output winds up; normal python processes send their output to the notebook but other programs may not). So I cd to the directory that will contain all my notebooks for this project (sadly the notebook browser isn't — or wasn't — very flexible; you basically have to run one notebook server, on a different port, for each project you want to have open at once) and just run ipython notebook. Then I ask my browser to load http://localhost:9888/ (where this is again the port number you chose and that is being forwarded through ssh). This should bring up the notebook server, where you can create notebooks or open existing ones.

A notebook lives half in your browser and half on the server. Fortunately notebooks are now auto-saved fairly regularly, and the "save" button triggers the recording of a checkpoint that you can go back to (though only the most recent one; you may still want to use git). So you can think of the notebook as residing on the server. If your connection drops, you can just reconnect and the notebook should come right back. If the browser window goes away, you'll get the notebook as of last auto-save (usually less than a minute ago). In either case the ipython kernel retains the state of all computations as you left it.

Connecting to the notebook is just a matter of sshing into the machine and then remembering which port this particular notebook server is on (usually there's one port number per machine; I could probably bookmark this in the browser). I'm not sure what happens if you have the same notebook open in two browser windows or on two machines, since both try to autosave regularly. But since this depends on the ssh connection, if you find you left a notebook open from a browser in another city you can always kill the ssh connection and kick the other browser off.

No comments: