Mathematics Tool Kit - Working on @CSS blocker about dealing with library paths in MTK - Open Design Engine

Working on @CSS blocker about dealing with library paths in MTK

So, Chris is working on YVN-10 (Create unit tests for the PressureVessel class and integrate them into its MTK document as the harness for the reported test results). As part of that work he has been reorganizing the Yavin project directory into something that looks like this:

Yavin/
- lib/
  - chamber/
    - pressure_vessel_calcs.py the Yavin module used to size the chamber's wall thickness
- documentation/
  - mtk notebook files
- tests/
  - chamber/
    - pressure_vessel_test_case_dataset.py data set for pressure vessel test cases and documentation
    - test_math.py unit tests for pressure vessel calcs class

So, why and I talking about this over in MTK. Well, this new structure highlights a problem in MTK, namely when Chris goes to import the pressure vessel calcs class (or anything else from parallel directories) into the notebook(s) in the documentation directory, he gets import errors. This is blocking his progress. At this week's #EngineerSpeak hangout I agreed to help Chris look for some ways to get around this issue.

At the heart of this block is the fact that Jupyter notebooks each get their own IPython kernel running as its own process with its current working directory equal to the notebook's directory. If you want to import libraries from the notebook directory or one of its sub-directories, everything works great. But, if you want to import from a parent or sibling directory you are out of luck. After poking around how libraries are searched for in Python, I found this little gem. So, I know we can easily add a directory to the search path, we just need to choose one and then have a reliable way to tell the notebook to use it.

In terms of choosing one, I think the top level project directory (Yavin/ in the example we are working with) makes the most sense. Now, as for finding it, I am thinking we borrow from the library search process in Python, and require a file be placed in this directory (say __mtk_project_root__.py which we can later use to contain project specific notebook initialization commands) to identify it as the top level project directory. Then we just need to include some initialization code which is run for each notebook kernel that searches up its directory structure for this file and then adds the located directory to the Python sys.path.

I tested out this idea today with some inline code in a couple of notebooks (see below). The code is clearly just draft code (it should use directory objects instead of strings, it should be bullet proofed for when the __mtk_project_root__.py is not found, etc). But, in a test environment, it proved the technique works for a few different directory structures, including one like Chris' example.

import os, sys

PROJECT_ROOT_FILE = '__mtk_project_root__.py'
searchPath = ''
cwd = os.getcwd()

while (not os.path.isfile(cwd + '/' + searchPath + PROJECT_ROOT_FILE)):
    searchPath = searchPath + "../" 

libPath = cwd + '/' + searchPath
sys.path.insert(0, libPath)

So, the question becomes how to make sure this gets run every time a new notebook is started. Well, IPython has a configuration option called startup files (scroll down toward the end of the page). I think we could easily insert an improved version of the code above into a startup script and be good to go (plus later source the script after we find it to get project custom settings).

The last thing this made me think about was IPython configurations. I wonder if we should start planning on setting up and shipping an MTK configuration (for stuff like this, initializing Pint, ensuring our set of notebook extensions are loaded, etc). The idea would be when you launch the MTK GUI, it would set its IPython to use the MTK configuration so we have what we need, but don't clobber IPython for other uses. I will have to do some more research on configurations and see what all we can really do with them.