Using Theano on Mavericks with a Virtualenv

lmjohns35 November 2014

I updated my laptop to Mavericks some time ago, and at the same time decided to make the switch to Python 3. I’d learned that all of my trusted numeric libraries already worked with the newest Python, and several of the new language features (yield from, float division, etc.) were appealing. So almost immediately after installing Mavericks, I set up a Python 3 virtualenv:

brew install virtualenv
virtualenv --python python3 ~/.py/py3

and installed all the usual suspects:

pip install numpy scipy matplotlib sympy ipython ...

This worked beautifully. It turns out that every package that I use for my everyday work runs just fine in Python 3.

But when the time came to install Theano, things were a little less rosy. After installing CUDA and making sure the paths were correct etc. etc., I was a bit disappointed, but not totally surprised, to find that Theano didn’t seem to work in this environment.

Any time I started up a Python program that used Theano, I would get an enormous error message in my terminal window displaying the 4k-line C program that must be the basic Theano runtime environment. The printout is followed by a number of compiler errors and warnings.

The warnings closest to the bottom of the error message reported:

clang: warning: -framework CoreFoundation: 'linker' input unused
ld: warning: -pie being ignored. It is only used when linking a main executable
Undefined symbols for architecture x86_64:
  "_PyModule_Create2", referenced from:
      _PyInit_cuda_ndarray in tmpxft_00009a16_00000000-16_mod.o
  "_PyUnicode_AsUTF8", referenced from:
      CudaNdarray_CreateArrayObj(CudaNdarray*, _object*) in tmpxft_00009a16_00000000-16_mod.o
      CudaNdarray_TakeFrom(CudaNdarray*, _object*) in tmpxft_00009a16_00000000-16_mod.o
  "_PyUnicode_FromString", referenced from:
      put_in_dict(_object*, char const*, int) in tmpxft_00009a16_00000000-16_mod.o
      GetDeviceProperties(_object*, _object*) in tmpxft_00009a16_00000000-16_mod.o
      CudaNdarray_active_device_name(_object*, _object*) in tmpxft_00009a16_00000000-16_mod.o
      CudaNdarray_get_dtype(CudaNdarray*, void*) in tmpxft_00009a16_00000000-16_mod.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)

['/usr/local/cuda/bin/nvcc', '-shared', '-g', '-O3', '-m64', '-Xcompiler',
'-DCUDA_NDARRAY_CUH=md67f7c8a21306c67152a70a88a837011,-fPIC', '-Xlinker',
'-rpath,/Users/leif/.theano/compiledir_Darwin-13.4.0-x86_64-i386-64bit-i386-3.4.2-64/cuda_ndarray',
'-I/Users/leif/.py/py3/lib/python3.4/site-packages/theano/sandbox/cuda',
'-I/Users/leif/.py/py3/lib/python3.4/site-packages/numpy/core/include',
'-I/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/include/python3.4m',
'-o', '/Users/leif/.theano/compiledir_Darwin-13.4.0-x86_64-i386-64bit-i386-3.4.2-64/cuda_ndarray/cuda_ndarray.so',
'mod.cu', '-L/usr/local/cuda/lib', '-lcublas', '-lcudart',
'-L/usr/local/Cellar/python/2.7.8_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/config',
'-ldl', '-lpython2.7', '-Xcompiler', '-framework,CoreFoundation', '-Xlinker', '-pie']
ERROR (theano.sandbox.cuda): Failed to compile cuda_ndarray.cu:
('nvcc return status', 1, 'for cmd', '/usr/local/cuda/bin/nvcc -shared -g -O3 -m64
-Xcompiler -DCUDA_NDARRAY_CUH=md67f7c8a21306c67152a70a88a837011,-fPIC -Xlinker
-rpath,/Users/leif/.theano/compiledir_Darwin-13.4.0-x86_64-i386-64bit-i386-3.4.2-64/cuda_ndarray
-I/Users/leif/.py/py3/lib/python3.4/site-packages/theano/sandbox/cuda
-I/Users/leif/.py/py3/lib/python3.4/site-packages/numpy/core/include
-I/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/include/python3.4m
-o /Users/leif/.theano/compiledir_Darwin-13.4.0-x86_64-i386-64bit-i386-3.4.2-64/cuda_ndarray/cuda_ndarray.so
mod.cu -L/usr/local/cuda/lib -lcublas -lcudart
-L/usr/local/Cellar/python/2.7.8_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/config
-ldl -lpython2.7 -Xcompiler -framework,CoreFoundation -Xlinker -pie')
WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu0 is not available

For a while I thought this was some awful error having to do with linker flags and the nvcc compiler. What’s worse, I couldn’t find anyone online with the same error message, only a few reports of similar errors when trying to use, e.g., the boost libraries, all of which seemed to be fixed by correcting library paths on the system in question.

I finally ran across this Stack Overflow question, however, and suddenly I noticed that I’d missed the real culprit in the compiler warnings. I was actually having a linker path problem, as indicated by the following line in the error message:

-L/usr/local/Cellar/python/2.7.8_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/config

The linker was trying to pull in the Python 2 libraries from outside my virtualenv!

A little googling revealed this virtualenv bug and provided the fix: you need to symlink or copy python-config into your virtualenv’s bin directory! For my case, I installed three symlinks, to parallel the existing setup for the python3.4 binary:

ln -s /usr/local/bin/python3.4-config ~/.py/py3/bin/
ln -s python3.4-config ~/.py/py3/bin/python3-config
ln -s python3.4-config ~/.py/py3/bin/python-config

Et voilà, ça y est!