I’ve been using scikit-learn over the past few weeks on a project. While developing and analyzing the data I just needed to get work done without the hassle of a complex installation, the Ubuntu image on EC2 provided just that. Now that the project is ready to be deployed, I need to install scikit-learn on the default Amazon Linux AMI. As I learned, installing scikit-learn is not trivial. It only has two dependencies, but those dependencies have dependencies and you have to sift through documentation of at least 5 packages to truly understand what what is needed to install and in what order. So I decided to brush up on my writing skills, dust off the old blog, and pen a simple guide that I can reference later. I’ll explain what the scikit-learn dependencies are and how to install them on the Amazon Linux AMI, specifically image ami-1624987f.
requirements
First we need Python version 2.6 or greater installed. The main two requirements are NumPy and SciPy. NumPy and SciPy each have their dependencies which are listed below.
- Numpy
- c compiler (gcc)
- fortran compiler (gfortran)
- python header files (2.4.x - 3.2.x)
- Strongly recommended BLAS or LAPACK
- Scipy
- Numpy
- Complete LAPACK library
