Merge pull request #2 from matt-graham/master

Some fixes and extra clarifications to environment set up instructions
This commit is contained in:
Pawel Swietojanski 2015-09-28 00:16:01 +01:00
commit beb0cdd0cb

View File

@ -11,91 +11,112 @@
"\n", "\n",
"# Setting up the software\n", "# Setting up the software\n",
"\n", "\n",
"Within this course we are going to work with python (using some auxiliary libraries like numpy and scipy). Depending on the infrastracture and working environment (e.g. DICE), root permission may not be not availabe so the packages cannot be installed in a default locations. A convenient python configuration, which allows to install and update third party libraries easily using package manager, are so called virtual environments. Those can be also used, to work (and test) the code with different versions of software.\n", "Within this course we are going to work with python (using some auxiliary libraries like numpy and scipy). Depending on the infrastracture and working environment (e.g. DICE), root permission may not be not available so the packages cannot be installed in default locations. A convenient python configuration, which allows us to install and update third party libraries easily using package manager, are so called virtual environments. They can be also used to work (and test) the code with different versions of software.\n",
"\n", "\n",
"## Instructions for Windows\n", "## Instructions for Windows\n",
"\n", "\n",
"The fastest way to get working setup on Windows is to install Anaconda (http://www.continuum.io) package. It's a python environment with precompiled most popular scientific libraries. It also works on MacOS, but numpy is not linked without a fee to a numerical library, hence for MacOS we recommend the following procedure.\n", "The fastest way to get working setup on Windows is to install Anaconda (http://www.continuum.io) package. It's a python environment with precompiled versions of the most popular scientific python libraries. It also works on MacOS, but numpy is not linked without a fee to a numerical library, hence for MacOS we recommend the following procedure.\n",
"\n", "\n",
"## Instructions for MacOS\n", "## Instructions for MacOS\n",
"\n", "\n",
"<ul>\n", " * Install macports following instructions at https://www.macports.org/install.php\n",
"<li>Install macports following instructions at https://www.macports.org/install.php</li>\n", " * Install the relevant python packages in macports\n",
"<li>Install the relevant python packages in macports\n",
"<ul>\n",
"<li> sudo port install py27-scipy +openblas </li>\n",
"<li> sudo port install py27-ipython +notebook </li>\n",
"<li> sudo port install py27-notebook </li>\n",
"<li> sudo port install py27-matplotlib </li>\n",
"<li> sudo port select --set python python27 </li>\n",
"<li> sudo port select --set ipython2 py27-ipython </li>\n",
"<li> sudo port select --set ipython py27-ipython </li>\n",
"</ul>\n",
"</ul>\n",
"\n", "\n",
"Make sure that your $PATH has /opt/local/bin before /usr/bin so you pick up the version of python you just installed\n", " ```\n",
" sudo port install py27-scipy +openblas\n",
" sudo port install py27-ipython +notebook\n",
" sudo port install py27-notebook\n",
" sudo port install py27-matplotlib\n",
" sudo port select --set python python27\n",
" sudo port select --set ipython2 py27-ipython\n",
" sudo port select --set ipython py27-ipython\n",
" ```\n",
"\n",
"Make sure that your `$PATH` has `/opt/local/bin` before `/usr/bin` so you pick up the version of python you just installed.\n",
"\n", "\n",
"## Instructions for DICE:\n", "## Instructions for DICE:\n",
"\n", "\n",
"### Configuring virtual environment (the generic way)\n", "### Configuring virtual environment (the generic way)\n",
"\n", "\n",
"<ul>\n", " * `git clone https://github.com/pypa/virtualenv`\n",
"<li>git clone https://github.com/pypa/virtualenv</li>\n", " * Enter the cloned repository and type `virtualenv.py --python /usr/bin/python2.7 --no-site-packages --prefix=~/mlpractical`\n",
"<li>Enter the cloned repository and type $\\texttt{virtualenv.py --python /usr/bin/python2.7 --no-site-packages --prefix=~/mlpractical}$ </li>\n", " * Activate the environment by typing `source ~/mlpractical/bin/activate` (to leave the virtual environment one may type `decativate`). Environments need to be activated every time ones start the new session (unless you do this explicitly in the shell starting scripts, i.e. `~/.bashrc`).\n",
"<li>Activate the environment by typing $\\texttt{source ~/mlpractical/bin/activate}$ (to leave the virtual environment one may type $\\texttt{decativate}$). Environments need to be activated every time ones start the new session (unless you do this explicitly in the shell starting scripts, i.e. ~/.bashrc).\n",
"</ul>\n",
"\n", "\n",
"### Configuring virtual environment (more comfy DICE wrapper)\n", "### Configuring virtual environment (more comfy DICE wrapper)\n",
"\n", "\n",
"DICE comes with a handy virtual environment wrapper, called $\\texttt{mkvirtualenv}$, which allows to simplify a bit the above process, to use it:\n", "DICE comes with a handy virtual environment wrapper, called $\\texttt{mkvirtualenv}$, which allows to simplify a bit the above process, to use it:\n",
"\n", "\n",
"<ul>\n", " * `source /usr/bin/virtualenvwrapper.sh` (add this also to `~/.bashrc` script so it is available automatically every time you ssh to the grid)\n",
"<li>$\\texttt{source /usr/bin/virtualenvwrapper.sh}$ (add this also to $\\texttt{~/.bashrc}$ script so its available automatically every time you ssh to the grid)</li>\n", " * Then type `mkvirtualenv mlpractical --python /usr/bin/python2.7` (this will create an environment under ~/.virtualenvs/mlpractical)\n",
"<li>Then type $\\texttt{mkvirtualenv mlpractical --python /usr/bin/python2.7}$ (this will create an environment under ~/.virtualenvs/mlpractical)</li>\n", " * To activate the environment you can use `workon` script that comes with the wrapper. Simply type: `workon mlpractical`\n",
"<li>To activate the environment you can use $\\texttt{workon}$ script that comes with the wrapper. Simply type: $\\texttt{workon mlpractical}$</li>\n",
"</ul>\n",
"\n", "\n",
"Then, before you follow next, install/upgrade the following packages:\n", "Then, before you follow next, install/upgrade the following packages:\n",
"\n", "\n",
"pip install --upgrade pip <br/>\n", "```\n",
"pip install setuptools <br/>\n", "pip install --upgrade pip\n",
"pip install setuptools --upgrade <br/>\n", "pip install setuptools\n",
"pip install ipython <br/>\n", "pip install setuptools --upgrade\n",
"pip install ipython\n",
"pip install notebook\n", "pip install notebook\n",
"```\n",
"\n", "\n",
"### Installing numpy\n", "### Installing numpy\n",
"\n", "\n",
"Note, having virtual environment properly installed one may go and type `pip install numpy`, though this will most likely lead to the suboptimal configuration where numpy is linked to ATLAS numerical library, which on DICE is compiled in multi-threaded mode. This means whenever numpy use BLAS accelerated computations (using ATLAS), it will use <b>all</b> the available cores at the given machine. This happens because ATLAS can be compiled to either run computations in single *or* multi threaded modes. However, contrary to some other backends, the latter does not allow to use an arbitrary number of threads (specified by the user prior to computation). This is highly suboptimal, as the potential speed-up resulting from paralleism depends on many factors like the communication overhead between threads, the size of the problem, etc.. Using all cores for our exercises is not-necessary.\n", "Note, having virtual environment properly installed one may then run `pip install numpy` to use pip to install numpy, though this will most likely lead to the suboptimal configuration where numpy is linked to ATLAS numerical library, which on DICE is compiled in multi-threaded mode. This means whenever numpy use BLAS accelerated computations (using ATLAS), it will use **all** the available cores at the given machine. This happens because ATLAS can be compiled to either run computations in single *or* multi threaded modes. However, contrary to some other backends, the latter does not allow to use an arbitrary number of threads (specified by the user prior to computation). This is highly suboptimal, as the potential speed-up resulting from paralleism depends on many factors like the communication overhead between threads, the size of the problem, etc. Using all cores for our exercises is not-necessary.\n",
"\n", "\n",
"For which reason, we are going to compile our own version of BLAS package, called *OpenBlas*. It allows to specify the number of threads manually by setting an environmental variable OMP_NUM_THREADS=N, where N is a desired number of parallel threads (please use 1 by default).\n", "For which reason, we are going to compile our own version of BLAS package, called *OpenBlas*. It allows to specify the number of threads manually by setting an environmental variable OMP_NUM_THREADS=N, where N is a desired number of parallel threads (please use 1 by default). You can set an environment variable in the current shell by running\n",
"\n",
"```\n",
"export OMP_NUM_THREADS=1\n",
"```\n",
"\n",
"(note the lack of spaces around the equals sign and use of `export` to define an environment variable which will be available in sub-shells rather than just a variable local to the current shell).\n",
"\n", "\n",
"#### OpenBlas\n", "#### OpenBlas\n",
"\n", "\n",
"To install OpenBlas library type:\n", "To install OpenBlas library run:\n",
"<ul>\n", "\n",
"<li>$\\texttt{git clone git://github.com/xianyi/OpenBLAS }$</li>\n", "```\n",
"<li>$ \\texttt{cd OpenBLAS}$ </li>\n", "git clone git://github.com/xianyi/OpenBLAS\n",
"<li>$ \\texttt{make}$</li>\n", "cd OpenBLAS\n",
"<li>$ \\texttt{make PREFIX=/path/to/OpenBLAS install}$ </li>\n", "make\n",
"<li>Add $\\texttt{/path/to/OpenBLAS/lib}$ to LD_LIBRARY_PATH environmental variable (do it in ~/.bashrc by `export` LD_LIBRARY_PATH=\"\\$LD_LIBRARY_PATH:/path/to/OpenBLAS/lib\") </li>\n", "make PREFIX=/path/to/OpenBLAS/lib install\n",
"</ul>\n", "```\n",
"\n",
"Once OpenBLAS is finished compiling we need to ensure the compiled shared library files in the `lib` subdirectory are available to the shared library loader. This can be done by appending the absolute path to the `lib` subdirectory to the `LD_LIBRARY_PATH` environment variable. To ensure this changes persist we will change the bash start up file `~/.bashrc` by opening it in a text editor and adding the following line\n",
"\n",
"```\n",
"export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/OpenBLAS/lib\n",
"```\n",
"\n",
"After you have edited `.bashrc` run\n",
"\n",
"```\n",
"source ~/.bashrc\n",
"source ~/mlpractical/bin/activate\n",
"```\n",
"\n",
"to rerun the bash start up script make sure the new environment variable is available in the current shell and then reactivate the virtual environment.\n",
"\n", "\n",
"#### Numpy\n", "#### Numpy\n",
"\n", "\n",
"<code>\n", "To install `numpy` linked against the OpenBLAS libraries we just compiled, first run the following\n",
"\n",
"```\n",
"wget http://downloads.sourceforge.net/project/numpy/NumPy/1.9.2/numpy-1.9.2.zip\n", "wget http://downloads.sourceforge.net/project/numpy/NumPy/1.9.2/numpy-1.9.2.zip\n",
"unzip numpy-1.9.2.zip\n", "unzip numpy-1.9.2.zip\n",
"cd numpy-1.9.2\n", "cd numpy-1.9.2\n",
"echo \"[openblas]\" >> site.cfg\n", "echo \"[openblas]\" >> site.cfg\n",
"echo \"library_dirs = /path/to/OpenBlas/lib\" >> site.cfg\n", "echo \"library_dirs = /path/to/OpenBlas/lib\" >> site.cfg\n",
"echo \"include_dirs = /path/to/OpenBLAS/include\" >> site.cfg\n", "echo \"include_dirs = /path/to/OpenBLAS/include\" >> site.cfg\n",
"</code>\n",
"\n",
"python setup.py build --fcompiler=gnu95\n", "python setup.py build --fcompiler=gnu95\n",
"```\n",
"\n", "\n",
"Assuming the virtual environment is activated, the below command will install numpy in a desired space (~/.virtualenvs/mlpractical/...):\n", "Assuming the virtual environment is activated, the below command will install numpy in a desired space (`~/.virtualenvs/mlpractical/...`):\n",
"\n", "\n",
"```\n",
"python setup.py install\n", "python setup.py install\n",
"```\n",
"\n", "\n",
"\n", "\n",
"### Installing remaining packages and running tests\n", "### Installing remaining packages and running tests\n",
@ -317,7 +338,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython2", "pygments_lexer": "ipython2",
"version": "2.7.9" "version": "2.7.6"
} }
}, },
"nbformat": 4, "nbformat": 4,