.. _dev_start_guide:
=====================
Developer Start Guide
=====================
Resources
=========
See :ref:`theano_community` for a list of Theano resources. The
following groups/mailing-lists are especially useful to Theano
contributors: `theano-dev`_, `theano-buildbot`_, and `theano-github`_.
.. _theano-dev: https://groups.google.com/group/theano-dev
.. _theano-github: https://groups.google.com/group/theano-github
.. _theano-buildbot: https://groups.google.com/group/theano-buildbot
To get up to speed, you'll need to
- Learn some non-basic Python to understand what's going on in some of the
trickier files (like tensor.py).
- Go through the `NumPy documentation`_.
- Learn to write reStructuredText_ for epydoc_ and Sphinx_.
- Learn about how unittest_ and nose_ work
.. _Sphinx: http://sphinx.pocoo.org/
.. _reStructuredText: http://docutils.sourceforge.net/rst.html
.. _epydoc: http://epydoc.sourceforge.net/
.. _NumPy documentation: http://docs.scipy.org/numpy/
.. _unittest: http://docs.python.org/library/unittest.html
.. _nose: http://somethingaboutorange.com/mrl/projects/nose/
Installation and configuration
==============================
To obtain developer access: register with `GitHub
`_ and create a fork of `Theano
`_.
This will create your own Theano project on GitHub, referred later
as "YourProfile/Theano", or "origin", from which you will be able to
contribute to the original Theano/Theano, also called "central".
Create a local copy
-------------------
Clone your fork locally with
.. code-block:: bash
git clone git@github.com:YOUR_GITHUB_LOGIN/Theano.git
For this URL to work, you must set your public ssh keys inside your
`github account setting `_.
From your local repository, your own fork on GitHub will be called "origin".
Then, add a reference to the original ("central") Theano repository with
.. code-block:: bash
git remote add central git://github.com/Theano/Theano.git
You can choose another name than "central" to reference Theano/Theano
(for instance, NumPy uses "upstream"), but this documentation will stick
to "central."
You can then test your installation of Theano by following the steps of
:ref:`testing_installation`.
Using your local copy
---------------------
To update your library to the latest revision, you should have a local branch
that tracks central/master. You can add one (named "trunk" here) with:
.. code-block:: bash
git fetch central
git branch trunk central/master
Once you have such a branch, in order to update it, do:
.. code-block:: bash
git checkout trunk
git pull
Keep in mind that this branch should be "read-only": if you want to
patch Theano, you should work in another branch, like described in the
:ref:`dev_workflow` section below.
Configure Git
-------------
On your local machine, you need to configure git with basic informations:
.. code-block:: bash
git config --global user.email you@yourdomain.example.com
git config --global user.name "Your Name Comes Here"
You can also instruct git to use color in diff. For this, you need to
add those lines in the file ~/.gitconfig
.. code-block:: cfg
[color]
branch = auto
diff = auto
interactive = auto
status = auto
.. _dev_workflow:
Development Workflow
====================
Start a new local branch
------------------------
When working on a new feature in your own fork, start from an up-to-date copy
of the `master` branch (the principal one) of the central repository
(Theano/Theano on GitHub):
.. code-block:: bash
git fetch central
git checkout -b my_shiny_feature central/master
.. note::
This last line is a shortcut for:
.. code-block:: bash
git branch my_shiny_feature central/master
git checkout my_shiny_feature
Submit your changes to the central repository
---------------------------------------------
Once your code is ready for others to review, you need to push your
branch to your github fork first:
.. code-block:: bash
git push -u origin my_shiny_feature
Then, go to your fork's github page on the github website, select your
feature branch and hit the "Pull Request" button in the top right
corner. This will signal the maintainers that you wish to submit your
changes for inclusion in central/master.
If you don't get any feedback, bug us on the theano-dev mailing list.
Address reviewer comments
-------------------------
Your pull request will be reviewed by members of the core development
team. If your branch is not directly accepted, the reviewers will use
GitHub's system to add "notes", either general (on the entire commit),
or "line notes", relative to a particular line of code.
In order to have the pull request accepted, you may have to answer
the reviewer's questions, you can do that on GitHub.
You may also have to edit your code to address their concerns. Some
of the usual requests include fixing typos in comments, adding or
correcting comments, adding unit tests in the test suite. In order to
do that, you should continue your edits in the same branch you used (in
this example, "my_shiny_feature"). For instance, if you changed your
working branch, you should first:
.. code-block:: bash
git checkout my_shiny_feature
Then, edit your code, and test it appropriately (see
:ref:`quality_contributions` below), and push it again to your GitHub
fork, like the first time (except the ``-u`` option is only needed the
first time):
.. code-block:: bash
git push origin my_shiny_feature
The pull request to the central repository will then be automatically
updated by GitHub. However, the reviewers will not be automatically
notified of your revision, so it is advised to reply to the comments on
GitHub, to let them know that you have submitted a fix.
.. _quality_contributions:
Tips for Quality Contributions
==============================
Coding Style Auto Check
-----------------------
In Theano, we use the same coding style as the `Pylearn
`_
project, except that we don't use the numpy docstring standard.
The principal thing to know is that we follow the
`PEP 8 `_ coding style.
We use git hooks provided in the project `pygithooks
`_ to validate that commits
respect pep8. This happens when each user commits, not when we
push/merge to the Theano repository. Github doesn't allow us to have
code executed when we push to the repository. So we ask all
contributors to use those hooks.
For historic reason, we currently don't have all files respecting pep8.
We decided to fix everything incrementally. So not all files respect it
now. So we strongly suggest that you use the "increment" pygithooks
config option to have a good workflow. See the pygithooks main page
for how to set it up for Theano and how to enable this option.
Setting up your Editor for PEP8
-------------------------------
Here are instructions for :ref:`Vim ` and :ref:`Emacs
`. If you have similar instructions for other text editors
or IDE, please let us know and we will update this documentation.
.. _vim_pep8:
Vim
~~~
Detection of warnings and errors is done by the `pep8`_ script
(or `flake8`_, that also checks for other things, like syntax
errors). Syntax highlighting and general integration into Vim is done by
the `Syntastic`_ plugin for Vim.
To install flake8, simply run::
pip install flake8
You can use ``easy_install`` instead of ``pip``, and ``pep8`` instead of
``flake8`` if you prefer. The important thing is that the ``flake8`` or
``pep8`` executable ends up in your ``$PATH``.
To install Syntastic, according to its documentation, the easiest way is
to install `pathogen.vim`_ first.
Here's a relevant extract of pathogen.vim's installation instructions:
Install to ``~/.vim/autoload/pathogen.vim``. Or copy and paste::
mkdir -p ~/.vim/autoload ~/.vim/bundle; \
curl -so ~/.vim/autoload/pathogen.vim \
https://raw.github.com/tpope/vim-pathogen/HEAD/autoload/pathogen.vim
If you don't have ``curl``, use ``wget -O`` instead.
By the way, if you're using Windows, change all occurrences of ``~/.vim``
to ``~\vimfiles``.
Add this to your vimrc::
call pathogen#infect()
Now any plugins you wish to install can be extracted to a subdirectory
under ``~/.vim/bundle``, and they will be added to the ``'runtimepath'``.
Now, we can install Syntastic. From the installation instructions:
.. code-block:: bash
cd ~/.vim/bundle
git clone https://github.com/scrooloose/syntastic.git
Then reload vim, run ``:Helptags``, and check out ``:help syntastic.txt``.
From now on, when you save into a Python file, a syntax check will be
run, and results will be displayed using Vim's `quickfix`_ mechanism
(more precisely, a location-list). A few useful commands are:
* Open the list of errors: ``:lopen``, that can be abbreviated in ``:lop``
(denoted ``:lop[en]``).
* Close that list: ``:lcl[ose]``.
* Next error: ``:lne[xt]``.
* Previous error: ``:lp[revious]``.
Once you fix errors, messages and highlighting will still appear in the
fixed file until you save it again.
We can also configure the ``~/.vimrc`` to make it easier to work with Syntastic.
For instance, to add a summary in the status bar, you can add::
set statusline+=%{SyntasticStatuslineFlag()}
To bind F2 and F3 to navigate to previous and next error, you can add::
map :lprevious
map :lnext
You can prefix those by ``autocmd FileType python`` if you want these
bindings to work only on Python files.
.. _pep8: http://pypi.python.org/pypi/pep8
.. _flake8: http://pypi.python.org/pypi/flake8
.. _Syntastic: https://github.com/scrooloose/syntastic/
.. _pathogen.vim: https://github.com/tpope/vim-pathogen
.. _quickfix: http://vimdoc.sourceforge.net/htmldoc/quickfix.html#quickfix
.. _emacs_pep8:
Emacs
~~~~~
There is an **excellent** system to configure emacs for Python:
`emacs-for-python
`_. It gathers many
emacs config into one, and modifies them to behave together nicely. You
can use it to check for pep8 compliance and for Python syntax errors.
To install it on Linux, you can do like this:
.. code-block:: bash
cd
git clone https://github.com/gabrielelanaro/emacs-for-python.git ~/.emacs.d/emacs-for-python
Then in your ``~/.emacs`` file, add this:
.. code-block:: common-lisp
;; Mandatory
(load-file "~/.emacs.d/emacs-for-python/epy-init.el")
(add-to-list 'load-path "~/.emacs.d/emacs-for-python/") ;; tell where to load the various files
;; Each of them enables different parts of the system.
;; Only the first two are needed for pep8, syntax check.
(require 'epy-setup) ;; It will setup other loads, it is required!
(require 'epy-python) ;; If you want the python facilities [optional]
(require 'epy-completion) ;; If you want the autocompletion settings [optional]
(require 'epy-editing) ;; For configurations related to editing [optional]
;; [newer version of emacs-for-python]
(require 'epy-nose) ;; For shortcut to call nosetests [optional]
;; Define f10 to previous error
;; Define f11 to next error
(require 'epy-bindings) ;; For my suggested keybindings [optional]
;; Some shortcut that do not collide with gnome-terminal,
;; otherwise, "epy-bindings" define f10 and f11 for them.
(global-set-key [f2] 'flymake-goto-prev-error)
(global-set-key [f3] 'flymake-goto-next-error)
;; Next two lines are the checks to do. You can add more if you wish.
(epy-setup-checker "pyflakes %f") ;; For python syntax check
(epy-setup-checker "pep8 -r %f") ;; For pep8 check
.. note::
The script highlights problematic lines. This can make part of the
line not readable depending on the background. To replace the line
highlight by an underline, add this to your emacs configuration
file:
;; Make lines readable when there is an warning [optional]
(custom-set-faces
'(flymake-errline ((((class color)) (:underline "red"))))
'(flymake-warnline ((((class color)) (:underline "yellow")))))
Unit tests
----------
When you submit a pull request, your changes will automatically be
tested via Travis-CI. This will post the results of the tests with a
little icon next to your commit. A yellow circle means the tests are
running. A red X means the tests failed and a green circle means the
tests passed.
Just because the tests run automatically does not mean you shouldn't
run them yourself to make sure everything is all right. You can run
only the portion you are modifying to go faster and have travis to
make sure there are no global impacts.
Also, if you are changing GPU code, travis doesn't test that, because
there are no GPUs on the test nodes.
To run the test suite with the default options, you can follow the
instructions of :ref:`testing_installation`.
Each night we execute all the unit tests automatically, with several
sets of options. The result is sent by email to the `theano-buildbot`_
mailing list.
For more detail, see :ref:`metadocumentation_nightly_build`.
To run all the tests with the same configuration as the buildbot, run
this script:
.. code-block:: bash
theano/misc/do_nightly_build
This script accepts arguments that it forwards to nosetests. You can
run only some tests or enable pdb by giving the equivalent nosetests
parameters.
More Advanced Git Usage
=======================
You can find information and tips in the `numpy development
`_
page. Here are a few.
Cleaning up branches
--------------------
When your pull request has been merged, you can delete the branch from
your GitHub fork's list of branches. This is useful to avoid having too
many branches staying there. Deleting this remote branch is achieved
with:
.. code-block:: bash
git push origin :my_shiny_feature
This lines pushes to the "origin" repository (your fork of Theano on
GitHub), into the branch "my_shiny_feature", an empty content (that's
why there is nothing before the colon), effectively removing it.
The branch will still be present in your local clone of the repository.
If you want to delete it from there, too, you can run:
.. code-block:: bash
git branch -d my_shiny_feature
Amending a submitted pull request
---------------------------------
If you want to fix a commit already submitted within a pull request
(e.g. to fix a small typo), before the pull request is accepted, you can
do it like this to keep history clean:
.. code-block:: bash
git checkout my_shiny_feature
git commit --amend
git push origin my_shiny_feature:my_shiny_feature
Do not abuse that command, and please use it only when there are only
small issues to be taken care of. Otherwise, it becomes difficult to
match the comments made by reviewers with the new modifications.
In the general case, you should stick with the approach described above.
Cleaning up history
-------------------
Sometimes you may have commits in your feature branch that
are not needed in the final pull request. There is a `page
`_ that talks about
this. In summary:
* Commits to the trunk should be a lot cleaner than commits to your
feature branch; not just for ease of reviewing but also
because intermediate commits can break blame (the bisecting tool).
* `git merge --squash` will put all of the commits from your feature branch into one commit.
* There are other tools that are useful if your branch is too big for one squash.
Add another distant repository
------------------------------
To collaborate with another user on some feature he is developing, and
that is not ready for inclusion in central, the easiest way is to use a
branch of their Theano fork (usually on GitHub).
Just like we added Theano/Theano as a remote repository, named
"central", you can add (on your local machine) a reference to their fork
as a new remote repository. REPO_NAME is the name you choose to name
this fork, and GIT_REPO_PATH is the URL of the fork in question.
.. code-block:: bash
git remote add REPO_NAME GIT_REPO_PATH
Then, you can create a new local branch (LOCAL_BRANCH_NAME) based on
a specific branch (REMOTE_BRANCH_NAME) from the remote repository
(REPO_NAME):
.. code-block:: bash
git checkout -b LOCAL_BRANCH_NAME REPO_NAME/REMOTE_BRANCH_NAME
Other tools that can help you
=============================
* `cProfile `_: time profiler that work at function level.
* `Yep `_: A module for profiling compiled extensions.
* `autopep8 `_: A tool that automatically formats Python code to conform to the PEP 8 style guide.
* `line_profiler `_: Line-by-line profiler.
* `memory_profiler `_: memory profiler
* `runsnake `_: Gui for cProfile(time profiler) and Meliae(memory profiler)
* `Guppy `_: Supports object and heap memory sizing, profiling and debugging.
* `hub `_: A tool that adds github commands to the git command line.
* `git pull-requests `_: Another tool for git/github command line.