Chainer version 2 has been released on 2017 June 1,
#Chainer v2.0.0 has been released! Memory reduction (33% in ResNet), API clean up, and CuPy as a separate package. https://t.co/xRrmZAlJWT
— Chainer (@ChainerOfficial) June 1, 2017
This post is a summary of what you need to change in your code for your chainer development. Detail change is written in official document.
- 1 Installation change
- 2 Global configuration is introduced
- 3 Input size of the Link can be omitted
- 4 init_scope closure can be used for Link, Chain initialization
- 5 Function spec change (GRU, LSTM etc)
- 6 Optimizer spec change
- 7 Internal Change for better performance
- 8 Summary
CuPy module becomes independent. Reason is that CuPy is GPU version of numpy, it can be used for many types of linear calculation, not specific for chainer.
To setup Chainer,
- If you are using only CPU, this is enough as previous
pip install chainer
- If you want to get a benefit of GPU, you need to setup CUDA and need to install CuPy separately.
pip install chainer pip install cupy
[NOTE] Also, is you have multiple GPU, you can install NCCL before install chainer and cupy to use
Important to note that NO source code change is necessary for your Chainer development. Chainer will import CuPy only when it is installed in your environment.
Global configuration is introduced
chainer.global_config and thread local config
chainer.config is introduced to control the chainer behavior.
Its config includes these flags,
See official document for details.
train flag and
enable_backprop flag is important to remember.
Function behavior can be controlled by using
chainer.config.train flag instead of writing it in function argument. I will just cite above official doc for the example,
Consider the following model definition and the code to call it in test mode written for Chainer v1.
# Chainer v1 import chainer.functions as F class MyModel(chainer.Link): ... def __call__(self, x, train=True): return f(F.dropout(x, train=train)) m = MyModel(...) y = m(x, train=False)
In Chainer v2, it should be updated into the following code:
# Chainer v2 import chainer.functions as F class MyModel(chainer.Link): ... def __call__(self, x): return f(F.dropout(x)) m = MyModel(...) with chainer.using_config('train', False): y = m(x)
volatile flag of
Variable class, used in Chainer v1, is removed in v2.
Instead you can use
chainer.config.enable_backprop flag to control ON/OFF of backpropagation.
When disable backprop, there is util function
x = chainer.Variable(x) with chainer.no_backprop_mode(): y = model(x)
Input size of the Link can be omitted
Let me just show the example,
In Chainer v1
conv1=L.Convolution2D(None, 16, 3)
In Chainer v2 it can be also written as, (writing in Chainer v1 notation is also possible)
This is available with following links,
init_scope closure can be used for Link, Chain initialization
- New-style parameter registration APIs are added to Link
- New-style child link registration APIs are added to Chain
When you define your own
init_scope() can be used to initialize
This writing style is recommended because of IDE (PyCharm etc) can enhance the local variable indexing and show type hinting. But you can still use conventional (chainer v1) initialization as well.
Below is an example of defining Chain class, from official doc,
For example, the following chain initialization code
# Chainer v1 class MyMLP(chainer.Chain): def __init__(self): super(MyMLP, self).__init__( layer1=L.Linear(None, 20), layer2=L.Linear(None, 30), ) ...
is recommended to be updated as follows.
# Chainer v2 class MyMLP(chainer.Chain): def __init__(self): super(MyMLP, self).__init__() with self.init_scope(): self.layer1 = L.Linear(20) self.layer2 = L.Linear(30)
Function spec change (GRU, LSTM etc)
This change affects those who are working with NLP (Natural Language Processing) field.
LSTM function behavior has changed.
Optimizer spec change
Some of the deprecated optimizer function, e.g.
zero_grads(), is removed.
Internal Change for better performance
This does not affect to your development code change, but just good to know for your knowledge:
Memory efficiency enhancement
When creating a computational graph for back propagation, Function does not keep the Variable’s array data itself but only keep the reference of it.
Lazy type check is introduced to speed up type check.
- CuPy module becomes independent package: need to install separately if using GPU.
- Global config,
Chaincall behavior is switched by
with chainer.using_config('train', False): ...
Variableis removed, use
with chainer.no_backprop_mode(): ...
- Your custom class of
Chaincan be initialized by with self.init_scope(): sentence.
with self.init_scope(): self.l1 = L.Linear(100) ...
I noticed that many
with statement is used in Chainer v2 code.