Chainer version 2 has been released on 2017 June 1,
#Chainer v2.0.0 has been released! Memory reduction (33% in ResNet), API clean up, and CuPy as a separate package. https://t.co/xRrmZAlJWT
— Chainer (@ChainerOfficial) June 1, 2017
This post is a summary of what you need to change in your code for your chainer development. Detail change is written in official document.
Contents
Installation change
CuPy module becomes independent. Reason is that CuPy is GPU version of numpy, it can be used for many types of linear calculation, not specific for chainer.
To setup Chainer,
- If you are using only CPU, this is enough as previous
pip install chainer
- If you want to get a benefit of GPU, you need to setup CUDA and need to install CuPy separately.
pip install chainer pip install cupy
[NOTE] Also, is you have multiple GPU, you can install NCCL before install chainer and cupy to use MultiProcessParallelUpdater.
Important to note that NO source code change is necessary for your Chainer development. Chainer will import CuPy only when it is installed in your environment.
Global configuration is introduced
Global config chainer.global_config
and thread local config chainer.config
is introduced to control the chainer behavior.
Its config includes these flags,
chainer.config.cudnn_deterministic
chainer.config.debug
chainer.config.enable_backprop
chainer.config.keep_graph_on_report
chainer.config.train
chainer.config.type_check
chainer.config.use_cudnn
See official document for details.
I think train
flag and enable_backprop
flag is important to remember.
chainer.config.train
Function behavior can be controlled by using chainer.config.train
flag instead of writing it in function argument. I will just cite above official doc for the example,
Example
Consider the following model definition and the code to call it in test mode written for Chainer v1.
# Chainer v1 import chainer.functions as F class MyModel(chainer.Link): ... def __call__(self, x, train=True): return f(F.dropout(x, train=train)) m = MyModel(...) y = m(x, train=False)
In Chainer v2, it should be updated into the following code:
# Chainer v2 import chainer.functions as F class MyModel(chainer.Link): ... def __call__(self, x): return f(F.dropout(x)) m = MyModel(...) with chainer.using_config('train', False): y = m(x)
chainer.config.enable_backprop
volatile
flag of Variable
class, used in Chainer v1, is removed in v2.
Instead you can use chainer.config.enable_backprop
flag to control ON/OFF of backpropagation.
When disable backprop, there is util function chainer.no_backprop_mode()
,
x = chainer.Variable(x) with chainer.no_backprop_mode(): y = model(x)
Input size of the Link can be omitted
Let me just show the example,
In Chainer v1
conv1=L.Convolution2D(None, 16, 3)
In Chainer v2 it can be also written as, (writing in Chainer v1 notation is also possible)
conv1=L.Convolution2D(16, 3)
This is available with following links,
init_scope closure can be used for Link, Chain initialization
- New-style parameter registration APIs are added to Link
- New-style child link registration APIs are added to Chain
When you define your own Link
or Chain
class, init_scope
() can be used to initialize Parameter
or Link
,
This writing style is recommended because of IDE (PyCharm etc) can enhance the local variable indexing and show type hinting. But you can still use conventional (chainer v1) initialization as well.
Below is an example of defining Chain class, from official doc,
Example
For example, the following chain initialization code
# Chainer v1 class MyMLP(chainer.Chain): def __init__(self): super(MyMLP, self).__init__( layer1=L.Linear(None, 20), layer2=L.Linear(None, 30), ) ...
is recommended to be updated as follows.
# Chainer v2 class MyMLP(chainer.Chain): def __init__(self): super(MyMLP, self).__init__() with self.init_scope(): self.layer1 = L.Linear(20) self.layer2 = L.Linear(30)
Function spec change (GRU, LSTM etc)
This change affects those who are working with NLP (Natural Language Processing) field.
GRU
and LSTM
function behavior has changed.
Optimizer spec change
Some of the deprecated optimizer function, e.g. zero_grads()
, is removed.
Internal Change for better performance
This does not affect to your development code change, but just good to know for your knowledge:
Memory efficiency enhancement
When creating a computational graph for back propagation, Function does not keep the Variable’s array data itself but only keep the reference of it.
Speed enhancement
Lazy type check is introduced to speed up type check.
Summary
- CuPy module becomes independent package: need to install separately if using GPU.
- Global config,
chainer.config
is introducedFunction
,Chain
call behavior is switched bychainer.config.train
flag.
with chainer.using_config('train', False): ...
volatile
flag ofVariable
is removed, usechainer.config.enable_backprop
flag instead
with chainer.no_backprop_mode(): ...
- Your custom class of
Link
andChain
can be initialized by with self.init_scope(): sentence.
with self.init_scope(): self.l1 = L.Linear(100) ...
I noticed that many with
statement is used in Chainer v2 code.