Chainer v2 released: difference from v1

Chainer version 2 has been released on 2017 June 1, 

#Chainer v2.0.0 has been released! Memory reduction (33% in ResNet), API clean up, and CuPy as a separate package. https://t.co/xRrmZAlJWT

— Chainer (@ChainerOfficial) June 1, 2017

This post is a summary of what you need to change in your code for your chainer development. Detail change is written in official document.

Installation change

CuPy module becomes independent. Reason is that CuPy is GPU version of numpy, it can be used for many types of linear calculation, not specific for chainer.

To setup Chainer,

  1. If you are using only CPU, this is enough as previous
pip install chainer
  1. If you want to get a benefit of GPU, you need to setup CUDA and need to install CuPy separately.
pip install chainer
pip install cupy

[NOTE] Also, is you have multiple GPU, you can install NCCL before install chainer and cupy to use MultiProcessParallelUpdater

Important to note that NO source code change is necessary for your Chainer development. Chainer will import CuPy only when it is installed in your environment.

Global configuration is introduced

Global config chainer.global_config and thread local config chainer.config is introduced to control the chainer behavior.

Its config includes these flags,

  • chainer.config.cudnn_deterministic
  • chainer.config.debug
  • chainer.config.enable_backprop
  • chainer.config.keep_graph_on_report
  • chainer.config.train
  • chainer.config.type_check
  • chainer.config.use_cudnn

See official document for details.

I think train flag and enable_backprop flag is important to remember.

chainer.config.train

Function behavior can be controlled by using chainer.config.train flag instead of writing it in function argument. I will just cite above official doc for the example,

Example

Consider the following model definition and the code to call it in test mode written for Chainer v1.

# Chainer v1
import chainer.functions as F
class MyModel(chainer.Link):
    ...

    def __call__(self, x, train=True):
        return f(F.dropout(x, train=train))
m = MyModel(...)
y = m(x, train=False)

In Chainer v2, it should be updated into the following code:

# Chainer v2
import chainer.functions as F
class MyModel(chainer.Link):
    ...

    def __call__(self, x):
        return f(F.dropout(x))
m = MyModel(...)
with chainer.using_config('train', False):
    y = m(x)

chainer.config.enable_backprop

volatile flag of Variable class, used in Chainer v1, is removed in v2.

Instead you can use chainer.config.enable_backprop flag to control ON/OFF of backpropagation.

When disable backprop, there is util function chainer.no_backprop_mode(),

x = chainer.Variable(x)
with chainer.no_backprop_mode():
    y = model(x)

Input size of the Link can be omitted

Let me just show the example,

In Chainer v1

conv1=L.Convolution2D(None, 16, 3)

In Chainer v2 it can be also written as, (writing in Chainer v1 notation is also possible)

conv1=L.Convolution2D(16, 3)

This is available with following links,

init_scope closure can be used for Link, Chain initialization

When you define your own Link or Chain class, init_scope() can be used to initialize Parameter or Link,

This writing style is recommended because of IDE (PyCharm etc) can enhance the local variable indexing and show type hinting. But you can still use conventional (chainer v1) initialization as well.

Below is an example of defining Chain class, from official doc,

Example

For example, the following chain initialization code

# Chainer v1
class MyMLP(chainer.Chain):
    def __init__(self):
        super(MyMLP, self).__init__(
            layer1=L.Linear(None, 20),
            layer2=L.Linear(None, 30),
        )
    ...

is recommended to be updated as follows.

# Chainer v2
class MyMLP(chainer.Chain):
    def __init__(self):
        super(MyMLP, self).__init__()
        with self.init_scope():
            self.layer1 = L.Linear(20)
            self.layer2 = L.Linear(30)


Function spec change (GRU, LSTM etc)

This change affects those who are working with NLP (Natural Language Processing) field.

GRU and LSTM function behavior has changed.

Optimizer spec change

Some of the deprecated optimizer function, e.g. zero_grads(), is removed.

Internal Change for better performance

This does not affect to your development code change, but just good to know for your knowledge:

Memory efficiency enhancement

When creating a computational graph for back propagation, Function does not keep the Variable’s array data itself but only keep the reference of it. 

Speed enhancement

Lazy type check is introduced to speed up type check.

Summary

  • CuPy module becomes independent package: need to install separately if using GPU.
  • Global config, chainer.config is introduced
    • FunctionChain call behavior is switched by chainer.config.train flag.
with chainer.using_config('train', False):
    ...
  • volatile flag of Variable is removed, use chainer.config.enable_backprop flag instead
with chainer.no_backprop_mode():
    ...
  • Your custom class of Link and Chain can be initialized by with self.init_scope(): sentence.
with self.init_scope():
    self.l1 = L.Linear(100)
    ...

I noticed that many with statement is used in Chainer v2 code.

Leave a Comment

Your email address will not be published. Required fields are marked *