Chainer class introduction

Chainer is a library for deep learning. You can implement current trend network e.g. CNN (Convolutional Neural Network), RNN (Recurrent Neural Network) etc.

* The post is written in 2016 July, with Chainer version 1.10, but Chainer is still in active development and some of the functionality specification may change in the future.

Variable, functions, links and Chain

At first, please read Introduction to Chainer. To summarize, input – output relationship of deep neural network is maintained by computational graph internally, which is constructed using Variablefunctionslinks and Chain. Once deep neural network is constructed, forward/backward propagation can be executed for training.

  • Variable
    Variable
     will be used as an input of FunctionsLinks and Chain.
    Ex, below code declares Variable x, and can be used as a argument of FunctionsLinks and Chain.
    x = Variable(np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32))
  • Functions
    Functions define calculation of Variable. When you want to proceed calculation with Variable, you need to use Functions, instead of standard math library. Note that you can also use arithmetric operator, +, – , *, / etc for calculation with Variable.
    Ex 1. F.sum(x) calculate sum of array x.
    Ex 2.F.sigmoid(x) calculate sigmoid function in element wise.
  • Links
    Links consist of Functions and some parameters (which may be tuned in training phase). It is one component of functions, it is often the case that Links can also be considered as one layer of the neural network.
    Ex 1. L.Linear(3, 2) defines fully connected layer from 3 units to 2 units.
    Ex 2. L.Convolution2D(8, 16, 3) defines Convolutional layer from 8 input channel (feature map) to 16 output channel (feature map) with kernel size (filter size of convolution) 3×3.
  • Chain
    Chain consists of Links and Functions. It is used todefine deep neural network.
    Chain itself is subclass of Link.
    Ex (explained detail later). below chain “basic_cnn_small” consists of 3 convolutional layer, and leaky_relu & clipped_relu are used for activation function.
class basic_cnn_small(Chain):
    """
    Basic CNN model.
    The network consists of Convolutional layer and leaky relu is used for activation
    """
    def __init__(self, inout_ch):
        super(basic_cnn_small, self).__init__(
            conv1=L.Convolution2D(in_channels=inout_ch, out_channels=8, ksize=3, stride=1),
            conv2=L.Convolution2D(8, 16, 3, stride=1),
            conv3=L.Convolution2D(16, inout_ch, 3, stride=1),
        )
        self.train = True

    def __call__(self, x, t=None):
        self.clear()

        h = F.leaky_relu(self.conv1(x), slope=0.1)
        h = F.leaky_relu(self.conv2(h), slope=0.1)
        h = F.clipped_relu(self.conv3(h), z=1.0)
        if self.train:
            self.loss = F.mean_squared_error(h, t)
            return self.loss
        else:
            return h

    def preprocess_x(self, x_data):
        """
        model specific preprocessing
        :param x_data:
        :return:
        """
        scaled_x = image_processing.nearest_neighbor_2x(x_data)
        return image_processing.image_padding(scaled_x, total_padding // 2)

    def clear(self):
        self.loss = None
        # self.accuracy = None

 After defining Chain class, it can be instantiated (to model variable)

import arch.basic_cnn_small as model_arch
model = model_arch.basic_cnn_small(inout_ch=inout_ch)

These classes are imported by

from chainer import Variable, Chain
import chainer.functions as F
import chainer.links as L


Comment: whole Chain definition is done by python language, Chainer does not use any definition file for the neural network. This is contrary to some of the famous machine learning library e.g. caffe, which uses proto.txt for the definition of neural network. This specification comes from one of the core concept of Chainer, “define by run” scheme, so that you don’t need to pre-define the neural network model before executing the code.

Optimizer and Serializer

Optimizer and Serializer acts as helper function for Chain class. They introduce convenient tool/functionality to the Chain class.

  • OptimizerIt helps to train the parameter of model (Chain).
    * the parameters are defined in Chainlinks.
    Ex 1.  optimizer = optimizers.SGD() Prepare optimizer for Stocastic gradient descent method.
    Ex 2. optimizer = optimizers.Adam(alpha=0.0001) Prepare optimizer for ADAM method.
  • SerializerIt  provides the method for save/load the model with Chain class.
    Ex 1. save the model (after training) serializers.save_npz('my.model', model)
    Ex 2. load the model (for inference) serializers.load_npz(model_load_path, model)

Leave a Comment

Your email address will not be published. Required fields are marked *