MNIST inference code

MNIST inference

  We already learned how to write training code in chainer, the last task is to use this trained model to inference (predict) the test input MNIST image. Inference code structure usually becomes as follows, Prepare input data Instantiate the trained model Load the trained model Feed input data into loaded model to get inference result You have already learned the necessary stuff, and it is easy. See for the source code. Prepare input data For MNIST, it is easy in one line

  Instantiate the trained model and load the model

Here, note that model can be loaded after instantiating the model. This model must have […]

Continue reading →

Chainer family

  Recently several sub-libraries for Chainer are released, ChainerRL RL: Reinforcement Learning Deep Reinforcement Learning library. cite from Recent state-of-the-art deep reinforcement algorithms are implemented, including A3C (Asynchronous Advantage Actor-Critic) ACER (Actor-Critic with Experience Replay) (only the discrete-action version for now) Asynchronous N-step Q-learning DQN (including Double DQN, Persistent Advantage Learning (PAL), Double PAL, Dynamic Policy Programming (DPP)) DDPG (Deep Deterministic Poilcy Gradients) (including SVG(0)) PGT (Policy Gradient Theorem)  How to install

  github repository ChainerRL – Deep Reinforcement Learning Library   ChainerCV CV: Computer Vision Image processing library for deep learning training. Common data-augmentation are implemented. How to install

github repository document ChainerMN MN: Multi Node […]

Continue reading →

Chainer version 2 – updated part

  Chainer version 2 is planned to be released in Apr 2017. Pre-release version is already available, install by this command

  The biggest change is that cupy (Roughly, it is GPU version of numpy) becomes independent, and provided separately.   Reference Chainer v2 alpha from Seiya Tokui

Continue reading →

Writing organized, reusable, clean training code using Trainer module

  Training code abstraction with Trainer Until now, I was implementing the training code in “primitive” way to explain what kind of operations are going on in deep learning training (※). However, the code can be written in much clean way using Trainer modules in Chainer. ※ Trainer modules are implemented from version 1.11, and some of the open source projects are implemented without Trainer. So it helps to understand these codes by knowing the training implementation without Trainer module as well. Motivation for using Trainer We can notice there are many “typical” operations widely used in machine learning, for example Iterating minibatch training, with minibatch sampled ramdomly Separate train […]

Continue reading →

Design patterns for defining model

  Machine learning consists of training phase and predict/inference phase, and what  model need to calculate is different Training phase: calculate loss (between on output and target) Predict/Inference phase: calculate output To manage this, I often see below 2 patterns to manage this.   Predictor – Classifier framework See ( and are also implemented in Predictor – Classifier framework) 2 Chain classes, “Predictor” and “Classifier” are used for this framework. Training phase: Predictor’s output is fed into Classifier to calculate loss. Predict/Inference phase: Only predictor’s output is used.   Predictor Predictor simply calculates output based on input.


  Classifier Classifier “wraps” predictors output y to […]

Continue reading →

Refactoring MNIST training

  Previous section, we learned minimum implementation ( for the training code for MNIST. Now, let’s refactor the codes. See argparse argparse is used to provide configurable script code. User can pass variable when executing the code. Below code is added to the training code

Then, these variables are configurable when executing the code from console. And these variables can be accessed by (e.g. args.batchsize, args.epoch etc.). For example, to set gpu device number 0,


or even adding “=”, works the same

  You can also see what command is available using –help command or simply -h.

  Reference: argparse document   […]

Continue reading →

MNIST training with Multi Layer Perceptron

  Training MNIST You already studied basics of Chainer and MNIST dataset. Now we can proceed to the MNIST classification task. We want to create a classifier that classifies MNIST handwritten image into its digit. In other words, classifier will get array which represents MNIST image as input and outputs its label. ※ Chainer contains modules called Trainer, Iterator, Updater, which makes your training code more organized. It is quite nice to write your training code by using them in higher level syntax. However, its abstraction makes difficult to understand what is going on during the training. For those who want to learn deep learning in more detail, I think […]

Continue reading →

MNIST dataset introduction


  MNIST dataset MNIST (Mixed National Institute of Standards and Technology) database is dataset for handwritten digits, distributed by Yann Lecun’s THE MNIST DATABASE of handwritten digits website. Wikipedia The dataset consists of pair, “handwritten digit image” and “label”. Digit ranges from 0 to 9, meaning 10 patterns in total. handwritten digit image: This is gray scale image with size 28 x 28 pixel. label : This is actual digit number this handwritten digit image represents. It is either  0 to 9.   MNIST dataset is widely used for “classification”, “image recognition” task. This is considered as relatively simple task, and often used for “Hello world” program in machine learning […]

Continue reading →

Why Chainer?

I will list up good points of Chainer as an opinion from one Chainer enthusiast. Features Easy environment setup Environment setup is easy, execute one command pip install chainer that’s all and easy. Some deep learning framework is written in C/C++ and requires to build by your own. It will take hours to only setup your develop environment. Easy to debug Chainer is fully written in python, and you can see the stack trace log when error raised. You can print out log during the neural network calculation, which cannot be done with many other framework where the neural network model needs to be pre-compiled (static graph framework). Flexible Chainer is  flexible […]

Continue reading →

Chainer basic module introduction 2

This post is just a copy of chainer_module2.ipynb on github, you can execute interactively using jupyter notebook. Advanced memo is written as “Note”. You can skip reading this for the first time reading. In previous tutorial, we learned Variable Link Function Chain Let’s try training the model (Chain) in this tutorial.In this section, we will learn Optimizer – Optimizes/tunes the internal parameter to fit to the target function Serializer – Handle save/load the model (Chain) For other chainer modules are explained in later tutorial. Training What we want to do here is regression analysis (Wikipedia).Given set of input x and its output y, we would like to construct a model (function) […]

Continue reading →