Refactoring MNIST training

  Previous section, we learned minimum implementation ( for the training code for MNIST. Now, let’s refactor the codes. See argparse argparse is used to provide configurable script code. User can pass variable when executing the code. Below code is added to the training code

Then, these variables are configurable when executing the code from console. And these variables can be accessed by (e.g. args.batchsize, args.epoch etc.). For example, to set gpu device number 0,


or even adding “=”, works the same

  You can also see what command is available using –help command or simply -h.

  Reference: argparse document   […]

Continue reading →

MNIST training with Multi Layer Perceptron


  Training MNIST You already studied basics of Chainer and MNIST dataset. Now we can proceed to the MNIST classification task. We want to create a classifier that classifies MNIST handwritten image into its digit. In other words, classifier will get array which represents MNIST image as input and outputs its label. ※ Chainer contains modules called Trainer, Iterator, Updater, which makes your training code more organized. It is quite nice to write your training code by using them in higher level syntax. However, its abstraction makes difficult to understand what is going on during the training. For those who want to learn deep learning in more detail, I think […]

Continue reading →

MNIST dataset introduction


  MNIST dataset MNIST (Mixed National Institute of Standards and Technology) database is dataset for handwritten digits, distributed by Yann Lecun’s THE MNIST DATABASE of handwritten digits website. Wikipedia The dataset consists of pair, “handwritten digit image” and “label”. Digit ranges from 0 to 9, meaning 10 patterns in total. handwritten digit image: This is gray scale image with size 28 x 28 pixel. label : This is actual digit number this handwritten digit image represents. It is either  0 to 9.   MNIST dataset is widely used for “classification”, “image recognition” task. This is considered as relatively simple task, and often used for “Hello world” program in machine learning […]

Continue reading →

Why Chainer?

I will list up good points of Chainer as an opinion from one Chainer enthusiast. Features Easy environment setup Environment setup is easy, execute one command pip install chainer that’s all and easy. Some deep learning framework is written in C/C++ and requires to build by your own. It will take hours to only setup your develop environment. Easy to debug Chainer is fully written in python, and you can see the stack trace log when error raised. You can print out log during the neural network calculation, which cannot be done with many other framework where the neural network model needs to be pre-compiled (static graph framework). Flexible Chainer is  flexible […]

Continue reading →

Chainer basic module introduction 2

This post is just a copy of chainer_module2.ipynb on github, you can execute interactively using jupyter notebook. Advanced memo is written as “Note”. You can skip reading this for the first time reading. In previous tutorial, we learned Variable Link Function Chain Let’s try training the model (Chain) in this tutorial.In this section, we will learn Optimizer – Optimizes/tunes the internal parameter to fit to the target function Serializer – Handle save/load the model (Chain) For other chainer modules are explained in later tutorial. Training What we want to do here is regression analysis (Wikipedia).Given set of input x and its output y, we would like to construct a model (function) […]

Continue reading →

Chainer basic module introduction

  This post is just a copy of chainer_module.ipynb on github, you can execute interactively using jupyter notebook. Advanced memo is written as “Note”. You can skip reading this for the first time reading.  In this tutorial, basic chainer modules are introduced and explained Variable Link Function Chain For other chainer modules are explained in later tutorial. Initial setup Below is typecal import statement of chainer modules.

  Variable Chainer variable can be created by Variable constructor, which creates chainer.Variable class object. When I write Variable, it means chainer’s class for Variable. Please do not confuse with the usual noun of “variable”. Note: the reason why chainer need to […]

Continue reading →

Chainer modules

    Comparing with caffe If you are using caffe, it is easy to get accustomed to chainer modules. Several variable’s functionality is similar and you can see below table for its correspondence. Chainer Caffe Comment datasets Data layers Input data can be formatted to this class for the model input.It covers most of the use case of input data structure. variable blob It is an input and output of functions/links/Chain. functions layers Framework supports widely used functions in deep learning. Ex sigmoid, tanh, ReLU etc. links layers Framework supports widely used layers in deep learning. Ex Linear layer, Convolutional layer etc. Chain net links and functions (layers) are jointed to […]

Continue reading →

English summary of “Chainer Advent Calendar 2016″

I will summarize Chainer Advent Calendar 2016, to introduce what kind of deep learning projects are going on in Japan. Chainer Deep learning framework, chainer, is developed by Japanese company Preferred Networks and the community is quite active in Japan. Since interesting projects are presented and discussed, I would like to introduce them by summarizing it in English. Advent Calendar In Japan, there is a popular programming blog website called Qiita, and the “Advent Calendar” in Qiita is an event that interested people can join and write a blog for specific theme during Dec 1 to Dec 25. Advent Calendar 2016 Chainer Advent Calendar 2016 If you want to read the […]

Continue reading →

Chainer class introduction


Chainer is a library for deep learning. You can implement current trend network e.g. CNN (Convolutional Neural Network), RNN (Recurrent Neural Network) etc. Chainer official document * The post is written in 2016 July, with Chainer version 1.10, but Chainer is still in active development and some of the functionality specification may change in the future. Variable, functions, links and Chain At first, please read Introduction to Chainer. To summarize, input – output relationship of deep neural network is maintained by computational graph internally, which is constructed using Variable, functions, links and Chain. Once deep neural network is constructed, forward/backward propagation can be executed for training. VariableVariable will be used […]

Continue reading →

SeRanet: Quick start guide

This post explains SeRanet project, super resolution software through deep learning. Preparation Dependencies – third party library Install python, pip The software is written in python, and I’m using python version 2.7.6. If you are using OS Ubuntu 14.04, python2 is pre-installed by default. So you don’t need to install explicitly. The version of python can be checked by typing 

 in the terminal. If you can’t find python, then try below.

  To install third party python library, pip command is often used. To install pip, type below in command line

Install popular libraries, numpy, scipy, matplotlib numpy, scipy, matplotlib are widely used for data processing in python. […]

Continue reading →