MNIST training with Multi Layer Perceptron

  Training MNIST You already studied basics of Chainer and MNIST dataset. Now we can proceed to the MNIST classification task. We want to create a classifier that classifies MNIST handwritten image into its digit. In other words, classifier will get array which represents MNIST image as input and outputs its label. ※ Chainer has some library class called Trainer, Iterator, Updater, which makes your training code more organized. However, its abstraction makes difficult to understand what is going on during the training. For those who want to learn deep learning in more detail, I think it is nice to know “primitive way” of training. Therefore, I decided to darely […]

Continue reading →

MNIST dataset introduction


  MNIST dataset MNIST (Mixed National Institute of Standards and Technology) database is dataset for handwritten digits, distributed by Yann Lecun’s THE MNIST DATABASE of handwritten digits website. Wikipedia The dataset consists of pair, “handwritten digit image” and “label”. Digit ranges from 0 to 9, meaning 10 patterns in total. handwritten digit image: This is gray scale image with size 28 x 28 pixel. label : This is actual digit number this handwritten digit image represents. It is either  0 to 9.   MNIST dataset is widely used for “classification”, “image recognition” task. This is considered as relatively simple task, and often used for “Hello world” program in machine learning […]

Continue reading →

Why Chainer?

I will list up good points of Chainer as an opinion from one Chainer enthusiast. Features Easy environment setup Environment setup is easy, execute one command pip install chainer that’s all and easy. Some deep learning framework is written in C/C++ and requires to build by your own. It will take hours to only setup your develop environment. Easy to debug Chainer is fully written in python, and you can see the stack trace log when error raised. You can print out log during the neural network calculation, which cannot be done with many other framework where the neural network model needs to be pre-compiled (static graph framework). Flexible Chainer is  flexible […]

Continue reading →

Chainer basic module introduction 2

This post is just a copy of chainer_module2.ipynb on github, you can execute interactively using jupyter notebook. Advanced memo is written as “Note”. You can skip reading this for the first time reading. In previous tutorial, we learned Variable Link Function Chain Let’s try training the model (Chain) in this tutorial.In this section, we will learn Optimizer – Optimizes/tunes the internal parameter to fit to the target function Serializer – Handle save/load the model (Chain) For other chainer modules are explained in later tutorial. Training What we want to do here is regression analysis (Wikipedia).Given set of input x and its output y, we would like to construct a model (function) […]

Continue reading →

Chainer basic module introduction

  This post is just a copy of chainer_module.ipynb on github, you can execute interactively using jupyter notebook. Advanced memo is written as “Note”. You can skip reading this for the first time reading.  In this tutorial, basic chainer modules are introduced and explained Variable Link Function Chain For other chainer modules are explained in later tutorial. Initial setup Below is typecal import statement of chainer modules.

  Variable Chainer variable can be created by Variable constructor, which creates chainer.Variable class object. When I write Variable, it means chainer’s class for Variable. Please do not confuse with the usual noun of “variable”. Note: the reason why chainer need to […]

Continue reading →

Chainer modules

    Comparing with caffe If you are using caffe, it is easy to get accustomed to chainer modules. Several variable’s functionality is similar and you can see below table for its correspondence. Chainer Caffe Comment datasets Data layers Input data can be formatted to this class for the model input.It covers most of the use case of input data structure. variable blob It is an input and output of functions/links/Chain. functions layers Framework supports widely used functions in deep learning. Ex sigmoid, tanh, ReLU etc. links layers Framework supports widely used layers in deep learning. Ex Linear layer, Convolutional layer etc. Chain net links and functions (layers) are jointed to […]

Continue reading →