# CIFAR-10, CIFAR-100 training with Convolutional Neural Network

Writing your CNN model This is example of small Convolutional Neural Network definition, CNNSmall

I also made a slightly bigger CNN, called CNNMedium,

It is nice to know the computational cost for Convolution layer, which is approximated as, $$H_I \times W_I \times CH_I \times CH_O \times kernel_size ^ 2$$ $CH_I$   : Input image channel $CH_O$ : Output image channel $H_I$      : Input image height $W_I$     : Input image width $k$           : kernal size (assuming same for width & height)   In above CNN definitions, the size of the channel is bigger for […]

# CIFAR-10, CIFAR-100 dataset introduction

Source code is uploaded on github. CIFAR-10 and CIFAR-100 are the small image datasets with its classification labeled. It is widely used for easy image classification task/benchmark in research community. Official page: CIFAR-10 and CIFAR-100 datasets In Chainer, CIFAR-10 and CIFAR-100 dataset can be obtained with build-in function. Setup code:

CIFAR-10 chainer.datasets.get_cifar10 method is prepared in Chainer to get CIFAR-10 dataset. Dataset is automatically downloaded from https://www.cs.toronto.edu only for the first time, and its cache is used from second time.

The dataset structure is quite same with MNIST dataset, it is TupleDataset.train[i] represents i-th data, there are 50000 training data.test data structure is same, with 10000 […]

# Understanding convolutional layer

Source code is uploaded on github.The sample image is obtained from PEXELS. What is the difference between convolutional layer and linear layer? What kind of intuition is in behind of using convolutional layer in deep neural network? This hands on shows some effects by convolutional layer to provide some intution about what convolutional layer do.

Above type of diagram often appears in Convolutional neural network field. Below figure explains its notation. Cuboid represents the “image” array where this image might not mean the meaningful picture. Horizontal axis represents channel number, vertical axis for image height and depth axis for image width respectively.   Convolution layer – […]

# Basic image processing tutorial

Basic image processing for deep learning. Refer github for the source code. The sample image is obtained from PEXELS. If you are not familiar with image processing, you can read this article before going to convolutional neural network. OpenCV is image processing library which supports loading image in numpy.ndarray format, save image converting image color format (RGB, YUV, Gray scale etc) resize and other useful image processing functionality. To install opencv, execute \$conda install -c https://conda.binstar.org/menpo -y opencv3

Loading and save image cv2.imread for loading image. cv2.imwrite for save image. plt.imshow for plotting, and plt.savefig for save plot image. OpenCV image format is usually 3 dimension (or 2 dimension if […]

# MNIST inference code

We already learned how to write training code in chainer, the last task is to use this trained model to inference (predict) the test input MNIST image. Inference code structure usually becomes as follows, Prepare input data Instantiate the trained model Load the trained model Feed input data into loaded model to get inference result You have already learned the necessary stuff, and it is easy. See inference_mnist.py for the source code. Prepare input data For MNIST, it is easy in one line

Instantiate the trained model and load the model

Here, note that model can be loaded after instantiating the model. This model must have […]

# Chainer family

Recently several sub-libraries for Chainer are released, ChainerRL RL: Reinforcement Learning Deep Reinforcement Learning library. cite from http://chainer.org/general/2017/02/22/ChainerRL-Deep-Reinforcement-Learning-Library.html Recent state-of-the-art deep reinforcement algorithms are implemented, including A3C (Asynchronous Advantage Actor-Critic) ACER (Actor-Critic with Experience Replay) (only the discrete-action version for now) Asynchronous N-step Q-learning DQN (including Double DQN, Persistent Advantage Learning (PAL), Double PAL, Dynamic Policy Programming (DPP)) DDPG (Deep Deterministic Poilcy Gradients) (including SVG(0)) PGT (Policy Gradient Theorem)  How to install

github repository ChainerRL – Deep Reinforcement Learning Library   ChainerCV CV: Computer Vision Image processing library for deep learning training. Common data-augmentation are implemented. How to install

github repository document ChainerMN MN: Multi Node […]

# Chainer version 2 – updated part

Chainer version 2 is planned to be released in Apr 2017. Pre-release version is already available, install by this command

The biggest change is that cupy (Roughly, it is GPU version of numpy) becomes independent, and provided separately.   Reference Chainer v2 alpha from Seiya Tokui

# Writing organized, reusable, clean training code using Trainer module

Training code abstraction with Trainer Until now, I was implementing the training code in “primitive” way to explain what kind of operations are going on in deep learning training (※). However, the code can be written in much clean way using Trainer modules in Chainer. ※ Trainer modules are implemented from version 1.11, and some of the open source projects are implemented without Trainer. So it helps to understand these codes by knowing the training implementation without Trainer module as well. Motivation for using Trainer We can notice there are many “typical” operations widely used in machine learning, for example Iterating minibatch training, with minibatch sampled ramdomly Separate train […]

# Design patterns for defining model

Machine learning consists of training phase and predict/inference phase, and what  model need to calculate is different Training phase: calculate loss (between on output and target) Predict/Inference phase: calculate output To manage this, I often see below 2 patterns to manage this.   Predictor – Classifier framework See train_mnist_2_predictor_classifier.py (train_mnist_1_minimum.py and train_mnist_4_trainer.py are also implemented in Predictor – Classifier framework) 2 Chain classes, “Predictor” and “Classifier” are used for this framework. Training phase: Predictor’s output is fed into Classifier to calculate loss. Predict/Inference phase: Only predictor’s output is used.   Predictor Predictor simply calculates output based on input.

Classifier Classifier “wraps” predictors output y to […]

# Refactoring MNIST training

Previous section, we learned minimum implementation (train_mnist_1_minimum.py) for the training code for MNIST. Now, let’s refactor the codes. See train_mnist_2_predictor_classifier.py. argparse argparse is used to provide configurable script code. User can pass variable when executing the code. Below code is added to the training code

Then, these variables are configurable when executing the code from console. And these variables can be accessed by args.xxx (e.g. args.batchsize, args.epoch etc.). For example, to set gpu device number 0,

or

or even adding “=”, works the same

You can also see what command is available using –help command or simply -h.

Reference: argparse document   […]