Write predict code using concat_examples


  This tutorial corresponds to 03_custom_dataset_mlp folder in the source code.   We have trained the model with own dataset, MyDataset, in previous post, let’s write predict code. Source code: predict_custom_dataset1.py predict_custom_dataset2.py   Prepare test data It is not difficult for the model to fit to the train data, so we will check how the model is fit to the test data. 

I used the same seed (=13) to extract the train and test data used in the training phase.   Load trained model

The procedure to load the trained model is Instantiate the model (which is a subclass of Chain: here, it is MyMLP) Send the parameters to GPU if […]

Continue reading →

Training code for MyDataset


  This tutorial corresponds to 03_custom_dataset_mlp folder in the source code. We have prepared your own dataset, MyDataset, in previous post. Training procedure for this dataset is now almost same with MNIST traning. Differences from MNIST dataset are, This task is regression task (estimate final “value”), instead of classification task (estimate the probability of category) Training data and validation/test data is not splitted in our custom dataset   Model definition for Regression task training Our task is to estimate the real value “t” given real value “x“, which is categorized as regression task. Wikipedia: Regression analysis We often use mean squared error as loss function, namely, $$ L = \frac{1}{D}\sum_i^D (t_i – […]

Continue reading →

Create dataset class from your own data with DatasetMixin

  This tutorial corresponds to 03_custom_dataset_mlp folder in the source code. In previous chapter we have learned how to train deep neural network using MNIST handwritten digits dataset. However, MNIST dataset has prepared by chainer utility library and you might now wonder how to prepare dataset when you want to use your own data for regression/classification task. Chainer provides DatasetMixin class to let you define your own dataset class.   Prepare Data In this task, we will try very simple regression task. Own dataset can be generated by create_my_dataset.py. 

  This script will create very simple csv file named “data/my_data.csv“, with column name “x” and “t”. “x” indicates input value […]

Continue reading →

Predict code for simple sequence dataset

  Predict code is easy, implemented in predict_simple_sequence.py. First, construct the model and load the trained model parameters,

  Then we only specify the first index (corresponds to word id), primeindex, and generate next index. We can generate next index repeatedly based on the generated index.

The result is the following, successfully generate the sequence. Predicted sequence:  [1, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 9, 9, 1, 2, 2, 3, 3, 3, […]

Continue reading →

Chainer v2 released: difference from v1

  Chainer version 2 has been released on 2017 June 1,  #Chainer v2.0.0 has been released! Memory reduction (33% in ResNet), API clean up, and CuPy as a separate package. https://t.co/xRrmZAlJWT — Chainer (@ChainerOfficial) June 1, 2017 This post is a summary of what you need to change in your code for your chainer development. Detail change is written in official document. Upgrade Guide from v1 to v2   Installation change CuPy module becomes independent. Reason is that CuPy is GPU version of numpy, it can be used for many types of linear calculation, not specific for chainer. CuPy official website CuPy github To setup Chainer, If you are using only CPU, […]

Continue reading →

Training RNN with simple sequence dataset


  We have learned in previous post that RNN is expected to have an ability to remember the sequence information. Let’s do a easy experiment to check it before trying actual NLP application. Simple sequence dataset I just prepared a simple script to generate simple integer sequence as follows, Source code: simple_sequence_dataset.py

Its output is, [1 2 2 3 3 3 4 4 4 4 5 5 5 5 5 6 6 6 6 6 6 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 1 2 2 …, 9 9 9] So the number […]

Continue reading →

Recurrent Neural Network (RNN) introduction


[Update 2017.06.11] Add chainer v2 code   How can we deal with the sequential data in deep neural network? This formulation is especially important in natural language processing (NLP) field. For example, text is made of sequence of word. If we want to predict the next word from given sentence, the probability of the next word depends on whole past sequence of word. So, the neural network need an ability to “remember” the past sentence to predict next word. In this chapter, Recurrent Neural Network (RNN) and Long Short Term Memory (LSTM) are introduced to deal with sequential data. Recurrent Neural Network (RNN) Recurrent Neural Network is similar to Multi Layer […]

Continue reading →

CIFAR-10, CIFAR-100 inference code


  The code structure of inference/predict stage is quite similar to MNIST inference code, please read this for precise explanation. Here, I will simply put the code and its results. CIFAR-10 inference code Code is uploaded on github as predict_cifar10.py.

This outputs the result as, You can see that even small CNN, it successfully classifies most of the images. Of course this is just a simple example and you can improve the model accuracy by tuning the deep neural network!   CIFAR-100 inference code In the same way, code is uploaded on github as predict_cifar100.py. CIFAR-100 is more difficult than CIFAR-10 in general because there are more class to classify but exists […]

Continue reading →

CIFAR-10, CIFAR-100 training with Convolutional Neural Network

  [Update 2017.06.11] Add chainer v2 code Writing your CNN model This is example of small Convolutional Neural Network definition, CNNSmall

  I also made a slightly bigger CNN, called CNNMedium,

  It is nice to know the computational cost for Convolution layer, which is approximated as, $$ H_I \times W_I \times CH_I \times CH_O \times k ^ 2 $$ \( CH_I \)  : Input image channel \( CH_O \) : Output image channel \( H_I \)     : Input image height \( W_I \)    : Input image width \( k \)           : kernal size (assuming same for width & height)   In above CNN definitions, the size of […]

Continue reading →

CIFAR-10, CIFAR-100 dataset introduction


  Source code is uploaded on github. CIFAR-10 and CIFAR-100 are the small image datasets with its classification labeled. It is widely used for easy image classification task/benchmark in research community. Official page: CIFAR-10 and CIFAR-100 datasets In Chainer, CIFAR-10 and CIFAR-100 dataset can be obtained with build-in function. Setup code: 

  CIFAR-10 chainer.datasets.get_cifar10 method is prepared in Chainer to get CIFAR-10 dataset. Dataset is automatically downloaded from https://www.cs.toronto.edu only for the first time, and its cache is used from second time.

The dataset structure is quite same with MNIST dataset, it is TupleDataset.train[i] represents i-th data, there are 50000 training data.test data structure is same, with 10000 […]

Continue reading →