MNIST training with Multi Layer Perceptron

  Training MNIST You already studied basics of Chainer and MNIST dataset. Now we can proceed to the MNIST classification task. We want to create a classifier that classifies MNIST handwritten image into its digit. In other words, classifier will get array which represents MNIST image as input and outputs its label. ※ Chainer has some library class called Trainer, Iterator, Updater, which makes your training code more organized. However, its abstraction makes difficult to understand what is going on during the training. For those who want to learn deep learning in more detail, I think it is nice to know “primitive way” of training. Therefore, I decided to darely […]

Continue reading →

MNIST dataset introduction

mnist_plot

  MNIST dataset MNIST (Mixed National Institute of Standards and Technology) database is dataset for handwritten digits, distributed by Yann Lecun’s THE MNIST DATABASE of handwritten digits website. Wikipedia The dataset consists of pair, “handwritten digit image” and “label”. Digit ranges from 0 to 9, meaning 10 patterns in total. handwritten digit image: This is gray scale image with size 28 x 28 pixel. label : This is actual digit number this handwritten digit image represents. It is either  0 to 9.   MNIST dataset is widely used for “classification”, “image recognition” task. This is considered as relatively simple task, and often used for “Hello world” program in machine learning […]

Continue reading →

Why Chainer?

I will list up good points of Chainer as an opinion from one Chainer enthusiast. Features Easy environment setup Environment setup is easy, execute one command pip install chainer that’s all and easy. Some deep learning framework is written in C/C++ and requires to build by your own. It will take hours to only setup your develop environment. Easy to debug Chainer is fully written in python, and you can see the stack trace log when error raised. You can print out log during the neural network calculation, which cannot be done with many other framework where the neural network model needs to be pre-compiled (static graph framework). Flexible Chainer is  flexible […]

Continue reading →

Chainer basic module introduction 2

This post is just a copy of chainer_module2.ipynb on github, you can execute interactively using jupyter notebook. Advanced memo is written as “Note”. You can skip reading this for the first time reading. In previous tutorial, we learned Variable Link Function Chain Let’s try training the model (Chain) in this tutorial.In this section, we will learn Optimizer – Optimizes/tunes the internal parameter to fit to the target function Serializer – Handle save/load the model (Chain) For other chainer modules are explained in later tutorial. Training What we want to do here is regression analysis (Wikipedia).Given set of input x and its output y, we would like to construct a model (function) […]

Continue reading →

Chainer basic module introduction

  This post is just a copy of chainer_module.ipynb on github, you can execute interactively using jupyter notebook. Advanced memo is written as “Note”. You can skip reading this for the first time reading.  In this tutorial, basic chainer modules are introduced and explained Variable Link Function Chain For other chainer modules are explained in later tutorial. Initial setup Below is typecal import statement of chainer modules.

  Variable Chainer variable can be created by Variable constructor, which creates chainer.Variable class object. When I write Variable, it means chainer’s class for Variable. Please do not confuse with the usual noun of “variable”. Note: the reason why chainer need to […]

Continue reading →

Chainer modules

    Comparing with caffe If you are using caffe, it is easy to get accustomed to chainer modules. Several variable’s functionality is similar and you can see below table for its correspondence. Chainer Caffe Comment datasets Data layers Input data can be formatted to this class for the model input.It covers most of the use case of input data structure. variable blob It is an input and output of functions/links/Chain. functions layers Framework supports widely used functions in deep learning. Ex sigmoid, tanh, ReLU etc. links layers Framework supports widely used layers in deep learning. Ex Linear layer, Convolutional layer etc. Chain net links and functions (layers) are jointed to […]

Continue reading →

English summary of “Chainer Advent Calendar 2016″

I will summarize Chainer Advent Calendar 2016, to introduce what kind of deep learning projects are going on in Japan. Chainer Deep learning framework, chainer, is developed by Japanese company Preferred Networks and the community is quite active in Japan. Since interesting projects are presented and discussed, I would like to introduce them by summarizing it in English. Advent Calendar In Japan, there is a popular programming blog website called Qiita, and the “Advent Calendar” in Qiita is an event that interested people can join and write a blog for specific theme during Dec 1 to Dec 25. Advent Calendar 2016 Chainer Advent Calendar 2016 If you want to read the […]

Continue reading →

python print, string manipulation summary

  Version difference python 3.x: print() is function and not a keyword anymore. () must be used.

python 2.x: print is considered as a keyword, so () is not necessary.

The above syntax cannot be used in python 3! But you can use future module import if you want to use print as a function for python 3 compatibility.

Summary: It is better to use print() for future compatibility from now on. Ref: What is __future__ in Python used for and how/when to use it, and how it works __future__ モジュールについて Single quotation and double quotation for string There is no difference between ‘word’ and “word” in […]

Continue reading →

AtCoder Grand Contest 007 review

Link AGC 007 top page Editorial I felt you need more invetiveness/creativeness in this contest, and it was difficult.  A – Shik and Stone Problem Compared to ARC (regular contest), it is slightly difficult even from problem A. The most elegant solution is to just check the number of ‘#’ is equal to \(W+H-1\) or not as explained in Editorial. However, I could not notice it so I checked that we can actually go using only right or down or not.

  B – Construct Sequences Problem You need to come up with one construction method which satisfies the 3 conditions. Editorial way is much more easier than below […]

Continue reading →

AtCoder Regular Contest 063 review

Link ARC 063 top page Editorial C – 一次元リバーシ / 1D Reversi Problem  

D – 高橋君と見えざる手 / An Invisible Hand Problem I didn’t notice that there is a restriction that each \(A_i \) is different. So it took time for implementation.

  E – 木と整数 / Integers on a Tree Problem Follow editorial.

                     

Continue reading →