Previous section, we learned minimum implementation (train_mnist_1_minimum.py) for the training code for MNIST. Now, let’s refactor the codes.
See train_mnist_2_predictor_classifier.py.
argparse
argparse is used to provide configurable script code. User can pass variable when executing the code.
Below code is added to the training code
import argparse def main(): parser = argparse.ArgumentParser(description='Chainer example: MNIST') parser.add_argument('--initmodel', '-m', default='', help='Initialize the model from given file') parser.add_argument('--batchsize', '-b', type=int, default=100, help='Number of images in each mini-batch') parser.add_argument('--epoch', '-e', type=int, default=20, help='Number of sweeps over the dataset to train') parser.add_argument('--gpu', '-g', type=int, default=-1, help='GPU ID (negative value indicates CPU)') parser.add_argument('--out', '-o', default='result/2', help='Directory to output the result') parser.add_argument('--resume', '-r', default='', help='Resume the training from snapshot') parser.add_argument('--unit', '-u', type=int, default=50, help='Number of units') args = parser.parse_args() ...
Then, these variables are configurable when executing the code from console. And these variables can be accessed by args.xxx
(e.g. args.batchsize
, args.epoch
etc.).
For example, to set gpu device number 0,
$ python train_mnist_2_predictor_classifier.py -g 0
or
$ python train_mnist_2_predictor_classifier.py --gpu 0
or even adding “=”, works the same
$ python train_mnist_2_predictor_classifier.py --gpu=0
You can also see what command is available using --help
command or simply -h
.
xxx:~/workspace/pycharm/chainer-hands-on-tutorial/src/mnist$ python train_mnist_2_predictor_classifier.py -h usage: train_mnist_2_predictor_classifier.py [-h] [--initmodel INITMODEL] [--batchsize BATCHSIZE] [--epoch EPOCH] [--gpu GPU] [--out OUT] [--resume RESUME] [--unit UNIT] Chainer example: MNIST optional arguments: -h, --help show this help message and exit --initmodel INITMODEL, -m INITMODEL Initialize the model from given file --batchsize BATCHSIZE, -b BATCHSIZE Number of images in each mini-batch --epoch EPOCH, -e EPOCH Number of sweeps over the dataset to train --gpu GPU, -g GPU GPU ID (negative value indicates CPU) --out OUT, -o OUT Directory to output the result --resume RESUME, -r RESUME Resume the training from snapshot --unit UNIT, -u UNIT Number
Reference:
[hands on] Try running the training with 10 epoch.
It can be done by
$ python train_mnist_2_predictor_classifier.py -e 10
and you don’t need to modify the python source code thanks to the
[hands on] If you have GPU, use GPU for training with model unit size = 1000.
$ python train_mnist_2_predictor_classifier.py -g 0 -u 1000
save/resume training
Save and load the model or optimizer
can be done using serializers
, below code is to save the training result. The directory to save the result can be configured by -o
option.
parser.add_argument('--out', '-o', default='result/2', help='Directory to output the result') ... # Save the model and the optimizer print('save the model') serializers.save_npz('{}/classifier.model'.format(args.out), classifier_model) serializers.save_npz('{}/mlp.model'.format(args.out), model) print('save the optimizer') serializers.save_npz('{}/mlp.state'.format(args.out), optimizer)
If you want to resume the training based on the previous training result, load the model
and optimizer
before start training.
Optimizer also owns internal parameters and thus need to be loaded for resuming training. For example, Adam holds the “first moment” m
and “second moment” v
explained in adam.
parser.add_argument('--initmodel', '-m', default='', help='Initialize the model from given file') parser.add_argument('--resume', '-r', default='', help='Resume the training from snapshot') ... # Init/Resume if args.initmodel: print('Load model from', args.initmodel) serializers.load_npz(args.initmodel, classifier_model) if args.resume: print('Load optimizer state from', args.resume) serializers.load_npz(args.resume, optimizer)
[hands on] Check resume the code after first training, by running
xxx:~/workspace/pycharm/chainer-hands-on-tutorial/src/mnist$ python train_mnist_2_predictor_classifier.py -m result/2/classifier.model -r result/2/mlp.state GPU: -1 # unit: 50 # Minibatch-size: 100 # epoch: 20 Load model from result/2/classifier.model Load optimizer state from result/2/mlp.state epoch 1 graph generated train mean loss=0.037441188701771655, accuracy=0.9890166732668877, throughput=57888.5195400998 images/sec test mean loss=0.1542429528321469, accuracy=0.974500007033348
You can check pre-trained model is used and accuracy is high (98%) from the beginning.
Note that these codes are not executed if no configuration is specified, model and optimizer is not loaded and default initial value is used.
Classifier
Build-in Link, L.Classifier
is used instead of custom class SoftmaxClassifier
in train_mnist_1_minimum.py.
classifier_model = L.Classifier(model)
I implemented SoftmaxClassifier
, to let you understand the loss calculation (suing softmax for classification task). However, most of the classification task use this function and it is already supported as a build-in Link L.Classifier
.
You can consider using L.Classifier
when coding a classification task.
[hands on]
Read the source code of L.Classifier
, and compare it with SoftmaxClassifier
in train_mnist_1_minimum.py.