Refactoring MNIST training


Previous section, we learned minimum implementation ( for the training code for MNIST. Now, let’s refactor the codes.



argparse is used to provide configurable script code. User can pass variable when executing the code.

Below code is added to the training code

Then, these variables are configurable when executing the code from console. And these variables can be accessed by (e.g. args.batchsize, args.epoch etc.).

For example, to set gpu device number 0,


or even adding “=”, works the same


You can also see what command is available using --help command or simply -h.




[hands on] Try running the training with 10 epoch.

It can be done by

and you don’t need to modify the python source code thanks to the


[hands on] If you have GPU, use GPU for training with model unit size = 1000.


save/resume training

Save and load the model or optimizer can be done using serializers, below code is to save the training result. The directory to save the result can be configured by -o option.


If you want to resume the training based on the previous training result, load the model and optimizer before start training.

Optimizer also owns internal parameters and thus need to be loaded for resuming training. For example, Adam holds the “first moment” m and “second moment” v explained in adam.


[hands on] Check resume the code after first training, by running

 You can check pre-trained model is used and accuracy is high (98%) from the beginning.


Note that these codes are not executed if no configuration is specified, model and optimizer is not loaded and default initial value is used.


Build-in Link, L.Classifier is used instead of custom class SoftmaxClassifier in

I implemented SoftmaxClassifier, to let you understand the loss calculation (suing softmax for classification task). However, most of the classification task use this function and it is already supported as a build-in Link L.Classifier

You can consider using L.Classifier when coding a classification task.

[hands on]

Read the source code of  L.Classifier, and compare it with SoftmaxClassifier in


Next: Design patterns for defining model

Sponsored Links

Leave a Reply

Your email address will not be published.