Design patterns for defining model

[Update 2017.06.11] Add Chainer v2 code

Machine learning consists of training phase and predict/inference phase, and what  model need to calculate is different

  • Training phase: calculate loss (between on output and target)
  • Predict/Inference phase: calculate output

To manage this, I often see below 2 patterns to manage this.

Predictor – Classifier framework

See ( and are also implemented in Predictor – Classifier framework)

Chain classes, “Predictor” and “Classifier” are used for this framework.

  • Training phase: Predictor’s output is fed into Classifier to calculate loss.
  • Predict/Inference phase: Only predictor’s output is used.
  • Predictor

Predictor simply calculates output based on input.

# Network definition Chainer v2
# 1. `init_scope()` is used to initialize links for IDE friendly design.
# 2. input size of Linear layer can be omitted
class MLP(chainer.Chain):

    def __init__(self, n_units, n_out):
        super(MLP, self).__init__()
        with self.init_scope():
            # input size of each layer will be inferred when omitted
            self.l1 = L.Linear(n_units)  # n_in -> n_units
            self.l2 = L.Linear(n_units)  # n_units -> n_units
            self.l3 = L.Linear(n_out)    # n_units -> n_out

    def __call__(self, x):
        h1 = F.relu(self.l1(x))
        h2 = F.relu(self.l2(h1))
        return self.l3(h2)
model = mlp.MLP(args.unit, 10)

  • Classifier

Classifier “wraps” predictors output y to calculate loss between y and actual target t.

classifier_model = L.Classifier(model)
optimizer.update(classifier_model, x, t)

which invokes classifier_model(x, t) internally, calculates loss and update internal parameter by back propagation.

Refer source code of Classifier for the detail.

Train flag framework

[Update] In chainer v2, global flag chainer.config.train is introduced. This framework may not be the recommended way for now.


Both the loss calculation in train phase and predict code for inference phase are implemented within one model, and the behavior is managed by “train flag” (or “test flag”/”predict flag”).

# Network definition
class MLP(chainer.Chain):

    def __init__(self, n_units, n_out):
        super(MLP, self).__init__()
        with self.init_scope():
            self.l1 = L.Linear(None, n_units)  # n_in -> n_units
            self.l2 = L.Linear(None, n_units)  # n_units -> n_units
            self.l3 = L.Linear(None, n_out)  # n_units -> n_out

        # Define train flag
        self.train = True

    def __call__(self, x, t=None):
        h1 = F.relu(self.l1(x))
        h2 = F.relu(self.l2(h1))
        y = self.l3(h2)
        if self.train:
            # return loss in training phase
            #y = self.predictor(x)
            self.loss = F.softmax_cross_entropy(y, t)
            self.accuracy = F.accuracy(y, t)
            return self.loss
            # return y in predict/inference phase
            return y

As default, self.train = True, and this model will calculate loss so that optimizer can update its internal parameters.

To predict value, we can set train flag to False,

model.train = False
y = model(x)
# model.train = True  # if necessary


Predictor – Classifier framework has an advantage that Classifier module can be independent and it will be reusable. However, if loss calculation is complicated, it is difficult to apply this framework.

In train flag framework, train loss calculation and predict calculation can be independent. You can implement any loss calculation,  even the loss calculation is very different from predict calculation.

Basically, you can use Predictor – Classifier framework if the loss function is typical. Use train flag framework otherwise.

Next: Writing organized, reusable, clean training code using Trainer module

Leave a Comment

Your email address will not be published. Required fields are marked *