loss for training and evaluation in estimator could be different

## Description
In current estimator implementation, fit_batch and evaluate_batch use the same loss function.
Code snippet in the `fit_batch` is shown below:
```
     with autograd.record():
            pred = [self.net(x) for x in data]
            loss = [self.loss(y_hat, y) for y_hat, y in zip(pred, label)]
```

The code snippet for evaluate_batch is shown below:
```
        data, label = self._get_data_and_label(val_batch, self.context, batch_axis)
        pred = [self.net(x) for x in data]
        loss = [self.loss(y_hat, y) for y_hat, y in zip(pred, label)]
```

both training and evaluation are using the same loss function `self.loss` to compute the batch loss. In many use cases, it does not hold true. For example, when training LSTM, user may use joint 
regularization loss during training whereas standard cross entropy during evaluation. 

When writing customized estimator, it is cumbersome to define a new loss when evaluation does not share the same loss with training. So it would be good if estimator api could include two losses: `self.train_loss` and `self.evaluate_loss` to tackle different cases separately.




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

loss for training and evaluation in estimator could be different #16879

Description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

loss for training and evaluation in estimator could be different #16879

Description

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions