Skip to content
This repository was archived by the owner on Dec 16, 2022. It is now read-only.

Commit 00bb6c5

Browse files
dirkgrepwalsh
andauthored
Be sure to close the TensorBoard writer (#4731)
* Be sure to close the tensorboard writer * Changelog * unindent Co-authored-by: Evan Pete Walsh <[email protected]>
1 parent 3f23938 commit 00bb6c5

File tree

2 files changed

+10
-4
lines changed

2 files changed

+10
-4
lines changed

CHANGELOG.md

+1
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
8181
- Fixed a bug in distributed training where the vocab would be saved from every worker, when it should have been saved by only the local master process.
8282
- Fixed a bug in the calculation of rouge metrics during distributed training where the total sequence count was not being aggregated across GPUs.
8383
- Fixed `allennlp.nn.util.add_sentence_boundary_token_ids()` to use `device` parameter of input tensor.
84+
- Be sure to close the TensorBoard writer even when training doesn't finish.
8485
- Fixed the docstring for `PyTorchSeq2VecWrapper`.
8586

8687
## [v1.1.0](https://github.com/allenai/allennlp/releases/tag/v1.1.0) - 2020-09-08

allennlp/training/trainer.py

+9-4
Original file line numberDiff line numberDiff line change
@@ -965,6 +965,13 @@ def train(self) -> Dict[str, Any]:
965965
"""
966966
Trains the supplied model with the supplied parameters.
967967
"""
968+
try:
969+
return self._try_train()
970+
finally:
971+
# make sure pending events are flushed to disk and files are closed properly
972+
self._tensorboard.close()
973+
974+
def _try_train(self) -> Dict[str, Any]:
968975
try:
969976
epoch_counter = self._restore_checkpoint()
970977
except RuntimeError:
@@ -1068,7 +1075,8 @@ def train(self) -> Dict[str, Any]:
10681075

10691076
if self._serialization_dir and self._master:
10701077
common_util.dump_metrics(
1071-
os.path.join(self._serialization_dir, f"metrics_epoch_{epoch}.json"), metrics
1078+
os.path.join(self._serialization_dir, f"metrics_epoch_{epoch}.json"),
1079+
metrics,
10721080
)
10731081

10741082
# The Scheduler API is agnostic to whether your schedule requires a validation metric -
@@ -1106,9 +1114,6 @@ def train(self) -> Dict[str, Any]:
11061114
for callback in self._end_callbacks:
11071115
callback(self, metrics=metrics, epoch=epoch, is_master=self._master)
11081116

1109-
# make sure pending events are flushed to disk and files are closed properly
1110-
self._tensorboard.close()
1111-
11121117
# Load the best model state before returning
11131118
best_model_state = self._checkpointer.best_model_state()
11141119
if best_model_state:

0 commit comments

Comments
 (0)