Skip to content
This repository was archived by the owner on Dec 16, 2022. It is now read-only.

Commit e5adfd7

Browse files
authored
Ignore 2 root node types in PTB parsing reader (#2675)
Some variants of the PTB dataset use `TOP` as the root node, rather than `VROOT`. This lets the dataset reader handle both.
1 parent 53a46ab commit e5adfd7

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

allennlp/data/dataset_readers/penn_tree_bank.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ def _read(self, file_path):
6969
self._strip_functional_tags(parse)
7070
# This is un-needed and clutters the label space.
7171
# All the trees also contain a root S node.
72-
if parse.label() == "VROOT":
72+
if parse.label() == "VROOT" or parse.label() == "TOP":
7373
parse = parse[0]
7474
pos_tags = [x[1] for x in parse.pos()] if self._use_pos_tags else None
7575
yield self.text_to_instance(parse.leaves(), pos_tags, parse)

0 commit comments

Comments
 (0)