This repository was archived by the owner on Dec 16, 2022. It is now read-only.
This repository was archived by the owner on Dec 16, 2022. It is now read-only.
AllenNLP's data piece could be more pythonic #1633
Closed
Description
allennlp's data API has become pretty stable and effective. However, it's still a bit unwieldy to manipulate individual Fields
and Instances
. There are a few things we could do to make them easier to use for a newcomer:
1. Manipulating Fields
in Instances
. Instance
could inherit from MutableMapping
. (so could MetadataField
, in the same vein.)
# Currently:
specific_field = instance.fields["field_name"]
# Ideal:
specific_field = instance["field_name"]
"field_name" in instance
>>> True
2. Iterating over SequenceFields
and ListFields
using Field.__iter__
:
# Before
fields = instance.fields
tokens = [t.text for t in fields['tokens'].tokens]
assert tokens == ["Mali", "government", "officials", "say", "the", "woman", "'s",
"confession", "was", "forced", "."]
# After
token_field = instances["tokens"]
assert token_field == ["Mali", "government", "officials", "say", "the", "woman", "'s",
"confession", "was", "forced", "."]
3. Representing Index
and SpanFields
as their values for equality (Field.__eq__
):
# Before
assert instance.fields["span_start"].sequence_index == 102
assert (instance.fields["span"].span_start, instance.fields["span"].span_end) == (102, 109)
# After
assert instance["span_start"] == 102
assert instance["span"] == (102, 109)
Metadata
Metadata
Assignees
Labels
No labels