Skip to content
This repository was archived by the owner on Dec 16, 2022. It is now read-only.
This repository was archived by the owner on Dec 16, 2022. It is now read-only.

AllenNLP's data piece could be more pythonic #1633

Closed
@DeNeutoy

Description

@DeNeutoy

allennlp's data API has become pretty stable and effective. However, it's still a bit unwieldy to manipulate individual Fields and Instances. There are a few things we could do to make them easier to use for a newcomer:

1. Manipulating Fields in Instances. Instance could inherit from MutableMapping. (so could MetadataField, in the same vein.)

# Currently:
specific_field = instance.fields["field_name"]

# Ideal:
specific_field = instance["field_name"]
"field_name" in instance
>>> True

2. Iterating over SequenceFields and ListFields using Field.__iter__:

# Before
fields = instance.fields
tokens = [t.text for t in fields['tokens'].tokens]
assert tokens == ["Mali", "government", "officials", "say", "the", "woman", "'s",
                      "confession", "was", "forced", "."]

# After
token_field = instances["tokens"]
assert token_field == ["Mali", "government", "officials", "say", "the", "woman", "'s",
                      "confession", "was", "forced", "."]

3. Representing Index and SpanFields as their values for equality (Field.__eq__):

# Before
assert instance.fields["span_start"].sequence_index == 102
assert (instance.fields["span"].span_start, instance.fields["span"].span_end) == (102, 109)

# After
assert instance["span_start"] == 102
assert instance["span"] == (102, 109)

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions