Skip to content

Make TimeZoneInfo more strict #5700

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 27 commits into from
Oct 18, 2024
Merged

Make TimeZoneInfo more strict #5700

merged 27 commits into from
Oct 18, 2024

Conversation

sffc
Copy link
Member

@sffc sffc commented Oct 17, 2024

Instead of TimeZoneInfo having optional fields, it now makes the fields required via the type system, which prevents callers from passing in a time zone with missing fields to a formatter that needs more of the fields.

@sffc sffc marked this pull request as ready for review October 17, 2024 02:12
@sffc sffc removed request for a team and zbraniecki October 17, 2024 02:12
Manishearth
Manishearth previously approved these changes Oct 17, 2024
Copy link
Member

@Manishearth Manishearth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit worried about the UX of this but timezones are gnarly anyway.

Copy link
Member

@robertbastian robertbastian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reviewed the timezone crate so far

@@ -36,6 +34,8 @@ pub enum ParseError {
InvalidOffsetError,
/// There was an invalid IANA identifier provided.
InvalidIanaIdentifier,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we ever want to raise InvalidIanaIdentifier, it might already be dead

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 125 to 133
let offset = UtcOffset::try_from_utc_offset_record(offset)?;
let iso = DateTime::<Iso>::try_new_iso_datetime(
date.year,
date.month,
date.day,
time.hour,
time.minute,
time.second,
)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: UtcOffset can construct itself from OffsetRecord, but DateTime cannot construct itself from DateRecord. Seems inconsistent.

Also, maybe create a Date and a Time instead of a DateTime, as that's what we need.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's because DateTime is in a different crate and we can't export from-ixdtf-record functions across crate boundaries because ixdtf is less stable than icu_calendar.

Comment on lines 87 to 105
IxdtfParseRecord {
offset: Some(offset),
tz: None,
date: Some(date),
time: Some(time),
..
} => Self::try_from_utc_offset_record(offset),
} => (offset, date, time),
// [-0800]
IxdtfParseRecord {
offset: None,
tz: Some(time_zone_annotation),
date,
time,
tz:
Some(TimeZoneAnnotation {
tz: TimeZoneRecord::Offset(offset),
..
}),
date: Some(date),
time: Some(time),
..
} => Self::try_from_time_zone_record(&time_zone_annotation.tz, None, date, time),
} => (offset, date, time),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these cases can be collapsed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no? they cover different cases which are in comments

},
);
let IxdtfParseRecord {
offset: None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

phew, not sure I want to make too much information an error. Imagine if you want to start sending more details across the network to change time zone format, if this errors you have to upgrade client and server at exactly the same time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think being strict about the expected fields is important. But, we can continue to iterate on exactly the shape of these functions and the strictness. In the mean time, I added more comments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imagine if you're calling an API that only guarantees that it sends the offset, but doesn't guarantee that it only sends the offset.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps that use case should be another constructor / parse function that sets the time zone to unk if it is not present. It seems bad to throw away the info, though.

Copy link
Member

@robertbastian robertbastian Oct 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's currently no constructor here that does that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should be an error. The constructor you choose asserts the shape of the string.

Can we please not make this block the PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No this is important. Look at serde json, it does not complain when an object contains a key it doesn't know, it only complains when it's missing a key.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok can I make it try_strict_from_str and then try_from_str is the loose one, as it is now

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Copy link
Member Author

@sffc sffc Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are 4 functions which I think cover all of the use cases. This is how I have named them for now:

  • try_offset_only_iso_from_str (strict, returns UtcOffset)
  • try_location_only_iso_from_str (strict, returns TimeZoneInfo<AtTime>)
  • try_loose_iso_from_str (loose, returns TimeZoneInfo<AtTime>)
  • try_iso_from_str (strict, returns TimeZoneInfo<Full>)

There is only one "loose" function because it is only possible for a loose function to return TimeZoneInfo<AtTime>, not TimeZoneInfo<Full>.

I do not see a use case for a "loose" function returning UtcOffset because if you want things loose, just get a TimeZoneInfo<AtTime> which is more general and fully featured. The whole point, as we discussed today, of parsing to a UtcOffset is to assert that you do not have a location.

}

impl TimeZoneInfo<models::Full> {
fn try_full_from_ixdtf_record(ixdtf_record: &IxdtfParseRecord) -> Result<Self, ParseError> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: keep constructors for the same type together

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now all the TimeZoneInfo constructors are together and the CustomZonedDateTime constructors are together.

@@ -90,6 +90,7 @@ extern crate alloc;
mod error;
mod ids;
pub mod provider;
pub mod scaffold;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#[doc(hidden)]?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it is public, not just ICU4X-internal, and I really do not like #[doc(hidden)]. I'm happy to slap unstable warnings on the module and everything inside of it, similar to the provider modules.

pub use time_zone::TimeZoneInfo;
pub use time_zone::TimeZoneModel;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when does a user need access to this trait? I don't think we need it reexported.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shows up in a public API signature and I do not like to not export things that show up publicly. I'm happy to export it from the scaffold module..

Comment on lines 70 to 71
/// This can be set in order to get correct historical time zone names.
/// If it's not set, the most recent data for the time zone will be used.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not anymore

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}

/// Sets a local time on this time zone.
pub const fn with_local_time(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub const fn with_local_time(
pub const fn at_time(

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


impl TimeZoneInfo<models::Full> {
/// Creates a new [`TimeZoneInfo`] for the UTC time zone
/// with a reference point at the UNIX Epoch.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmh kinda don't want to document this. The reference point for UTC is completely irrelevant, because its metazone doesn't change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll delete this function since it has only 1 call site.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's still useful to have this as a public constructor

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No actually I think this is a terrible function. For example, someone might make a TimeZoneInfo::utc_full as the only non-string constructor of TimeZoneInfo<Full> and then try to overwrite fields on it.

This function clearly is not that useful because there is only 1 call site and it is a weird call site in a match statement. There are no call sites in normal code.

robertbastian
robertbastian previously approved these changes Oct 18, 2024
@sffc
Copy link
Member Author

sffc commented Oct 18, 2024

Ok I think this is ready now. Only thing I didn't do 100% from your feedback is the full deduplication of code in ixdtf.rs, but it is better than it was before.

@sffc sffc requested a review from robertbastian October 18, 2024 02:21
@sffc sffc merged commit 5ad31bf into unicode-org:main Oct 18, 2024
28 checks passed
@sffc sffc deleted the strict-tzinfo-2 branch October 18, 2024 03:26
@robertbastian
Copy link
Member

#5533

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants