Skip to content

Decide how to deal with parameterised calendars in AnyCalendar #6397

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
robertbastian opened this issue Apr 1, 2025 · 24 comments
Closed

Decide how to deal with parameterised calendars in AnyCalendar #6397

robertbastian opened this issue Apr 1, 2025 · 24 comments
Labels
2.0-breaking Changes that are breaking API changes

Comments

@robertbastian
Copy link
Member

robertbastian commented Apr 1, 2025

Currently we have

  • AnyCalendar::Ethiopian, AnyDateInner::Ethiopian, AnyCalendarKind::Ethiopian, AnyCalendarKind::EthiopianAmeteAlem
  • similar for HijriTabular

i.e. we collapse parametric calendars in AnyCalendar and AnyDateInner, but not in AnyCalendarKind. However, when we match AnyCalendar against AnyDateInner, we don't verify that the calendar parameters match. Should we? Or should we have separate AnyCalendar variants for different instantiations of a calendar? Or just separate AnyCalendarDateInner variants?

@robertbastian robertbastian added 2.0-breaking Changes that are breaking API changes discuss-priority Discuss at the next ICU4X meeting labels Apr 1, 2025
@robertbastian robertbastian changed the title Decide how to deal with parameterised calendars in `AnyCalendar Decide how to deal with parameterised calendars in AnyCalendar Apr 1, 2025
@Manishearth
Copy link
Member

We've discussed this before, Ethiopian calendars being mixable was intentional, but low importance. Hijri calendars should not be mixed and we should error in that case, probably via extra variants

@mdfarid01

This comment has been minimized.

@Manishearth

This comment has been minimized.

@mdfarid01

This comment has been minimized.

@mdfarid01

This comment has been minimized.

@Manishearth

This comment has been minimized.

@mdfarid01

This comment has been minimized.

@Manishearth

This comment has been minimized.

@mdfarid01

This comment has been minimized.

@Manishearth

This comment has been minimized.

@mdfarid01

This comment has been minimized.

@Manishearth
Copy link
Member

  • @Manishearth I like all the types to be clickable in docs
  • @sffc My position coming into this was that all accessible types are exported, with the lowest one being opaque. But it seems OK if the type is accessible by a trait path.

@sffc
Copy link
Member

sffc commented Apr 4, 2025

Why is this issue closed?

What we have now is inconsistent and I've never liked it. It is weird that we collapse some things but not others. Can someone explain why we split Dangi and Chinese into separate structs but not Ethiopian and Ethioaa? We are in some weird middle ground. I'm fine if we can explain this, but I thought we were moving in the direction of having one struct per calendar, with Ethiopian being an exception to the rule, but now we seem to be moving in the opposite direction.

@sffc sffc reopened this Apr 4, 2025
@Manishearth
Copy link
Member

That's largely an implementation detail IMO. Collapsing the calendar types where possible reduces API surface rather cleanly; I have always disliked having so much boilerplate for islamic.

The main place this is noticeably exposed is AnyCalendar's enum, which is rarely directly accessed. (There is an argument to be made for making it opaque, I'm not sure I wish to make that argument, but one could).

I'm not opposed to merging Chinese and Dangi into the same type. It would greatly simplify the code, we have an additional layer of internal traits and boilerplate there.

@sffc
Copy link
Member

sffc commented Apr 4, 2025

I think I want to toss this issue into 2.0-stretch, if someone wants to take the time to clean up the AnyCalendar type split.

@sffc sffc added this to the ICU4X 2.0 Stretch ⟨P2⟩ milestone Apr 4, 2025
@sffc
Copy link
Member

sffc commented Apr 8, 2025

Conclusion from brief discussion with @Manishearth @robertbastian @sffc:

  • Yes, seems like the path we want to follow
  • Probably merge Japanese with JapaneseExtended and Chinese with Dangi
  • Could be implemented in short term by having an enum (with private variants) that delegates and then clean this up later

@Manishearth Manishearth added discuss Discuss at a future ICU4X-SC meeting and removed discuss-priority Discuss at the next ICU4X meeting labels Apr 10, 2025
@Manishearth
Copy link
Member

I was going through the current code and I remembered why we had separate types for a lot of these.

One thing is formatting: we should not have the same type for two calendars that cannot share formatters. So Chinese and Dangi should continue to have separate types, unless they use a type parameter. By this metric HijriTabular can still be one type.

The other concern is date arithmetic: we wish to eventually support subtracting dates from each other, and that is supposed to be in the same calendar. By merging the HijriTabular types, we are forcing ourselves to make that functionality fallible. Are we okay with that?

There's a ladder of strictness we can follow:

  • Level 1: Any calendars can share types. There may be ways to make this work with formatter type safety
  • Level 2: Calendas which format similarly can have the same type (e.g. hijri calendars). Formatting continues to have type safety, but calendar arithmetic needs to have runtime checks.
  • Level 3: Calendars which format similarly and have date correspondences, but may disagree on eras, can have the same type (ethiopic, japanese)

Historically we were on level 3, and we kept Japanese to be two types to have better data loading. With HijriTabular we are moving to Level 2. Chinese/Dangi take us to Level 1, which we should not do. Making it generic is acceptable.

Level 2 is fine as long as the eventual arithmetic APIs cover this. They're private right now so it's tweakable.

@sffc
Copy link
Member

sffc commented Apr 11, 2025

I think merging the types is not a problem in DateTimeFormatter, because we persist a calendar object, so we don't care what type it is. We just use it however it was constructed, which is fine. Most dates get converted, and .format_same_calendar works so long as the calendar objects return the correct .kind().

But it might get problematic in FixedCalendarDateTimeFormatter since there is no calendar object being persisted. What is the status quo in Ethiopian for using FixedCalendarDateTimeFormatter with both era styles? How do we differentiate? Do we just rely on the input date to tell us?

@sffc
Copy link
Member

sffc commented Apr 11, 2025

Can we create newtypes for every calendar and impl Deref so we don't have to copy around all the methods?

@Manishearth
Copy link
Member

I think merging the types is not a problem in DateTimeFormatter, because we persist a calendar object, so we don't care what type it is. We just use it however it was constructed, which is fine. Most dates get converted, and .format_same_calendar works so long as the calendar objects return the correct .kind().

But it might get problematic in FixedCalendarDateTimeFormatter since there is no calendar object being persisted. What is the status quo in Ethiopian for using FixedCalendarDateTimeFormatter with both era styles? How do we differentiate? Do we just rely on the input date to tell us?

The status quo is we have a single set of formatting data, and both ethiopic calendar types can be used interchangeably with the formatter, and will produce correct results. The formatter will see Amete Alem X or Amete Mihret Y and format those accordingly.

Level 2 and 3 are not problematic in formatters, and if anything they make it easier to share formatters and reduce data loading if dealing with Fixed. You don't actually need to have different Fixed instances for different calendars.

Can we create newtypes for every calendar and impl Deref so we don't have to copy around all the methods?

This wouldn't do anything, that doesn't carry trait impls over. ArithmeticDate is, essentially, this type already.

Deref also isn't supposed to be used like that (one of the reasons is precisely that it doesn't forward traits, but if you use it this way you rapidly find yourself wanting that behavior). Deref coercions only apply in specific situations, which work great for pointery types but not well for the general goal of having a coercing conversion.

@sffc
Copy link
Member

sffc commented Apr 11, 2025

I observe that we're turning the needs of icu_datetime into constraints on icu_calendar. Are we okay with that?

If we are, then (2) seems like a principled approach.

@Manishearth
Copy link
Member

I observe that we're turning the needs of icu_datetime into constraints on icu_calendar. Are we okay with that?

That's part of the question. Historically, when we were on Level 3, we were constrained by a non-mandatory need of icu_calendar: that dates could be mixed for arithmetic in a type safe way. This preempted having to think about icu_datetime.

With HijriTabular, we are now on level 2, where we have accepted that the not-yet-existent date APIs will need to be fallible, and thus we're constrained only by icu_datetime. I'm ... still fine with that, though I'm sure it's ideal.

I think in the absence of these arithmetic APIs, icu_datetime's constraints are useful fodder here for giving an idea of calendar similarity.

So yeah, 2 seems fine.

@sffc
Copy link
Member

sffc commented Apr 11, 2025

oh I see what you mean about the date arithmetic APIs

You want to prevent code like

let d1 = HijriTabular::new_type_ii_thursday(...)
let d2 = HijriTabular::new_type_ii_friday(...)
let difference = d1.until(d2)

@Manishearth Manishearth removed the discuss Discuss at a future ICU4X-SC meeting label Apr 15, 2025
@Manishearth
Copy link
Member

Level 2: Calendars which format similarly can have the same type (e.g. hijri calendars). Formatting continues to have type safety, but calendar arithmetic needs to have runtime checks.

  • @sffc originally wanted to go more in the direction of Level 4
  • @robertbastian: I think until won't be a problem. If we notice the era doesn't match we can return the difference in dates. If we notice that the two dates have different epochs, then we can return the difference in dates.
  • @Manishearth No since the API lets you tweak how the diff works
  • @robertbastian AnyCalendar?
  • @sffc will return an error
  • @robertbastian:
  • @sffc: similar to julian/gregorian
  • @robertbastian: we can have a type parameter. Any returns an error, Gregorian is infallible.
  • (more discussion)
  • @sffc I think era subset/superset is an exception to the rule.

Proposal:

  • Calendars which format similarly can have the same type (e.g. hijri calendars). This will include patterns/etc.
  • Calendars with an era subset/superset relationship may be mergeable, but not by default (only if it makes sense from a data perspective, and if e.g. things like patterns/etc are the same).
  • In the future, date arithmetic APIs should use an associated error type. AnyCalendar and HijriTabular can return a real error, everyone else returns Infallible.

(This is status quo, but proposes a future route for arithmetic)

Agreed: @Manishearth @sffc @robertbastian

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.0-breaking Changes that are breaking API changes
Projects
None yet
Development

No branches or pull requests

4 participants