Turns the standard AST `JNumber` into a safe constructor #30

mdedetrich · 2017-07-02T19:06:53Z

@sirthias @Ichoran @ktoso @fommil This PR disallows JNumber to be called directly via the constructor as a String (there is a fromString method which returns an Option[JNumber] with the regex check).

The regex check has also been removed from all constructors which are guaranteed to be a number, and since JNumber is a final class, all of the boilerplate methods to make it treated as a case class has been defined (unapply/canEqual/productArity/productElement ).

Note that this was only done to the standard AST and not the unsafe one (not sure if it should also be done for unsafe). Note that when combined with #17 (comment) its possible to do some include a bit flag field within JNumber which can track if it gets created as a long/int/double/float so we can make efficient .to converters for different number types.

Let me know if this is looking good, and I can merge/release to M4. I think the only contentious point would be whether these changes make sense in the unsafe AST

codecov-io · 2017-07-02T19:20:34Z

Codecov Report

Merging #30 into master will increase coverage by 0.12%.
The diff coverage is 43.9%.

@@            Coverage Diff             @@
##           master      #30      +/-   ##
==========================================
+ Coverage   44.56%   44.68%   +0.12%     
==========================================
  Files           5        5              
  Lines         662      687      +25     
  Branches      128      133       +5     
==========================================
+ Hits          295      307      +12     
- Misses        367      380      +13

Impacted Files	Coverage Δ
js/src/main/scala/scalajson/ast/JValue.scala	`0% <0%> (ø)`	⬆️
...s/src/main/scala/scalajson/ast/unsafe/JValue.scala	`0% <0%> (ø)`	⬆️
shared/src/main/scala/scalajson/ast/package.scala	`57.77% <100%> (ø)`	⬆️
...m/src/main/scala/scalajson/ast/unsafe/JValue.scala	`59.5% <75%> (-0.24%)`	⬇️
jvm/src/main/scala/scalajson/ast/JValue.scala	`73.39% <77.77%> (+0.39%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0c444bf...d75e458. Read the comment docs.

sirthias · 2017-07-02T20:05:50Z

js/src/main/scala/scalajson/ast/JValue.scala

-    throw new NumberFormatException(value)
-  }
+final class JNumber(private[ast] val underlying: String) extends JValue {
+  @inline val value: String = underlying


What's the purpose of having underlying as well as this value?

To access the raw string value (you can't access value because it's private and for large numbers you may want to access the String value)

if only for that reason then a def would make more sense perhaps (not to grow the memory footprint of the object)

Yeah that's why I used@inline, or does this not work?

If it's a val the value has to be stored somewhere. @inline means the content of the method (in this case accessing the field) should be inlined.

It's also generally not a good idea to use @inline on a public API method, since it makes it harder to change the implementation while maintaining backwards compatibility.

Yep, as Greg said. @inline is bad for bin compat, does not change field semantics, may only do so for the method, and here we just want the method.

To access the raw string value (you can't access value because it's private and for large numbers you may want to access the String value).

I don't see the benefit of adding a second value member.
Why not simply call the constructor parameter value and make it a public field?

sirthias · 2017-07-02T20:08:37Z

Why do you prefer the "manual" case class over the actual case class, which is now possible (since 2.11.11 and 2.12.2)?

sirthias · 2017-07-02T20:11:58Z

js/src/main/scala/scalajson/ast/JValue.scala

+    else
+      None
+
+  def unapply(arg: JNumber): Option[String] = Some(arg.underlying)


This could be improved via something like this, i.e. it could be done in a way that avoids the allocation here.
Not saying that this should be done, just pointing out the opportunity.

Will have a look at this tomorrow

From an end-user standpoint I'm not a fan of having a JNumber.unapply that provides a string. Maybe we should just remove the unapply altogether? What is a user supposed to do when getting this string? Try to parse it as a double? as a BigDecimal? It would be nice if the type system could help out more here.

The unapply for JNumber is there for these reasons

Consistency in the AST (everything is a case class, containing a single value with an unapply)

The string can be directly converted to the number you want (i.e. toDouble, toInt) since its always going to be a valid number. This may be improved with Remove the .to[X] converters #17 (comment)

mdedetrich · 2017-07-02T21:20:13Z

@sirthias The reason why I used a manual class is because of Scale 2.10.x support. Also there isn't any real difference for the user and we don't have to rely on compiler flags (safer route)

gmethvin · 2017-07-03T04:06:47Z

js/src/main/scala/scalajson/ast/JValue.scala

-  if (!value.matches(jNumberRegex)) {
-    throw new NumberFormatException(value)
-  }
+final class JNumber(private[ast] val underlying: String) extends JValue {


Shouldn't this constructor be private?

That said, it would be preferable to provide different implementations of JNumber for each of the cases.

+1

private should be the default.
Making it public should require respective argumentative backup.

It is private, just for the package ast. This is done to save performance as it allows both the standard ast and unsafe to access the private constructor to save on performance in some cases (bypassing the regex check when not needed). Its still private to anything that is outside of the scalaJSON package.

If we end up removing the unsafe AST or it gets changed, I will definitely change it back to private.

currently the constructor is public

I think (barring the manual case class discussion) this would be the preferred solution:

final class JNumber private[ast](val value: String) extends JValue { ...

Agreed. JNumber private[ast] (...) is the way to go.

Ah I misunderstood you guys, fixing it now

gmethvin · 2017-07-03T04:20:09Z

js/src/main/scala/scalajson/ast/JValue.scala

+    else
+      None
+
+  def unapply(arg: JNumber): Option[String] = Some(arg.underlying)


From an end-user standpoint I'm not a fan of having a JNumber.unapply that provides a string. Maybe we should just remove the unapply altogether? What is a user supposed to do when getting this string? Try to parse it as a double? as a BigDecimal? It would be nice if the type system could help out more here.

gmethvin · 2017-07-03T04:32:53Z

js/src/main/scala/scalajson/ast/JValue.scala

-    throw new NumberFormatException(value)
-  }
+final class JNumber(private[ast] val underlying: String) extends JValue {
+  @inline val value: String = underlying


If it's a val the value has to be stored somewhere. @inline means the content of the method (in this case accessing the field) should be inlined.

It's also generally not a good idea to use @inline on a public API method, since it makes it harder to change the implementation while maintaining backwards compatibility.

sirthias · 2017-07-03T07:45:33Z

@sirthias The reason why I used a manual class is because of Scale 2.10.x support. Also there isn't any real difference for the user and we don't have to rely on compiler flags (safer route)

Ok, so it comes down to either supporting Scala 2.10 or going for the standard case class.
Let's understand a bit the pros/cons of either choice:

Dropping Scala 2.10 support would be bad because all projects that are still on Scala 2.10 could not depend on ScalaJSON. In order to understand the impact of this it'd be important to understand which projects this are. Do you know of any?
Going for the "manual" case class has a number of (IMHO) important drawbacks:

a) Asymmetry with all the other model classes. It seems weird that everything is a case class except for JNumber.
b) Forgoing internal optimizations. AFAIK the compiler treats (final) case classes specially in certain cases (e.g. in pattern matches) and applies a number of internal optimizations, e.g. by removing superfluous allocations (see also the discussion around unapply). The "manual" case class will not be able to benefit from these.
c) Risks with regard to meta support. Certain tools like shapeless generics or other macros might require an ADT to be modelled from actual cases classes to work as expected. I am quite sure that going down the manual route will hurt us here in some way or another at some point in the future!
d) Less future proof. Coming developments (e.g. around TASTY) might add even more significance to case classes in unforeseen ways. The asymmetry in the ScalaJSON ADT is not a good thing here either.
e) We have quite a few more lines of code in ScalaJSON, which makes the code harder to read and maintain. For example, we have the risk that the compiler changes the default desugaring of case classes in subtle ways (between versions) that we won't (immediately) see and follow.
f) People reading the code will always and automatically look at the code and try to spot the difference to what they think the default desugaring of case classes is, which is hard, because many people will not know the details to well. Overall it's much more confusing that a standard case class.

So, overall I think we need to have very good reasons to not go for the standard case class.
In my eyes, it comes down to fully understanding how important Scala 2.10 support really is.
My guess is that it is of no significance whatsoever, but I might be wrong.

mdedetrich · 2017-07-03T10:01:00Z

@sirthias

Dropping Scala 2.10 support would be bad because all projects that are still on Scala 2.10 could not depend on ScalaJSON. In order to understand the impact of this it'd be important to understand which projects this are. Do you know of any?

http4s, apparently they are going to be stuck on 2.10 for a while. Also circe (due to http4s and cats)

Asymmetry with all the other model classes. It seems weird that everything is a case class except for JNumber.

Afaik, the asymmetry is only in the source and not how end users end up using the JNumber class (the only difference at this point is the new keyword, which end-users don't even use because they can't construct the JNumber directly with a string anyways)

Forgoing internal optimizations. AFAIK the compiler treats (final) case classes specially in certain cases (e.g. in pattern matches) and applies a number of internal optimizations, e.g. by removing superfluous allocations (see also the discussion around unapply). The "manual" case class will not be able to benefit from these.

This is probably the most convincing argument, but I am not completely aware of these optimizations. Also I think you can simulate the changes for unapply in a manual class

Risks with regard to meta support. Certain tools like shapeless generics or other macros might require an ADT to be modelled from actual cases classes to work as expected. I am quite sure that going down the manual route will hurt us here in some way or another at some point in the future!

Less future proof. Coming developments (e.g. around TASTY) might add even more significance to case classes in unforeseen ways. The asymmetry in the ScalaJSON ADT is not a good thing here either.

This is true

People reading the code will always and automatically look at the code and try to spot the difference to what they think the default desugaring of case classes is, which is hard, because many people will not know the details to well. Overall it's much more confusing that a standard case class.

This is a documentation issue

So, overall I think we need to have very good reasons to not go for the standard case class.
In my eyes, it comes down to fully understanding how important Scala 2.10 support really is.
My guess is that it is of no significance whatsoever, but I might be wrong.

Unfortunately its very significant (http4s/circe), but there are alternatives

Provide the current implementation just for scala 2.10.x. Means more code to maintain (although not a real issue with a library that is meant to rarely change) and use the case class for 2.11.x/2.12.x
Just use a case class for the entire codebase and present the Scala 2.10 users with a big warning label that they can access the constructor themselves

Also with these compiler flags that are needed for Scala 2.11.x, I assume its ScalaJSON that only needs to use these flags for compilation (i.e. end users don't need to add these special flags?)

sirthias · 2017-07-03T10:21:04Z

Unfortunately its very significant (http4s/circe), but there are alternatives

Provide the current implementation just for scala 2.10.x. Means more code to maintain (although >not a real issue with a library that is meant to rarely change) and use the case class for 2.11.x/2.12.x

Just use a case class for the entire codebase and present the Scala 2.10 users with a big > warning label that they can access the constructor themselves

To be honest, even though not great, I'd prefer the first of your listed alternatives over the manual case class way. Paying forever into the future with the drawbacks just for Scala 2.10 support now doesn't appear like a good tradeoff to me.

So how about splitting out a scala210 branch with a manual case class solution for JNumber and a proper case class for JNumber for everyone else?

Also with these compiler flags that are needed for Scala 2.11.x, I assume its ScalaJSON that only needs to use these flags for compilation (i.e. end users don't need to add these special flags?)

Yes, that is also my understanding. This flag only affects the compiler frontend.

sirthias · 2017-07-03T10:24:22Z

js/src/main/scala/scalajson/ast/JValue.scala

  }
+
+  def fromString(value: String): Option[JNumber] =


With Scala 2.10 out of the way we should be able to simply name this method apply as well, like all the others...

I am thinking of leaving in the fromString method for all versions and aliasing it to apply for Scala 2.10.x and Scala 2.11.x, that way for a codebase that needs to targets all Scala versions you can use fromString

mdedetrich · 2017-07-03T11:07:38Z

@sirthias Sounds like a separate branch for 2.10.x support is the way, will work on it tonight

sirthias · 2017-07-04T07:02:19Z

js/src/main/scala/scalajson/ast/JValue.scala

@@ -95,7 +95,7 @@ object JNumber {
  *
  * @author Matthew de Detrich
  */
-final class JNumber(private[ast] val underlying: String) extends JValue {
+final class JNumber private[ast] (val underlying: String) extends JValue {
  @inline def value: String = underlying


IMO, with the constructor parameter being a public field, this construct is now completely superfluous.
Or is there some benefit that I am not seeing?

Scala 2.11.x + uses a proper case class Now using sbt-cross-project Moved private[ast] to the whole constructor Removed underlying (not needed with move of private[ast])

mdedetrich · 2017-07-04T11:49:36Z

@sirthias @ktoso @Ichoran New PR is up. Lot of stuff has changed, have removed underlying (its now pointless) and there is a build that is specific only for Scala 2.10.

@sirthias I have provided an apply for JNumber("1"): Option[JNumber] but it only works for JVM Scala 2.11.x and Scala 2.12.x (doesn't work for Scala.js, nor does it work for Scala 2.10.x). For this reason, there is a JNumber.fromString method that works on all platforms and all Scala versions

sirthias · 2017-07-04T11:56:16Z

js/src/main/scala-2.10/scalajson.ast/JValue.scala

+  */
+// Due to a restriction in Scala 2.10, we can't use a proper case class along with
+// a private constructor, so we have to simulate a case class with a normal class
+final class JNumber private[ast] (val value: String) extends JValue {


What about a copy method?

Good pickup, I will add a copy method

sirthias · 2017-07-04T11:59:01Z

jvm/src/main/scala-2.10/scalajson.ast/JValue.scala

+  * @author Matthew de Detrich
+  */
+// Due to a restriction in Scala 2.10, we can't use a proper case class along with
+// a private constructor, so we have to simulate a case class with a normal class


As a side note and for the record: This is not correct. Even in Scala 2.10 case classes can have a private constructor. The issue is that we cannot override/replace the default apply method generated by the compiler even when the constructor itself is marked private, which kind of defeats the purpose of marking the constructor private in the first place.

Worded it wrongly, will update the documentation now.

sirthias · 2017-07-04T12:08:33Z

shared/src/main/scala-2.10/scalajson.ast/package.scala

+    * @author Rex Kerr
+    * @see https://github.com/Ichoran/kse/blob/master/src/main/scala/jsonal/Jast.scala#L683-L917
+    */
+  private[ast] def numericStringEquals(a: String, b: String): Boolean = {


This is crazy!
What's the purpose of this special numericStringEquals?

I think what is missing from the whole effort is a clear statement as to what semantic level the scalajson AST is supposed to target.
The decision of having JNumber wrap a String leads one to believe that the AST is targeting the lowest, closest-to-the-source, level, which can be a completely valid approach.
But then I'd expect that JNumber("0") != JNumber("-0")!

This here appears to try to turn JNumber into a hybrid: Represent the lowest semantic level but try to also uphold some kind of higher-level guarantees. IMHO this is not only confusing but also a setup for weird and hard to understand effects down the road.

We should make a clear decision as to what we want the AST to represent and then model exactly that, not more and not less, with a consistent and no-magic approach.

numericStringEquals is designed to test for actual number equality however there is clearly a bug in it (@Ichoran are you able to help here)?

This is something that people asked for, i.e. they expect that JNumber does proper number equality. So yes its true that this is a weird hybrid, but the reasoning for using String in the first place is because no unbounded real number exists in Scala (else we would be using that which would implement its own proper number equality)

People ask for (and will ask for) many things.
The hard part of this effort is to know which requests to fulfill and which to deny.
Fulfilling all requests is a sure path to disaster.

IMHO there are two choices:

Make JNumber represent a parsed number, i.e. lift the level of abstraction from the character level to some higher number-representing data type.

Make JNumber a character representation of a number, i.e. without actually raising the level of abstraction. Rather: Merely group a bunch of characters and mark them as representing a number.

From reading the source one gets the impression that scalajson is doing the latter and not the former. The former is possible but a bit more involved. Circe goes down this route and is a good example of how it can be done. Unfortunately this requires the introduction of another number type (BiggerDecimal in circe's case) since there is no readily available one that can properly represent all JSON numbers.

My impression was that scalajson chose to go down the "stay low-level" path exactly to circumvent these kinds of issues. The consequence would have to be that JNumber("0") != JNumber("-0"), which is not only expected from the type setup, it is also consistent with the semantic approach.

I really see no value in going for a strange semantic mix. How would that be useful?
When would I want to rely on character-level equality when I really want some kind of higher-level equality?

The question of whether it's even feasible to properly do equality on arbitrary JSON numbers is a second issue. Is there a proper testsuite for the currently implemented "magic" equality implementation?

I will wait for @Ichoran to chime in, I think there was a reason behind it. To be frank, I have zero issues in removing numericStringEquals is it means less work for me to maintain.

mdedetrich · 2017-07-05T11:08:13Z

@dwijnand @eed3si9n @ktoso @sirthias @Ichoran @gmethvin I am going to release 1.0.0-M4 now and merge this into master.

@sirthias I have created an issue at #36 for your concern regarding numericStringEquals if you want to talk about it there.

Also the name based extractor optimizations no longer apply, named based extractors only came into Scala 2.11.x, so its irrelevant in the scala210 branch. In the normal branch we use standard case classes which should automatically generate named based extractors by default

Turns the standard AST JNumber into a safe constructor, #28 and #22

9e32346

mdedetrich changed the title ~~Turns the standard AST JNumber into a safe constructor, #28 and #22~~ Turns the standard AST JNumber into a safe constructor Jul 2, 2017

sirthias reviewed Jul 2, 2017

View reviewed changes

gmethvin reviewed Jul 3, 2017

View reviewed changes

Make the val a def to save on memory

bc14b53

sirthias reviewed Jul 3, 2017

View reviewed changes

Moved private constructor

0dbd2f0

sirthias reviewed Jul 4, 2017

View reviewed changes

Scala 2.10.x now has its own specific implementation due to restrictions

32b180f

Scala 2.11.x + uses a proper case class Now using sbt-cross-project Moved private[ast] to the whole constructor Removed underlying (not needed with move of private[ast])

sirthias reviewed Jul 4, 2017

View reviewed changes

Updated comments

d4efef2

sirthias reviewed Jul 4, 2017

View reviewed changes

mdedetrich mentioned this pull request Jul 4, 2017

From js any #7

Closed

mdedetrich added 2 commits July 5, 2017 10:11

Use a precompiled regex #32

178a34d

Override the .copy constructor to throw an exception #33

1baab54

mdedetrich force-pushed the safeJNumberConstructor branch from 07bc5e9 to 1baab54 Compare July 5, 2017 08:45

apply(value: String) method does work on Scala.js, incorrect flags #28

cc1643b

mdedetrich mentioned this pull request Jul 5, 2017

Remove numericStringEquals? #36

Open

Fixed up name based extractors

d75e458

mdedetrich merged commit ba3f4ef into master Jul 5, 2017

mdedetrich deleted the safeJNumberConstructor branch July 5, 2017 11:55

mdedetrich mentioned this pull request Jul 5, 2017

Release 1.0.0 #13

Closed

mdedetrich added a commit that referenced this pull request Jul 5, 2017

Updated due to #30

244d50d

Turns the standard AST JNumber into a safe constructor #30

Turns the standard AST JNumber into a safe constructor #30

Uh oh!

Conversation

mdedetrich commented Jul 2, 2017

Uh oh!

codecov-io commented Jul 2, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mdedetrich Jul 2, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ktoso Jul 2, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mdedetrich Jul 3, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sirthias commented Jul 2, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mdedetrich commented Jul 2, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mdedetrich Jul 3, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sirthias commented Jul 3, 2017

Uh oh!

mdedetrich commented Jul 3, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sirthias commented Jul 3, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mdedetrich commented Jul 3, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Turns the standard AST `JNumber` into a safe constructor #30

Turns the standard AST `JNumber` into a safe constructor #30

codecov-io commented Jul 2, 2017 •

edited

Loading

mdedetrich Jul 2, 2017 •

edited

Loading

ktoso Jul 2, 2017 •

edited

Loading

mdedetrich Jul 3, 2017 •

edited

Loading

mdedetrich commented Jul 2, 2017 •

edited

Loading

mdedetrich Jul 3, 2017 •

edited

Loading

mdedetrich commented Jul 3, 2017 •

edited

Loading

mdedetrich commented Jul 5, 2017 •

edited

Loading