-
Notifications
You must be signed in to change notification settings - Fork 10
Added JNumber conversions #38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There is now a constructedFlag which tracks how a JNumber is created
Codecov Report
@@ Coverage Diff @@
## master #38 +/- ##
==========================================
- Coverage 44.68% 35.43% -9.26%
==========================================
Files 5 5
Lines 687 906 +219
Branches 133 194 +61
==========================================
+ Hits 307 321 +14
- Misses 380 585 +205
Continue to review full report at Codecov.
|
Some(value.toFloat) | ||
else { | ||
val asFloat = value.toFloat | ||
if (BigDecimal(value) == BigDecimal(asFloat.toDouble)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Scala's toFloat
is more relaxed on this.
> eval "1.1111111111111111111111".toFloat
[info] ans: Float = 1.1111112
- If I am a function that was given just
JValue
, how would I know what content your value contains? - Does it make sense to be more strict than String's
toFloat
here? - Suppose yes, would it make sense to return
Try
or throw exception so at least we'd know why you've bailed out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH, I wasn't entirely sure how floats worked in Scala, but after reading this I think I need to do some more research If I have a better way of detecting losss of precision
To answer your questions
- You can always read the raw the string value in a
JNumber
. The point of these converters is to correctly convert the string to a number (correctly here means without losing any data). As we can tell, this is quite hard ;) - Yes, else these converters are pointless. If its not possible then I will remove toFloat (nothing is stopping you from doing a
.toFloat
on the actual string value) - Well it currently returns
Option[Float]
. ReturningNone
implies that you have lost some data (unless I am missing something here?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spray used BigDecimal
internally, so toFloat
works like this:
eval BigDecimal("1.1111111111111111111111").toFloat
[info] ans: Float = 1.1111112
If we care about correctness so much, my vote would be get rid of toFloat
altogether. See also Circe - https://github.com/circe/circe/blob/master/modules/core/shared/src/main/scala/io/circe/JsonNumber.scala
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, ill have a look to get rid of it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Getting rid of Float
for correctness is no more sensible than getting rid of Double
for correctness. Also, I am not convinced that possible loss of precision of trailing digits is a valid reason for returning None
for a floating point value. For example, 1.0/7
is printed as 0.14285714285714285
but the conventional rounding, which also parses to the same number, is 0.14285714285714286
.
0.142857142857142857142857142857142857142857142857142857...
0.142857142857142849212692681248881854116916656494140625
^ stops matching decimal expansion
So neither having the exact decimal representation of the double value nor having the best rounding of that value really represents what's going on. Instead, if you overflow (get positive or negative infinity) that's wrong; and if you underflow (the number is not zero but you return zero) that is also wrong. Otherwise, the parse should just return the best representation of the value, and everyone should keep in mind (just like always) Float
and Double
might not represent things exactly.
The reason to get rid of toFloat
is if you want to care about exactness less because you already can't win that battle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Getting rid of
Float
for correctness is no more sensible than getting rid ofDouble
for correctness.
Both def toFloat: Option[Float]
and def toDouble: Option[Double]
that adds shape checking (BigDecimal(value) == BigDecimal(asFloat.toDouble)
) are ridiculous. Because IEEE floats internally represent decimals as a binary fraction, it can't express 0.1
.
If we are letting them survive, trusting that the user knows what they are doing, I think the Scala idiomatic way is to behave like StringOps and def toFloat: Float
and def toDouble: Double
return whatever value.toFloat
/ value.toDouble
returns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eed3si9n - BigDecimal(value) == BigDecimal(asDouble)
will be true for 0.1, but the float versions will be nonsensical. (Note that the Scala BigDecimal
wraps the Java one to be a little less weird about 0.1: the default assumption is that you meant, by your Double
, its canonical string representation, not the crazily-long binary fraction expansion.)
def this(value: Double) = this(value.toString) | ||
|
||
final class JNumber private[ast] (val value: String)( | ||
private[ast] val constructedFlag: Int) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could constructedFlag
be a smell that we are doing poor man's object orientation?
If we want lossless (and performant?) round trip from Double -> JNumber -> Double, and Json -> AST -> Json, would it make sense to split this class into private[ast] JInt
, JLong
, JFloat
, JDouble
, JBigInt
, JBigDecimial
, and JStringNumber
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I didn't realise this at the time, @sirthias what do you think of this? If we are going to do performant number conversions may as well represent the cases as various classes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's some pretty wide fanout. In principle if encoding and decoding in memory is a likely scenario, this is the way to do it, but I'm skeptical that this is an important use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's #41 that splits JNumber
up.
# Conflicts: # js/src/main/scala/scalajson/ast/JValue.scala # js/src/main/scala/scalajson/ast/unsafe/JValue.scala # jvm/src/main/scala/scalajson/ast/JValue.scala # jvm/src/main/scala/scalajson/ast/unsafe/JValue.scala
This is an implementation which adds various conversions from
JNumber
to other number types (see #37). Unlike the previous implementation, these conversionstoDouble
will returnNone
if the value inside theJNumber
is too big and you will lose data)To make the various number converter calls as performant as possible, when you construct a
JNumber
we store a bitFlag (calledconstructedFlag
) which tracks how aJNumber
is constructed. This means that if you construct aJNumber
as a double with its argument (i.e.JNumber(3446437343d)
), we first do a check against this bitflag (which is a single if statement) and immediately return aSome(value.toDouble)
(since we know its a double), otherwise we do a more expensive check. This also means that theJNumber
has an extra int per instance ofJNumber
in terms of memory (so we are trading a very small amount of extra memory for a more performant number conversion in typical use cases).The checks were inspired from argonaut (i.e. https://github.com/argonaut-io/argonaut/blob/master/argonaut/shared/src/main/scala/argonaut/JsonNumber.scala) afaik, these are known to be correct
One note, due to how
ast.unsafe.JNumber
was constructed (theString
constructer by default is public), we now added the a second parameter to the constructor list which is not hidden (i.e. itsfinal case class JNumber(value: String, constructedFlag: Int = 0)
) which means theunapply
method has been changed. Although this change is breaking, it means theunsafe
AST does give you control over specifying theconstructedFlag
.To be done
Here is the current results for the benchmarks
@ktoso @travisbrown @Ichoran @eed3si9n @sirthias Thoughts?