Skip to content

Added JNumber conversions #38

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from
Closed

Added JNumber conversions #38

wants to merge 6 commits into from

Conversation

mdedetrich
Copy link
Owner

This is an implementation which adds various conversions from JNumber to other number types (see #37). Unlike the previous implementation, these conversions

  • Aren't done by a typeclass but rather by direct methods
  • Are correct in that they won't lose precision (i.e. a toDouble will return None if the value inside the JNumber is too big and you will lose data)

To make the various number converter calls as performant as possible, when you construct a JNumber we store a bitFlag (called constructedFlag) which tracks how a JNumber is constructed. This means that if you construct a JNumber as a double with its argument (i.e. JNumber(3446437343d)), we first do a check against this bitflag (which is a single if statement) and immediately return a Some(value.toDouble) (since we know its a double), otherwise we do a more expensive check. This also means that the JNumber has an extra int per instance of JNumber in terms of memory (so we are trading a very small amount of extra memory for a more performant number conversion in typical use cases).

The checks were inspired from argonaut (i.e. https://github.com/argonaut-io/argonaut/blob/master/argonaut/shared/src/main/scala/argonaut/JsonNumber.scala) afaik, these are known to be correct

One note, due to how ast.unsafe.JNumber was constructed (the String constructer by default is public), we now added the a second parameter to the constructor list which is not hidden (i.e. its final case class JNumber(value: String, constructedFlag: Int = 0)) which means the unapply method has been changed. Although this change is breaking, it means the unsafe AST does give you control over specifying the constructedFlag.

To be done

  • Proper tests
  • More benchmarks for other number types

Here is the current results for the benchmarks

> benchmarkJVM/testOnly scalajson.ast.JNumberConversionBenchmark
[info] ::Benchmark intBitFlagCheckSuccess::
[info] cores: 8
[info] hostname: Matthews-MacBook-Pro.local
[info] name: Java HotSpot(TM) 64-Bit Server VM
[info] osArch: x86_64
[info] osName: Mac OS X
[info] vendor: Oracle Corporation
[info] version: 25.111-b14
[info] Parameters(seed -> 300000): 0.004583
[info] Parameters(seed -> 600000): 0.004508
[info] Parameters(seed -> 900000): 0.00365
[info] Parameters(seed -> 1200000): 0.003651
[info] Parameters(seed -> 1500000): 0.003779
[info]
[info] ::Benchmark intManualCheckSuccess::
[info] cores: 8
[info] hostname: Matthews-MacBook-Pro.local
[info] name: Java HotSpot(TM) 64-Bit Server VM
[info] osArch: x86_64
[info] osName: Mac OS X
[info] vendor: Oracle Corporation
[info] version: 25.111-b14
[info] Parameters(seed -> 300000): 0.008185
[info] Parameters(seed -> 600000): 0.007592
[info] Parameters(seed -> 900000): 0.007627
[info] Parameters(seed -> 1200000): 0.007783
[info] Parameters(seed -> 1500000): 0.007503
[info]
[info] ::Benchmark floatBitFlagCheckSuccess::
[info] cores: 8
[info] hostname: Matthews-MacBook-Pro.local
[info] name: Java HotSpot(TM) 64-Bit Server VM
[info] osArch: x86_64
[info] osName: Mac OS X
[info] vendor: Oracle Corporation
[info] version: 25.111-b14
[info] Parameters(seed -> 300000): 0.004514
[info] Parameters(seed -> 600000): 0.004442
[info] Parameters(seed -> 900000): 0.00447
[info] Parameters(seed -> 1200000): 0.010328
[info] Parameters(seed -> 1500000): 0.010266
[info] Parameters(seed -> 300000): 0.004446
[info] Parameters(seed -> 600000): 0.004409
[info] Parameters(seed -> 900000): 0.004563
[info] Parameters(seed -> 1200000): 0.010455
[info] Parameters(seed -> 1500000): 0.010432
[info] Parameters(seed -> 300000): 0.004532
[info] Parameters(seed -> 600000): 0.004559
[info] Parameters(seed -> 900000): 0.004381
[info] Parameters(seed -> 1200000): 0.010438
[info] Parameters(seed -> 1500000): 0.01039
[info] Parameters(seed -> 300000): 0.004513
[info] Parameters(seed -> 600000): 0.004333
[info] Parameters(seed -> 900000): 0.004222
[info] Parameters(seed -> 1200000): 0.010413
[info] Parameters(seed -> 1500000): 0.010087
[info] Parameters(seed -> 300000): 0.004332
[info] Parameters(seed -> 600000): 0.004259
[info] Parameters(seed -> 900000): 0.004153
[info] Parameters(seed -> 1200000): 0.010197
[info] Parameters(seed -> 1500000): 0.010366
[info]
[info] ::Benchmark floatManualCheckSuccess::
[info] cores: 8
[info] hostname: Matthews-MacBook-Pro.local
[info] name: Java HotSpot(TM) 64-Bit Server VM
[info] osArch: x86_64
[info] osName: Mac OS X
[info] vendor: Oracle Corporation
[info] version: 25.111-b14
[info] Parameters(seed -> 300000): 0.016463
[info] Parameters(seed -> 600000): 0.016133
[info] Parameters(seed -> 900000): 0.015968
[info] Parameters(seed -> 1200000): 0.02228
[info] Parameters(seed -> 1500000): 0.022384
[info] Parameters(seed -> 300000): 0.016113
[info] Parameters(seed -> 600000): 0.016185
[info] Parameters(seed -> 900000): 0.0167
[info] Parameters(seed -> 1200000): 0.023496
[info] Parameters(seed -> 1500000): 0.022714
[info] Parameters(seed -> 300000): 0.016185
[info] Parameters(seed -> 600000): 0.016222
[info] Parameters(seed -> 900000): 0.016077
[info] Parameters(seed -> 1200000): 0.022478
[info] Parameters(seed -> 1500000): 0.022686
[info] Parameters(seed -> 300000): 0.016302
[info] Parameters(seed -> 600000): 0.016113
[info] Parameters(seed -> 900000): 0.015989
[info] Parameters(seed -> 1200000): 0.022411
[info] Parameters(seed -> 1500000): 0.022288
[info] Parameters(seed -> 300000): 0.016191
[info] Parameters(seed -> 600000): 0.016106
[info] Parameters(seed -> 900000): 0.015932
[info] Parameters(seed -> 1200000): 0.022691
[info] Parameters(seed -> 1500000): 0.022319
[info]
[info] Passed: Total 0, Failed 0, Errors 0, Passed 0

@ktoso @travisbrown @Ichoran @eed3si9n @sirthias Thoughts?

There is now a constructedFlag which tracks how a JNumber is created
@mdedetrich mdedetrich changed the title Added JNumber conversions Added JNumber conversions Jul 11, 2017
@codecov-io
Copy link

codecov-io commented Jul 11, 2017

Codecov Report

Merging #38 into master will decrease coverage by 9.25%.
The diff coverage is 17.58%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #38      +/-   ##
==========================================
- Coverage   44.68%   35.43%   -9.26%     
==========================================
  Files           5        5              
  Lines         687      906     +219     
  Branches      133      194      +61     
==========================================
+ Hits          307      321      +14     
- Misses        380      585     +205
Impacted Files Coverage Δ
...s/src/main/scala/scalajson/ast/unsafe/JValue.scala 0% <0%> (ø) ⬆️
js/src/main/scala/scalajson/ast/JValue.scala 0% <0%> (ø) ⬆️
shared/src/main/scala/scalajson/ast/package.scala 59.91% <100%> (+2.13%) ⬆️
...m/src/main/scala/scalajson/ast/unsafe/JValue.scala 42.79% <22.97%> (-16.72%) ⬇️
jvm/src/main/scala/scalajson/ast/JValue.scala 47.36% <25.67%> (-26.03%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fd5637b...171f30d. Read the comment docs.

Some(value.toFloat)
else {
val asFloat = value.toFloat
if (BigDecimal(value) == BigDecimal(asFloat.toDouble))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scala's toFloat is more relaxed on this.

> eval "1.1111111111111111111111".toFloat
[info] ans: Float = 1.1111112
  1. If I am a function that was given just JValue, how would I know what content your value contains?
  2. Does it make sense to be more strict than String's toFloat here?
  3. Suppose yes, would it make sense to return Try or throw exception so at least we'd know why you've bailed out?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, I wasn't entirely sure how floats worked in Scala, but after reading this I think I need to do some more research If I have a better way of detecting losss of precision

To answer your questions

  1. You can always read the raw the string value in a JNumber. The point of these converters is to correctly convert the string to a number (correctly here means without losing any data). As we can tell, this is quite hard ;)
  2. Yes, else these converters are pointless. If its not possible then I will remove toFloat (nothing is stopping you from doing a .toFloat on the actual string value)
  3. Well it currently returns Option[Float]. Returning None implies that you have lost some data (unless I am missing something here?)

Copy link
Contributor

@eed3si9n eed3si9n Jul 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spray used BigDecimal internally, so toFloat works like this:

eval BigDecimal("1.1111111111111111111111").toFloat
[info] ans: Float = 1.1111112

If we care about correctness so much, my vote would be get rid of toFloat altogether. See also Circe - https://github.com/circe/circe/blob/master/modules/core/shared/src/main/scala/io/circe/JsonNumber.scala

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, ill have a look to get rid of it

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting rid of Float for correctness is no more sensible than getting rid of Double for correctness. Also, I am not convinced that possible loss of precision of trailing digits is a valid reason for returning None for a floating point value. For example, 1.0/7 is printed as 0.14285714285714285 but the conventional rounding, which also parses to the same number, is 0.14285714285714286.

0.142857142857142857142857142857142857142857142857142857...
0.142857142857142849212692681248881854116916656494140625
                  ^ stops matching decimal expansion

So neither having the exact decimal representation of the double value nor having the best rounding of that value really represents what's going on. Instead, if you overflow (get positive or negative infinity) that's wrong; and if you underflow (the number is not zero but you return zero) that is also wrong. Otherwise, the parse should just return the best representation of the value, and everyone should keep in mind (just like always) Float and Double might not represent things exactly.

The reason to get rid of toFloat is if you want to care about exactness less because you already can't win that battle.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting rid of Float for correctness is no more sensible than getting rid of Double for correctness.

Both def toFloat: Option[Float] and def toDouble: Option[Double] that adds shape checking (BigDecimal(value) == BigDecimal(asFloat.toDouble)) are ridiculous. Because IEEE floats internally represent decimals as a binary fraction, it can't express 0.1.

If we are letting them survive, trusting that the user knows what they are doing, I think the Scala idiomatic way is to behave like StringOps and def toFloat: Float and def toDouble: Double return whatever value.toFloat / value.toDouble returns.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eed3si9n - BigDecimal(value) == BigDecimal(asDouble) will be true for 0.1, but the float versions will be nonsensical. (Note that the Scala BigDecimal wraps the Java one to be a little less weird about 0.1: the default assumption is that you meant, by your Double, its canonical string representation, not the crazily-long binary fraction expansion.)

def this(value: Double) = this(value.toString)

final class JNumber private[ast] (val value: String)(
private[ast] val constructedFlag: Int)
Copy link
Contributor

@eed3si9n eed3si9n Jul 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could constructedFlag be a smell that we are doing poor man's object orientation?
If we want lossless (and performant?) round trip from Double -> JNumber -> Double, and Json -> AST -> Json, would it make sense to split this class into private[ast] JInt, JLong, JFloat, JDouble, JBigInt, JBigDecimial, and JStringNumber?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I didn't realise this at the time, @sirthias what do you think of this? If we are going to do performant number conversions may as well represent the cases as various classes

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's some pretty wide fanout. In principle if encoding and decoding in memory is a likely scenario, this is the way to do it, but I'm skeptical that this is an important use case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's #41 that splits JNumber up.

@eed3si9n eed3si9n mentioned this pull request Jul 24, 2017
# Conflicts:
#	js/src/main/scala/scalajson/ast/JValue.scala
#	js/src/main/scala/scalajson/ast/unsafe/JValue.scala
#	jvm/src/main/scala/scalajson/ast/JValue.scala
#	jvm/src/main/scala/scalajson/ast/unsafe/JValue.scala
@mdedetrich mdedetrich closed this Dec 11, 2017
@mdedetrich mdedetrich deleted the jNumberConversions branch December 11, 2017 22:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants