Skip to content

Function.applyN #156

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 13, 2018
Merged

Function.applyN #156

merged 1 commit into from
Apr 13, 2018

Conversation

matthewleon
Copy link
Contributor

addresses #155

I haven't thought of a great intuitive example for this, as the places I'm using it are a bit abstruse. A lot of the obvious examples often seem better suited to using Semigroup or Monoid. Any ideas?

@garyb
Copy link
Member

garyb commented Jan 28, 2018

I was going to mention on your other issue about that - it does seem to me that this is very Monoidy - if used for a monoidal type it's just the power function but less efficient.

@matthewleon
Copy link
Contributor Author

if used for a monoidal type it's just the power function but less efficient.

But there are definitely cases where this makes sense, like wanting stack safety or just not wanting to accumulate a large thunk. Just have to think of a simple example to attach here.

@matthewleon
Copy link
Contributor Author

My brain might be a bit rattled now, but I think particularly for the Endo monoid, power should always be less efficient than this, no?

@garyb
Copy link
Member

garyb commented Jan 28, 2018

power will always result in less appends than this, it's logarithmic, but you're right about stack safety.

@natefaubion
Copy link

There's no reason you can't make a stack safe Endo http://try.purescript.org/?backend=core&gist=8b235e18090ec1a003bdf2d6f417668d

@matthewleon
Copy link
Contributor Author

power will always result in less appends than this, it's logarithmic, but you're right about stack safety.

A bit confused by this statement, as applyN is not actually resulting in any building of thunks by composition. The way I see it, power will build a thunk in, yes, O(log(n)), but then there will still be n iterations of the function that it is composing, whereas applyN just goes ahead and iterates the function n times, without any thunk-building. Perhaps I should try to benchmark a demo that iterates something like (_ + 1) a very large number of times to see if my thinking is straight here.

@matthewleon
Copy link
Contributor Author

There's no reason you can't make a stack safe Endo http://try.purescript.org/?backend=core&gist=8b235e18090ec1a003bdf2d6f417668d

While this is stack safe, and also pretty cool, power itself remains non-stack safe, no?

That said, I feel like this distracts a bit from my main point, which is the one I'm making in the previous comment. Unless I'm misunderstanding how power will work in these contexts... Which is entirely possible :)

@matthewleon
Copy link
Contributor Author

There's no reason you can't make a stack safe Endo http://try.purescript.org/?backend=core&gist=8b235e18090ec1a003bdf2d6f417668d

This also is substituting materializing an Array for creating a thunk. When combined with power, you'll be creating a bunch of arrays and then throwing them at the GC.

That said, I do still think it's cool, and probably has its place, no?

@natefaubion
Copy link

Yes on both accounts, but since power is logarithmic, I wonder what the overhead actually is for realistic sizes. It's not allocating an array for every item , nor is it eating stack for every item. Array obviously has an upperbound where it stops being efficient.

-- |
-- | If n is less than or equal to 0, the function is not applied.
applyN :: forall a. (a -> a) -> Int -> a -> a
applyN f n = go n
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think eta-reducing this (applyN f = go) would be nice, so that there are fewer variables of type Int in scope, which means there's less to keep in your head while reading this. We then wouldn't need to use a primed identifier in the definition of go, I think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I think it might not be entirely clear what n is in the comments: how about this?

The function applyN f n applies the function f to its argument n times.

I think examples using f = (_+1) are probably best, seeing as there are few other functions and data types in scope here for us to use. Examples using that f would hopefully be easy to understand, too.

@@ -106,6 +105,9 @@ infixl 4 greaterThanOrEq as >=
-- | Compares two values by mapping them to a type with an `Ord` instance.
comparing :: forall a b. Ord b => (a -> b) -> (a -> a -> Ordering)
comparing f = compare `on` f
where
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be a little nicer to just inline on here, instead of defining it and using it immediately.

@hdgarrood
Copy link
Contributor

hdgarrood commented Jan 28, 2018

log2 (i.e. log to base 2) grows so slowly that I think we can pretty much consider power to be stack-safe. You'd need an astronomically huge n for it to be a problem. Assuming that each recursive call consumes a constant amount of stack m (please let me know if this assumption seems unsafe), and letting the total stack size be M, we're okay as long as m * log2 n < M. Equivalently, n * 2^m < 2^M, or n < 2^(M - m). I think empirically 10^4 is a sensible estimate for M, and M - m is generally going to be very close to M. So we're okay for n < 2^(10^4) (ish). Of course it's impossible for n to get anywhere near that size, because it's an Int.

So we need m * log2 n < M, or equivalently log2 n < M/m, or n < 2^(M/m), i.e. n must be less than 2 raised to M/m, which is the number of recursions we can make before causing a stack overflow. Empirically (waves hands) that number is often around 10^4, but 2^(10^4) is significantly larger than what will fit in an Int.

@hdgarrood
Copy link
Contributor

Wait no, this is wrong. 2^(m * log2 n) is not equal to n * 2^m. Let me reconsider this.

@@ -77,6 +81,20 @@ applyFlipped x f = f x
-- | ```
infixl 1 applyFlipped as #

-- | Applies the function `f` to its argument `n` times.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing about this is that, while it might be clear what f and n are to someone looking at the source, it's not really as clear if you're looking at the docs on Pursuit. You kind of have to be familiar with the convention that n is for naturals and f is for functions, which is often safe to assume, but probably best not to in the Prelude. This is why I'd like to say something like "the function applyN f n applies the function f ..." so that it's clear what f and n are. It doesn't have to be that verbatim, but I would like f and n to be introduced a little more explicitly.

applyN :: forall a. (a -> a) -> Int -> a -> a
applyN f = go
where
go n' acc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could just use n here instead of n' now that it won't shadow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

@matthewleon
Copy link
Contributor Author

@hdgarrood I think I've addressed your review comments in this latest amendment.

@matthewleon
Copy link
Contributor Author

I've created a branch of purescript-monoid that benchmarks a comparison between applyN and power with Endo, using the function (_ + 1) iterated 10000 times. Results:

applyN
mean   = 63.70 μs
stddev = 38.25 μs
min    = 55.92 μs
max    = 3.09 ms
power Endo
mean   = 98.29 μs
stddev = 26.90 μs
min    = 89.86 μs
max    = 812.34 μs

I'd say that's a difference in efficiency that makes this function worthwhile.

This thread has gone in a few different directions, so I'd like to resume, I think in order of importance, why I think this function is worthwhile.

  1. In the first place, applyN is much more beginner-friendly than power and Endo.
  2. Writing it this way, rather than using power and Endo, gives us a nice boost in efficiency, as reflected in the mini benchmark, as we don't materialize any intermediate data structure (thunks or otherwise).
  3. We use constant, rather than logarithmic, stack space (I'll avoid the term "stack-safe" here, as it's a bit subjective).

@matthewleon
Copy link
Contributor Author

Benchmark branch, for the curious: https://github.com/matthewleon/purescript-monoid/tree/bench-endo

@hdgarrood
Copy link
Contributor

hdgarrood commented Jan 28, 2018

What happens if you use n = top :: Int in your benchmark? I'd like to think both implementations would work.

@matthewleon
Copy link
Contributor Author

What happens if you use n = top :: Int in your benchmark? I'd like to think both implementations would work.

Ran them a single time each:

applyN
mean   = 7.08 s
stddev = NaN s
min    = 7.08 s
max    = 7.08 s
power Endo
mean   = 23.01 s
stddev = NaN s
min    = 23.01 s
max    = 23.01 s

Do keep in mind, though, that power might be getting invoked in the context of another recursive function...

All that being said, I don't think the stack issues here are really the main attraction. The constant stack space is a nice "perk," but the speed difference and ergonomics are, I think, more important.

@hdgarrood
Copy link
Contributor

All that being said, I don't think the stack issues here are really the main attraction. The constant stack space is a nice "perk," but the speed difference and ergonomics are, I think, more important.

Yep, agreed.

@garyb garyb merged commit 3cbf8d3 into purescript:master Apr 13, 2018
@garyb garyb mentioned this pull request Apr 13, 2018
@NickHu
Copy link

NickHu commented Jun 28, 2020

Sorry to drag up an old thread, but can I ask why the iteration is not terminated early if you find a fixed point at $&lt; n$ iterations? a -> a is the type of pure endofunctions right? I'm not sure what the purescript compiler will compile this down to, but I'm imagining that this can be made more efficient if the iteration function is very expensive to compute but expected to converge quickly - please let me know if this is not the case.

@hdgarrood
Copy link
Contributor

I would say this is because adding an Eq constraint would make it impossible to use with types which aren’t Eq, and also because checking whether we have reached a fixed point each time we recurse could make this much more expensive than it is currently when used with functions which don’t have any fixed points.

@NickHu
Copy link

NickHu commented Jun 29, 2020

Ah, I had forgotten about the Eq constraint; would there be scope to add such a function that I mentioned, as applyN' or something? (Probably a better name exists)

If so, I can open a PR

@hdgarrood
Copy link
Contributor

I don't think this is function is generally useful enough to earn a spot in Prelude, sorry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants