Description
This is a proposal for a way to handle erroneous execution in Pyret that emphasizes making "throwing" an error lightweight, while still handling them in a typable way without catch
. It was languishing in a Slack post, and I figure it's better to have it in a more central place. I don't expect to move on this any time soon.
A New Control-Flow Construct
Pyret gets a new keyword, failwith, which is part of a function declaration:
fun string-to-number(s :: String) -> Number
failwith fail :: Error:
ch = str-substring(s, 0, 1)
when is-not-numeric(ch):
fail(invalid-input("Not a numeric string", ch)
end
end
failwith
binds a continuation at the point of the function start for reporting errors. In the example above, this is the identifier fail
. fail
accepts arguments of type Error
in this example, which is a built-in datatype in Pyret. Calling the fail
function the first time in the dynamic extent of the call to string-to-number
jumps back to the start of the string-to-number
function with the error value. Calling a particular binding of fail
more than once, or outside the extent of string-to-number
, is a bug, and causes an uncatchable error. That is, fail
is an escape continuation.
Strawman alternative: Pyret could have a new keyword that is a local control-flow construct. For example, we could call it local-throw
, and it could jump to the boundary of the enclosing function. However, Pyret has so many functions defined for things like for
and local helpers that a useful meaning of "the nearest enclosing function" is tricky to pin down (a similar problem arises with this
in JavaScript). We know enough about continuations to bind them as identifiers and allow helper functions, etc, to trigger the failure.
Handling Errors
With no other changes, this would merely act as a new binding position for the existing raise
function. A caller of string-to-number
has no special way to catch the failure, so this call would act just as it does in current Pyret, terminating the whole computation if it fails:
check:
string-to-number("1a2") raises "invalid-input"
end
Catching Errors
To handle errors, Pyret gets a new syntactic form for calling functions, which will evaluate to an Either
of the successful answer or the error value. To use it, append ?
to the function name:
check:
string-to-number?("1a2")
is left(invalid-input("Not a numeric string, "a"))
string-to-number?("123") is right(123)
end
Note that a ?
-call works naturally on functions that aren't defined with a failwith
clause. A function without a failwith
clause behaves the same in a ?
expression as a function with a failwith
clause that never signals an error.
Local Handling Only
This new ?
calling form only catches errors raised directly by the failure handler of the function being called, not any other failures. This means that if the author of a function didn't consider catching an error or raising it explicitly, the caller cannot choose to notice and handle those errors. For example, if we write a function that calls string-to-number
but doesn't handle invalid input errors, the function always errors, no matter how it is called:
fun add-strings(s1, s2):
string-to-number(s1) + string-to-number(s2)
end
check:
add-strings("1ab", "123") raises "invalid-input"
add-strings?("1ab", "123") raises "invalid-input"
end
This local handling rule is a design decision that deliberately weakens the extent of error handling:
It encourages catching errors close to where they happen, since they can only be handled immediately at the call that may error.
It correctly ascribes blame to the particular function invocation that caused the error, as opposed to an exception handler that could have received an exception from one of many locations in the program.
It makes type-checking simpler, since a call site only needs to know the declared failwith
type in order to type-check a ?
expression to the correct Either type.
Type-checking
This adds a new component to function types, which is the failwith position:
string-to-number :: (String -> Number ? Error)
Subtyping the failwith position is covariant, just like the return position.
A function that doesn't declare a failwith position has Bot as its failwith position:
number-to-string :: (Number -> String ? Bot)
Function annotations written without a ?
position are sugar for Top in that position:
fun map<a,b>(f :: (a -> b), l :: List<a>) -> List<b>:
...
end
is sugar for
fun map<a,b>(f :: (a -> b ? Top), l :: List<a>) -> List<b>:
...
end
(Note that this may be one level more subtle – it might be that positive failwith
positions get Bot and negative failwith
positions get Top, to handle nested contravariant positions in annotations. This would need discussion on details and soundness.)
This version of map
accepts a function with any error declaration, due to subtyping, and with the default implementation of map
, would catch no errors and throw whatever underlying errors the callback throws.
Some Examples
Deferring errors in map
Note that we could also write map as:
fun map<a, b, c>(f :: (a -> b ? c), l :: List<a>) -> List<b>
failwith fail :: c:
fun map-help(l):
cases(List<a>) l:
| empty => empty
| link(first, rest) =>
cases(Either<a, c>) f?(first):
| left(c) => fail(c) # note that this calls the _outer_ fail
| right(ans) =>
link(ans, map-help(f, rest))
end
end
end
map-help(f, l)
end
Then map
could be called with ?
and return the (first) failure. Since the Either
nonsense really pollutes the function, we could go one further. The unwrapping and forwarding logic could be moved into a helper:
fun defer-errors1<A, B, Err>(f :: A -> B ? Err, fail :: Err -> Bot):
lam(x):
cases(Either<A, Err>) f?(x):
| left(err) => fail(err)
| right(ans) => ans
end
end
end
fun map<a, b, c>(f :: (a -> b ? c), l :: List<a>) -> List<b>
failwith fail :: c:
shadow f = defer-errors1(f, fail)
cases(List<a>) l:
| empty => empty
| link(first, rest) => link(f(first), map(f, rest))
end
end
Helpers like defer-errorsN
can be builtin for this kind of situation. This lets a function choose naturally if it wants to handle errors from a callback, return the same catchable errors as the callback, or if a callback error should simply terminate the whole program (e.g. it's a bug). map
could also choose to not terminate on the first error, and instead have its fail take a List<c>
, reporting all the errors that occurred.
This proposal doesn't come with a clear sense of what the "right" version of map
is, though I suspect the one above is pretty good. The strength of the proposal is that it lets us experiment with different patterns, and come to a set of best practices for situations like map
and filter
. The weakness is that it's not clear the default is the right thing.
Files, Script vs Program
A quick script to do some work:
include file
text = read-file("my-data.txt")
lines = string-split(text, "\n")
line-count = lines.length()
sum = reduce(_ + _, map(string-to-number, lines))
A more principled program fragment:
import file as F
# A data structure to decide what to render in a UI file picker
# for numeric data files
data FileSummary:
| file-error(f :: FileError)
| invalid-format(s :: String)
| summarized(line-count :: Number, sum :: Number)
end
...
fun get-file-summary(filename):
maybe-text = read-file?(filename)
cases(Either<String, FileError>) maybe-text:
| left(err) => file-error(err)
| right(text) =>
lines = string-split(text, "\n")
line-count = lines.count()
as-numbers = map?(string-to-number, lines)
cases(Either<List<Number>, Error) s:
| left(err) => invalid-format(tostring(err))
| right(numbers) =>
summarized(line-count, reduce(_ + _, numbers))
end
end
end
...
Implementation and Performance
Escape continuations don't cause any stack-copying, so in both a more traditional implementation and in the JavaScript compiler, we simply need to leave enough information on the stack to "jump" it back to the right place, and to mark fail continuations as no longer valid (in case they end up stored on the heap and called later).
In an assembly implementation, I'd change the calling convention to have success and failure continuations as stack/code pointers explicitly in the call. A ?
call would point to code that allocates a right with the success and a left with the failure, and the return of the function would jump to the right place. A normal call would simply have a failure continuation that aborts the computation as normal.
Allocating the functions to do this in JavaScript is probably a non-starter, so we probably want fail
to raise a special kind of exception that is caught by ?
call sites and passes through other handlers. Then the catch of that special handler can allocate the left before using the result.
There are clear opportunities for inlining and optimizing ?
calls that appear directly in Either
case matches in the future