Skip to content

(Used to be a discussion) Local Error Reporting #1778

@jpolitz

Description

@jpolitz

This is a proposal for a way to handle erroneous execution in Pyret that emphasizes making "throwing" an error lightweight, while still handling them in a typable way without catch. It was languishing in a Slack post, and I figure it's better to have it in a more central place. I don't expect to move on this any time soon.

A New Control-Flow Construct

Pyret gets a new keyword, failwith, which is part of a function declaration:

fun string-to-number(s :: String) -> Number
  failwith fail :: Error:

  ch = str-substring(s, 0, 1)
  when is-not-numeric(ch):
    fail(invalid-input("Not a numeric string", ch)
  end
end

failwith binds a continuation at the point of the function start for reporting errors. In the example above, this is the identifier fail. fail accepts arguments of type Error in this example, which is a built-in datatype in Pyret. Calling the fail function the first time in the dynamic extent of the call to string-to-number jumps back to the start of the string-to-number function with the error value. Calling a particular binding of fail more than once, or outside the extent of string-to-number, is a bug, and causes an uncatchable error. That is, fail is an escape continuation.

Strawman alternative: Pyret could have a new keyword that is a local control-flow construct. For example, we could call it local-throw, and it could jump to the boundary of the enclosing function. However, Pyret has so many functions defined for things like for and local helpers that a useful meaning of "the nearest enclosing function" is tricky to pin down (a similar problem arises with this in JavaScript). We know enough about continuations to bind them as identifiers and allow helper functions, etc, to trigger the failure.

Handling Errors

With no other changes, this would merely act as a new binding position for the existing raise function. A caller of string-to-number has no special way to catch the failure, so this call would act just as it does in current Pyret, terminating the whole computation if it fails:

check:
  string-to-number("1a2") raises "invalid-input"
end

Catching Errors

To handle errors, Pyret gets a new syntactic form for calling functions, which will evaluate to an Either of the successful answer or the error value. To use it, append ? to the function name:

check:
  string-to-number?("1a2")
    is left(invalid-input("Not a numeric string, "a"))
  string-to-number?("123") is right(123)
end

Note that a ?-call works naturally on functions that aren't defined with a failwith clause. A function without a failwith clause behaves the same in a ? expression as a function with a failwith clause that never signals an error.

Local Handling Only

This new ? calling form only catches errors raised directly by the failure handler of the function being called, not any other failures. This means that if the author of a function didn't consider catching an error or raising it explicitly, the caller cannot choose to notice and handle those errors. For example, if we write a function that calls string-to-number but doesn't handle invalid input errors, the function always errors, no matter how it is called:

fun add-strings(s1, s2):
  string-to-number(s1) + string-to-number(s2)
end
check:
  add-strings("1ab", "123") raises "invalid-input"
  add-strings?("1ab", "123") raises "invalid-input"
end

This local handling rule is a design decision that deliberately weakens the extent of error handling:
It encourages catching errors close to where they happen, since they can only be handled immediately at the call that may error.

It correctly ascribes blame to the particular function invocation that caused the error, as opposed to an exception handler that could have received an exception from one of many locations in the program.

It makes type-checking simpler, since a call site only needs to know the declared failwith type in order to type-check a ? expression to the correct Either type.

Type-checking

This adds a new component to function types, which is the failwith position:

string-to-number :: (String -> Number ? Error)

Subtyping the failwith position is covariant, just like the return position.

A function that doesn't declare a failwith position has Bot as its failwith position:

number-to-string :: (Number -> String ? Bot)

Function annotations written without a ? position are sugar for Top in that position:

fun map<a,b>(f :: (a -> b), l :: List<a>) -> List<b>:
   ...
end

is sugar for

fun map<a,b>(f :: (a -> b ? Top), l :: List<a>) -> List<b>:
   ...
end

(Note that this may be one level more subtle – it might be that positive failwith positions get Bot and negative failwith positions get Top, to handle nested contravariant positions in annotations. This would need discussion on details and soundness.)

This version of map accepts a function with any error declaration, due to subtyping, and with the default implementation of map, would catch no errors and throw whatever underlying errors the callback throws.

Some Examples

Deferring errors in map

Note that we could also write map as:

fun map<a, b, c>(f :: (a -> b ? c), l :: List<a>) -> List<b>
  failwith fail :: c:
  fun map-help(l):
    cases(List<a>) l:
      | empty => empty
      | link(first, rest) =>
        cases(Either<a, c>) f?(first):
          | left(c) => fail(c)  # note that this calls the _outer_ fail
          | right(ans) =>
            link(ans, map-help(f, rest))
        end
    end
  end
  map-help(f, l)
end

Then map could be called with ? and return the (first) failure. Since the Either nonsense really pollutes the function, we could go one further. The unwrapping and forwarding logic could be moved into a helper:

fun defer-errors1<A, B, Err>(f :: A -> B ? Err, fail :: Err -> Bot):
  lam(x):
    cases(Either<A, Err>) f?(x):
      | left(err) => fail(err)
      | right(ans) => ans
    end
  end
end
fun map<a, b, c>(f :: (a -> b ? c), l :: List<a>) -> List<b>
  failwith fail :: c:
  shadow f = defer-errors1(f, fail)
  cases(List<a>) l:
    | empty => empty
    | link(first, rest) => link(f(first), map(f, rest))
  end
end

Helpers like defer-errorsN can be builtin for this kind of situation. This lets a function choose naturally if it wants to handle errors from a callback, return the same catchable errors as the callback, or if a callback error should simply terminate the whole program (e.g. it's a bug). map could also choose to not terminate on the first error, and instead have its fail take a List<c>, reporting all the errors that occurred.

This proposal doesn't come with a clear sense of what the "right" version of map is, though I suspect the one above is pretty good. The strength of the proposal is that it lets us experiment with different patterns, and come to a set of best practices for situations like map and filter. The weakness is that it's not clear the default is the right thing.

Files, Script vs Program

A quick script to do some work:

include file
text = read-file("my-data.txt")
lines = string-split(text, "\n")
line-count = lines.length()
sum = reduce(_ + _, map(string-to-number, lines))

A more principled program fragment:

import file as F
# A data structure to decide what to render in a UI file picker
# for numeric data files
data FileSummary:
  | file-error(f :: FileError)
  | invalid-format(s :: String)
  | summarized(line-count :: Number, sum :: Number)
end
...
fun get-file-summary(filename):
  maybe-text = read-file?(filename)
  cases(Either<String, FileError>) maybe-text:
    | left(err) => file-error(err)
    | right(text) =>
      lines = string-split(text, "\n")
      line-count = lines.count()
      as-numbers = map?(string-to-number, lines)
      cases(Either<List<Number>, Error) s:
        | left(err) => invalid-format(tostring(err))
        | right(numbers) =>
          summarized(line-count, reduce(_ + _, numbers))
      end
  end
end
...

Implementation and Performance

Escape continuations don't cause any stack-copying, so in both a more traditional implementation and in the JavaScript compiler, we simply need to leave enough information on the stack to "jump" it back to the right place, and to mark fail continuations as no longer valid (in case they end up stored on the heap and called later).

In an assembly implementation, I'd change the calling convention to have success and failure continuations as stack/code pointers explicitly in the call. A ? call would point to code that allocates a right with the success and a left with the failure, and the return of the function would jump to the right place. A normal call would simply have a failure continuation that aborts the computation as normal.

Allocating the functions to do this in JavaScript is probably a non-starter, so we probably want fail to raise a special kind of exception that is caught by ? call sites and passes through other handlers. Then the catch of that special handler can allocate the left before using the result.

There are clear opportunities for inlining and optimizing ? calls that appear directly in Either case matches in the future

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions