|
| 1 | +# About |
| 2 | + |
| 3 | +Many languages have a way such as `null` or `none` to indicate a non-existent value. |
| 4 | +Because Julia is designed to handle large volumes of (often messy) data, it has multiple forms of [nothingness][nothingness]. |
| 5 | + |
| 6 | +The overall aim is to flag missing or suspect values as they are encountered, then continue without raising an exception. |
| 7 | + |
| 8 | +## `nothing` |
| 9 | + |
| 10 | +If a value really does not exist, it is represented by [`nothing`][nothing]. This is probably closest to what C (`NULL`) or Python (`None`) might do. |
| 11 | + |
| 12 | +```julia-repl |
| 13 | +julia> n = nothing |
| 14 | +
|
| 15 | +julia> isnothing(n) |
| 16 | +true |
| 17 | +
|
| 18 | +julia> typeof(n) |
| 19 | +Nothing |
| 20 | +``` |
| 21 | + |
| 22 | +So `nothing` is a singleton value of type `Nothing`, and we can test for it. |
| 23 | + |
| 24 | +One common use of `nothing` is as a return (non-)value for functions which are used only for their side effects (printing, network configuration, or whatever). |
| 25 | + |
| 26 | +## `missing` |
| 27 | + |
| 28 | +For situations where a value exists in theory but we don't know what it is, [`missing`][missing] is used. For example, when counting vehicles traveling on a road, human observers might need a break or automatic sensors break down, but the traffic continues to flow. |
| 29 | + |
| 30 | +Thus `missing` is a placeholder, warning humans that they need to make a decision about how to handle this gap in the data. |
| 31 | + |
| 32 | +```julia-repl |
| 33 | +julia> mv = [1, 2, missing] |
| 34 | +3-element Vector{Union{Missing, Int64}}: |
| 35 | + 1 |
| 36 | + 2 |
| 37 | + missing |
| 38 | +
|
| 39 | +julia> typeof(mv) |
| 40 | +Vector{Union{Missing, Int64}} (alias for Array{Union{Missing, Int64}, 1}) |
| 41 | +
|
| 42 | +julia> ismissing.(mv) # broadcast function, displays as 1 for true, 0 for false |
| 43 | +3-element BitVector: |
| 44 | + 0 |
| 45 | + 0 |
| 46 | + 1 |
| 47 | +``` |
| 48 | + |
| 49 | +Few other languages have this feature built in, but close analogues are `NA` in R or `NULL` in SQL. |
| 50 | + |
| 51 | +The vector type in the example above is set to `Union{Missing, Int64}`, which has some similarity to types such as `Option` or `Maybe` in other languages. |
| 52 | + |
| 53 | +Expressions usually return `missing` by default if `missing` values are present. |
| 54 | +If you want these values to be ignored, use the [`skipmissing()`][skipmissing] function to make this explicit: |
| 55 | + |
| 56 | +```julia-repl |
| 57 | +julia> mv = [1, 2, missing] |
| 58 | +3-element Vector{Union{Missing, Int64}}: |
| 59 | + 1 |
| 60 | + 2 |
| 61 | + missing |
| 62 | +
|
| 63 | +julia> sum(mv) # missing in, missing out |
| 64 | +missing |
| 65 | +
|
| 66 | +julia> skipmissing(mv) |
| 67 | +skipmissing(Union{Missing, Int64}[1, 2, missing]) |
| 68 | +
|
| 69 | +julia> collect(skipmissing(mv)) |
| 70 | +2-element Vector{Int64}: |
| 71 | + 1 |
| 72 | + 2 |
| 73 | +
|
| 74 | +julia> sum(skipmissing(mv)) # functions like sum() can work with iterators |
| 75 | +3 |
| 76 | +``` |
| 77 | + |
| 78 | +Because `skipmissing` creates an iterator, wrap it in [`collect()`][collect] if you need a vector. |
| 79 | + |
| 80 | +Sometimes it is useful to replace `missing` values with some default. |
| 81 | +The [`@coalesce()`][coalesce] macro is useful for this, as it will return the first non-missing value (or `missing` if there is nothing else). |
| 82 | + |
| 83 | +```julia-repl |
| 84 | +julia> str = ["I", "exist", missing] |
| 85 | +3-element Vector{Union{Missing, String}}: |
| 86 | + "I" |
| 87 | + "exist" |
| 88 | + missing |
| 89 | +
|
| 90 | +julia> [@coalesce(s, "-") for s in str] |
| 91 | +3-element Vector{String}: |
| 92 | + "I" |
| 93 | + "exist" |
| 94 | + "-" |
| 95 | +``` |
| 96 | + |
| 97 | +## `NaN` |
| 98 | + |
| 99 | +Short for "Not a Number", [NaN][NaN] flags a computation problem in situations where a number was expected. |
| 100 | + |
| 101 | +```julia-repl |
| 102 | +julia> v = [0, 1, -1] |
| 103 | +3-element Vector{Int64}: |
| 104 | + 0 |
| 105 | + 1 |
| 106 | + -1 |
| 107 | +
|
| 108 | +julia> v / 0 |
| 109 | +3-element Vector{Float64}: |
| 110 | + NaN |
| 111 | + Inf |
| 112 | + -Inf |
| 113 | + |
| 114 | +julia> sum(v / 0) |
| 115 | +NaN |
| 116 | +``` |
| 117 | + |
| 118 | +Any sort of calculation on data including a NaN will give a `NaN` result. |
| 119 | + |
| 120 | +There is currently no special function to remove `NaN` values, but the standard [`filter()`][filter] function can do this quite simply. |
| 121 | +Only values for which some given condition is `true` will be copied to the result array: |
| 122 | + |
| 123 | +```julia-repl |
| 124 | +julia> filter(!isnan, [1, 2, NaN]) |
| 125 | +2-element Vector{Float64}: |
| 126 | + 1.0 |
| 127 | + 2.0 |
| 128 | +``` |
| 129 | + |
| 130 | +## Undefined data |
| 131 | + |
| 132 | +This is something you are likely use less frequently. |
| 133 | +Exceptions include: |
| 134 | +- There are ways to create composite types (such as [`struct`][struct]) with uninitialized fields. |
| 135 | +Julia usually tries to use default values, but can only do this if it has enough information to infer the field type. |
| 136 | +- In some cases, initializing a vector as `undef` can be a flexible and performant approach. |
| 137 | +This has been used, for example, in community solutions to the [Circular Buffer][circbuff] exercise (an advanced technique, but this is graded as a hard exercise). |
| 138 | + |
| 139 | +Getting an [`undef`][undef] result from an attempt to access a value is an error state, typically flagged with an [`UndefRefError`][UndefRefError]. |
| 140 | + |
| 141 | + |
| 142 | +[nothingness]: https://docs.julialang.org/en/v1/manual/faq/#Nothingness-and-missing-values |
| 143 | +[nothing]: https://docs.julialang.org/en/v1/base/base/#Core.Nothing |
| 144 | +[isnothing]: https://docs.julialang.org/en/v1/base/base/#Base.isnothing |
| 145 | +[missing]: https://docs.julialang.org/en/v1/base/base/#Base.missing |
| 146 | +[skipmissing]: https://docs.julialang.org/en/v1/base/base/#Base.skipmissing |
| 147 | +[collect]: https://docs.julialang.org/en/v1/base/collections/#Base.collect-Tuple{Any} |
| 148 | +[coalesce]: https://docs.julialang.org/en/v1/base/base/#Base.@coalesce |
| 149 | +[NaN]: https://en.wikipedia.org/wiki/NaN |
| 150 | +[filter]: https://docs.julialang.org/en/v1/base/collections/#Base.filter |
| 151 | +[struct]: https://docs.julialang.org/en/v1/base/base/#struct |
| 152 | +[undef]: https://docs.julialang.org/en/v1/base/arrays/#Core.undef |
| 153 | +[UndefRefError]: https://docs.julialang.org/en/v1/base/base/#Core.UndefRefError |
| 154 | +[circbuff]: https://exercism.org/tracks/julia/exercises/circular-buffer |
0 commit comments