Skip to content

Commit 64a71b8

Browse files
Tidied up the language on Fault Tolerance page (#7232)
* wip - Update fault-tolerance.md * Update fault-tolerance.md * Update fault-tolerance.md typo * Update fault-tolerance.md trailing spaces --------- Co-authored-by: Aaron Stannard <[email protected]>
1 parent 2dfbecf commit 64a71b8

File tree

1 file changed

+26
-51
lines changed

1 file changed

+26
-51
lines changed

docs/articles/actors/fault-tolerance.md

Lines changed: 26 additions & 51 deletions
Original file line numberDiff line numberDiff line change
@@ -4,25 +4,16 @@ title: Fault tolerance
44
---
55
# Fault Tolerance
66

7-
As explained in [Actor Systems](xref:actor-systems) each actor is the supervisor of its
8-
children, and as such each actor defines fault handling supervisor strategy.
9-
This strategy cannot be changed afterwards as it is an integral part of the
10-
actor system's structure.
7+
As explained in [Actor Systems](xref:actor-systems), each actor is the supervisor of its
8+
children, and as such each actor defines a fault handling supervisor strategy.
9+
This strategy cannot be changed after a child actor is created.
1110

1211
## Fault Handling in Practice
1312

14-
First, let us look at a sample that illustrates one way to handle data store errors,
15-
which is a typical source of failure in real world applications. Of course it depends
16-
on the actual application what is possible to do when the data store is unavailable,
17-
but in this sample we use a best effort re-connect approach.
13+
Let's set up an example strategy which will handle data store errors in a child actor. In this sample we use a best effort re-connect approach.
1814

1915
## Creating a Supervisor Strategy
2016

21-
The following sections explain the fault handling mechanism and alternatives
22-
in more depth.
23-
24-
For the sake of demonstration let us consider the following strategy:
25-
2617
```csharp
2718
protected override SupervisorStrategy SupervisorStrategy()
2819
{
@@ -46,9 +37,7 @@ protected override SupervisorStrategy SupervisorStrategy()
4637
}
4738
```
4839

49-
I have chosen a few well-known exception types in order to demonstrate the application of the fault handling directives described in [Supervision and Monitoring](xref:supervision). First off, it is a one-for-one strategy, meaning that each child is treated separately (an all-for-one strategy works very similarly, the only difference is that any decision is applied to all children of the supervisor, not only the failing one). There are limits set on the restart frequency, namely maximum 10 restarts per minute; each of these settings could be left out, which means that the respective limit does not apply, leaving the possibility to specify an absolute upper limit on the restarts or to make the restarts work infinitely. The child actor is stopped if the limit is exceeded.
50-
51-
This is the piece which maps child failure types to their corresponding directives.
40+
We will handle a few exception types to demonstrate some fault handling directives described in [Supervision and Monitoring](xref:supervision). This strategy is "one-for-one", meaning that each child is treated separately. The alternative is an "all-for-one" strategy, where a decision is applied to _all_ children of the supervisor, not only the failing one. We have chosen to set a limit of maximum 10 restarts per minute; The child actor is stopped if the limit is exceeded. We could have chosen to leave this argument out, which would have created a strategy where the child actor would restart indefinitely.
5241

5342
> [!NOTE]
5443
> If the strategy is declared inside the supervising actor (as opposed to
@@ -65,10 +54,7 @@ exceptions are handled by default:
6554
* `ActorKilledException` will stop the failing child actor; and
6655
* Any other type of `Exception` will restart the failing child actor.
6756

68-
If the exception escalate all the way up to the root guardian it will handle it
69-
in the same way as the default strategy defined above.
70-
71-
You can combine your own strategy with the default strategy:
57+
You can combine your own strategy with the default strategy like this:
7258

7359
```csharp
7460
protected override SupervisorStrategy SupervisorStrategy()
@@ -90,7 +76,7 @@ protected override SupervisorStrategy SupervisorStrategy()
9076

9177
### Stopping Supervisor Strategy
9278

93-
Closer to the Erlang way is the strategy to just stop children when they fail
79+
An alternative which is closer to the Erlang way is to stop children when they fail
9480
and then take corrective action in the supervisor when DeathWatch signals the
9581
loss of the child. This strategy is also provided pre-packaged as
9682
`SupervisorStrategy.StoppingStrategy` with an accompanying
@@ -99,17 +85,13 @@ loss of the child. This strategy is also provided pre-packaged as
9985

10086
### Logging of Actor Failures
10187

102-
By default the `SupervisorStrategy` logs failures unless they are escalated.
103-
Escalated failures are supposed to be handled, and potentially logged, at a level
104-
higher in the hierarchy.
105-
106-
You can mute the default logging of a `SupervisorStrategy` by setting
88+
The default strategy logs failures unless they are escalated. You can mute the default logging of a `SupervisorStrategy` by setting
10789
`loggingEnabled` to `false` when instantiating it. Customized logging
10890
can be done inside the `Decider`. Note that the reference to the currently
10991
failed child is available as the `Sender` when the `SupervisorStrategy` is
11092
declared inside the supervising actor.
11193

112-
You may also customize the logging in your own ``SupervisorStrategy`` implementation
94+
You can also customize the logging in your own ``SupervisorStrategy`` implementation
11395
by overriding the `logFailure` method.
11496

11597
## Supervision of Top-Level Actors
@@ -121,8 +103,7 @@ strategy.
121103

122104
## Test Application
123105

124-
The following section shows the effects of the different directives in practice,
125-
wherefor a test setup is needed. First off, we need a suitable supervisor:
106+
Consider this custom `SupervisorStrategy`:
126107

127108
```csharp
128109
public class Supervisor : UntypedActor
@@ -159,7 +140,7 @@ public class Supervisor : UntypedActor
159140
}
160141
```
161142

162-
This supervisor will be used to create a child, with which we can experiment:
143+
This supervisor will be used to create a child actor:
163144

164145
```csharp
165146
public class Child : UntypedActor
@@ -184,9 +165,9 @@ public class Child : UntypedActor
184165
}
185166
```
186167

187-
The test is easier by using the utilities described in [Akka-Testkit](xref:testing-actor-systems).
168+
We'll use the utilities in [Akka-Testkit](xref:testing-actor-systems) to help us describe and test the expected behavior.
188169

189-
Let us create actors:
170+
First, we'll create actors:
190171

191172
```csharp
192173
var supervisor = system.ActorOf<Supervisor>("supervisor");
@@ -195,8 +176,7 @@ supervisor.Tell(Props.Create<Child>());
195176
var child = ExpectMsg<IActorRef>(); // retrieve answer from TestKit’s TestActor
196177
```
197178

198-
The first test shall demonstrate the `Resume` directive, so we try it out by
199-
setting some non-initial state in the actor and have it fail:
179+
Our first test will demonstrate `Directive.Resume`, so we set some non-initial state in the child actor and cause it to fail:
200180

201181
```csharp
202182
child.Tell(42); // set state to 42
@@ -208,30 +188,27 @@ child.Tell("get");
208188
ExpectMsg(42);
209189
```
210190

211-
As you can see the value 42 survives the fault handling directive because we're using the `Resume` directive, which does not cause the actor to restart. Now, if we
212-
change the failure to a more serious `NullReferenceException`, that will no
213-
longer be the case:
191+
As you can see the value 42 survives the fault handling directive because we're using the `Resume` directive, which does not cause the actor to restart.
192+
193+
If we change the failure to a more serious `NullReferenceException`, which we defined above to result in a `Restart` directive, that will no longer be the case:
214194

215195
```csharp
216196
child.Tell(new NullReferenceException());
217197
child.Tell("get");
218198
ExpectMsg(0);
219199
```
220200

221-
This is because the actor has restarted and the original `Child` actor instance that was processing messages will be destroyed and replaced by a brand-new instance defined using the original `Props` passed to its parent.
201+
This is because the actor has restarted and the original `Child` actor instance that was processing messages will be destroyed and replaced by a brand-new instance defined using the same `Props`.
222202

223-
And finally in case of the fatal `IllegalArgumentException` the child will be
224-
terminated by the supervisor:
203+
And finally in case of the fatal `ArgumentException`, our strategy will return a stop directive, and the child will be terminated by the supervisor:
225204

226205
```csharp
227206
Watch(child); // have testActor watch "child"
228207
child.Tell(new ArgumentException()); // break it
229208
ExpectMsg<Terminated>().ActorRef.Should().Be(child);
230209
```
231210

232-
Up to now the supervisor was completely unaffected by the child's failure,
233-
because the directives set did handle it. In case of an `Exception`, this is not
234-
true anymore and the supervisor escalates the failure.
211+
Up to now the supervisor was completely unaffected by the child's failure, because the directives in our strategy handled the exception. However, if we cause an `Exception`, none of our handlers are invoked and the supervisor escalates the failure.
235212

236213
```csharp
237214
supervisor.Tell(Props.Create<Child>()); // create new child
@@ -247,14 +224,12 @@ message.ExistenceConfirmed.Should().BeTrue();
247224
```
248225

249226
The supervisor itself is supervised by the top-level actor provided by the
250-
`ActorSystem`, which has the default policy to restart in case of all
251-
`Exception` cases (with the notable exceptions of
252-
`ActorInitializationException` and `ActorKilledException`). Since the
253-
default directive in case of a restart is to kill all children, we expected our poor
254-
child not to survive this failure.
227+
`ActorSystem`. This has the default policy to restart as a result of all
228+
`Exception`s except `ActorInitializationException` and `ActorKilledException`. Since the
229+
default directive in case of a restart is to kill all children, our poor
230+
child did not survive this failure.
255231

256-
In case this is not desired (which depends on the use case), we need to use a
257-
different supervisor which overrides this behavior.
232+
If we don't want our children to be restarted we can override `PreRestart` in the Supervisor:
258233

259234
```csharp
260235
public class Supervisor2 : UntypedActor
@@ -296,7 +271,7 @@ public class Supervisor2 : UntypedActor
296271
```
297272

298273
With this parent, the child survives the escalated restart, as demonstrated in
299-
the last test:
274+
this last test:
300275

301276
```csharp
302277
var supervisor2 = system.ActorOf<Supervisor2>("supervisor2");

0 commit comments

Comments
 (0)