失败更像是分布式系统的一个特性。因此Akka用一个容忍失败的模型,在你的业务逻辑与失败处理逻辑(supervision逻辑)中间你能有一个清晰的边界。只需要一点点工作,这很赞。这就是我们要讨论的主题。

ACTOR SUPERVISION

想象一个方法调用了你栈顶的方法但却出了一个异常。那么在栈下的方法能做什么呢?

  1. 抓住异常并按顺序处理恢复

  2. 抓住异常,也许记个日志并保持安静。

  3. 下层的方法也可以选择无视这个异常(或者抓住并重扔出来)

想象下一直扔到main方法仍然没有处理这个异常。这种情况下,程序肯定会输出一个异常给console后退出。

图片描述

你可以把同样的情况套用在线程上。如果一个子线程抛了异常而再假设run*call方法没有处理它,那么这个异常就会期望放在父线程或主线程中解决,无论哪种情况,如果主线程没有处理他,系统就会退出。

让我们再看看 - 如果被context.actorof创建出来的子Actor因为一个异常失败了。父actor(指supervisor)可以处理子actor的任何失败。如果父actor做了,他可以处理并恢复(Restart/Resume)。另外,把异常传递(Escalate)给父actor。 还有一种做法,可以直接stop掉子actor - 这就是那个子actor结局了。 为什么我说父actor(那个supervisor)?这是因为akka的监护方式为家长监护 - 这意味着只有创建了actor的人才能监护他们。

就这么多了!我们已经覆盖到所有监护指令(Directoives)了。

策略

我忘了说一点: 你已经知道一个Akka Actor可以创建子actor并且子actor也可以随意创建他们自己的子actor。

现在,想下以下两个场景:

1.OneForOneStrategy

你的actor创建了很多子actor并且每一个子actor都连接了不同的数据源。假设你运行的是一个将英语翻译成多种语言的应用。

图片描述

假设,一个子actor失败了然而你可以接受在最终结果里跳过这个结果,你想怎么做?关掉这个服务?当然不,你可能想要重启/关闭这个有问题的子actor。是吧?现在这个策略在Akka的监护策略中叫OneForOneStrategy策略 - 如果一个actor挂了,只单独处理这个actor。

基于你的业务异常,你可能需要对不同的异常有不同的反应(停止,重启,升级,恢复)。要配置你自己的策略,你只需要override你Actor类中的supervisorStrategy方法。

声明OneForOneStrategy的例子

import akka.actor.Actor  
import akka.actor.ActorLogging  
import akka.actor.OneForOneStrategy  
import akka.actor.SupervisorStrategy.Stop


class TeacherActorOneForOne extends Actor with ActorLogging {

    ...
    ...
  override val supervisorStrategy=OneForOneStrategy() {

    case _: MinorRecoverableException     => Restart
    case _: Exception                   => Stop

  }
  ...
  ...  

2.AllForOneStrategy策略

假设你在做一个外部排序 (这是个又证明了我没啥创造力的例子!),你的每个块都被一个不同的actor处理。突然,一个Actor失败了并抛了一个异常。这样再往下处理就没什么意思了因为最终结果肯定是错的。所以,逻辑就是停止stop所有的actor。
图片描述

我为什么说stop而不是重启?因为在这个例子里重启也没用,每个actor的mailbox在重启时并不会被清理。所以,如果我们重启了,另外的chunk仍然会被处理。这不是我们想要的。重建actor并用新的mailbox在这里是个更合适的策略。

OneForOneStrategy一样,只需要用AllForOneStrategy的实现覆写supervisorStrategy

下面是例子

import akka.actor.{Actor, ActorLogging}  
import akka.actor.AllForOneStrategy  
import akka.actor.SupervisorStrategy.Escalate  
import akka.actor.SupervisorStrategy.Stop


class TeacherActorAllForOne extends Actor with ActorLogging {

  ...

  override val supervisorStrategy = AllForOneStrategy() {

    case _: MajorUnRecoverableException => Stop
    case _: Exception => Escalate

  }
  ...
  ...

指令 DIRECTIVES

AllForOneStrategyOneForOneStrategy的构造方法都接受一个叫DeciderPartialFunction[Throwable,Directive]方法,他把ThrowableDirective指令做了一个映射:

case _: MajorUnRecoverableException => Stop  

这就简单的四个指令 - Stop,Resume,Escalate和Restart

Stop

在异常发生时子actor会停止,任何发给停止的actor的消息都会被转到deadLetter队列。

Resume

子actor会忽略抛出异常的消息并且继续处理队列中的其他消息。

Restart

子actor会停止并且一个新的actor会初始化。继续处理mailbox中其他的消息。世界对这个是无感知的因为同样的ActorRef指向了新的Actor。

Escalate

supervisor复制了失败并让他的supervisor处理这个异常。

缺省策略

如果我们的actor没指定任何策略但是创建了子actor。他们会怎样处理?Actor会有一个缺省的策略:

override val supervisorStrategy=OneForOneStrategy() {

    case _: ActorInitializationException=> Stop
    case _: ActorKilledException        => Stop
    case _: DeathPactException             => Stop
    case _: Exception                   => Restart

}

所以,缺省策略处理了四个case:

1. ACTORINITIALIZATIONEXCEPTION => STOP

当actor不能初始化,他会抛出一个ActorInitializationException。actor会被停止。让我们在preStart调用中模拟下这个:

package me.rerun.akkanotes.supervision

import akka.actor.{ActorSystem, Props}  
import me.rerun.akkanotes.protocols.TeacherProtocol.QuoteRequest  
import akka.actor.Actor  
import akka.actor.ActorLogging

object ActorInitializationExceptionApp extends App{

  val actorSystem=ActorSystem("ActorInitializationException")
  val actor=actorSystem.actorOf(Props[ActorInitializationExceptionActor], "initializationExceptionActor")
  actor!"someMessageThatWillGoToDeadLetter"
}

class ActorInitializationExceptionActor extends Actor with ActorLogging{  
  override def preStart={
    throw new Exception("Some random exception")
  }
  def receive={
    case _=>
  }
}
Ru

运行ActorInitializationExceptionApp会产生一个ActorInitializationException 异常然后所有的消息都会进deadLetterActor的消息队列:

Log

[ERROR] [11/10/2014 16:08:46.569] [ActorInitializationException-akka.actor.default-dispatcher-2] [akka://ActorInitializationException/user/initializationExceptionActor] Some random exception
akka.actor.ActorInitializationException: exception during creation  
    at akka.actor.ActorInitializationException$.apply(Actor.scala:164)
...
...
Caused by: java.lang.Exception: Some random exception  
    at me.rerun.akkanotes.supervision.ActorInitializationExceptionActor.preStart(ActorInitializationExceptionApp.scala:17)
...
...

[INFO] [11/10/2014 16:08:46.581] [ActorInitializationException-akka.actor.default-dispatcher-4] [akka://ActorInitializationException/user/initializationExceptionActor] Message [java.lang.String] from Actor[akka://ActorInitializationException/deadLetters] to Actor[akka://ActorInitializationException/user/initializationExceptionActor#-1290470495] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.

2. ACTORKILLEDEXCEPTION => STOP

当Actor被kill消息关闭后,他会抛出一个ActorKilledException。如果抛这个异常,缺省策略会让子actor停止。看起来停止一个被kill掉的actor没什么意义。但想想这个:

  1. ActorKilledException 会被传递给supervisor。 那么之前我们在DeathWatch里面提到的Actor里的生命周期watchdeathwatchers。 直到Actor被停掉前watcher不会知道任何事情。

  2. 给Actor发送kill只会让那个特定的监管actor知道。用stop处理会暂停那个actor的mailbox,暂停了子actor的mailbox,停止了子actor,发送了Terminated给所有子actor的watcher,发送给所有类一个Terminated,然后actor的watcher都会迅速失败最终让Actor自己停止、

package me.rerun.akkanotes.supervision

import akka.actor.{ActorSystem, Props}  
import me.rerun.akkanotes.protocols.TeacherProtocol.QuoteRequest  
import akka.actor.Actor  
import akka.actor.ActorLogging  
import akka.actor.Kill

object ActorKilledExceptionApp extends App{

  val actorSystem=ActorSystem("ActorKilledExceptionSystem")
  val actor=actorSystem.actorOf(Props[ActorKilledExceptionActor])
  actor!"something"
  actor!Kill
  actor!"something else that falls into dead letter queue"
}

class ActorKilledExceptionActor extends Actor with ActorLogging{  
  def receive={
    case message:String=> log.info (message)
  }
}

Log

日志说只要ActorKilledException 进来,supervisor就会停掉actor并且消息会进入deadLetter队列

INFO  m.r.a.s.ActorKilledExceptionActor - something

ERROR akka.actor.OneForOneStrategy - Kill  
akka.actor.ActorKilledException: Kill

INFO  akka.actor.RepointableActorRef - Message [java.lang.String] from Actor[akka://ActorKilledExceptionSystem/deadLetters] to Actor[akka://ActorKilledExceptionSystem/user/$a#-1569063462] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.

3. DEATHPACTEXCEPTION => STOP

在[DeathWatch]()文中,你可以看到当一个Actor观察一个子Actor时,他期望在他的receive中处理Terminated消息。如果没有呢?你会得到一个DeathPactException

图片描述

代码演示了supervisorwatch子actor但没有从子actor处理Terminated消息。

package me.rerun.akkanotes.supervision

import akka.actor.{ActorSystem, Props}  
import me.rerun.akkanotes.protocols.TeacherProtocol.QuoteRequest  
import akka.actor.Actor  
import akka.actor.ActorLogging  
import akka.actor.Kill  
import akka.actor.PoisonPill  
import akka.actor.Terminated

object DeathPactExceptionApp extends App{

  val actorSystem=ActorSystem("DeathPactExceptionSystem")
  val actor=actorSystem.actorOf(Props[DeathPactExceptionParentActor])
  actor!"create_child" //Throws DeathPactException
  Thread.sleep(2000) //Wait until Stopped
  actor!"someMessage" //Message goes to DeadLetters

}

class DeathPactExceptionParentActor extends Actor with ActorLogging{

  def receive={
    case "create_child"=> {
      log.info ("creating child")
      val child=context.actorOf(Props[DeathPactExceptionChildActor])
      context.watch(child) //Watches but doesnt handle terminated message. Throwing DeathPactException here.
      child!"stop"
    }
    case "someMessage" => log.info ("some message")
    //Doesnt handle terminated message
    //case Terminated(_) =>
  }
}

class DeathPactExceptionChildActor extends Actor with ActorLogging{  
  def receive={
    case "stop"=> {
      log.info ("Actor going to stop and announce that it's terminated")
      self!PoisonPill
    }
  }
}

Log

日志告诉我们DeathPactException 进来了,supervisor停止了actor然后消息都进入了deadLetter的队列

INFO  m.r.a.s.DeathPactExceptionParentActor - creating child

INFO  m.r.a.s.DeathPactExceptionChildActor - Actor going to stop and announce that it's terminated

ERROR akka.actor.OneForOneStrategy - Monitored actor [Actor[akka://DeathPactExceptionSystem/user/$a/$a#-695506341]] terminated  
akka.actor.DeathPactException: Monitored actor [Actor[akka://DeathPactExceptionSystem/user/$a/$a#-695506341]] terminated

INFO  akka.actor.RepointableActorRef - Message [java.lang.String] from Actor[akka://DeathPactExceptionSystem/deadLetters] to Actor[akka://DeathPactExceptionSystem/user/$a#-1452955980] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.

4. EXCEPTION => RESTART

对于其他的异常,缺省的指令重启Actor。看下这个应用。只是要证明下Actor重启了,OtherExceptionParentActor让child抛出一个异常并立刻发送一条消息。消息在子actor重启的时候到达了mailbox,并被处理了。真不错!
图片描述

package me.rerun.akkanotes.supervision

import akka.actor.Actor  
import akka.actor.ActorLogging  
import akka.actor.ActorSystem  
import akka.actor.OneForOneStrategy  
import akka.actor.Props  
import akka.actor.SupervisorStrategy.Stop

object OtherExceptionApp extends App{

  val actorSystem=ActorSystem("OtherExceptionSystem")
  val actor=actorSystem.actorOf(Props[OtherExceptionParentActor])
  actor!"create_child"

}

class OtherExceptionParentActor extends Actor with ActorLogging{

  def receive={
    case "create_child"=> {
      log.info ("creating child")
      val child=context.actorOf(Props[OtherExceptionChildActor])

      child!"throwSomeException"
      child!"someMessage"
    }
  }
}

class OtherExceptionChildActor extends akka.actor.Actor with ActorLogging{

  override def preStart={
    log.info ("Starting Child Actor")
  }

  def receive={
    case "throwSomeException"=> {
      throw new Exception ("I'm getting thrown for no reason") 
    }
    case "someMessage" => log.info ("Restarted and printing some Message")
  }

  override def postStop={
    log.info ("Stopping Child Actor")
  }

}

Log

1.异常抛出了,我们能在trace中看到

  1. 子类重启了 - stop和start被调用了(我们稍后能看到preRestart和postRestart

  2. 消息在重启开始前被发送给子actor。

INFO  m.r.a.s.OtherExceptionParentActor - creating child

INFO  m.r.a.s.OtherExceptionChildActor - Starting Child Actor

ERROR akka.actor.OneForOneStrategy - I'm getting thrown for no reason

java.lang.Exception: I'm getting thrown for no reason  
    at me.rerun.akkanotes.supervision.OtherExceptionChildActor$$anonfun$receive$2.applyOrElse(OtherExceptionApp.scala:39) ~[classes/:na]
    at akka.actor.Actor$class.aroundReceive(Actor.scala:465) ~[akka-actor_2.11-2.3.4.jar:na]
...
...    

INFO  m.r.a.s.OtherExceptionChildActor - Stopping Child Actor

INFO  m.r.a.s.OtherExceptionChildActor - Starting Child Actor

INFO  m.r.a.s.OtherExceptionChildActor - Restarted and printing some Message

ESCALATE AND RESUME

我们在defaultStrategy中看到stop和restart的例子。现在让我们快速看下Escalate

Resume忽略了异常并处理mailbox中的下条消息。这就像是抓住了异常但什么事也没做。

Escalating更像是异常是致命的而supervisor不能处理它。所以,他要向他的supervisor求救。让我们看个例子。

假设有三个Actor - EscalateExceptionTopLevelActor, EscalateExceptionParentActor 和 EscalateExceptionChildActor。 如果一个子actor抛出一个日常并且父级别actor不能处理它,他可以Escalate这个异常到顶级actor。顶级actor也可以选择对哪些指令做出响应。在我们的例子里,我们只是做了stopstop会立即停掉child(这里是EscalateExceptionParentActor)。我们知道,当一个actor执行stop时,他的所有子类都会在actor自己停掉前先停止。
图片描述

package me.rerun.akkanotes.supervision

import akka.actor.Actor  
import akka.actor.ActorLogging  
import akka.actor.ActorSystem  
import akka.actor.OneForOneStrategy  
import akka.actor.Props  
import akka.actor.SupervisorStrategy.Escalate  
import akka.actor.SupervisorStrategy.Stop  
import akka.actor.actorRef2Scala

object EscalateExceptionApp extends App {

  val actorSystem = ActorSystem("EscalateExceptionSystem")
  val actor = actorSystem.actorOf(Props[EscalateExceptionTopLevelActor], "topLevelActor")
  actor ! "create_parent"
}

class EscalateExceptionTopLevelActor extends Actor with ActorLogging {

  override val supervisorStrategy = OneForOneStrategy() {
    case _: Exception => {
      log.info("The exception from the Child is now handled by the Top level Actor. Stopping Parent Actor and its children.")
      Stop //Stop will stop the Actor that threw this Exception and all its children
    }
  }

  def receive = {
    case "create_parent" => {
      log.info("creating parent")
      val parent = context.actorOf(Props[EscalateExceptionParentActor], "parentActor")
      parent ! "create_child" //Sending message to next level
    }
  }
}

class EscalateExceptionParentActor extends Actor with ActorLogging {

  override def preStart={
    log.info ("Parent Actor started")
  }

  override val supervisorStrategy = OneForOneStrategy() {
    case _: Exception => {
      log.info("The exception is ducked by the Parent Actor. Escalating to TopLevel Actor")
      Escalate
    }
  }

  def receive = {
    case "create_child" => {
      log.info("creating child")
      val child = context.actorOf(Props[EscalateExceptionChildActor], "childActor")
      child ! "throwSomeException"
    }
  }

  override def postStop = {
    log.info("Stopping parent Actor")
  }
}

class EscalateExceptionChildActor extends akka.actor.Actor with ActorLogging {

  override def preStart={
    log.info ("Child Actor started")
  }

  def receive = {
    case "throwSomeException" => {
      throw new Exception("I'm getting thrown for no reason.")
    }
  }
  override def postStop = {
    log.info("Stopping child Actor")
  }
}

Log

可以在log中看到,

  1. 子actor抛了异常。

  2. supervisor(EscalateExceptionParentActor)升级了(escalate)异常抛给了他的supervisor(EscalateExceptionTopLevelActor

  3. EscalateExceptionTopLevelActor 的指令是关闭actor。在顺序上,子actor先停止。

  4. 父actor之后再停止(在watcher被通知后)

INFO  m.r.a.s.EscalateExceptionTopLevelActor - creating parent

INFO  m.r.a.s.EscalateExceptionParentActor - Parent Actor started

INFO  m.r.a.s.EscalateExceptionParentActor - creating child

INFO  m.r.a.s.EscalateExceptionChildActor - Child Actor started

INFO  m.r.a.s.EscalateExceptionParentActor - The exception is ducked by the Parent Actor. Escalating to TopLevel Actor

INFO  m.r.a.s.EscalateExceptionTopLevelActor - The exception from the Child is now handled by the Top level Actor. Stopping Parent Actor and its children.

ERROR akka.actor.OneForOneStrategy - I'm getting thrown for no reason.  
java.lang.Exception: I'm getting thrown for no reason.  
    at me.rerun.akkanotes.supervision.EscalateExceptionChildActor$$anonfun$receive$3.applyOrElse(EscalateExceptionApp.scala:71) ~[classes/:na]
    ...
    ...

INFO  m.r.a.s.EscalateExceptionChildActor - Stopping child Actor

INFO  m.r.a.s.EscalateExceptionParentActor - Stopping parent Actor

请记住无论哪个指令发出只会使用在被escalated的子类上。 例如,一个restart指令从顶层发出,只有父类会被重启并且在构造函数中/preStart中的都会被执行。如果一个父actor的子类在构造函数中呗创建,他们就会被创建。然而,在消息中创建的child的父actor仍然会在Terminated状态。

TRIVIA

实际上,你可以控制是否preStart被调用。我们可以在下节看到。如果你好奇,可以看下Actor中的*postRestart方法

def postRestart(reason: Throwable): Unit = {  
  preStart()
}

代码

跟往常一样,代码在github


文章来自微信平台「麦芽面包」(微信扫描二维码关注)。未经允许,禁止转载。

图片描述


祝坤荣
1k 声望1.5k 粉丝

科幻影迷,书虫,硬核玩家,译者


引用和评论

0 条评论