Android ANR 原理

戈壁老王

ANR 简介

ANR:Application Not Responding,即“应用程序无响应”。Android 运行时,AMS 和 WMS 会监测应用程序的响应时间,如果应用程序主线程(即UI线程)在超时时间内对输入事件没有处理完毕,或者对特定操作没有执行完毕,就会上报 ANR。

ANR 的触发分为以下几类,

  • InputDispatching Timeout:输入事件(包括按键和触屏事件)在5秒内无响应,就会弹出 ANR 提示框,供用户选择继续等待程序响应或者关闭这个应用程序(也就是杀掉这个应用程序的进程)。输入超时类的 ANR 可以细分为以下两类:

    • 处理消息超时:顾名思义这一类是指因为消息处理超时而发生的 ANR,在 log,会看到 “Input dispatching timed out (Waiting because the focused window has not finished processing the input events that were previously delivered to it.)”
    • 无法获取焦点:这一类通常因为新窗口创建慢或旧窗口退出慢而造成窗口无法获得焦点从而发生 ANR,典型 Log “Reason: Waiting because no window has focus but there is a focused application that may eventually add a window when it finishes starting up.”
  • Broadcast Timeout:BroadcastReceiver在规定时间内(前台广播10秒,后台广播60秒)无法处理完成,即报出广播超时的消息。这一类型没有提示框弹出,多发于 statusbar,settings 应用中。
  • Service Timeout:Service在规定时间内(前台服务20秒,后台服务200秒)无法处理完成,即报出服务超时。这一类也不会弹框提示,偶尔会在 Bluetooth 和 wifi 中出现,但是很少碰到。
  • ContentProvider Timeout:ContentProvider 的 publish 在10s内没有完成,会报出此类 ANR。多发于android.process.media中。

产生 ANR 需要满足以下条件,

  1. 只有应用程序进程的主线程(UI 线程)响应超时才会产生 ANR。
  2. 只有达到超时时间才能触发 ANR。产生 ANR 的上下文不同,超时时间也会不同。
  3. 只有输入事件或特定操作才能触发 ANR。输入事件是指按键、触屏等设备输入事件,特定操作是指 BroadcastReceiver 和 Service 的生命周期中的各个函数。产生 ANR 的上下文不同,导致 ANR 的原因也会不同。

防止产生 ANR 的方法主要就是避免在主线程中执行耗时的操作,可以降耗时操作放入子线程中执行。耗时操作包括:

  • 数据库操作。 数据库操作尽量采用异步方法做处理
  • 初始化的数据和控件太多
  • 频繁的创建线程或者其它大对象;
  • 加载过大数据和图片;
  • 对大数据排序和循环操作;
  • 过多的广播和滥用广播;
  • 大对象的传递和共享;
  • 网络操作

InputDispatching Timeout

在 Android Input 系统中,InputDispatcher 负责将输入事件分发给 UI 主线程。UI 主线程接收到输入事件后,使用 InputConsumer 来处理事件。经过一系列的 InputStage 完成事件分发后,执行 finishInputEvent() 方法来告知 InputDispatcher 事件已经处理完成。InputDispatcher 中使用 handleReceiveCallback() 方法来处理 UI 主线程返回的消息,最终将 dispatchEntry 事件从等待队列中移除。

InputDispatching Timeout ANR 就产生于输入事件分发的过程中。InputDispatcher 分发事件过程中会检测上一个输入事件的状态,如果上一个输入事件在限定时间内没有完成分发,就会触发 ANR。InputDispatching Timeout 的默认限定时间的5s,有两处对其进行定义。

frameworks/native/services/inputflinger/InputDispatcher.cpp
constexpr nsecs_t DEFAULT_INPUT_DISPATCHING_TIMEOUT = 5000 * 1000000LL; // 5 sec

frameworks/base/services/core/java/com/android/server/wm/WindowManagerService.java
static final long DEFAULT_INPUT_DISPATCHING_TIMEOUT_NANOS = 5000 * 1000000L;

Input 分发事件时,通过 InputDispatcherThread 的 threadLoop 来循环读取 Input 事件,然后使用 dispatchOnce() 进行事件分发,实际的实现在 dispatchOnceInnerLocked() 中。

frameworks/native/services/inputflinger/InputDispatcher.cpp

void InputDispatcher::dispatchOnceInnerLocked(nsecs_t* nextWakeupTime) {
    nsecs_t currentTime = now(); // 记录事件分发时间
    ......
    // 没有正在分发的事件时,获取一个事件
    if (! mPendingEvent) {
        if (mInboundQueue.isEmpty()) {
            // Inbound Queue 为空时,处理应用 switch key,根据状态生成 repeat key
            ......
        } else {
            // 从 Inbound Queue 中获取一个事件
            mPendingEvent = mInboundQueue.dequeueAtHead();
            traceInboundQueueLengthLocked();
        }
          ......
        // 重置 ANR Timeout 状态
        resetANRTimeoutsLocked();
    }
    ......
    // 根据事件类型进行分发
    switch (mPendingEvent->type) {
    case EventEntry::TYPE_CONFIGURATION_CHANGED: {
        // 配置改变
        ......
    case EventEntry::TYPE_DEVICE_RESET: {
        // 设备重置
        ......
    case EventEntry::TYPE_KEY: {
        // 按键输入
        ......
        done = dispatchKeyLocked(currentTime, typedEntry, &dropReason, nextWakeupTime);
        break;
    }

    case EventEntry::TYPE_MOTION: {
        // 触摸屏输入
        ......
        done = dispatchMotionLocked(currentTime, typedEntry,
                &dropReason, nextWakeupTime);
        break;
    }
    ......
    if (done) {
        // 根据 dropReason 来决定是否丢弃事件
        if (dropReason != DROP_REASON_NOT_DROPPED) {
            dropInboundEventLocked(mPendingEvent, dropReason);
        }
        mLastDropReason = dropReason;
        
        releasePendingEventLocked(); // 释放正在处理的事件,也会重置 ANR timeout
        *nextWakeupTime = LONG_LONG_MIN;  // force next poll to wake up immediately
    }
}

接下来看具体输入事件的分发,以按键输入为例,会使用 dispatchKeyLocked() 进行分发。

frameworks/native/services/inputflinger/InputDispatcher.cpp

bool InputDispatcher::dispatchKeyLocked(nsecs_t currentTime, KeyEntry* entry,
        DropReason* dropReason, nsecs_t* nextWakeupTime) {
    // 新事件需要进行预处理
    if (! entry->dispatchInProgress) {
        ......
    }

    // 处理 try again 事件
    if (entry->interceptKeyResult == KeyEntry::INTERCEPT_KEY_RESULT_TRY_AGAIN_LATER) {
        ......
    }
    
    // 如果 flag 为 POLICY_FLAG_PASS_TO_USER,注册事件拦截
    if (entry->interceptKeyResult == KeyEntry::INTERCEPT_KEY_RESULT_UNKNOWN) {
        ......
    }
    ......
    // 寻找当前的焦点窗口,这里可能会触发 ANR
    Vector<InputTarget> inputTargets;
    int32_t injectionResult = findFocusedWindowTargetsLocked(currentTime,
            entry, inputTargets, nextWakeupTime);
    if (injectionResult == INPUT_EVENT_INJECTION_PENDING) {
        return false;
    }
    ......
    // 分发按键
    dispatchEventLocked(currentTime, entry, inputTargets);
    return true;
}

我们主要关注 ANR 的触发,看一下 findFocusedWindowTargetsLocked() 的实现。

frameworks/native/services/inputflinger/InputDispatcher.cpp

int32_t InputDispatcher::findFocusedWindowTargetsLocked(nsecs_t currentTime,
        const EventEntry* entry, Vector<InputTarget>& inputTargets, nsecs_t* nextWakeupTime) {
    int32_t injectionResult;
    std::string reason;

    // 丢弃无窗口无应用的事件
    if (mFocusedWindowHandle == NULL) {
        if (mFocusedApplicationHandle != NULL) {
            // 有应用无窗口的状态
            injectionResult = handleTargetsNotReadyLocked(currentTime, entry,
                    mFocusedApplicationHandle, NULL, nextWakeupTime,
                    "Waiting because no window has focus but there is a "
                    "focused application that may eventually add a window "
                    "when it finishes starting up.");
            goto Unresponsive;
        }

        ALOGI("Dropping event because there is no focused window or focused application.");
        injectionResult = INPUT_EVENT_INJECTION_FAILED;
        goto Failed;
    }

    // 权限检查
    if (! checkInjectionPermission(mFocusedWindowHandle, entry->injectionState)) {
        injectionResult = INPUT_EVENT_INJECTION_PERMISSION_DENIED;
        goto Failed;
    }

    // 检查窗口是否 ready
    reason = checkWindowReadyForMoreInputLocked(currentTime,
            mFocusedWindowHandle, entry, "focused");
    if (!reason.empty()) {
        injectionResult = handleTargetsNotReadyLocked(currentTime, entry,
                mFocusedApplicationHandle, mFocusedWindowHandle, nextWakeupTime, reason.c_str());
        goto Unresponsive;
    }

    // 窗口已经准备好
    injectionResult = INPUT_EVENT_INJECTION_SUCCEEDED;
    addWindowTargetLocked(mFocusedWindowHandle,
            InputTarget::FLAG_FOREGROUND | InputTarget::FLAG_DISPATCH_AS_IS, BitSet32(0),
            inputTargets);
    ......
    return injectionResult;
}

上述代码使用 checkWindowReadyForMoreInputLocked() 来检查窗口是否准备就绪,它使用字符串返回 window connection 的状态,当窗口正常时返回空。接下来会使用 handleTargetsNotReadyLocked() 来处理窗口未就绪的情形。

frameworks/native/services/inputflinger/InputDispatcher.cpp

int32_t InputDispatcher::handleTargetsNotReadyLocked(nsecs_t currentTime,
        const EventEntry* entry,
        const sp<InputApplicationHandle>& applicationHandle,
        const sp<InputWindowHandle>& windowHandle,
        nsecs_t* nextWakeupTime, const char* reason) {
    if (applicationHandle == NULL && windowHandle == NULL) {
        if (mInputTargetWaitCause != INPUT_TARGET_WAIT_CAUSE_SYSTEM_NOT_READY) {
            // 等待系统就绪,无窗口无应用的情形进入一次
            ......
        }
    } else {
        if (mInputTargetWaitCause != INPUT_TARGET_WAIT_CAUSE_APPLICATION_NOT_READY) {
            // 等待应用就绪,一个窗口进入一次
            ......
            // 设置超时,默认值都是5s
            nsecs_t timeout; 
            ......
            mInputTargetWaitCause = INPUT_TARGET_WAIT_CAUSE_APPLICATION_NOT_READY;
            mInputTargetWaitStartTime = currentTime; // 当前事件第一次分发时间
            mInputTargetWaitTimeoutTime = currentTime + timeout; // 超时时间
            mInputTargetWaitTimeoutExpired = false;
            mInputTargetWaitApplicationHandle.clear(); // 清除当前等待的应用

            // 设置当前等待的应用
            if (windowHandle != NULL) {
                mInputTargetWaitApplicationHandle = windowHandle->inputApplicationHandle;
            }
            if (mInputTargetWaitApplicationHandle == NULL && applicationHandle != NULL) {
                mInputTargetWaitApplicationHandle = applicationHandle;
            }
        }
    }

    if (mInputTargetWaitTimeoutExpired) {
        return INPUT_EVENT_INJECTION_TIMED_OUT;
    }

    if (currentTime >= mInputTargetWaitTimeoutTime) {
        // 事件分发超时,触发 ANR
        onANRLocked(currentTime, applicationHandle, windowHandle,
                entry->eventTime, mInputTargetWaitStartTime, reason);

        // ANR 触发时,立即唤醒下一个轮询
        *nextWakeupTime = LONG_LONG_MIN;
        return INPUT_EVENT_INJECTION_PENDING;
    } else {
        // 超时时强制唤醒轮询
        if (mInputTargetWaitTimeoutTime < *nextWakeupTime) {
            *nextWakeupTime = mInputTargetWaitTimeoutTime;
        }
        return INPUT_EVENT_INJECTION_PENDING;
    }
}

上述代码显示了一个正常输入事件分发过程中触发 ANR 的过程,触发 ANR 的判定是在一个限定的时间内是否可以完成事件分发。正常的分发过程中会有两处位置重置 ANR timeout,

  • 获取一个新的分发事件时。
  • 事件完成分发时。
void InputDispatcher::resetANRTimeoutsLocked() {
    // ANR timeout 重置就是重置了等待状态,清除了等待应用
    mInputTargetWaitCause = INPUT_TARGET_WAIT_CAUSE_NONE;
    mInputTargetWaitApplicationHandle.clear();
}

系统运行时,主要是以下4个场景,会有机会执行resetANRTimeoutsLocked:

  • 解冻屏幕, 系统开/关机的时刻点 (thawInputDispatchingLw, setEventDispatchingLw)
  • wms聚焦app的改变 (WMS.setFocusedApp, WMS.removeAppToken)
  • 设置input filter的过程 (IMS.setInputFilter)
  • 再次分发事件的过程(dispatchOnceInnerLocked)

触发 ANR 后,会调用 onANRLocked() 来捕获 ANR 的相关信息。大致的调用流程为,

InputDispatcher::onANRLocked --> InputDispatcher::doNotifyANRLockedInterruptible
    --> InputManagerService.notifyANR --> InputMonitor.notifyANR
    --> ActivityManagerService.inputDispatchingTimedOut --> AppErrors.appNotResponding

最终在 AppErrors.appNotResponding() 中打印 log 信息、dump 栈信息、打印 CPU 信息。

Broadcast Timeout

Android 的广播机制在接收到广播进行处理时,可能会出现 receiver 处理很慢从而影响后续 receiver 接收的情形。因此,Android 对广播的接收处理增加了一个限定时间,超出限定时间将触发 ANR。需要说明,广播超时值会出现在串行广播中。并行广播因为并不存在传输的依赖关系,所以不会发生广播超时。对于不同的广播存在两个限定时间:前台广播10s和后台广播60s。

frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java
    
    // How long we allow a receiver to run before giving up on it.
    static final int BROADCAST_FG_TIMEOUT = 10*1000;
    static final int BROADCAST_BG_TIMEOUT = 60*1000;
    ......
    public ActivityManagerService(Context systemContext) {
        ......
        mHandlerThread = new ServiceThread(TAG,
                THREAD_PRIORITY_FOREGROUND, false /*allowIo*/);
        mHandlerThread.start();
        mHandler = new MainHandler(mHandlerThread.getLooper());
        ......
        // 创建广播队列,前台和后台
        mFgBroadcastQueue = new BroadcastQueue(this, mHandler,
                "foreground", BROADCAST_FG_TIMEOUT, false);
        mBgBroadcastQueue = new BroadcastQueue(this, mHandler,
                "background", BROADCAST_BG_TIMEOUT, true);
        mBroadcastQueues[0] = mFgBroadcastQueue;
        mBroadcastQueues[1] = mBgBroadcastQueue;
        ......
    }

广播队列创建时会设置超时时间,接下来直接看下广播的处理过程。

frameworks/base/services/core/java/com/android/server/am/BroadcastQueue.java 

    final void processNextBroadcastLocked(boolean fromMsg, boolean skipOomAdj) {
        BroadcastRecord r;
        ......
        // 处理并行广播
        while (mParallelBroadcasts.size() > 0) {
            ......
        }
        
        // 广播正在处理时,检查进程是否存活
        if (mPendingBroadcast != null) {
            ......
        }
        
        boolean looped = false;
        // 处理串行广播
        do {
            ......
            r = mOrderedBroadcasts.get(0);
            boolean forceReceive = false;
            
            // 如果广播处理超时,强行结束它
            int numReceivers = (r.receivers != null) ? r.receivers.size() : 0;
            if (mService.mProcessesReady && r.dispatchTime > 0) {
                long now = SystemClock.uptimeMillis();
                if ((numReceivers > 0) &&
                        (now > r.dispatchTime + (2*mTimeoutPeriod*numReceivers))) {
                    broadcastTimeoutLocked(false); // forcibly finish this broadcast
                    forceReceive = true;
                    r.state = BroadcastRecord.IDLE;
                }
            }
            ......
            if (r.receivers == null || r.nextReceiver >= numReceivers
                    || r.resultAbort || forceReceive) {
                if (r.resultTo != null) {
                    try {
                        // 处理完广播,发送最终结果
                        performReceiveLocked(r.callerApp, r.resultTo,
                            new Intent(r.intent), r.resultCode,
                            r.resultData, r.resultExtras, false, false, r.userId);
                        r.resultTo = null;
                    ......
                }
                    
                // 撤销 BROADCAST_TIMEOUT_MSG 消息
                cancelBroadcastTimeoutLocked();
                ....
            }
        } while (r == null);
            
        // 获取下一个广播
        int recIdx = r.nextReceiver++;
        
        r.receiverTime = SystemClock.uptimeMillis();
        if (recIdx == 0) {
            // 在 receiver 启动时开启跟踪
            r.dispatchTime = r.receiverTime;
            r.dispatchClockTime = System.currentTimeMillis();
            ......
        }
        if (! mPendingBroadcastTimeoutMessage) {
            // 设置广播超时时间,发送 BROADCAST_TIMEOUT_MSG
            long timeoutTime = r.receiverTime + mTimeoutPeriod;
            setBroadcastTimeoutLocked(timeoutTime);
        }
            
        final BroadcastOptions brOptions = r.options;
        final Object nextReceiver = r.receivers.get(recIdx);

        // 处理动态注册的广播
        if (nextReceiver instanceof BroadcastFilter) {
            ......
        }
            
        // 处理静态注册的广播
        ResolveInfo info =
            (ResolveInfo)nextReceiver;
        ComponentName component = new ComponentName(
                info.activityInfo.applicationInfo.packageName,
                info.activityInfo.name);
        ....
        // 获取 receiver 对应的进程
        String targetProcess = info.activityInfo.processName;
        ProcessRecord app = mService.getProcessRecordLocked(targetProcess,
                info.activityInfo.applicationInfo.uid, false);
        ......
        // 如果相应进程存在,直接进行处理
        if (app != null && app.thread != null && !app.killed) {
            try {
                app.addPackage(info.activityInfo.packageName,
                        info.activityInfo.applicationInfo.versionCode, mService.mProcessStats);
                processCurBroadcastLocked(r, app, skipOomAdj);
                return;
            } catch (RemoteException e) {
            } catch (RuntimeException e) {.
                logBroadcastReceiverDiscardLocked(r);
                finishReceiverLocked(r, r.resultCode, r.resultData,
                        r.resultExtras, r.resultAbort, false);
                scheduleBroadcastsLocked();
                r.state = BroadcastRecord.IDLE;
                return;
            }
        }
            
        // 如果相应进程不存在,则创建进程
        if ((r.curApp=mService.startProcessLocked(targetProcess,
                info.activityInfo.applicationInfo, true,
                r.intent.getFlags() | Intent.FLAG_FROM_BACKGROUND,
                "broadcast", r.curComponent,
                (r.intent.getFlags()&Intent.FLAG_RECEIVER_BOOT_UPGRADE) != 0, false, false))
                        == null) {
            logBroadcastReceiverDiscardLocked(r);
            finishReceiverLocked(r, r.resultCode, r.resultData,
                    r.resultExtras, r.resultAbort, false);
            scheduleBroadcastsLocked();
            r.state = BroadcastRecord.IDLE;
            return;
        }

        mPendingBroadcast = r;
        mPendingBroadcastRecvIndex = recIdx;
    }

上述代码是广播处理的简单流程,与 ANR 触发相关的主要是两个地方,

  • 设置广播超时的消息,setBroadcastTimeoutLocked()
  • 撤销广播超时消息,cancelBroadcastTimeoutLocked()

在开始处理一个广播时,会根据超时时间来设置一个延迟发送的消息。在限定时间内,如果该消息没有被撤销就会触发 ANR。

frameworks/base/services/core/java/com/android/server/am/BroadcastQueue.java

    final void setBroadcastTimeoutLocked(long timeoutTime) {
        if (! mPendingBroadcastTimeoutMessage) {
            // 发送延迟消息 BROADCAST_TIMEOUT_MSG
            Message msg = mHandler.obtainMessage(BROADCAST_TIMEOUT_MSG, this);
            mHandler.sendMessageAtTime(msg, timeoutTime);
            mPendingBroadcastTimeoutMessage = true;
        }
    }
frameworks/base/services/core/java/com/android/server/am/BroadcastQueue.java

    private final class BroadcastHandler extends Handler {
        public BroadcastHandler(Looper looper) {
            super(looper, null, true);
        }

        @Override
        public void handleMessage(Message msg) {
            switch (msg.what) {
                ....
                case BROADCAST_TIMEOUT_MSG: {
                    synchronized (mService) {
                        broadcastTimeoutLocked(true);
                    }
                } break;
            }
        }
    }
    ......
    final void broadcastTimeoutLocked(boolean fromMsg) {
        ......
        long now = SystemClock.uptimeMillis();
        BroadcastRecord r = mOrderedBroadcasts.get(0);
        if (fromMsg) {
            ......
            long timeoutTime = r.receiverTime + mTimeoutPeriod;
            if (timeoutTime > now) {
                // 如果当前时间没有达到限定时间,重新设置超时消息
                setBroadcastTimeoutLocked(timeoutTime);
                return;
            }
        }

        BroadcastRecord br = mOrderedBroadcasts.get(0);
        if (br.state == BroadcastRecord.WAITING_SERVICES) {
            // 广播已经处理完,正在等待 service 执行。继续处理下一条广播
            br.curComponent = null;
            br.state = BroadcastRecord.IDLE;
            processNextBroadcast(false);
            return;
        }
        ......
        // 获取 APP 进程
        if (curReceiver != null && curReceiver instanceof BroadcastFilter) {
            BroadcastFilter bf = (BroadcastFilter)curReceiver;
            if (bf.receiverList.pid != 0
                    && bf.receiverList.pid != ActivityManagerService.MY_PID) {
                synchronized (mService.mPidsSelfLocked) {
                    app = mService.mPidsSelfLocked.get(
                            bf.receiverList.pid);
                }
            }
        } else {
            app = r.curApp;
        }
        ......
        // 继续处理下一条广播
        finishReceiverLocked(r, r.resultCode, r.resultData,
                r.resultExtras, r.resultAbort, false);
        scheduleBroadcastsLocked();

        if (!debugging && anrMessage != null) {
            // 触发 ANR
            mHandler.post(new AppNotResponding(app, anrMessage));
        }
    }

在广播处理完成时会撤销超时消息。

    final void cancelBroadcastTimeoutLocked() {
        if (mPendingBroadcastTimeoutMessage) {
            mHandler.removeMessages(BROADCAST_TIMEOUT_MSG, this);
            mPendingBroadcastTimeoutMessage = false;
        }
    }

Service Timeout

Service Timeout 发生在 Service 的启动过程中,如果在限定时间内无法完成启动就会触发 ANR。根据 Service 类型的不同,赋予前台服务和后台服务不同的超时时间。

frameworks/base/services/core/java/com/android/server/am/ActiveServices.java

    // How long we wait for a service to finish executing.
    static final int SERVICE_TIMEOUT = 20*1000;

    // How long we wait for a service to finish executing.
    static final int SERVICE_BACKGROUND_TIMEOUT = SERVICE_TIMEOUT * 10;    

在 Service 的启动过程中,会根据限定时间来设置一个延迟消息,用来触发启动超时。

frameworks/base/services/core/java/com/android/server/am/ActiveServices.java

    private final void bumpServiceExecutingLocked(ServiceRecord r, boolean fg, String why) {
        ......
                scheduleServiceTimeoutLocked(r.app);
        ......
    }
    ......
    private final void realStartServiceLocked(ServiceRecord r,
            ProcessRecord app, boolean execInFg) throws RemoteException {
        ......
        r.app = app;
        r.restartTime = r.lastActivity = SystemClock.uptimeMillis();

        final boolean newService = app.services.add(r);
        bumpServiceExecutingLocked(r, execInFg, "create"); // 发送超时消息
        mAm.updateLruProcessLocked(app, false, null); // 更新 LRU
        updateServiceForegroundLocked(r.app, /* oomAdj= */ false);
        mAm.updateOomAdjLocked(); // 更新 OOM ADJ

        boolean created = false;
        try {
            ......
            mAm.notifyPackageUse(r.serviceInfo.packageName,
                                 PackageManager.NOTIFY_PACKAGE_USE_SERVICE);
            app.forceProcessStateUpTo(ActivityManager.PROCESS_STATE_SERVICE);
            // 最终执行服务的 onCreate() 方法
            app.thread.scheduleCreateService(r, r.serviceInfo,
                    mAm.compatibilityInfoForPackageLocked(r.serviceInfo.applicationInfo),
                    app.repProcState);
            r.postNotification();
            created = true;
        ......
    }
    ......
    void scheduleServiceTimeoutLocked(ProcessRecord proc) {
        if (proc.executingServices.size() == 0 || proc.thread == null) {
            return;
        }
        // 设置延迟消息,用于触发服务启动超时
        Message msg = mAm.mHandler.obtainMessage(
                ActivityManagerService.SERVICE_TIMEOUT_MSG);
        msg.obj = proc;
        mAm.mHandler.sendMessageDelayed(msg,
                proc.execServicesFg ? SERVICE_TIMEOUT : SERVICE_BACKGROUND_TIMEOUT);
    }

当服务启动超时,会向 AMS 发送一个 SERVICE_TIMEOUT_MSG 消息。

frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java

    final class MainHandler extends Handler {
        public MainHandler(Looper looper) {
            super(looper, null, true);
        }

        @Override
        public void handleMessage(Message msg) {
            switch (msg.what) {
            ......
            case SERVICE_TIMEOUT_MSG: {
                mServices.serviceTimeout((ProcessRecord)msg.obj);
            } break;
            ......
            }
        }
    };
frameworks/base/services/core/java/com/android/server/am/ActiveServices.java

    void serviceTimeout(ProcessRecord proc) {
        String anrMessage = null;

        synchronized(mAm) {
            if (proc.executingServices.size() == 0 || proc.thread == null) {
                return;
            }
            final long now = SystemClock.uptimeMillis();
            final long maxTime =  now -
                    (proc.execServicesFg ? SERVICE_TIMEOUT : SERVICE_BACKGROUND_TIMEOUT);
            ServiceRecord timeout = null;
            long nextTime = 0;
            for (int i=proc.executingServices.size()-1; i>=0; i--) {
                // 寻找超时的服务
                ......
            }
            if (timeout != null && mAm.mLruProcesses.contains(proc)) {
                // 服务超时,生成超时信息
                ......
                mAm.mHandler.removeCallbacks(mLastAnrDumpClearer);
                mAm.mHandler.postDelayed(mLastAnrDumpClearer, LAST_ANR_LIFETIME_DURATION_MSECS);
                anrMessage = "executing service " + timeout.shortName;
            } else {
                // 服务未超时,重置超时信息
                Message msg = mAm.mHandler.obtainMessage(
                        ActivityManagerService.SERVICE_TIMEOUT_MSG);
                msg.obj = proc;
                mAm.mHandler.sendMessageAtTime(msg, proc.execServicesFg
                        ? (nextTime+SERVICE_TIMEOUT) : (nextTime + SERVICE_BACKGROUND_TIMEOUT));
            }
        }

        if (anrMessage != null) {
            // 触发 ANR
            mAm.mAppErrors.appNotResponding(proc, null, null, false, anrMessage);
        }
    }

上面描述了 Service Timeout 的产生过程,如果想避免超时消息的产生,就需要在限定时间内将消息移除。移除操作在服务启动完成后进行,下面看一下真正进行服务启动的代码。

frameworks/base/core/java/android/app/ActivityThread.java
    
    private void handleCreateService(CreateServiceData data) {
        unscheduleGcIdler();

        LoadedApk packageInfo = getPackageInfoNoCheck(
                data.info.applicationInfo, data.compatInfo);
        Service service = null;
        try {
            // 创建 service
            java.lang.ClassLoader cl = packageInfo.getClassLoader();
            service = packageInfo.getAppFactory()
                    .instantiateService(cl, data.info.name, data.intent);
        ......
        try {
            // 创建ContextImpl对象
            ContextImpl context = ContextImpl.createAppContext(this, packageInfo);
            context.setOuterContext(service);

            // 创建Application对象
            Application app = packageInfo.makeApplication(false, mInstrumentation);
            service.attach(context, this, data.info.name, data.token, app,
                    ActivityManager.getService());
            service.onCreate(); //     调用服务onCreate()方法
            mServices.put(data.token, service);
            try {
                // 服务启动完成
                ActivityManager.getService().serviceDoneExecuting(
                        data.token, SERVICE_DONE_EXECUTING_ANON, 0, 0);
            ......
    }
frameworks/base/services/core/java/com/android/server/am/ActiveServices.java

    private void serviceDoneExecutingLocked(ServiceRecord r, boolean inDestroying,
            boolean finishing) {
        r.executeNesting--;
        if (r.executeNesting <= 0) {
            if (r.app != null) {
                r.app.execServicesFg = false;
                r.app.executingServices.remove(r);
                if (r.app.executingServices.size() == 0) {
                    // 当前进程中没有正在执行的 service 时,移除服务超时消息
                    mAm.mHandler.removeMessages(ActivityManagerService.SERVICE_TIMEOUT_MSG, r.app);
                ......

ContentProvider Timeout

ContentProvider Timeout 发生在应用启动过程中。如果应用启动时,Provider 发布超过限定时间就会触发 ANR。应用进程创建后,会调用 attachApplicationLocked() 进行初始化。

frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java

    // How long we wait for an attached process to publish its content providers
    // before we decide it must be hung.
    static final int CONTENT_PROVIDER_PUBLISH_TIMEOUT = 10*1000;
    ......
    private final boolean attachApplicationLocked(IApplicationThread thread,
            int pid, int callingUid, long startSeq) {
        ......
        // 如果应用存在 Provider,设置延迟消息处理 Provider 超时
        if (providers != null && checkAppInLaunchingProvidersLocked(app)) {
            Message msg = mHandler.obtainMessage(CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG);
            msg.obj = app;
            mHandler.sendMessageDelayed(msg, CONTENT_PROVIDER_PUBLISH_TIMEOUT);
        }
        ......
    }

当在限定时间内没有完成 Provider 发布时,会发送消息 CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG,Handler 会进行相应处理。

frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java
    
    final class MainHandler extends Handler {
        public MainHandler(Looper looper) {
            super(looper, null, true);
        }

        @Override
        public void handleMessage(Message msg) {
            switch (msg.what) {
            ......
            case CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG: {
                // 处理 Provider 超时消息
                ProcessRecord app = (ProcessRecord)msg.obj;
                synchronized (ActivityManagerService.this) {
                    processContentProviderPublishTimedOutLocked(app);
                }
            } break
            ......
        }
    }
    ......
    boolean removeProcessLocked(ProcessRecord app,
            boolean callerWillRestart, boolean allowRestart, String reason) {
        final String name = app.processName;
        final int uid = app.uid;
        ......
        // 移除mProcessNames中的相应对象
        removeProcessNameLocked(name, uid);
        ......
        boolean needRestart = false;
        if ((app.pid > 0 && app.pid != MY_PID) || (app.pid == 0 && app.pendingStart)) {
            int pid = app.pid;
            if (pid > 0) {
                // 杀进程前处理一些相关状态
                ......
            }
            // 判断是否需要重启进程
            boolean willRestart = false;
            if (app.persistent && !app.isolated) {
                if (!callerWillRestart) {
                    willRestart = true;
                } else {
                    needRestart = true;
                }
            }
            app.kill(reason, true); // 杀死进程
            handleAppDiedLocked(app, willRestart, allowRestart); // 回收资源
            if (willRestart) {
                removeLruProcessLocked(app);
                addAppLocked(app.info, null, false, null /* ABI override */);
            }
        } else {
            mRemovedProcesses.add(app);
        }

        return needRestart;
    }
    ......
    private final void processContentProviderPublishTimedOutLocked(ProcessRecord app) {
        // 清理 Provider
        cleanupAppInLaunchingProvidersLocked(app, true);
        // 清理应用进程
        removeProcessLocked(app, false, true, "timeout publishing content providers");
    }
    ......
    boolean cleanupAppInLaunchingProvidersLocked(ProcessRecord app, boolean alwaysBad) {
        boolean restart = false;
        for (int i = mLaunchingProviders.size() - 1; i >= 0; i--) {
            ContentProviderRecord cpr = mLaunchingProviders.get(i);
            if (cpr.launchingApp == app) {
                if (!alwaysBad && !app.bad && cpr.hasConnectionOrHandle()) {
                    restart = true;
                } else {
                    // 移除死亡的 Provider
                    removeDyingProviderLocked(app, cpr, true);
                }
            }
        }
        return restart;
    }

可以看到 ContentProvider Timeout 发生时并没有调用 AMS.appNotResponding() 方法,仅仅杀死问题进程及清理相关信息。Provider 的超时消息会在发布成功时被清除,相关代码如下。

frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java

    public final void publishContentProviders(IApplicationThread caller,
            List<ContentProviderHolder> providers) {
        ......
        synchronized (this) {
            final ProcessRecord r = getRecordForAppLocked(caller);
            ......
            final int N = providers.size();
            for (int i = 0; i < N; i++) {
                ContentProviderHolder src = providers.get(i);
                ......
                ContentProviderRecord dst = r.pubProviders.get(src.info.name);
                if (dst != null) {
                    ComponentName comp = new ComponentName(dst.info.packageName, dst.info.name);
                    mProviderMap.putProviderByClass(comp, dst);
                    ......
                    if (wasInLaunchingProviders) {
                        // 移除超时消息
                        mHandler.removeMessages(CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG, r);
                    }
                    ......
                }
            }

            Binder.restoreCallingIdentity(origId);
        }
    }
参考文档:

理解Android ANR的触发原理
Android ANR:原理分析及解决办法
Input系统—ANR原理分析

阅读 991

老王系统屋
做为一个不称职的老年码农,一直疏忽整理笔记,开博记录一下,用来丰富老年生活,

做为一个不称职的老年码农,一直疏忽整理笔记,开博记录一下,用来丰富老年生活,

58 声望
13 粉丝
0 条评论

做为一个不称职的老年码农,一直疏忽整理笔记,开博记录一下,用来丰富老年生活,

58 声望
13 粉丝
宣传栏