本文记录binder ipc的一些不常提到的知识点,以及DeadSystemException,DeadObjectException等binder相关异常的常见案例
Binder背景知识
一次Binder IPC调用流程简单理解可以分为3步:(A进程->B进程)

- 调用进程(caller)中,binder框架负责将IPC的各个参数序列化成(一段连续的)二进制数据,放到Parcel容器里
- 内核驱动负责将 caller Parcel的数据拷贝到目标进程(callee)的binder buffer里
- 目标进程将binder buffer里存放二进制数据反序列化成参数对象,执行对应的IPC调用
上面IPC流程过程中的每一步都可能发生一些无法避免的错误,例如
- 第2步里拷贝数据到目标进程的时候,发现要写入的数据超过
BINDER_VM_SIZE了 - 第2步里拷贝数据到目标进程的时候,目标进程突然FC或被系统Kill了
- 有多个进程也在向(当前的)目标进程发起IPC,目标进程IPC(业务逻辑复杂或发生卡顿)响应较慢,目标进程的binder buffer就会被这些请求占满,那么后发起的IPC就会失败
- 同理,caller进程在发起IPC到目标进程的时候,后台有其他进程也在向caller进程发起IPC,导致caller进程的binder buffer不足,最终又导致目标进程在回传IPC调用结果给caller的时候出现binder buffer不足而失败
但是Binder驱动返回给上层Binder框架的错误码只有几种:
BR_DEAD_REPLY
目标进程挂了,binder驱动无法处理此次IPC,binder框架层会抛出DeadObjectException给到Java层
BR_FAILED_REPLY
实际的错误原因可能有3种:
- 目标进程的binder buffer里没有足够的空间存放此次IPC要传输的数据,IPC失败
- 在向目标进程的binder buffer里拷贝IPC数据时,目标进程突然挂了,IPC失败
- 目标进程返回的数据量特别大或调用方进程同时在响应其他进程的IPC的话,就可能出现调用方进程的binder buffer没有足够的剩余空间存放此次的返回数据,IPC失败
由于驱动层并没有返回详细的错误码,框架层只能依据此次向目标进程传输的数据量大小,如果超过200KB就抛出TransactionTooLargeException,其他情况都抛出DeadObjectException
BR_FROZEN_REPLY
AOSP和OEM厂商都有各自的冻结App进程的策略,目的是减少后台app的cpu/memory占用,从而降低功耗,也能延长待机时间
如果向一个被冻结的进程发起同步Binder IPC,binder驱动层会直接reject此次调用,binder框架层收到这个错误时也会向收到BR_FAILED_REPLY一样的抛出TransactionTooLargeException或DeadObjectException
相关代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
| // frameworks/base/core/jni/android_util_Binder.cpp
static jboolean android_os_BinderProxy_transact(JNIEnv* env, jobject obj,
jint code, jobject dataObj, jobject replyObj, jint flags) {
Parcel* data = parcelForJavaObject(env, dataObj);
if (data == NULL) {
return JNI_FALSE;
}
Parcel* reply = parcelForJavaObject(env, replyObj);
if (reply == NULL && replyObj != NULL) {
return JNI_FALSE;
}
IBinder* target = getBPNativeData(env, obj)->mObject.get();
if (target == NULL) {
jniThrowException(env, "java/lang/IllegalStateException", "Binder has been finalized!");
return JNI_FALSE;
}
// 实际是通过IPCThreadState::transact()和binder驱动进行交互
status_t err = target->transact(code, *data, reply, flags);
if (err == NO_ERROR) {
return JNI_TRUE;
}
if (err == UNKNOWN_TRANSACTION) {
return JNI_FALSE;
}
signalExceptionForError(env, obj, err, true /*canThrowRemoteException*/, data->dataSize());
return JNI_FALSE;
|
IPCThreadState::waitForResponse()用于等待此次IPC的返回数据,并将
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
| // frameworks/native/libs/binder/IPCThreadState.cpp
status_t IPCThreadState::waitForResponse(Parcel *reply, status_t *acquireResult) {
uint32_t cmd;
int32_t err;
while (1) {
if ((err=talkWithDriver()) < NO_ERROR) break;
err = mIn.errorCheck();
if (err < NO_ERROR) break;
if (mIn.dataAvail() == 0) continue;
cmd = (uint32_t)mIn.readInt32();
switch (cmd) {
// ...
case BR_DEAD_REPLY:
err = DEAD_OBJECT;
goto finish;
case BR_FAILED_REPLY:
err = FAILED_TRANSACTION;
goto finish;
case BR_FROZEN_REPLY:
ALOGW("Transaction failed because process frozen.");
err = FAILED_TRANSACTION;
goto finish;
// ...
}
}
finish:
// ...
return err;
}
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
| // frameworks/base/core/jni/android_util_Binder.cpp
void signalExceptionForError(JNIEnv* env, jobject obj, status_t err,
bool canThrowRemoteException, int parcelSize) {
switch (err) {
// ...
case DEAD_OBJECT:
// DeadObjectException is a checked exception, only throw from certain methods.
jniThrowException(env, canThrowRemoteException
? "android/os/DeadObjectException"
: "java/lang/RuntimeException", NULL);
break;
case FAILED_TRANSACTION: {
ALOGE("!!! FAILED BINDER TRANSACTION !!! (parcel size = %d)", parcelSize);
const char* exceptionToThrow;
std::string msg;
// TransactionTooLargeException is a checked exception, only throw from certain methods.
// TODO(b/28321379): Transaction size is the most common cause for FAILED_TRANSACTION
// but it is not the only one. The Binder driver can return BR_FAILED_REPLY
// for other reasons also, such as if the transaction is malformed or
// refers to an FD that has been closed. We should change the driver
// to enable us to distinguish these cases in the future.
if (canThrowRemoteException && parcelSize > 200*1024) {
// bona fide large payload
exceptionToThrow = "android/os/TransactionTooLargeException";
msg = base::StringPrintf("data parcel size %d bytes", parcelSize);
} else {
// Heuristic: a payload smaller than this threshold "shouldn't" be too
// big, so it's probably some other, more subtle problem. In practice
// it seems to always mean that the remote process died while the binder
// transaction was already in flight.
exceptionToThrow = (canThrowRemoteException)
? "android/os/DeadObjectException"
: "java/lang/RuntimeException";
msg = "Transaction failed on small parcel; remote process probably died, but "
"this could also be caused by running out of binder buffer space";
}
jniThrowException(env, exceptionToThrow, msg.c_str());
} break;
// ...
}
}
|
Java Binder异常类继承关系
1
2
3
4
| android.os.RemoteException
|_ TransactionTooLargeException
|_ DeadObjectException
|_ DeadSystemException
|
DeadSystemException是向system_server发起的IPC时遇到了DeadObjectException,但从上面的背景知识一节,我们知道DeadObjectException不一定说明是因为目标进程system_server挂掉了
Binder异常案例
案例1:同步binder call传输数据量过大
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
| AndroidRuntime: FATAL EXCEPTION: main
AndroidRuntime: Process: com.mmi.player, PID: 27332
AndroidRuntime: java.lang.RuntimeException: android.os.TransactionTooLargeException: data parcel size 1084868 bytes
AndroidRuntime: at android.app.ContextImpl.sendBroadcast(ContextImpl.java:1240)
AndroidRuntime: at android.content.ContextWrapper.sendBroadcast(ContextWrapper.java:515)
AndroidRuntime: at com.mmi.player.newplayer.PlayCenterUtil.j(SourceFile:61)
AndroidRuntime: at com.mmi.player.newplayer.manager.module.MusicWidgetManager.h(SourceFile:9)
AndroidRuntime: at com.mmi.player.newplayer.manager.module.MusicWidgetManager.b(SourceFile:1)
AndroidRuntime: at com.mmi.player.newplayer.manager.module.l.run(SourceFile:1)
AndroidRuntime: at android.os.Handler.handleCallback(Handler.java:958)
AndroidRuntime: at android.os.Handler.dispatchMessage(Handler.java:99)
AndroidRuntime: at android.os.Looper.loopOnce(Looper.java:222)
AndroidRuntime: at android.os.Looper.loop(Looper.java:314)
AndroidRuntime: at android.app.ActivityThread.main(ActivityThread.java:8790)
AndroidRuntime: at java.lang.reflect.Method.invoke(Native Method)
AndroidRuntime: at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:561)
AndroidRuntime: at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1013)
AndroidRuntime: Caused by: android.os.TransactionTooLargeException: data parcel size 1084868 bytes
AndroidRuntime: at android.os.BinderProxy.transactNative(Native Method)
AndroidRuntime: at android.os.BinderProxy.transact(BinderProxy.java:639)
AndroidRuntime: at android.app.IActivityManager$Stub$Proxy.broadcastIntentWithFeature(IActivityManager.java:6281)
AndroidRuntime: at android.app.ContextImpl.sendBroadcast(ContextImpl.java:1235)
AndroidRuntime: ... 13 more
|
player向目标进程发送了1084868 bytes,约1059.4KB数据,超过BINDER_VM_SIZE了
案例2:异步binder call传输数据量过大
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
| AndroidRuntime: FATAL EXCEPTION: main
AndroidRuntime: Process: com.ss.android.ugc.aweme, PID: 30553
AndroidRuntime: java.lang.RuntimeException: android.os.TransactionTooLargeException: data parcel size 546784 bytes
AndroidRuntime: at android.app.ActivityClient.activityStopped(ActivityClient.java:86)
AndroidRuntime: at android.app.servertransaction.PendingTransactionActions$StopInfo.run(PendingTransactionActions.java:143)
AndroidRuntime: at android.os.Handler.handleCallback(Handler.java:938)
AndroidRuntime: at android.os.Handler.dispatchMessage(Handler.java:99)
AndroidRuntime: at android.os.Looper.loopOnce(Looper.java:210)
AndroidRuntime: at android.os.Looper.loop(Looper.java:299)
AndroidRuntime: at android.app.ActivityThread.main(ActivityThread.java:8269)
AndroidRuntime: at java.lang.reflect.Method.invoke(Native Method)
AndroidRuntime: at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:576)
AndroidRuntime: at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1073)
AndroidRuntime: Caused by: android.os.TransactionTooLargeException: data parcel size 546784 bytes
AndroidRuntime: at android.os.BinderProxy.transactNative(Native Method)
AndroidRuntime: at android.os.BinderProxy.transact(BinderProxy.java:624)
AndroidRuntime: at android.app.IActivityClientController$Stub$Proxy.activityStopped(IActivityClientController.java:1297)
AndroidRuntime: at android.app.ActivityClient.activityStopped(ActivityClient.java:83)
AndroidRuntime: ... 9 more
|
我们知道Activity有onSaveInstanceState/onRestoreInstanceState的回调,可以存储和恢复页面的一些状态,这些数据会发送给system_server,这样即使app进程挂了,用户重新进入这个页面时也能恢复到之前的状态(当然要开发者去适配)
ActivityClient.activityStopped就是用于将app页面状态传递给system_sever进程的,但是它是个oneway的binder call,最大只让传输508KB的数据
这个例子里,抖音的某个Activity页面肯定是在onSaveInstanceState回调保存了546784 bytes的数据,也就是约534KB,明显是超过限制了
Tips
Android的Activity生命周期回调比较复杂,避免依赖onSaveInstanceState去保存数据,必要场景还是使用文件进行持久化更加靠谱
DeadObjectException
案例1:对端进程死亡
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
| 04:37:47.638 2313 2453 I ActivityManager: Killing 26784:com.mmi.guardprovider/u0a159 (adj 250): MemoryReclaimService(service)
04:37:47.668 1658 1658 I Zygote : Process 26784 exited due to signal 9 (Killed)
04:37:47.682 31103 31103 E AndroidRuntime: FATAL EXCEPTION: main
04:37:47.682 31103 31103 E AndroidRuntime: Process: com.mmi.securitycenter.remote, PID: 31103
04:37:47.682 31103 31103 E AndroidRuntime: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
04:37:47.682 31103 31103 E AndroidRuntime: at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:571)
04:37:47.682 31103 31103 E AndroidRuntime: at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1013)
04:37:47.682 31103 31103 E AndroidRuntime: Caused by: java.lang.reflect.InvocationTargetException
04:37:47.682 31103 31103 E AndroidRuntime: at java.lang.reflect.Method.invoke(Native Method)
04:37:47.682 31103 31103 E AndroidRuntime: at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:561)
04:37:47.682 31103 31103 E AndroidRuntime: ... 1 more
04:37:47.682 31103 31103 E AndroidRuntime: Caused by: android.os.DeadObjectException
04:37:47.682 31103 31103 E AndroidRuntime: at android.os.BinderProxy.transactNative(Native Method)
04:37:47.682 31103 31103 E AndroidRuntime: at android.os.BinderProxy.transact(BinderProxy.java:621)
04:37:47.682 31103 31103 E AndroidRuntime: at com.mmi.guardprovider.aidl.IAntiVirusServer$Stub$a.q0(Unknown Source:21)
04:37:47.682 31103 31103 E AndroidRuntime: at re.l.i(Unknown Source:81)
04:37:47.682 31103 31103 E AndroidRuntime: at re.l.a(Unknown Source:0)
04:37:47.682 31103 31103 E AndroidRuntime: at re.k.a(Unknown Source:4)
04:37:47.682 31103 31103 E AndroidRuntime: at f9.a$d.onServiceConnected(Unknown Source:38)
04:37:47.682 31103 31103 E AndroidRuntime: at android.app.LoadedApk$ServiceDispatcher.doConnected(LoadedApk.java:2253)
04:37:47.682 31103 31103 E AndroidRuntime: at android.app.LoadedApk$ServiceDispatcher$RunConnection.run(LoadedApk.java:2286)
04:37:47.682 31103 31103 E AndroidRuntime: at android.os.Handler.handleCallback(Handler.java:958)
04:37:47.682 31103 31103 E AndroidRuntime: at android.os.Handler.dispatchMessage(Handler.java:99)
04:37:47.682 31103 31103 E AndroidRuntime: at android.os.Looper.loopOnce(Looper.java:224)
04:37:47.682 31103 31103 E AndroidRuntime: at android.os.Looper.loop(Looper.java:318)
04:37:47.682 31103 31103 E AndroidRuntime: at android.app.ActivityThread.main(ActivityThread.java:8669)
04:37:47.682 31103 31103 E AndroidRuntime: ... 3 more
|
能看到IAntiVirusServer$Stub$a.q0这个调用发生之前,目标进程26784就被系统干掉了
案例2:目标进程的binder buffer不足
1
2
3
4
5
6
7
8
9
10
11
| 11-08 19:38:41.848 10095 9413 10265 E AndroidRuntime: FATAL EXCEPTION: Thread-16
11-08 19:38:41.848 10095 9413 10265 E AndroidRuntime: Process: com.wwm.mtbf, PID: 9413
11-08 19:38:41.848 10095 9413 10265 E AndroidRuntime: java.lang.RuntimeException: android.os.DeadObjectException: Transaction failed on small parcel; remote process probably died, but this could also be caused by running out of binder buffer space
11-08 19:38:41.848 10095 9413 10265 E AndroidRuntime: at com.wwm.mtbf.SlowBinderCallDemo$2.run(SlowBinderCallDemo.java:58)
11-08 19:38:41.848 10095 9413 10265 E AndroidRuntime: at java.lang.Thread.run(Thread.java:1012)
11-08 19:38:41.848 10095 9413 10265 E AndroidRuntime: Caused by: android.os.DeadObjectException: Transaction failed on small parcel; remote process probably died, but this could also be caused by running out of binder buffer space
11-08 19:38:41.848 10095 9413 10265 E AndroidRuntime: at android.os.BinderProxy.transactNative(Native Method)
11-08 19:38:41.848 10095 9413 10265 E AndroidRuntime: at android.os.BinderProxy.transact(BinderProxy.java:684)
11-08 19:38:41.848 10095 9413 10265 E AndroidRuntime: at com.wwm.mtbf.IMyAidlInterface$Stub$Proxy.mockSlowResonse(IMyAidlInterface.java:124)
11-08 19:38:41.848 10095 9413 10265 E AndroidRuntime: at com.wwm.mtbf.SlowBinderCallDemo$2.run(SlowBinderCallDemo.java:56)
11-08 19:38:41.848 10095 9413 10265 E AndroidRuntime: ... 1 more
|
查看此时的binder驱动log
1
2
3
4
5
6
7
| [188157.002191] binder: undelivered TRANSACTION_COMPLETE
[188157.002213] binder: undelivered transaction 15278855, process died.
# 此次9413进程的10265线程发起的ipc数据需要分配的95296 bytes
[188166.932812] binder_alloc: 9905: binder_alloc_buf size 95296 failed, no address space
# 当前目标进程9905有10个binder call在处理中,剩余的buffer空间只有87424 bytes
[188166.932819] binder_alloc: allocated: 952960 (num: 10 largest: 95296), free: 87424 (num: 1 largest: 87424)
[188166.932824] binder: 9413:10265 transaction failed 29201/-28, size 95296-0 line 3381
|
案例3:调用方进程的binder buffer空间不足
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
| 02-22 17:42:56.314 1000 2495 6267 W system_server: Large reply transaction of 1000776 bytes, interface descriptor , code 1
02-22 17:42:56.316 1000 1720 1720 W BpBinder: Large or Failed outgoing transaction of 4 bytes, interface descriptor , code 1
02-22 17:42:56.316 1000 1720 1720 E JavaBinder: !!! FAILED BINDER TRANSACTION !!! (parcel size = 4)
02-22 17:42:56.316 1000 1720 1720 D AndroidRuntime: Shutting down VM
02-22 17:42:56.317 1000 1720 1720 E AndroidRuntime: FATAL EXCEPTION: main
02-22 17:42:56.317 1000 1720 1720 E AndroidRuntime: Process: com.android.systemui, PID: 1720
02-22 17:42:56.317 1000 1720 1720 E AndroidRuntime: android.os.BadParcelableException: Failure retrieving array; only received 16 of 22
02-22 17:42:56.317 1000 1720 1720 E AndroidRuntime: at android.content.pm.BaseParceledListSlice.<init>(BaseParceledListSlice.java:104)
02-22 17:42:56.317 1000 1720 1720 E AndroidRuntime: at android.content.pm.ParceledListSlice.<init>(ParceledListSlice.java:42)
02-22 17:42:56.317 1000 1720 1720 E AndroidRuntime: at android.content.pm.ParceledListSlice.<init>(Unknown Source:0)
02-22 17:42:56.317 1000 1720 1720 E AndroidRuntime: at android.content.pm.ParceledListSlice$1.createFromParcel(ParceledListSlice.java:80)
02-22 17:42:56.317 1000 1720 1720 E AndroidRuntime: at android.content.pm.ParceledListSlice$1.createFromParcel(ParceledListSlice.java:78)
02-22 17:42:56.317 1000 1720 1720 E AndroidRuntime: at android.os.Parcel.readTypedObject(Parcel.java:4025)
02-22 17:42:56.317 1000 1720 1720 E AndroidRuntime: at android.app.INotificationManager$Stub$Proxy.getActiveNotificationsFromListener(INotificationManager.java:4451)
02-22 17:42:56.317 1000 1720 1720 E AndroidRuntime: at android.service.notification.NotificationListenerService.getActiveNotifications(NotificationListenerService.java:1067)
02-22 17:42:56.317 1000 1720 1720 E AndroidRuntime: at android.service.notification.NotificationListenerService.getActiveNotifications(NotificationListenerService.java:991)
02-22 17:42:56.317 1000 1720 1720 E AndroidRuntime: at com.android.systemui.statusbar.phone.NotificationListenerWithPlugins.getActiveNotifications(go/retraceme e705dac4e8523a576a9f2e57230c33a5a6b8c24d494a7f72a88e51a609ebdefc:1)
02-22 17:42:56.317 1000 1720 1720 E AndroidRuntime: at com.android.systemui.statusbar.NotificationListener.onListenerConnected(go/retraceme e705dac4e8523a576a9f2e57230c33a5a6b8c24d494a7f72a88e51a609ebdefc:14)
02-22 17:42:56.317 1000 1720 1720 E AndroidRuntime: at com.android.systemui.statusbar.notification.NotificationListener.onListenerConnected(go/retraceme e705dac4e8523a576a9f2e57230c33a5a6b8c24d494a7f72a88e51a609ebdefc:1)
02-22 17:42:56.317 1000 1720 1720 E AndroidRuntime: at android.service.notification.NotificationListenerService$MyHandler.handleMessage(NotificationListenerService.java:2416)
02-22 17:42:56.317 1000 1720 1720 E AndroidRuntime: at android.os.Handler.dispatchMessage(Handler.java:106)
02-22 17:42:56.317 1000 1720 1720 E AndroidRuntime: at android.os.Looper.loopOnce(Looper.java:224)
02-22 17:42:56.317 1000 1720 1720 E AndroidRuntime: at android.os.Looper.loop(Looper.java:318)
02-22 17:42:56.317 1000 1720 1720 E AndroidRuntime: at android.app.ActivityThread.main(ActivityThread.java:8762)
02-22 17:42:56.317 1000 1720 1720 E AndroidRuntime: at java.lang.reflect.Method.invoke(Native Method)
02-22 17:42:56.317 1000 1720 1720 E AndroidRuntime: at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:561)
02-22 17:42:56.317 1000 1720 1720 E AndroidRuntime: at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1013)
02-22 17:42:56.317 1000 1720 1720 E AndroidRuntime: Caused by: android.os.DeadObjectException: Transaction failed on small parcel; remote process probably died, but this could also be caused by running out of binder buffer space
02-22 17:42:56.317 1000 1720 1720 E AndroidRuntime: at android.os.BinderProxy.transactNative(Native Method)
02-22 17:42:56.317 1000 1720 1720 E AndroidRuntime: at android.os.BinderProxy.transact(BinderProxy.java:639)
02-22 17:42:56.317 1000 1720 1720 E AndroidRuntime: at android.content.pm.BaseParceledListSlice.<init>(BaseParceledListSlice.java:95)
02-22 17:42:56.317 1000 1720 1720 E AndroidRuntime: ... 19 more
|
对照这时候的binder驱动log
1
2
3
4
5
6
7
| [ 7508.578077][ T6267] binder_alloc: 1720: binder_alloc_buf size 1047296 failed, no address space
# systemui的binder buffer当前已使用70528 bytes,还有969848 bytes空闲 (两个加起来约等于1016KB)
[ 7508.578093][ T6267] binder_alloc: allocated: 70528 (num: 1 largest: 70528), free: 969856 (num: 2 largest: 969848)
[ 7508.578105][ T6267] binder: 2495:6267 transaction failed 29201/-28, size 1000776-46520 line 3325
[ 7508.578154][ T6267] binder: send failed reply for transaction 7111893 to 1720:1720
[ 7509.171866][ T3979] binder_alloc: 1720: binder_alloc_buf, no vma
[ 7509.179961][ T6267] binder: 2495:6267 transaction failed 29189/-22, size 132-0 line 3133
|
systemui向system_server请求通知列表,但是system_serve回传给systemui的数据达到1022KB,systemui进程的binder buffer根本没有这么大的空间可以存放
DeadSystemException
案例1: 异步binder call堆积
1
2
3
4
5
6
7
8
9
10
11
12
| 09-03 20:42:26.204 1284 1318 E IPCThreadState: Process seems to be sending too many oneway calls.
09-03 20:42:27.243 25810 25810 E IPCThreadState: Binder transaction failure: 22351626/29201/-28
09-03 20:42:27.243 25810 25810 W BpBinder: Large or Failed outgoing transaction of 80 bytes, interface descriptor , code 51
09-03 20:42:27.243 25810 25810 E JavaBinder: !!! FAILED BINDER TRANSACTION !!! (parcel size = 80)
09-03 20:42:27.244 1284 1318 E IPCThreadState: Binder transaction failure: 22351631/29201/-28
# INetdEventListener的第3个binder call,也就是onConnectEvent
09-03 20:42:27.244 1284 1318 W BpBinder: Large or Failed outgoing transaction of 140 bytes, interface descriptor android.net.metrics.INetdEventListener, code 3
09-03 20:42:27.244 25810 25810 D AndroidRuntime: Shutting down VM
09-03 20:42:27.244 25810 25810 E AndroidRuntime: FATAL EXCEPTION: main
09-03 20:42:27.244 25810 25810 E AndroidRuntime: Process: com.chinamworld.main, PID: 25810
09-03 20:42:27.244 25810 25810 E AndroidRuntime: DeadSystemException: The system died; earlier logs will point to the root cause
|
此时的binder log
1
2
3
4
5
6
| # 1633是netd
[ 1805.078745][ T1672] binder_alloc: 2420: pid 1633 spamming oneway? 1678 buffers allocated for a total size of 416208
[ 1805.078851][ T1672] binder_alloc: 2420: pid 1633 spamming oneway? 1679 buffers allocated for a total size of 416456
[ 1805.079049][ T1672] binder_alloc: 2420: pid 1633 spamming oneway? 1680 buffers allocated for a total size of 416704
[ 1805.079282][ T1672] binder_alloc: 2420: pid 1633 spamming oneway? 1681 buffers allocated for a total size of 416952
[ 1805.079355][ T1672] binder_alloc: 2420: pid 1633 spamming oneway? 1682 buffers allocated for a total size of 417200
|
能看到system_server进程收到了大量的INetdEventListener.onConnectEvent请求,一下子处理不过来,就导致binder buffer不足
原因是这个时间端内app进程发起了大量网络链接,网络的同事此前跟Google沟通过,谷歌认为这种属于某个应用的恶意行为,建议去定位哪些app的行为导致,并推进app整改
异步binder call
和同步binder call不懂,向同一个接口(例如ActiviyManagerService)的(所有)异步binder call是串行处理的
Google工程师的想法有时候跟我们有明显不同,例如这里我们会认为如果是恶意app发起了大量的请求,应该在系统层面限制这个app使用更多的网络资源,而不是去找app协调沟通;如果一个恶意App可以轻松的耗尽system_server binder buffer,直接影响系统里其他的进程与system_server的binder通信,那整个系统就会变得不稳定阅读资料: