1.问题现象

​ VPA应用经常崩溃,从日志中看com.iflytek.autofly.avatar 进程经常被kill。

2.问题定位

2.1 搜索AndroidRuntime

查看进程有没有被系统杀死,从log中看到,系统杀死的是com.gxa.service.btcall,并不是我们希望的com.iflytek.autofly.avatar应用,因此此线索结束。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
E AndroidRuntime: FATAL EXCEPTION: main
E AndroidRuntime: Process: com.gxa.service.btcall, PID: 16399
E AndroidRuntime: java.lang.RuntimeException: Unable to create service com.gxa.service.btcall.BtCallService: java.lang.NullPointerException: Attempt to invoke virtual method 'boolean ts.car.bluetooth.sdk.phone.BluetoothPhoneBookManager.registerCallback(ts.car.bluetooth.sdk.phone.BluetoothPhoneBookManager$BluetoothPhoneBookCallBack)' on a null object reference
E AndroidRuntime: at android.app.ActivityThread.handleCreateService(ActivityThread.java:3582)
E AndroidRuntime: at android.app.ActivityThread.access$1300(ActivityThread.java:200)
E AndroidRuntime: at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1672)
E AndroidRuntime: at android.os.Handler.dispatchMessage(Handler.java:106)
E AndroidRuntime: at android.os.Looper.loop(Looper.java:193)
E AndroidRuntime: at android.app.ActivityThread.main(ActivityThread.java:6718)
E AndroidRuntime: at java.lang.reflect.Method.invoke(Native Method)
E AndroidRuntime: at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:493)
E AndroidRuntime: at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:858)
E AndroidRuntime: Caused by: java.lang.NullPointerException: Attempt to invoke virtual method 'boolean ts.car.bluetooth.sdk.phone.BluetoothPhoneBookManager.registerCallback(ts.car.bluetooth.sdk.phone.BluetoothPhoneBookManager$BluetoothPhoneBookCallBack)' on a null object reference
E AndroidRuntime: at com.gxa.service.btcall.model.adapter.BtPhoneAdapter.init(BtPhoneAdapter.java:282)
E AndroidRuntime: at com.gxa.service.btcall.model.adapter.BtPhoneAdapter.<init>(BtPhoneAdapter.java:249)
E AndroidRuntime: at com.gxa.service.btcall.BtCallService.onCreate(BtCallService.java:101)
E AndroidRuntime: at android.app.ActivityThread.handleCreateService(ActivityThread.java:3570)
E AndroidRuntime: ... 8 more
2.2 直接搜索com.iflytek.autofly.avatar
1
2
Line 13098: 07-28 18:10:06.089   782 17518 I am_kill : [0,28090,com.iflytek.autofly.avatar,200,stop com.iflytek.autofly.avatar]
Line 13098: 07-28 18:10:06.089 782 17518 I am_kill : [0,28090,com.iflytek.autofly.avatar,200,stop com.iflytek.autofly.avatar]

从日志中看到我们关注的应用确实有被kill的足迹。然后顺腾摸瓜,搜索“782”,看看这个进程是为什么kill VPA进程。

2.3 搜索782

从一堆日志中根本看不出有什么被杀原因的线索,“单纯”被杀了。继续回到com.iflytek.autofly.avatar的日志中

1
2
3
4
5
6
7
8
// 这一行日志非常关键,可以看到com.iflytek.autofly.avatar被3186强制停止了!!
782 15478 I ActivityManager: Force stopping com.iflytek.autofly.avatar appid=1000 user=0: from pid 3186
// 被停止的oom_adj值为200说明内存暂时也还不高,因此暂时排除com.iflytek.autofly.avatar因为内存问题被kill的可能
782 15478 I ActivityManager: Killing 2211:com.iflytek.autofly.avatar/1000 (adj 200): stop com.iflytek.autofly.avatar
782 15478 I am_kill : [0,2211,com.iflytek.autofly.avatar,200,stop com.iflytek.autofly.avatar]xxxxxxxxxx com.iflytek.autofly.avatar
782 15478 I ActivityManager: Force stopping com.iflytek.autofly.avatar appid=1000 user=0: from pid 3186
782 15478 I ActivityManager: Killing 2211:com.iflytek.autofly.avatar/1000 (adj 200): stop com.iflytek.autofly.avatar
782 15478 I am_kill : [0,2211,com.iflytek.autofly.avatar,200,stop com.iflytek.autofly.avatar]
2.4 搜索”3186”查看这个进程干了什么

3186可以看到很多重要信息,如下信息显示这个进程会去做内存检查,当前可用内存:51687837696,已用内存419430400,发起超过400M警告

1
2
3
4
3186  3245 I procmonitor: @DeviceStorageMonitor@: MSG_CHECK_DATA_MEMORY
3186 3245 I procmonitor: @DeviceStorageMonitor@: requestCheckDataAvailableMemorySize
3186 3245 I procmonitor: @DeviceStorageMonitor@: checkDataAvailableMemorySize is 51687837696 check memorySize is 419430400
3186 3245 I procmonitor: @DeviceStorageMonitor@: MSG_DATA_MEMORY_MORE_400M
1
2
3
4
5
782 15478 I ActivityManager: Force stopping com.iflytek.autofly.avatar appid=1000 user=0: from pid 3186
3186 3218 D procmonitor: @ActivityProcManager@: onProcessDied pid=2211 uid=1000
// 从这句话可以看出com.iflytek.autofly.avatar被杀了
3186 3246 D procmonitor: @ActivityProcManager@: runLimitLogic kill packagename = com.iflytek.autofly.avatar
3186 3246 D procmonitor: @ConfigManager@: getWhiteLists called() isInitFalg=true

真相即将浮现,查看procmonitor是哪个服务输出的。

1
2
jieou@gxatek-fw-no:/work/jieou/gxa_code/lagvm_p/LINUX/android/vendor/gxatek/proprietary$ grep -nr "procmonitor"
CarProcManager/ProcManagementService/service/src/main/java/com/gxa/car/procmanagement/utils/LogUtils.java:27: private static final String TAG = "procmonitor";

在源码中grep到日志是ProcManagementService发出的,于是开始分析这个服务到底是干了什么。

3.ProcManagementService源码分析

3.1 服务架构如下

ProcManagerService.png

1
2
3
4
5
6
7
8
public class ProcManagementService extends Service {
@Override
public void onCreate() {
LogUtils.logd(TAG, "onCreate()");
mProcManagementImpl = new ProcManagementImpl(this.getApplicationContext());
super.onCreate();
}
}

真正实现在ProcManagementImpl

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
public ProcManagementImpl(Context context) {
mContext = context;
// 1.一键清理manager
mCleanProcManager = new CleanProcManager(mContext);
// 2.内存管理monitor
mDeviceStorageMonitor = new DeviceStorageMonitor(mContext);
mDeviceStorageMonitor.init();
// 3.界面管理manager
mActivityProcManager = new ActivityProcManager(mContext.getApplicationContext());
mAsyncHandlerThread = new HandlerThread(THREAD_NAME);
mAsyncHandlerThread.start();
mAsyncHandler = new Handler(mAsyncHandlerThread.getLooper()) {
@Override
public void handleMessage(Message msg) {
asyncHandleMessage(msg);
}
};
// 4.初始化Handler,解析配置文件
mAsyncHandler.sendEmptyMessageDelayed(ID_INIT, DELAY_TIME_500);
// 5.注册清理app内存和清理后台app内存广播
registerDebugReceiver();
}

从构造方法中可以看出,ProcManagement主要会做五个事情,其中我们主要分析1,2,3,5步。

3.2 CleanProcManager

CleanProcManager主要是提供接口供设置应用使用,设置可以一键kill非白名单后台应用。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
// CleanProcManager.java
// kill非白名单应用
public void killBackgroundProcessByWhiteList(List<String> whiteList) {
LogUtils.logd(TAG, "killBackgroundProcessByWhiteList");
MemoryInfo beforeInfo = new MemoryInfo();
try {
mAms.getMemoryInfo(beforeInfo);
} catch (RemoteException ex) {
LogUtils.loge(TAG, "getMemoryInfo exception=" + ex.toString());
}

List<RecentTaskInfo> runningAppList = filterAppInfoByWhiteList(getRecentTask(), whiteList);
if (runningAppList != null) {
for (RecentTaskInfo info : runningAppList) {
if (DEBUG) {
LogUtils.logd(TAG, "runningAppList:" + info.realActivity.getPackageName());
}
//KillProcess
String packageName = info.realActivity.getPackageName();
if (packageName.contains("launcher")) {
continue;
}
LogUtils.logd(TAG, "killProcess:" + info.realActivity.getPackageName());
// mAm.killBackgroundProcesses(info.realActivity.getPackageName());
try {
mAms.removeTask(info.persistentId);
} catch (RemoteException ex) {
LogUtils.loge(TAG, "killBackgroundProcessByWhiteList ex=" + ex.toString());
}

}
}

MemoryInfo afterInfo = new MemoryInfo();

try {
mAms.getMemoryInfo(afterInfo);
} catch (RemoteException ex) {
LogUtils.loge(TAG, "getMemoryInfo exception=" + ex.toString());
}

long mem = afterInfo.availMem - beforeInfo.availMem;
String freeMem = Formatter.formatFileSize(mContext, mem);
LogUtils.logd(TAG, "FreeMem=" + freeMem);
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
// CleanProcManager.java
// 清理应用内存
public void deleteCacheByWhiteList(List<String> whiteList) {
LogUtils.logd(TAG, "deleteCacheByWhiteList");
if (mAppDataObserver == null) {
mAppDataObserver = new AppDataObserver();
}

List<ApplicationInfo> installApps = getInstallApplications();
if (DEBUG) {
for (int i = 0; i < installApps.size(); i++) {
LogUtils.logd(TAG, "getInstallApplications->" + installApps.get(i).packageName);
}
}

List<ApplicationInfo> runningAppList = filterAppInfoByWhiteList(installApps, whiteList);

if (runningAppList != null) {
for (ApplicationInfo info : runningAppList) {
if (DEBUG) {
LogUtils.logd(TAG, "runningAppList->" + info.processName);
}
if (info.packageName.contains("launcher")) {
continue;
}

LogUtils.logd(TAG, "deleteApplicationCacheFiles->" + info.processName);
mPm.deleteApplicationCacheFiles(info.processName, mAppDataObserver);
}
}

}
3.3 DeviceStorageMonitor

在DeviceStorageMonitor主要是判断每个应用在data分区的数据大小。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
private void checkMemory() {
requestCheckDataAvailableMemorySize();

if (checkDataAvailableMemorySize(MEMORY_SIZE_FOR_400M)) {
mStorageMonitorHandler.sendEmptyMessage(MSG_DATA_MEMORY_MORE_400M);
} else {
if (checkDataAvailableMemorySize(MEMORY_SIZE_FOR_200M)) {
mStorageMonitorHandler.sendEmptyMessage(MSG_DATA_MEMORY_MORE_200M);
} else {
if (checkDataAvailableMemorySize(MEMORY_SIZE_FOR_100M)) {
mStorageMonitorHandler.sendEmptyMessage(MSG_DATA_MEMORY_MORE_100M);
} else {
mStorageMonitorHandler.sendEmptyMessage(MSG_DATA_MEMORY_LESS_100M);
}
}
}
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
switch (msg.what) {
case MSG_CHECK_DATA_MEMORY:
LogUtils.logi(TAG, "MSG_CHECK_DATA_MEMORY");
checkMemory();
break;
case MSG_DATA_MEMORY_MORE_400M:
// 当还大于400M,不处理
LogUtils.logi(TAG, "MSG_DATA_MEMORY_MORE_400M");
mLowStorageSpace = false;
break;
case MSG_DATA_MEMORY_MORE_200M:
// 当小于200M,弹出toast
LogUtils.logi(TAG, "MSG_DATA_MEMORY_MORE_200M");
mLowStorageSpace = false;
CarToast.makeText(mContext, R.string.system_basic_recovery_and_remove_app_text,
TOAST_SHOW_LONG_TIME).show();
break;
case MSG_DATA_MEMORY_MORE_100M:
// 当小于100M,弹出警告框
LogUtils.logi(TAG, "MSG_DATA_MEMORY_MORE_100M");
mLowStorageSpace = false;
if (has3rdApp()) {
showStorageWarningView();
} else {
showMasterCleanDialog();
}
// 清理log日志
clearLogFiles();
break;
case MSG_DATA_MEMORY_LESS_100M:
LogUtils.logi(TAG, "MSG_DATA_MEMORY_LESS_100M");
// 低内存标志位置位
mLowStorageSpace = true;
// 弹出清理提示框
showMasterCleanDialog();
// 删除temp目录
deletePlaceHolderFile();
// 清理log日志
clearLogFiles();
break;
}
3.4 ActivityProcManager

这里着重介绍ActivityProcManager类,因为出现该问题的原因就是这个manager导致的。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
public ActivityProcManager(Context context) {
mContext = context;
mMonitorHandlerThread = new HandlerThread(TAG);
mMonitorHandlerThread.start();
mHandler = new Handler(mMonitorHandlerThread.getLooper()) {
@Override
public void handleMessage(Message msg) {
// 判断是否杀掉进程的逻辑
myHandleMessage(msg);
}
};
// 进程观察者
mProcessObserver = new ProcessObserver();
mAm = ActivityManager.getService();
// monitor activity change.
try {
mAm.registerProcessObserver(mProcessObserver);
} catch (RemoteException ex) {
LogUtils.loge(TAG, "cannot register activity monitoring" + Log.getStackTraceString(ex));
throw new RuntimeException(ex);
}
}

首先来看一下信号输入,进程观察者

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
private class ProcessObserver extends IProcessObserver.Stub {
@Override
public void onForegroundActivitiesChanged(int pid, int uid, boolean foregroundActivities) {
if (DEBUG) {
LogUtils.logd(TAG, "onForegroundActivitiesChanged,pid=" + pid + " uid=" + uid
+ " foregroundActivities=" + foregroundActivities);
}
String packageName = getPackageNameByPidUid(pid, uid);
if (foregroundActivities) {
synchronized (LOCK) {
// 如果是前台activity,做一个标记
mCurrentActivityInfo = new LastActivityInfo(pid, uid, packageName);
}
return;
}

// 如果是忽略列表中的应用,直接返回,(应用返回home或者systemui)不做记录
if (checkIsIgnorePackage(packageName)) {
return;
}
// 如果切换界面的进程号是一样的,不做判断
if (mLastActivityInfo != null && mLastActivityInfo.mPid == pid) {
return;
}
synchronized (LOCK) {
// 记录上一次界面的进程号和包名
mLastActivityInfo = new LastActivityInfo(pid, uid, packageName);
}
if (DEBUG) {
LogUtils.logd(TAG, "mLastActivityInfo = " + mLastActivityInfo.toString());
}
// 通知检查activity界面
notifyForegroundActivitiesChanged();
}

@Override
public void onProcessDied(int pid, int uid) {
......
}
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
private void notifyForegroundActivitiesChanged() {
if (mHandler == null) {
LogUtils.loge(TAG, "notifyForegroundActivitiesChanged handler is null!");
return;
}
// check interval time
// 每10秒触发一次界面检查:ID_CHECK_RUN_LIMIT
long curTime = SystemClock.elapsedRealtime();
if ((curTime - mLastRunLimitLogicTime) >= MINIMUM_RUNNING_INTERVAL) {
mHandler.removeMessages(ID_CHECK_RUN_LIMIT);
mHandler.sendEmptyMessage(ID_CHECK_RUN_LIMIT);
} else {
mHandler.removeMessages(ID_CHECK_RUN_LIMIT);
long delay = MINIMUM_RUNNING_INTERVAL - (curTime - mLastRunLimitLogicTime);
mHandler.sendEmptyMessageDelayed(ID_CHECK_RUN_LIMIT, delay);
}
}
1
2
3
4
5
6
private void myHandleMessage(Message msg) {
int id = msg.what;
if (id == ID_CHECK_RUN_LIMIT) {
runLimitLogic();
}
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
private void runLimitLogic() {
if (!mFeaturesEnable) {
return;
}
// 获取当前activity栈
List<RecentTaskInfo> infos = getRecentTask();
if (infos == null) {
return;
}
for (RecentTaskInfo info : infos) {
String packageName = info.realActivity.getPackageName();
// 校验当前activity是否是前台activity
if (checkIsCurrentActiveTask(packageName)) {
continue;
}
// 校验是否是白名单应用
if (checkIsWhiteList(packageName)) {
continue;
}
// 校验是否是音乐应用在后台播放
if (checkIsAudioInBackground(packageName)) {
continue;
}
// 校验是否跳过两级界面
if (checkIsLastRecord(packageName)) {
continue;
}

try {
// 如果以上都是false,杀掉进程
Method method = Class.forName(ACTIVITY_MANAGER_NAME)
.getMethod(FORCE_STOP_PACKAGE, String.class);
method.invoke((ActivityManager) mContext.getSystemService(Context.ACTIVITY_SERVICE), packageName);
} catch (Exception ex) {
LogUtils.logd(TAG, "forceStopPackage ex=" + Log.getStackTraceString(ex));
}

if (DEBUG) {
LogUtils.logd(TAG, "runLimitLogic kill packagename = " + packageName);
}
}
// update record time
synchronized (LOCK) {
mLastRunLimitLogicTime = SystemClock.elapsedRealtime();
}
}

ok,我们找到问题了,com.iflytek.autofly.avatar不是白名单应用,界面跳过两级以上,被procmanger杀死了。

3.5 问题解决
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
<!-- 在procmanager/proc_management_cfg.xml中加入应用白名单 -->
<?xml version="1.0" encoding="utf-8"?>

<proc_configs>
<white_lists>
<item>com.gxa.app.launcher</item>
<item>com.gxa.appservice.procmanagement</item>
<item>com.gxa.systemui</item>
<item>com.gxa.app.settings</item>
<item>com.android.systemui</item>
</white_lists>
<black_lists>
<item>demo.test.package</item>
</black_lists>
</proc_configs>

4.问题回溯

车机系统在基于原生AMS的oom_adj内存管理机制的基础上,另外建立了一套进程管理机制来保证车机在有限的资源中,优先级高的应用进程能够获得更多存活机会。