问题背景
我们在适配完展讯T606平台后发现拖拽图标时会卡住,如下图所示,有一个弹出提示:pasting to Launcher,小圆圈一直在打转,桌面不可操作,不过可以点击取消按钮X恢复,OpenHarmony版本是3.2.2 release。
刚开始笔者误以为是gfx图形加速未适配导致,事实上当仔细查看拖拽卡死时的log,能发现如下一些看上去可能和这个pasting有关联的错误:
01-01 09:01:42.311 1467 1466 E C01800/SA_CLIENT: GetSystemAbilityWrapper sa 6001 didn't start. Returning nullptr
01-01 09:01:42.312 1466 1466 E C04400/Device_Profile: DistributedDeviceProfileClient::GetDeviceProfileService get service failed
01-01 09:01:42.312 1466 1466 E C01c02/PasteboardService: [dev_profile.cpp] GetEnabledStatus# GetDeviceProfile failed, .
01-01 09:01:42.312 1466 1466 D C01c02/PasteboardService: [distributed_module_config.cpp] GetEnabledStatus# localEnabledStatus = false.
01-01 09:01:42.312 1466 1466 E C01c02/PasteboardService: [pasteboard_service.cpp] HasDistributedData# clipPlugin null.
01-01 09:01:42.312 1466 1466 I C01c02/PasteboardService: [pasteboard_service_stub.cpp] OnHasPasteData# end.
01-01 09:01:42.312 1813 1813 D C01c01/PasteboardClient: [pasteboard_service_proxy.cpp] HasPasteData# end.
[...]
01-01 09:01:43.442 1466 1466 E C04400/Device_Profile: DistributedDeviceProfileClient::GetDeviceProfileService get service failed
01-01 09:01:43.442 1466 1466 E C01c02/PasteboardService: [dev_profile.cpp] GetEnabledStatus# GetDeviceProfile failed, .
01-01 09:01:43.442 1466 1466 D C01c02/PasteboardService: [distributed_module_config.cpp] GetEnabledStatus# localEnabledStatus = false.
01-01 09:01:43.442 1466 1466 E C01c02/PasteboardService: [pasteboard_service.cpp] SetDistributedData# clipPlugin null.
这里主要涉及到OpenHarmony的系统能力管理, code位于//foundation/systemabilitymgr,我们先来简单了解下。
SA管理
首先,我们看下启动恢复子系统对samgr的说明:
5.samgr是各个SA的服务注册中心,每个SA启动时,都需要向samgr注册,每个SA会分配一个ID,应用可以通过该ID访问SA。
6.foundation是一个特殊的SA服务进程,提供了用户程序管理框架及基础服务。由该进程负责应用的生命周期管理。
标准系统的systemabilitymgr子系统有两个part(部件or组件):一个是samgr,一个是safwk。
samgr服务
samgr组件是OpenHarmony的核心组件,提供OpenHarmony系统服务启动、注册、查询等功能。
samgr 服务框架如下图:
samgr服务几个关键接口如下表:
接口名 | 接口表述 |
SystemAbilityManager::AddSystemAbility() | 接收sa框架发送的注册消息并缓存 |
SystemAbilityManager::CheckSystemAbility() | 接收sa框架发送的获取消息,通过sa id,查找到对应服务的代理对象,然后返回给sa框架 |
SystemAbilityManager::LoadSystemAbility() | sa非开机启动,按需动态加载,需在配置中指定"ondemand"为true |
sa框架
safwk组件定义OpenHarmony中SystemAbility的实现方法,并提供启动、注册等接口实现。
框架图如下:
从层级上看,safwk偏上点,它通过代理samgr_proxy发送消息和samgr服务进行交互。
sa框架主要包含一个可执行程序sa_main以及一个接口库system_ability_fwk:
./services/safwk/BUILD.gn:
ohos_executable("sa_main") {
install_enable = true
sources = [ "src/main.cpp" ]
./interfaces/innerkits/safwk/BUILD.gn:
ohos_shared_library("system_ability_fwk") {
sources = [
"../../../services/safwk/src/local_ability_manager.cpp",
"../../../services/safwk/src/local_ability_manager_stub.cpp",
"../../../services/safwk/src/system_ability.cpp",
]
sa的实现方式:
SystemAbility实现一般采用XXX.cfg + profile.json + libXXX.z.so的方式由init进程执行对应的XXX.cfg文件拉起相关SystemAbility进程。
其中,profile.json的run-on-create
说明:
run-on-create:true表示进程启动后即向samgr组件注册该SystemAbility;false表示按需启动,即在其他模块访问到该SystemAbility时启动,必配项。
示例代码还提到:IXXX是IPC对外接口,XXXProxy是客户端通信代码,XXXStub是服务端通信代码。到这里,我们已经对sa管理有了一些基础的认识。
卡屏分析
客户端分析
在对sa管理有了基本的认识后,接下来我们从PasteboardService的错误看起:
01-01 09:01:42.312 1466 1466 E C01c02/PasteboardService: [dev_profile.cpp] GetEnabledStatus# GetDeviceProfile failed, .
这个剪贴板服务是一个sa,code位于 //foundation/distributeddatamgr/pasteboard/,配置是profile/3701.xml:
<info>
<process>pasteboard_service</process>
<systemability>
<name>3701</name>
<libpath>libpasteboard_service.z.so</libpath>
<run-on-create>true</run-on-create>
<distributed>true</distributed>
<dump-level>1</dump-level>
</systemability>
</info>
可以看到run-on-create
是true,启动时注册此sa。
出错的code是剪贴板正在操作粘贴数据,如SetPasteData
->SetDistributedData
:
bool PasteboardService::SetDistributedData(int32_t user, PasteData &data)
{
std::vector<uint8_t> rawData;
auto clipPlugin = GetClipPlugin(); //TJ:here
if (clipPlugin == nullptr) {
PASTEBOARD_HILOGE(PASTEBOARD_MODULE_SERVICE, "clipPlugin null.");
return false;
}
笔者在log中发现这个clipPlugin
都是null:
01-01 09:01:42.312 1466 1466 E C01c02/PasteboardService: [pasteboard_service.cpp] HasDistributedData# clipPlugin null.
01-01 09:01:43.442 1466 1466 E C01c02/PasteboardService: [pasteboard_service.cpp] SetDistributedData# clipPlugin null.
01-01 09:01:44.634 1466 2053 E C01c02/PasteboardService: [pasteboard_service.cpp] GetDistributedData# clipPlugin null.
那这个clipPlugin
是怎么获取的了:
std::shared_ptr<ClipPlugin> PasteboardService::GetClipPlugin()
{
auto isOn = DistributedModuleConfig::IsOn(); //TJ: here
std::lock_guard<decltype(mutex)> lockGuard(mutex);
if (!isOn || clipPlugin_ != nullptr) {
return clipPlugin_;
}
//foundation/distributeddatamgr/pasteboard/services/core/src/distributed_module_config.cpp:
bool DistributedModuleConfig::IsOn()
{
status_ = GetEnabledStatus();
return status_;
}
bool DistributedModuleConfig::GetEnabledStatus()
{
PASTEBOARD_HILOGD(PASTEBOARD_MODULE_SERVICE, "GetEnabledStatus start.");
std::string localEnabledStatus = "false";
DevProfile::GetInstance().GetEnabledStatus("", localEnabledStatus);//TJ:here
//foundation/distributeddatamgr/pasteboard/services/core/src/dev_profile.cpp:
void DevProfile::GetEnabledStatus(const std::string &deviceId, std::string &enabledStatus)
{
PASTEBOARD_HILOGD(PASTEBOARD_MODULE_SERVICE, "GetEnabledStatus start.");
ServiceCharacteristicProfile profile;
int32_t ret = DistributedDeviceProfileClient::GetInstance().GetDeviceProfile(deviceId, SERVICE_ID, profile); //TJ: here
if (ret != HANDLE_OK) {
PASTEBOARD_HILOGE(PASTEBOARD_MODULE_SERVICE, "GetDeviceProfile failed, %{public}.5s.", deviceId.c_str());
return;
}
//foundation/deviceprofile/device_info_manager/interfaces/innerkits/core/src/distributed_device_profile_client.cpp:
int32_t DistributedDeviceProfileClient::GetDeviceProfile(const std::string& udid, const std::string& serviceId,
ServiceCharacteristicProfile& profile)
{
auto dps = GetDeviceProfileService();
sptr<IDistributedDeviceProfile> DistributedDeviceProfileClient::GetDeviceProfileService()
{
std::lock_guard<std::mutex> lock(serviceLock_);
if (dpProxy_ != nullptr) {
return dpProxy_;
}
auto samgrProxy = SystemAbilityManagerClient::GetInstance().GetSystemAbilityManager();
if (samgrProxy == nullptr) {
HILOGE("get samgr failed");
return nullptr;
}
auto object = samgrProxy->GetSystemAbility(DISTRIBUTED_DEVICE_PROFILE_SA_ID); //TJ: here
if (object == nullptr) {
HILOGE("get service failed");
return nullptr;
}
//foundation/systemabilitymgr/samgr/frameworks/native/source/system_ability_manager_proxy.cpp:
sptr<IRemoteObject> SystemAbilityManagerProxy::GetSystemAbility(int32_t systemAbilityId)
{
return GetSystemAbilityWrapper(systemAbilityId);
}
sptr<IRemoteObject> SystemAbilityManagerProxy::GetSystemAbilityWrapper(int32_t systemAbilityId, const string& deviceId)
{
[...]
bool isExist = false;
int32_t timeout = RETRY_TIME_OUT_NUMBER;
HILOGD("GetSystemAbilityWrapper:Waiting for sa %{public}d, ", systemAbilityId);
do {
sptr<IRemoteObject> svc;
if (deviceId.empty()) {
svc = CheckSystemAbility(systemAbilityId, isExist); //TJ: here
if (!isExist) {
HILOGW("%{public}s:sa %{public}d is not exist", __func__, systemAbilityId);
usleep(SLEEP_ONE_MILLI_SECOND_TIME * SLEEP_INTERVAL_TIME);
continue;
}
} else {
svc = CheckSystemAbility(systemAbilityId, deviceId);
}
[...]
} while (timeout--);
HILOGE("GetSystemAbilityWrapper sa %{public}d didn't start. Returning nullptr", systemAbilityId);
return nullptr;
}
deviceId
默认是空,走deviceId.empty
分支:
sptr<IRemoteObject> SystemAbilityManagerProxy::CheckSystemAbility(int32_t systemAbilityId, bool& isExist)
{
...
int32_t err = remote->SendRequest(CHECK_SYSTEM_ABILITY_IMMEDIATELY_TRANSACTION, data, reply, option);
if (err != ERR_NONE) {
return nullptr;
}
sptr<IRemoteObject> irsp(reply.ReadRemoteObject());
ret = reply.ReadBool(isExist);
if (!ret) {
HILOGW("CheckSystemAbility Read isExist failed!");
return nullptr;
}
pasteboard_service
通过代理SystemAbilityManagerProxy
发送消息->SendRequest
给远端查询sa DISTRIBUTED_DEVICE_PROFILE_SA_ID
是否启动正常。
在分析远端处理之前,我们先来看下这个sa(6001):
//foundation/systemabilitymgr/samgr/interfaces/innerkits/samgr_proxy/include/system_ability_definition.h:271: DISTRIBUTED_DEVICE_PROFILE_SA_ID = 6001,
code位于//foundation/deviceprofile/,配置是sa_profile/6001.xml:
<info>
<process>distributedsched</process>
<systemability>
<name>6001</name>
<libpath>libdistributed_device_profile.z.so</libpath>
<run-on-create>true</run-on-create>
<distributed>false</distributed>
<dump-level>1</dump-level>
</systemability>
</info>
可见,这个sa也是开机就注册到系统里(run-on-create
是true)。
远端分析
前文已经梳理过sa管理,我们很容易找到接收client请求的远端入口 - //foundation/systemabilitymgr/samgr/services/samgr/native/source/system_ability_manager_stub.cpp:
int32_t SystemAbilityManagerStub::OnRemoteRequest(uint32_t code,
MessageParcel& data, MessageParcel& reply, MessageOption &option)
{
HILOGI("SystemAbilityManagerStub::OnReceived, code = %{public}u, callerPid = %{public}d, flags= %{public}d",
code, IPCSkeleton::GetCallingPid(), option.GetFlags());
if (!EnforceInterceToken(data)) {
HILOGE("SystemAbilityManagerStub::OnReceived, code = %{public}u, check interfaceToken failed", code);
return ERR_PERMISSION_DENIED;
}
auto itFunc = memberFuncMap_.find(code); //TJ:here
if (itFunc != memberFuncMap_.end()) {
auto memberFunc = itFunc->second;
if (memberFunc != nullptr) {
return (this->*memberFunc)(data, reply);
}
}
在memberFuncMap_
里查找CHECK_SYSTEM_ABILITY_IMMEDIATELY_TRANSACTION
:
memberFuncMap_[CHECK_SYSTEM_ABILITY_IMMEDIATELY_TRANSACTION] =
&SystemAbilityManagerStub::CheckSystemAbilityImmeInner;
int32_t SystemAbilityManagerStub::CheckSystemAbilityImmeInner(MessageParcel& data, MessageParcel& reply)
{
[...]
ret = reply.WriteRemoteObject(CheckSystemAbility(systemAbilityId, isExist));
if (!ret) {
return ERR_FLATTEN_OBJECT;
}
[...]
}
//foundation/systemabilitymgr/samgr/services/samgr/native/source/system_ability_manager.cpp:
sptr<IRemoteObject> SystemAbilityManager::CheckSystemAbility(int32_t systemAbilityId, bool& isExist)
{
if (!CheckInputSysAbilityId(systemAbilityId)) {
return nullptr;
}
sptr<IRemoteObject> abilityProxy = CheckSystemAbility(systemAbilityId); //TJ: here
if (abilityProxy == nullptr) {
lock_guard<recursive_mutex> autoLock(onDemandLock_);
auto iter = startingAbilityMap_.find(systemAbilityId);
[...]
}
startingAbilityMap_
应该是给ondemand sa用, 6001 sa配置是开机启动,只看CheckSystemAbility
:
sptr<IRemoteObject> SystemAbilityManager::CheckSystemAbility(int32_t systemAbilityId)
{
HILOGD("%{public}s called, systemAbilityId = %{public}d", __func__, systemAbilityId);
[...]
shared_lock<shared_mutex> readLock(abilityMapLock_);
auto iter = abilityMap_.find(systemAbilityId);
if (iter != abilityMap_.end()) {
HILOGI("found service : %{public}d.", systemAbilityId);
return iter->second.remoteObj;
}
HILOGW("NOT found service : %{public}d", systemAbilityId);
return nullptr;
}
在abilityMap_
中查找是否有这个sa,辅助相关log:
01-01 09:01:40.179 454 454 W C01800/SAMGR: NOT found service : 6001
而abilityMap_
是在AddSystemAbility
时才有这个sa:
int32_t SystemAbilityManager::AddSystemAbility(int32_t systemAbilityId, const sptr<IRemoteObject>& ability,
const SAExtraProp& extraProp)
{
[...]
abilityMap_[systemAbilityId] = std::move(saInfo);
HILOGI("insert %{public}d. size : %{public}zu", systemAbilityId, abilityMap_.size());
[...]
可见,这个6001 sa就没有添加到系统里,那我们就来追踪下这个sa的添加过程。
sa:6001启动分析
sa 6001的定义:
./foundation/systemabilitymgr/samgr/interfaces/innerkits/samgr_proxy/include/system_ability_definition.h:365: { DISTRIBUTED_DEVICE_PROFILE_SA_ID, "DistributedDeviceProfile" },
顺着这个domain看下启动log,能发现这个服务初始化有错误:
01-01 13:05:05.653 4291 4308 E C04400/Device_Profile: DeviceProfileStorageManager::Init get local udid failed
01-01 13:05:05.653 4291 4308 E C04400/Device_Profile: DistributedDeviceProfileService::Init DeviceProfileStorageManager init failed
01-01 13:05:05.653 4291 4308 E C04400/Device_Profile: DistributedDeviceProfileService::OnStart init failed
sa启动入口在框架侧://foundation/systemabilitymgr/safwk/services/safwk/src/system_ability.cpp:
void SystemAbility::Start()
{
HILOGD(TAG, "starting system ability...");
if (isRunning_) {
return;
}
HILOGD(TAG, "[PerformanceTest] SAFWK OnStart systemAbilityId:%{public}d", saId_);
int64_t begin = GetTickCount();
HITRACE_METER_NAME(HITRACE_TAG_SAMGR, ToString(saId_) + "_OnStart");
OnStart(); //TJ: here
// The details should be implemented by subclass
void SystemAbility::OnStart()
{
}
//foundation/deviceprofile/device_info_manager/services/core/src/distributed_device_profile_service.cpp:
void DistributedDeviceProfileService::OnStart()
{
HILOGI("called");
if (!Init()) {
HILOGE("init failed"); //TJ: here
return;
}
if (!Publish(this)) {
HILOGE("publish SA failed");
return;
}
}
bool DistributedDeviceProfileService::Init()
{
if (!DpDeviceManager::GetInstance().Init()) {
HILOGE("DeviceManager init failed");
return false;
}
if (!DeviceProfileStorageManager::GetInstance().Init()) {
HILOGE("DeviceProfileStorageManager init failed"); //TJ: here
return false;
}
//foundation/deviceprofile/device_info_manager/services/core/src/dbstorage/device_profile_storage_manager.cpp:
bool DeviceProfileStorageManager::Init()
{
if (!inited_) {
if (!SyncCoordinator::GetInstance().Init()) {
HILOGE("SyncCoordinator init failed");
return false;
}
DpDeviceManager::GetInstance().GetLocalDeviceUdid(localUdid_);
if (localUdid_.empty()) {
HILOGE("get local udid failed"); //TJ: here
return false;
}
//foundation/deviceprofile/device_info_manager/services/core/src/devicemanager/dp_device_manager.cpp:
void DpDeviceManager::GetLocalDeviceUdid(std::string& udid)
{
char localDeviceId[DEVICE_ID_SIZE] = {0};
GetDevUdid(localDeviceId, DEVICE_ID_SIZE);
udid = localDeviceId;
}
//base/startup/init/interfaces/innerkits/syspara/parameter.c:
int GetDevUdid(char *udid, int size)
{
return GetDevUdid_(udid, size);
}
//base/startup/init/interfaces/innerkits/syspara/param_comm.c:
INIT_LOCAL_API int GetDevUdid_(char *udid, int size)
{
if (size < UDID_LEN || udid == NULL) {
return EC_FAILURE;
}
uint32_t len = (uint32_t)size;
int ret = SystemGetParameter("const.product.udid", udid, &len);
BEGET_CHECK(ret != 0, return ret);
const char *manufacture = GetManufacture_();
const char *model = GetProductModel_();
const char *sn = GetSerial_();
if (manufacture == NULL || model == NULL || sn == NULL) {
return -1;
}
[...]
}
如果获取不到系统参数const.product.udid
,那manufacture
,model
,sn
这三个属性必须有。
INIT_LOCAL_API const char *GetProductModel_(void)
{
static const char *productModel = NULL;
return GetProperty("const.product.model", &productModel);
}
INIT_LOCAL_API const char *GetManufacture_(void)
{
static const char *productManufacture = NULL;
return GetProperty("const.product.manufacturer", &productManufacture);
}
manufacture
,model
都没有问题, 看下sn
的获取:
INIT_LOCAL_API const char *GetSerial_(void)
{
[...]
int ret = SystemGetParameter("ohos.boot.sn", ohosSerial, &len);
BEGET_CHECK(ret == 0, return NULL);
return ohosSerial;
}
ohos.boot.sn
的配置比较隐蔽,在//base/startup/init/services/param/manager/param_server.c:
INIT_LOCAL_API int LoadParamFromCmdLine(void)
{
static const cmdLineInfo cmdLines[] = {
{OHOS_BOOT"hardware", CommonDealFun
},
[...]
},
{OHOS_BOOT"sn", SnDealFun //TJ: here
base/startup/init/services/param/include/param_utils.h:#define OHOS_BOOT "ohos.boot."
SnDealFun
:
static int SnDealFun(const char *name, const char *value, int res)
{
const char *snFileList [] = {
"/sys/block/mmcblk0/device/cid",
"/proc/bootdevice/cid"
};
int ret = CheckParamName(name, 0);
PARAM_CHECK(ret == 0, return ret, "Invalid name %s", name);
if (value != NULL && res == 0 && value[0] != '/') {
[...]
}
if (value != NULL && value[0] == '/') {
[...]
}
for (size_t i = 0; i < ARRAY_LENGTH(snFileList); i++) {
ret = ReadSnFromFile(name, snFileList[i]);
if (ret == 0) {
break;
}
}
return ret;
}
就是bootloader传上来的command line如果没有,就从snFileList
里去读取,显然这里sn
拿的是eMMC的cid,而笔者的存储设备是ufs,它就不吃这一套。
/proc/bootdevice/cid
这个路径也没有,笔者的适配参考是Android版本,Android BL当然没有ohos.boot.sn
,最简单有效的解决方法就是直接cp对应的Android参数androidboot.serialno
。
OK,我们再回到路开始的地方,也就是剪贴板服务的GetClipPluin()
:
std::shared_ptr<ClipPlugin> PasteboardService::GetClipPlugin()
{
auto isOn = DistributedModuleConfig::IsOn();
std::lock_guard<decltype(mutex)> lockGuard(mutex);
if (!isOn || clipPlugin_ != nullptr) {
return clipPlugin_; //TJ: here
}
!isOn
为真,直接返回clipPlugin_
,我们从log已经知道这个clipPluin_
一直是空,也就是在sn未配置的情况下,clipPlugin_
并没有很好的被获取到。
思考总结
在回顾问题时,笔者认为序列号应该是一个与量产有关的参数,让量产因素影响桌面操作的逻辑似乎不合理。
归根结底,这应该还是一个软件设计问题。首先,系统应该为序列号这种关键参数设置适当的默认值,防止因未配置而出现意外行为。其次,在缺失关键参数的情况下,软件应该具备一个优雅降级的处理机制,而不是出现冻住屏幕这类异常行为。另外,文档可以更清晰的注明软件正常运行所需的要求和配置。
如果能确保这几个基本面,笔者认为应该能最大程度减少这类坑的出现,从而增强系统整体的可靠性。
[参考文档]
•https://docs.openharmony.cn/pages/v3.2/zh-cn/device-dev/subsystems/subsys-boot-overview.md/•https://gitee.com/openharmony/systemabilitymgr_safwk•https://gitee.com/openharmony/systemabilitymgr_samgr•https://gitee.com/openharmony/distributeddatamgr_pasteboard•https://gitee.com/openharmony/deviceprofile_device_info_manager