如何解决Ambari 自定义服务启动成功后依旧显示停止状态问题

发布时间：2021-12-06 09:26:34 阅读：373 作者：柒染栏目：大数据

开发者测试专用服务器限时活动，0元免费领，库存有限，领完即止！点击查看>>

如何解决Ambari 自定义服务启动成功后依旧显示停止状态问题，针对这个问题，这篇文章详细介绍了相对应的分析和解答，希望可以帮助更多想解决这个问题的小伙伴找到更简单易行的方法。

1、概述

如果遇到该情况，首先前往 /var/log/ambari-agent/ambari-agent.log 查看日志输出。

服务安装后，每隔大约 60s 会执行 status() 方法。如果执行 status() 方法的过程中报错，则在 Ambari 页面上会显示服务已停止。如果执行 status() 方法的过程中没报错，则在 Ambari 页面上显示服务正常。

通常在 status() 方法中，我们会使用 Ambari 提供的 resource_management 模块里的 check_process_status() 来判断服务的状态。

check_process_status() 通过检测一个 pid 文件里面的进程号，来判断服务的启动状态。通常 pid 文件内只有一个进程号，如 12168 。

2、问题示例分析

2.1、报错

以自定义服务 JanusGraph 为例，status() 方法是这样写的：

from resource_management import *def status(self, env):    import graphexp_params    env.set_params(graphexp_params)    check_process_status(graphexp_params.graphexp_nginx_pid_file)

graphexp_params.py 文件的局部内容：

from resource_management import *config = Script.get_config()# graphexp的nginx pid文件路径graphexp_pid_dir = config['configurations']['graphexp-server']['graphexp_pid_dir']# graphexp的nginx pid文件路径graphexp_nginx_pid_file = os.path.join(graphexp_pid_dir, 'graphexp_nginx.pid')

上述代码是动态获取 Ambari 页面上的 graphexp_pid_dir 配置项，然后拼凑成一个 pid 文件路径，这个 pid 文件内容只有 graphexp 组件的进程号。

结果出错了，根据 /var/log/ambari-agent/ambari-agent.log 日志输出，发现在 status_params.py 里面获取 graphexp-server.xml 文件内的参数值报错，如下图所示：

如何解决Ambari 自定义服务启动成功后依旧显示停止状态问题

2.2、问题排查

在 status() 方法下，输出 config['configurations'] 发现只能打印出：

ams-hbase-env,infra-solr-env,hbase-env,ams-env,elastic-env,janusgraph-env,ams-grafana-env,hadoop-env,zookeeper-env,cluster-env

以上这些值，没有 graphexp-server 项。

而在 start() 方法里面打印有很多，所有的 configurations 的 xml 文件都被加载到了：

ranger-hdfs-audit,ssl-client,infra-solr-log4j,ranger-hdfs-policymgr-ssl,ams-hbase-site,elastic-config,ranger-hbase-audit,hdfs-logsearch-conf,ams-grafana-env,ranger-hdfs-security,ams-ssl-client,infra-solr-env,ranger-hdfs-plugin-properties,hbase-policy,ams-logsearch-conf,ams-hbase-security-site,hdfs-site,ams-env,ams-site,ams-hbase-policy,janusgraph-env,hadoop-metrics2.properties,hadoop-policy,hdfs-log4j,hbase-site,infra-logsearch-conf,ranger-hbase-plugin-properties,ams-grafana-ini,graphexp-server,ams-ssl-server,infra-solr-xml,ams-log4j,ams-hbase-env,core-site,infra-solr-security-json,gremlin-server,janusgraph-hbase-solr,infra-solr-client-log4j,hbase-logsearch-conf,hadoop-env,zookeeper-log4j,hbase-log4j,postgresql,ssl-server,hbase-env,zoo.cfg,elastic-env,ranger-hbase-policymgr-ssl,zookeeper-logsearch-conf,cluster-env,zookeeper-env,ams-hbase-log4j,ranger-hbase-security

所以猜测在 status() 方法里面，只能识别 xxx-env.xml 里面的配置内容。但是 ambari2.7 的自定义服务没有这个问题，只在 ambari2.6 上出现了。

2.3、解决办法

新建 graphexp-env.xml 文件，将 graphexp_pid_dir 配置项添加到该文件内。graphexp_params.py 文件的 graphexp_pid_dir 写法修改为：

# graphexp的nginx pid文件路径graphexp_pid_dir = config['configurations']['graphexp-env']['graphexp_pid_dir']# graphexp的nginx pid文件路径graphexp_nginx_pid_file = os.path.join(graphexp_pid_dir, 'graphexp_nginx.pid')

在 status() 方法内，获取 graphexp-env.xml 文件内的配置，只有 xxx-env.xml 的内容才可以被 status() 方法加载到。

3、status()方法调试建议

由于 status() 是轮询调用，且目前还不知道日志输出的具体位置（没有输出到 ambari-agent.log 里面），所以可以用 Execute("echo {0} >> /tmp/test.log".format(status_params.gtm_standby_pid_file)) 命令来输出需要的参数值。同时也可以根据上述 Execute 语句位置来判断代码具体的报错行数，方便定位代码报错地点。

关于如何解决Ambari 自定义服务启动成功后依旧显示停止状态问题问题的解答就分享到这里了，希望以上内容可以对大家有一定的帮助，如果你还有很多疑惑没有解开，可以关注亿速云行业资讯频道了解更多相关知识。

亿速云「云服务器」，即开即用、新一代英特尔至强铂金CPU、三副本存储NVMe SSD云盘，价格低至29元/月。点击查看>>

向AI问一下细节

如何解决Ambari 自定义服务启动成功后依旧显示停止状态问题

1、概述

2、问题示例分析

2.1、报错

2.2、问题排查

2.3、解决办法

3、status()方法调试建议

猜你喜欢

最新资讯

相关推荐

开发者交流群：

相关标签