这篇文章给大家介绍Fair Scheduler到Capacity Scheduler的转换工具有哪些,内容非常详细,感兴趣的小伙伴们可以参考借鉴,希望对大家能有所帮助。
o 一次查看多个节点
o 细粒度的锁
o 多个分配线程
o 吞吐量提高5-10倍
经过长期仔细的分析,我们决定选择Capacity Scheduler作为默认调度程序。我们整理了一份文档,比较了 YARN-9698 ( 直接链接 )下的“容量调度程序”和“公平调度程序”的功能。
请注意,尽管我们使用各种Fair Scheduler和YARN站点配置测试了该工具,但它是Apache Hadoop的新增功能。强烈建议手动检查和检查生成的输出文件。
yarn fs2cs -y /path/to/yarn-site.xml [-f /path/to/fair-scheduler.xml] {-o /output/path/ | -p} [-t] [-s] [-d]-y /path/to/yarn-site.xml [-f /path/to/fair-scheduler.xml] {-o /output/path/ | -p} [-t] [-s] [-d]
yarn fs2cs --yarnsiteconfig /path/to/yarn-site.xml [--fsconfig /path/to/fair-scheduler.xml] {--output-directory /output/path/ | --print} [--no-terminal-rule-check] [--skip-verification] [--dry-run]--yarnsiteconfig /path/to/yarn-site.xml [--fsconfig /path/to/fair-scheduler.xml] {--output-directory /output/path/ | --print} [--no-terminal-rule-check] [--skip-verification] [--dry-run]
yarn fs2cs --yarnsiteconfig /home/hadoop/yarn-site.xml --fsconfig /home/hadoop/fair-scheduler.xml --output-directory /tmp--yarnsiteconfig /home/hadoop/yarn-site.xml --fsconfig /home/hadoop/fair-scheduler.xml --output-directory /tmp
<allocations> <queue name="root"><queue name="root"> <weight>1.0</weight><weight>1.0</weight> <schedulingPolicy>drf</schedulingPolicy><schedulingPolicy>drf</schedulingPolicy> <queue name="default"><queue name="default"> <weight>1.0</weight><weight>1.0</weight> <schedulingPolicy>drf</schedulingPolicy><schedulingPolicy>drf</schedulingPolicy> </queue></queue> <queue name="users" type="parent"><queue name="users" type="parent"> <maxChildResources>memory-mb=8192, vcores=1</maxChildResources><maxChildResources>memory-mb=8192, vcores=1</maxChildResources> <weight>1.0</weight><weight>1.0</weight> <schedulingPolicy>drf</schedulingPolicy><schedulingPolicy>drf</schedulingPolicy> </queue></queue> </queue></queue> <queuePlacementPolicy><queuePlacementPolicy> <rule name="specified" create="true"/><rule name="specified" create="true"/> <rule name="nestedUserQueue" create="true"><rule name="nestedUserQueue" create="true"> <rule name="default" create="true" queue="users"/><rule name="default" create="true" queue="users"/> </rule></rule> <rule name="default"/><rule name="default"/> </queuePlacementPolicy></queuePlacementPolicy></allocations>
yarn.scheduler.fair.allow-undeclared-pools = true.scheduler.fair.allow-undeclared-pools = trueyarn.scheduler.fair.user-as-default-queue = true.scheduler.fair.user-as-default-queue = trueyarn.scheduler.fair.preemption = false.scheduler.fair.preemption = falseyarn.scheduler.fair.preemption.cluster-utilization-threshold = 0.8.scheduler.fair.preemption.cluster-utilization-threshold = 0.8yarn.scheduler.fair.sizebasedweight = false.scheduler.fair.sizebasedweight = falseyarn.scheduler.fair.assignmultiple = true.scheduler.fair.assignmultiple = trueyarn.scheduler.fair.dynamicmaxassign = true.scheduler.fair.dynamicmaxassign = trueyarn.scheduler.fair.maxassign = -1.scheduler.fair.maxassign = -1yarn.scheduler.fair.continuous-scheduling-enabled = false.scheduler.fair.continuous-scheduling-enabled = falseyarn.scheduler.fair.locality-delay-node-ms = 2000.scheduler.fair.locality-delay-node-ms = 2000
~$ yarn fs2cs -y /home/examples/yarn-site.xml -f /home/examples/fair-scheduler.xml -o /tmp$ yarn fs2cs -y /home/examples/yarn-site.xml -f /home/examples/fair-scheduler.xml -o /tmp
2020-05-05 14:22:41,384 INFO [main] converter.FSConfigToCSConfigConverter (FSConfigToCSConfigConverter.java:prepareOutputFiles(138)) - Output directory for yarn-site.xml and capacity-scheduler.xml is: /tmp
-05-05 14:22:41,384 INFO [main] converter.FSConfigToCSConfigConverter (FSConfigToCSConfigConverter.java:prepareOutputFiles(138)) - Output directory for yarn-site.xml and capacity-scheduler.xml is: /tmp
2020-05-05 14:22:41,388 INFO [main] converter.FSConfigToCSConfigConverter (FSConfigToCSConfigConverter.java:loadConversionRules(177)) - Conversion rules file is not defined, using default conversion config!
-05-05 14:22:41,388 INFO [main] converter.FSConfigToCSConfigConverter (FSConfigToCSConfigConverter.java:loadConversionRules(177)) - Conversion rules file is not defined, using default conversion config!
[...] output trimmed for brevity output trimmed for brevity
2020-05-05 14:22:42,572 ERROR [main] converter.FSConfigToCSConfigConverterMain (MarkerIgnoringBase.java:error(159)) - -05-05 14:22:42,572 ERROR [main] converter.FSConfigToCSConfigConverterMain (MarkerIgnoringBase.java:error(159)) - Error while starting FS configuration conversion! while starting FS configuration conversion!
[...] output trimmed for brevity output trimmed for brevity
Caused by: org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfigurationException: Rules after rule 2 in queue placement policy can never be reached
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueuePlacementPolicy.updateRuleSet(QueuePlacementPolicy.java:115)
[...]
~$ yarn fs2cs -y /home/examples/yarn-site.xml -f /home/examples/fair-scheduler.xml -o /tmp --no-terminal-rule-check
2020-05-05 14:41:39,189 INFO [main] capacity.CapacityScheduler (CapacityScheduler.java:initScheduler(384)) - Initialized CapacityScheduler with calculator=class org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator, minimumAllocation=<<memory:1024, vCores:1>>, maximumAllocation=<<memory:8192, vCores:4>>, asynchronousScheduling=false, asyncScheduleInterval=5ms,multiNodePlacementEnabled=false
2020-05-05 14:41:39,190 INFO [main] converter.ConvertedConfigValidator (ConvertedConfigValidator.java:validateConvertedConfig(72)) - Capacity scheduler was successfully started
This time, the conversion succeeded!
2020-05-05 14:41:38,908 WARN [main] converter.FSConfigToCSConfigRuleHandler (ConversionOptions.java:handleWarning(48)) - Setting <userMaxAppsDefault> is not supported, ignoring conversion
2020-05-05 14:41:38,945 WARN [main] converter.FSConfigToCSConfigRuleHandler (ConversionOptions.java:handleWarning(48)) - Setting <maxChildResources> is not supported, ignoring conversion
yarn.scheduler.capacity.resource-calculator =org.apache.hadoop.yarn.util.resource.DominantResourceCalculatoryarn.scheduler.capacity.per-node-heartbeat.multiple-assignments-enabled = trueyarn.resourcemanager.scheduler.class = org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
yarn.scheduler.capacity.root.users.maximum-capacity = 100yarn.scheduler.capacity.root.default.capacity = 50.000yarn.scheduler.capacity.root.default.ordering-policy = fairyarn.scheduler.capacity.root.users.capacity = 50.000yarn.scheduler.capacity.root.default.maximum-capacity = 100yarn.scheduler.capacity.root.queues = default,usersyarn.scheduler.capacity.root.maximum-capacity = 100yarn.scheduler.capacity.maximum-am-resource-percent = 0.5
yarn.scheduler.fair.preemption - trueyarn.scheduler.fair.sizebasedweight - trueyarn.scheduler.fair.continuous-scheduling-enabled - true
yarn.scheduler.capacity.resource-calculator =org.apache.hadoop.yarn.util.resource.DominantResourceCalculatoryarn.scheduler.capacity.schedule-asynchronously.scheduling-interval-ms = 5yarn.scheduler.capacity.schedule-asynchronously.enable = trueyarn.resourcemanager.monitor.capacity.preemption.monitoring_interval = 10000yarn.resourcemanager.monitor.capacity.preemption.max_wait_before_kill = 15000yarn.scheduler.capacity.per-node-heartbeat.multiple-assignments-enabled = trueyarn.resourcemanager.scheduler.class =org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduleryarn.resourcemanager.scheduler.monitor.enable = true
yarn.scheduler.capacity.root.default.ordering-policy.fair.enable-size-based-weight = trueyarn.scheduler.capacity.root.users.ordering-policy.fair.enable-size-based-weight = trueyarn.scheduler.capacity.root.users.capacity = 50.000yarn.scheduler.capacity.root.queues = default,usersyarn.scheduler.capacity.root.users.maximum-capacity = 100yarn.scheduler.capacity.root.ordering-policy.fair.enable-size-based-weight = true[...] rest is omitted because it’s the same as before
root.a = 3root.b = 1
yarn.scheduler.capacity.root.a.capacity = 75.000yarn.scheduler.capacity.root.a.maximum-capacity = 100.000yarn.scheduler.capacity.root.b.capacity = 25.000yarn.scheduler.capacity.root.b.maximum-capacity = 100.000
root = 1root.users = 20root.default = 10root.users.alice = 3root.users.bob = 1
yarn.scheduler.capacity.root.capacity = 100.000yarn.scheduler.capacity.root.maximum-capacity = 100.000yarn.scheduler.capacity.root.users.capacity = 66.667yarn.scheduler.capacity.root.users.maximum-capacity = 100.000yarn.scheduler.capacity.root.default.capacity = 33.333yarn.scheduler.capacity.root.default.maximum-capacity = 100.000yarn.scheduler.capacity.root.users.alice.capacity = 75.000yarn.scheduler.capacity.root.users.alice.maximum-capacity = 100.000yarn.scheduler.capacity.root.users.bob.capacity = 25.000yarn.scheduler.capacity.root.users.bob.maximum-capacity = 100.000
• 每个用户的最大应用程序数
• <userMaxAppsDefault> –每个用户的默认最大应用程序
• <minResources> –队列的最小资源
• <maxResources> –队列的最大资源
• <maxChildResources> –动态创建的队列的最大资源
• 队列级别的DRF排序策略:在Capacity Scheduler中,DRF必须是全局的。在Fair Scheduler中,可以在DRF父项下使用常规的“ Fair”策略。
1) 在Capacity Scheduler(YARN-9936 )中将百分比向量作为资源处理。用户将不仅可以定义单个容量,还可以定义不同资源的多个值。
2) 手柄maxRunningApps 每用户userMaxAppsDefault (YARN-9930 )我们有“每用户最多的应用程序”设置,但它不是直接配置和繁琐,因为它的三个设置的组合。我们还必须注意不要破坏现有行为–如果超过了最大设置,Capacity Scheduler中的现有逻辑将拒绝提交应用程序,而在Fair Scheduler中,该应用程序始终被接受,并将在以后进行调度。
3) 处理minResources ,maxResources 和maxChildResources 这些在很大程度上取决于YARN-9936 。在Fair Scheduler中,用户可以通过多种方式(单个百分比,两个单独的百分比或绝对资源)表达这些设置。为了支持Capacity Scheduler中的类似设置,我们需要YARN-9936 。
4) 使映射规则的行为类似于Fair Scheduler中存在的实现。在“放置规则”部分中说明了如何评估映射规则。我们可能需要一种新的,可插入的方法–这样,我们就不会在已经非常复杂的现有代码库中引入回归。
5) 关于DRF和其他调度策略的改进(YARN-9892 )当前,我们有一个由属性yarn.scheduler.capacity.resource-calculator 定义的全局资源计算器。这在Fair Scheduler中更加细腻。
6) 关于整个转换过程的通用微调在Capacity Scheduler 中有一些属性,例如“ user-limit-factor”或“ minimum-user-limit-percent”。我们暂时不使用这些设置,但是事实证明,在某些配置中,它们被证明是有用的
关于Fair Scheduler到Capacity Scheduler的转换工具有哪些就分享到这里了,希望以上内容可以对大家有一定的帮助,可以学到更多知识。如果觉得文章不错,可以把它分享出去让更多的人看到。
免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。