这篇文章主要讲解了“akka cluster相关问题怎么解决”,文中的讲解内容简单清晰,易于学习与理解,下面请大家跟着小编的思路慢慢深入,一起来研究和学习“akka cluster相关问题怎么解决”吧!
最近项目中,用akka(2.6.8) cluster在k8s做分布式的部署,,其中遇到unreachable node 如果一直未手动重启,则会导致其他的node加入不到cluster中来,
具体的操作为其中的一个非seed node节点由于pod 重启导致,部署到了其他的节点上,而之前的node(ip),cluster则会一直去连接该node(ip),从而导致异常
首先我们先看一下概念Gossip Convergence,如下:
Gossip convergence cannot occur while any nodes are unreachable. The nodes need to become reachable again, or moved to the down and removed states (see the Cluster Membership Lifecycle section). This only blocks the leader from performing its cluster membership management and does not influence the application running on top of the cluster. For example this means that during a network partition it is not possible to add more nodes to the cluster. The nodes can join, but they will not be moved to the up state until the partition has healed or the unreachable nodes have been downed.
翻译过来就是: 当任何节点都不可达时,Gossip convergence就不达成一致。节点需要再次变得reachable,或转移到down和removed状态。这仅阻止领导者执行其集群成员资格管理,并且不会影响在集群顶部运行的应用程序。例如,这意味着在网络分
If a node is unreachable then gossip convergence is not possible and therefore most leader actions are impossible (for instance, allowing a node to become a part of the cluster). To be able to move forward, the node must become reachable again or the node must be explicitly “downed”. This is required because the state of an unreachable node is unknown and the cluster cannot know if the node has crashed or is only temporarily unreachable because of network issues or GC pauses. See the section about User Actions below for ways a node can be downed.
第一种方式自行研究,我们采用第二种方式: 其中SBR分tatic-quorum, keep-majority, keep-oldest, down-all, lease-majority 五种strategies
akka.coordinated-shutdown.exit-jvm = on akka.coordinated-shutdown.exit-code = 0 akka.cluster.downing-provider-class = "akka.cluster.sbr.SplitBrainResolverProvider" akka.cluster.split-brain-resolver.down-all-when-unstable = off akka.cluster.split-brain-resolver.stable-after = 20s akka.cluster.split-brain-resolver.active-strategy = keep-majority akka.cluster.split-brain-resolver.keep-majority.role = "admin"
名词 | 说明 |
akka.coordinated-shutdown.exit-jvm | 当节点从cluster中移除时,是否退出jvm,可选为on off |
akka.coordinated-shutdown.exit-code | 退出时的状态码 |
akka.cluster.downing-provider-class | 配置为akka.cluster.sbr.SplitBrainResolverProvider,表示启动SBR |
akka.cluster.split-brain-resolver.down-all-when-unstable | 当cluster处于不稳定状态多久,会关闭所有节点,可选on off或者持续时间,如15s |
akka.cluster.split-brain-resolver.stable-after | 节点处于unreachable多久,SBR开始进行节点down操作 |
akka.cluster.split-brain-resolver.active-strategy | keep-majority,启动的策略 |
akka.cluster.split-brain-resolver.keep-majority.role | 设置只有该role才能进行做SBR决定 |
感谢各位的阅读,以上就是“akka cluster相关问题怎么解决”的内容了,经过本文的学习后,相信大家对akka cluster相关问题怎么解决这一问题有了更深刻的体会,具体使用情况还需要大家实践验证。这里是亿速云,小编将为大家推送更多相关知识点的文章,欢迎关注!