这篇文章主要介绍“hadoop 2.4 namenode源码分析”,在日常操作中,相信很多人在hadoop 2.4 namenode源码分析问题上存在疑惑,小编查阅了各式资料,整理出简单好用的操作方法,希望对大家解答”hadoop 2.4 namenode源码分析”的疑惑有所帮助!接下来,请跟着小编一起来学习吧!
在hadoop nn的HA,对于主备节点的选举,是通过ActiveStandbyElector来实现的。源码上有针对该类的解释。
小弟英文不才,翻译一下。该类主要使用了zookeeper实现了主节点的选举,对于成功选举的主节点,会在zookeeper上创建零时节点。如果创建成功,NN会变成active,而其余nn节点会成备用节点。
下面还是来具体分析一下ActiveStandbyElector类的作用,ActiveStandbyElector主要实现了选举,选举流程主要是通过创建零时节点的方式实现,如果创建成功。可以认为是获取到对应的LOCK,该节点可以成为active。如果没有成功创建该节点,可以认为为standby节点,对于standby节点,需要一直监听该LOCK节点的状态。如果发生节点的事件,就去尝试选举。基本流程就是这样。
下面,来看一下ActiveStandbyElector类的主要方法和流程。对于熟悉zookeeper的同学来说,zookeeper的必须要实现watcher接口,其中可以实现自己的各种事件的处理逻辑。
在ActiveStandbyElector中,采用了
内部类来实现Watcher接口,其process方法,调用了processWatchEvent来实现具体的业务处理。
下面来分析该processWatchEvent的具体逻辑:
//处理zk的事件
synchronized void processWatchEvent(ZooKeeper zk, WatchedEvent event) { Event.EventType eventType = event.getType(); if (isStaleClient(zk)) return; LOG.debug("Watcher event type: " + eventType + " with state:" + event.getState() + " for path:" + event.getPath() + " connectionState: " + zkConnectionState + " for " + this); if (eventType == Event.EventType.None) { //会话本身的时间,如连接。失去连接。 // the connection state has changed switch (event.getState()) { case SyncConnected: LOG.info("Session connected."); // if the listener was asked to move to safe state then it needs to // be undone ConnectionState prevConnectionState = zkConnectionState; zkConnectionState = ConnectionState.CONNECTED; if (prevConnectionState == ConnectionState.DISCONNECTED && wantToBeInElection) { monitorActiveStatus();//监控节点 } break; case Disconnected: LOG.info("Session disconnected. Entering neutral mode..."); // ask the app to move to safe state because zookeeper connection // is not active and we dont know our state zkConnectionState = ConnectionState.DISCONNECTED; enterNeutralMode(); break; case Expired: // the connection got terminated because of session timeout // call listener to reconnect LOG.info("Session expired. Entering neutral mode and rejoining..."); enterNeutralMode(); reJoinElection(0);//参与选举 break; case SaslAuthenticated: LOG.info("Successfully authenticated to ZooKeeper using SASL."); break; default: fatalError("Unexpected Zookeeper watch event state: " + event.getState()); break; } return; } // a watch on lock path in zookeeper has fired. so something has changed on // the lock. ideally we should check that the path is the same as the lock // path but trusting zookeeper for now //节点事件 String path = event.getPath(); if (path != null) { switch (eventType) { case NodeDeleted: if (state == State.ACTIVE) { enterNeutralMode();//该方法目前未实现 } joinElectionInternal();//开始选举 break; case NodeDataChanged: monitorActiveStatus();//继续监控该节点,尝试成为active break; default: LOG.debug("Unexpected node event: " + eventType + " for path: " + path); monitorActiveStatus(); } return; } // some unexpected error has occurred fatalError("Unexpected watch error from Zookeeper"); }
而joinElectionInternal,选举的核心方法就是,
选举就是通过对zkLokFilePath节点的创建,来完成。这个采用了zk的异步回调。
从该类的定义,可以看出,本身就是实现了zk的两个接口。
StatCallback需要实现的方法,如下:
对于两个方法的实现,ActiveStandbyElector内部实现几乎是一样的。这里不再贴上源码,有兴趣的可以自己去看源码。
贴上实现方法,有注释。呵呵
public synchronized void processResult(int rc, String path, Object ctx, String name) { if (isStaleClient(ctx)) return; LOG.debug("CreateNode result: " + rc + " for path: " + path + " connectionState: " + zkConnectionState + " for " + this); Code code = Code.get(rc);//为了方便使用,这里自定义了一组状态 if (isSuccess(code)) {//成功返回,成功创建zklocakpath节点 // we successfully created the znode. we are the leader. start monitoring if (becomeActive()) {//要将本节点上的NN变成active monitorActiveStatus();//继续监控节点状态 } else { reJoinElectionAfterFailureToBecomeActive();//失败,继续选举尝试 } return; } if (isNodeExists(code)) {//节点存在,说明已经有active,wait即可 if (createRetryCount == 0) { // znode exists and we did not retry the operation. so a different // instance has created it. become standby and monitor lock. becomeStandby(); } // if we had retried then the znode could have been created by our first // attempt to the server (that we lost) and this node exists response is // for the second attempt. verify this case via ephemeral node owner. this // will happen on the callback for monitoring the lock. monitorActiveStatus();//不过努力成为active的动作不能停 return; } String errorMessage = "Received create error from Zookeeper. code:" + code.toString() + " for path " + path; LOG.debug(errorMessage); if (shouldRetry(code)) { if (createRetryCount < maxRetryNum) { LOG.debug("Retrying createNode createRetryCount: " + createRetryCount); ++createRetryCount; createLockNodeAsync(); return; } errorMessage = errorMessage + ". Not retrying further znode create connection errors."; } else if (isSessionExpired(code)) { // This isn't fatal - the client Watcher will re-join the election LOG.warn("Lock acquisition failed because session was lost"); return; } fatalError(errorMessage); }
对于becomeStandby,becomeActive这些状态的改变,有ZKFailoverController来实现。
到此,关于“hadoop 2.4 namenode源码分析”的学习就结束了,希望能够解决大家的疑惑。理论与实践的搭配能更好的帮助大家学习,快去试试吧!若想继续学习更多相关知识,请继续关注亿速云网站,小编会继续努力为大家带来更多实用的文章!
免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。