而Spring Hadoop 所使用的AM**Template没有如此逻辑,因此修改如下,在AppmasterRmTemplate中增加如下接口和实现:
@Override public AllocateResponse allocate(final AllocateRequest request, final String host, final Integer rpcPort, final String trackUrl) { return execute(new YarnRpcCallback<AllocateResponse, ApplicationMasterProtocol>() { @Override public AllocateResponse doInYarn(ApplicationMasterProtocol proxy) throws YarnException, IOException { return doAllocate(proxy, request, host, rpcPort, trackUrl); } private AllocateResponse doAllocate(ApplicationMasterProtocol proxy, AllocateRequest request, final String host, final Integer rpcPort, final String trackUrl) throws IOException, YarnException { AllocateResponse allocateResponse = null; try { allocateResponse = proxy.allocate(request); } catch (ApplicationMasterNotRegisteredException e) { log.warn("ApplicationMaster is out of sync with ResourceManager," + " hence resyncing."); // re register with RM log.info("Re-register am with RM."); registerApplicationMaster(host, rpcPort, trackUrl); allocateResponse = doAllocate(proxy, request, host, rpcPort, trackUrl); return allocateResponse; } return allocateResponse; } }); } |
调用者DefaultContainerAllocator修改:
AppmasterService appmasterClientService = YarnContextUtils.getAppmasterClientService(getBeanFactory()); AppmasterTrackService appmasterTrackService = YarnContextUtils.getAppmasterTrackService(getBeanFactory()); String host = appmasterClientService == null ? "" : appmasterClientService.getHost(); int port = appmasterClientService == null ? 0 : appmasterClientService.getPort(); String trackUrl = appmasterTrackService == null ? null : appmasterTrackService.getTrackUrl(); log.info("Host: " + host + " ,port: " + port + ", trackUrl: " + trackUrl); AllocateResponse allocate = getRmTemplate().allocate(request, host, port, trackUrl); |
即可。 重新打包次此spring-yarn-core.jar,替换springxd中的jar,即可实现Yarn的HA支持。
配置文件
# Hadoop properties spring: hadoop: fsUri: hdfs://xxx resourceManagerHost: xxx # resourceManagerHost: yarn-cluster resourceManagerPort: 8032 # rmAddress: yarn-cluster # resourceManagerSchedulerAddress: ${spring.hadoop.resourceManagerHost}:8030 # jobHistoryAddress: xxx ## For phd30 only (values for version 3.0.1.0, also change resourceManagerPort above to 8050) # config: # mapreduce.application.framework.path: '/phd/apps/3.0.1.0-1/mapreduce/mapreduce.tar.gz#mr-framework' # mapreduce.application.classpath: '$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/phd/3.0.1.0-1/hadoop/lib/hadoop-lzo-0.6.0.3.0.1.0-1.jar:/etc/hadoop/conf/secure' ## For hdp22 only (values for version 2.2.8.0, also change resourceManagerPort above to 8050) config: mapreduce.application.framework.path: ${spring.yarn.config.mapreduce.application.framework.path} mapreduce.application.classpath: ${spring.yarn.config.mapreduce.application.classpath} net.topology.script.file.name: /etc/hadoop/conf/topology_script.py dfs.namenode.rpc-address: xxx.xxx.xxx.xxx:8020 dfs.nameservices: xxx dfs.ha.namenodes.xxx: nn1,nn2 dfs.namenode.rpc-address.xxx.nn1: xxx.xxx.xxx.xxx:8020 dfs.namenode.rpc-address.xxx.nn2: xxx.xxx.xxx.xxx:8020 dfs.client.failover.proxy.provider.xxx: org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider yarn.resourcemanager.ha.enabled: true yarn.resourcemanager.ha.rm-ids: rm1,rm2 yarn.resourcemanager.cluster-id: yarn-cluster yarn.resourcemanager.address.rm1: xxx.xxx.xxx.xxx yarn.resourcemanager.scheduler.address.rm1: xxx.xxx.xxx.xxx:8030 yarn.resourcemanager.admin.address.rm1: xxx.xxx.xxx.xxx:8033 yarn.resourcemanager.webapp.address.rm1: xxx.xxx.xxx.xxx:8088 yarn.resource.resource-tracker.address.rm1: xxx.xxx.xxx.xxx:8031 yarn.resourcemanager.address.rm2: xxx.xxx.xxx.xxx yarn.resourcemanager.scheduler.address.rm2: xxx.xxx.xxx.xxx:8030 yarn.resourcemanager.admin.address.rm2: xxx.xxx.xxx.xxx:8033 yarn.resourcemanager.webapp.address.rm2: xxx.xxx.xxx.xxx:8088 yarn.resource.resource-tracker.address.rm2: xxx.xxx.xxx.xxx:8031 yarn.resourcemanager.zk-address: xxx.xxx.xxx.xxx:2181,xxx.xxx.xxx.xxx:2181,xxx.xxx.xxx.xxx:2181 yarn.resourcemanager.recovery.enabled: true |
需要在Spring->hadoop->config 下,增加yarn高可用相关配置。