- 积分
- 16843
在线时间 小时
最后登录1970-1-1
|
马上注册,结交更多好友,享用更多功能,让你轻松玩转社区。
您需要 登录 才可以下载或查看,没有账号?开始注册
x
一,安装环境与配置前准备工作6 U$ t) _' D0 @
硬件:4个虚拟机分别为master1:192.168.110.20,master2:192.168.110.21,slave1:192.168.110.22,slave2:192.168.110.23; Q- h8 N% ~# y& V, f$ D. o4 ~3 `9 k
系统:红帽 CentOS6.58 e/ Y( g9 m8 t
HADOOP版本:最新版本hadoop-2.0.0-alpha 安装包为hadoop-2.0.0-alpha.tar.gz
# U" s @/ A6 d9 C2 Z下载官网地址:http://apache.etoak.com/hadoop/common/hadoop-2.0.0-alpha/0 U6 U P0 J/ f6 O
JDK版本:jdk-6u6-linux-i586.bin(最低要求为JDK 1.6)' t- ~# b! k) m5 t* b% d
虚拟机的安装和LINUX的安装不介绍,GOOGLE一大堆3 v5 h2 B* L6 Z+ H6 h+ F) f
创建相关目录:mkdir /usr/hadoop(hadoop安装目录)mkdir /usr/java(JDK安装目录)二,安装JDK(所有节点都一样)
1 N/ j% Y# X6 v2 h) r: z# [& i. e1,将下载好的jdk-6u6-linux-i586.bin通过SSH上传到/usr/java下
d6 R3 B8 k2 t9 C9 P2,进入JDK安装目录cd /usr/java 并且执行chmod +x jdk-6u6-linux-i586.bin, H, F7 {; W1 m* N8 B+ K
3,执行./jdk-6u6-linux-i586.bin(一路回车,遇到yes/no全部yes,最后会done,安装成功)1 u1 |8 }* Q% J8 j: T' i/ j
4,配置环境变量,执行cd /etc命令后执行vi profile,在行末尾添加
- J- Y4 l; R- F4 N8 rexport JAVA_HOME=/usr/java/jdk1.6.0_27& P: A4 x6 g* U9 g* B, [+ \- T
export CLASSPATH=.:$JAVA_HOME/lib/tools.jar:/lib/dt.jar$ d1 g5 ?' }& |7 U
export PATH=$JAVA_HOME/bin:$PATH5,执行chmod +x profile将其变成可执行文件+ |0 I0 a$ h( e" V- g( g- f
6,执行source profile使其配置立即生效
1 N$ c% g) k% b' U' Z3 h+ [( Y7,执行java -version查看是否安装成功三,修改主机名,所有节点均一样配置. y. _" P7 O7 h) z L
1,连接到主节点192.168.110.20,修改network,执行cd /etc/sysconfig命令后执行vi network,修改HOSTNAME=master1
/ s0 b0 Q3 b7 c2,修改hosts文件,执行cd /etc命令后执行vi hosts,在行末尾添加:7 F2 D3 ]8 H0 K# g+ H1 L5 _
192.168.110.20 master1: X$ g: w1 O! Y" n" W6 H
192.168.110.21 master2
$ n1 P& Z! I0 I W192.168.110.22 slave1& j! s0 a K |6 t/ P0 |3 c8 k! r
192.168.110.23 slave2
* P# Q* f- z# {' k3,执行hostname master1
& N+ ?3 |; P: C" R7 i- z4,执行exit后重新连接可看到主机名以修改OK四,配置SSH无密码登陆3 d6 E& f# ~9 {) K1 x: r5 t6 i
1,SSH无密码原理简介:首先在master上生成一个密钥对,包括一个公钥和一个私钥,并将公钥复制到所有的slave上。2 ]1 T" T" g" X
然后当master通过SSH连接slave时,slave就会生成一个随机数并用master的公钥对随机数进行加密,并发送给master。& A/ ]# R2 ]' s; A6 H' a
最后master收到加密数之后再用私钥解密,并将解密数回传给slave,slave确认解密数无误之后就允许master不输入密码进行连接了2 f" @" j& M+ ]3 H3 B; v) j4 r
2,具体步骤:, d. `2 L. }. a1 X2 d- d0 r
1、执行命令ssh-keygen -t rsa之后一路回车,查看刚生成的无密码钥对:cd .ssh 后执行ll
' k$ I7 M: m1 L9 O) B7 W2、把id_rsa.pub追加到授权的key里面去。执行命令cat ~/.ssh/id_rsa.pub >>~/.ssh/authorized_keys H& X' T7 s3 H, w! x9 O
3、修改权限:执行chmod 600 ~/.ssh/authorized_keys8 j3 u# U8 Y. n6 A
4、确保cat /etc/ssh/sshd_config 中存在如下内容
8 p1 m. g9 ^/ wRSAAuthentication yes& C5 v t3 c. h1 k5 T" s
PubkeyAuthentication yes' x: p( Y& h9 j ]1 f: h1 J
AuthorizedKeysFile .ssh/authorized_keys" E8 N+ z) l, P+ G
如需修改,则在修改后执行重启SSH服务命令使其生效:service sshd restart3 O g! @. @$ t0 I; \
5、将公钥复制到所有的slave机器上:scp ~/.ssh/id_rsa.pub 192.168.110.22:~/ 然后输入yes,最后输入slave机器的密码
# Z8 d5 F, L P! \& ]7 Z6、在slave机器上创建.ssh文件夹:mkdir ~/.ssh 然后执行chmod 700 ~/.ssh(若文件夹以存在则不需要创建)) _! j9 G! _% Y9 O1 F7 p2 h
7、追加到授权文件authorized_keys执行命令:cat ~/id_rsa.pub >> ~/.ssh/authorized_keys 然后执行chmod 600 ~/.ssh/authorized_keys
7 B) Q2 P# O& Y8 \' L4 Y: h7 k& L6 c8、重复第4步
% E& n8 ^- o, `2 U6 X1 Z; c9、验证命令:在master机器上执行 ssh 192.168.110.22发现主机名由master1变成slave1即成功,最后删除id_rsa.pub文件:rm -r id_rsa.pub
) s0 i- P( h. X# w+ r' S3,按照以上步骤分别配置master1,master2,slave1,slave2,要求每个master与每个slave之间都可以无密码登录五,安装HADOOP,所有节点都一样! V6 d$ U1 q/ P( I$ J: V
1,将hadoop-2.0.0-alpha.tar.gz上传到HADOOP的安装目录/usr/hadoop中+ i% F* {0 c* s
2,解压安装包:tar -zxvf hadoop-2.0.0-alpha.tar.gz
; p1 T& A+ ^# H+ l+ G3,创建tmp文件夹:mkdir /usr/hadoop/tmp
& b3 {5 n8 z9 \3 n: O4,配置环境变量:vi /etc/profile. Q% I6 k8 R: \- b$ W
export HADOOP_DEV_HOME=/usr/hadoop/hadoop-2.0.0-alpha- M0 L9 V" }' v7 j; t3 l
export PATH=$PATH:$HADOOP_DEV_HOME/bin, m3 j6 D9 g1 }0 u7 ~8 r P+ @
export PATH=$PATH:$HADOOP_DEV_HOME/sbin1 e3 d: H N) a
export HADOOP_MAPARED_HOME=${HADOOP_DEV_HOME}4 Q$ a1 w8 N5 ^1 h( T! l- \
export HADOOP_COMMON_HOME=${HADOOP_DEV_HOME}6 b6 S/ P* a# Y0 ~! }+ m* P8 x1 z% X
export HADOOP_HDFS_HOME=${HADOOP_DEV_HOME}) m# C* t+ s' N
export YARN_HOME=${HADOOP_DEV_HOME}7 g% I% n2 M) M4 V; g/ ]
export HADOOP_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop
, W1 ?0 W% {6 S; ]% v. Dexport HDFS_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop
0 \2 v9 ]% F4 o T' `- Q2 Eexport YARN_CONF_DIR=${HADOOP_DEV_HOME}/etc/hadoop( x6 Y1 p- R0 {6 m8 \1 W' i. x
5,配置HADOOP# x7 j( {' a2 [% ~
配置文件位于/usr/hadoop/hadoop-2.0.0-alpha/etc/hadoop下: B* r% {8 F7 L" y
1、创建并配置hadoop-env.sh+ z1 Y) j, s, W: [0 {
vi /usr/hadoop/hadoop-2.0.0-alpha/etc/hadoop/hadoop-env.sh 在末尾添加export JAVA_HOME=/usr/java/jdk1.6.0_27
% a S! e& _7 h) u2、配置core-site.xml文件
. z- d8 D1 j. M<property>
) k( s m# P4 \8 T0 W2 t <name>hadoop.tmp.dir</name>
; R8 U) ^ Y7 P) t( l) H' w <value>/usr/hadoop/tmp</value>
: Z2 j- ^0 Q% s* f6 g$ z; }5 U3 R</property>/ E0 p6 k J3 z: n8 ]8 | h
3、创建并配置slaves:vi slaves 并添加以下内容2 F. r W$ C- c' u9 k
192.168.110.22' t. Y) E9 K; k3 a7 |
192.168.110.23" b6 Y; C) M/ k2 q
4、配置hdfs-site.xml
) I% {5 d$ x. Y<configuration># S$ h* F/ R& t N( r
<property>' n1 f$ v% t% x P( ?6 T
<name>dfs.namenode.name.dir</name>; i; W3 X1 i. ~! \, J2 _
<value>file:/usr/hadoop/hdfs/name</value>
% O6 U3 z0 Q) t5 T* p <final>true</final>5 q& K2 t# E! s# M6 l2 U; [
</property><property>" d' P1 G' E9 I% o
<name>dfs.federation.nameservice.id</name>5 p5 \- ^2 q+ V
<value>ns1</value>
$ h# S8 K8 O, I& F6 M</property><property>
8 ]2 J/ c5 |2 w/ l5 B <name>dfs.namenode.backup.address.ns1</name>
3 n2 X1 D% \# |! L5 M <value>192.168.110.23:50100</value>2 x& R1 e- v7 g v# C* z9 F2 N0 F% ?
</property><property>
2 ]0 x9 E/ O" Z3 i% L3 q8 V: k <name>dfs.namenode.backup.http-address.ns1</name>- E/ m- v& E, r
<value>192.168.110.23:50105</value>7 V( A2 ?9 v: |
</property><property>
$ {" U* h2 o4 |8 m$ s <name>dfs.federation.nameservices</name>
8 p6 m, @. z, T4 |7 n( W6 [ <value>ns1,ns2</value>
/ e& K$ P) C# [9 M) u</property><property>0 \. T9 S. o2 i! N
<name>dfs.namenode.rpc-address.ns1</name>0 {7 W1 y& \- p) X, M1 M
<value>192.168.110.20:9000</value> o5 N% H. H M* ~, M! k" S8 P" ?" K: J6 u
</property><property>
2 n0 y& S. J( ^. F/ D, [ <name>dfs.namenode.rpc-address.ns2</name>* X/ n+ U; c7 e0 x
<value>192.168.110.21:9000</value># N& ^0 R- I- r+ O
</property><property>$ B/ y; ]6 b, D7 h0 V
<name>dfs.namenode.http-address.ns1</name>
% ~% Z# r" O9 g <value>192.168.110.20:23001</value>
* \& }# f: W" U i7 J' V4 D</property><property>6 m$ \0 T5 f5 ?
<name>dfs.namenode.http-address.ns2</name>8 n! ^3 i a) t D# ]3 S
<value>192.168.110.21:13001</value>
1 `4 i! x6 L: Z</property><property>
& m2 h: ~, Y) D7 H7 n2 z$ s3 Y <name>dfs.dataname.data.dir</name>* g) ?9 a: k/ e/ @8 p( B8 f% g
<value>file:/usr/hadoop/hdfs/data</value>
% O9 C" i& n8 a3 o <final>true</final>
; V! Y- D5 k7 W8 \ q4 g</property><property>6 B- ~3 i! s% V; O
<name>dfs.namenode.secondary.http-address.ns1</name>6 ]. C G- |% m O& D& o
<value>192.168.110.20:23002</value>
' N# E+ Z- a6 Z" b0 W: E</property><property>1 M; ?: \0 C7 w
<name>dfs.namenode.secondary.http-address.ns2</name>- ~$ b6 B' g- b3 F% M$ I6 Q" v. ]; ?/ z
<value>192.168.110.21:23002</value>
3 B* A1 M6 v8 [- ~ t" c</property><property>
/ U0 N) d3 `$ j- A4 t <name>dfs.namenode.secondary.http-address.ns1</name>& E9 W- F, o' g3 [ L
<value>192.168.110.20:23003</value>
1 L* @ A- U/ \+ }+ b</property><property> C0 c4 c1 z, U$ {
<name>dfs.namenode.secondary.http-address.ns2</name>) D1 q- X. k) T' r6 ^1 ^
<value>192.168.110.21:23003</value>
3 X' O( q5 ?: k+ J</property></configuration>5、配置yarn-site.xml4 y4 f" c& i8 G
<configuration><!-- Site specific YARN configuration properties -->
1 a ~% }5 T3 X9 g<property>
3 s" M0 w- u5 ^ <name>yarn.resourcemanager.address</name>1 i# ]* k: J5 ~3 Z& k1 E+ _
<value>192.168.110.20:18040</value>0 H7 @" \0 {9 K) j( r7 f
</property><property>5 G0 @, w, T2 e
<name>yarn.resourcemanager.scheduler.address</name>
- V1 o& r1 @6 ] <value>192.168.110.20:18030</value>% v: R6 |9 U- \8 m H1 d
</property><property>
; r6 b1 s1 o4 ]* @/ p/ n; s <name>yarn.resourcemanager.webapp.address</name>
! f2 g" D) u, _ <value>192.168.110.20:18088</value>
8 g( B, S$ G6 Y6 a- |7 j! d8 t</property><property>9 R" }' y4 r: I. f. I3 t
<name>yarn.resourcemanager.resource-tracker.address</name>% Q1 U) X) X( q9 X6 o0 N3 h. c& H
<value>192.168.110.20:18025</value>
4 E5 B( L& V% q</property><property>
q9 Z" g& `4 `; p <name>yarn.resourcemanager.admin.address</name>6 G4 n. o& [3 x: i% g K
<value>192.168.110.20:18141</value>$ b" q. u& ^4 M5 Z) R
</property><property>3 @5 A: {5 |7 y( G: U
<name>yarn.nodemanager.aux-services</name>
! T) M3 i. x# p* Y- e- c% x <value>mapreduce.shuffle</value>
" {: ?* a! [5 I3 |1 p</property>
7 ^# [: V6 J: u: f0 H, ] b</configuration>六,启动HADOOP集群,并测试WORDCOUNT( d" _& _, w0 h
1,格式化 namenode:分别在两个master上执行:hadoop namenode -format -clusterid eric
& W8 r/ V; m: L! D K0 u" u7 Y2,启动HADOOP:在master1执行start-all.sh或先执行start-dfs.sh再执行start-yarn.sh
2 }+ s7 Z3 o/ y3 T, q" c6 Q3,分别在各个节点上执行jps命令,显示结果如下即成功启动:9 {* \ s8 c) ~9 d' M4 R
[root@master1 hadoop]# jps9 E8 `+ U7 Y v8 `8 S$ H0 o
1956 Bootstrap' G( F4 S1 S0 M, b" y1 A
4183 Jps
5 [; L$ ^+ U! V" A( Q9 j+ L3938 ResourceManager$ m1 U! [1 @0 B+ }* y
3845 SecondaryNameNode
3 ~- u \ E# a/ ]; S0 d2 w3652 NameNode4 h! h) k1 }2 s3 u! k4 I: ~$ M+ ~
[root@master2 ~]# jps
" ~& a, |% G3 w* T* q" K2 b3778 Jps! O$ B* Q1 k D* z$ J9 \
1981 Bootstrap
5 }: n9 u3 p# f& f; m/ C3736 SecondaryNameNode v f% f: G" [# @
3633 NameNode2 L0 D m" o* W" \6 P! p& I# b/ o
[root@slave1 ~]# jps
; X2 q1 Q% \' }% K3766 Jps
$ s! L W+ d7 j! }3675 NodeManager
- g' g7 A3 Q2 g3551 DataNode
3 P0 X0 g; W& m' e1 }3 z& u# R[root@slave1 ~]# jps
& o9 w D3 b0 K3675 NodeManager* [! H* A$ o( \& d5 D* | Q) F @
3775 Jps
# F" o5 K8 t0 l, j K$ v3551 DataNode' L2 R: \! F6 D
4,在master1上,创建输入目录:hadoop fs -mkdir hdfs://192.168.110.20:9000/input
% N) S: Q5 d$ q0 y3 f/ A% D5,将/usr/hadoop/hadoop-2.0.0-alpha/目录下的所有txt文件复制到hdfs分布式文件系统的目录里,执行以下命令) `6 \# s4 V3 z, W$ O+ h+ \; I6 E4 i
hadoop fs -put /usr/hadoop/hadoop-2.0.0-alpha/*.txt hdfs://192.168.110.20:9000/input
6 X8 ]+ A9 j' I2 x! j: J- u. o, H6,在master1上,执行HADOOP自带的例子,wordcount包,命令如下:, j0 A: F+ U! n3 e N( F, e* v# v
cd /usr/hadoop/hadoop-2.0.0-alpha/share/hadoop/mapreduce+ L- e, q% a& y
hadoop jar hadoop-mapreduce-examples-2.0.0-alpha.jar wordcount hdfs://192.168.110.20:9000/input hdfs://192.168.110.20:9000/output
d5 Q' R+ M9 x- y3 t$ j( d9 l7,在master1上,查看结果命令如下:
% [+ H" X9 M; Q/ H& ]+ @* V, c8 }[root@master1 hadoop]# hadoop fs -ls hdfs://192.168.110.20:9000/output
7 K, K2 W9 x5 J( ^Found 2 items
$ x! J3 ?6 | ~* M/ A1 r* t-rw-r--r-- 2 root supergroup 0 2012-06-29 22:59 hdfs://192.168.110.20:9000/output/_SUCCESS. [) c/ r. U& V0 e
-rw-r--r-- 2 root supergroup 8739 2012-06-29 22:59 hdfs://192.168.110.20:9000/output/part-r-00000 n8 P @- K( q; m
[root@master1 hadoop]# hadoop fs -ls hdfs://192.168.110.20:9000/input2 q; m u/ z$ T
Found 3 items0 z$ `! y3 H$ a- s1 o* t
-rw-r--r-- 2 root supergroup 15164 2012-06-29 22:55 hdfs://192.168.110.20:9000/input/LICENSE.txt' C: a1 z- Z2 E2 S# a! ]: m& B
-rw-r--r-- 2 root supergroup 101 2012-06-29 22:55 hdfs://192.168.110.20:9000/input/NOTICE.txt/ b+ U2 B# G g! V- G5 i
-rw-r--r-- 2 root supergroup 1366 2012-06-29 22:55 hdfs://192.168.110.20:9000/input/README.txt
/ r y6 w" T' f e1 @[root@master1 hadoop]# hadoop fs -cat hdfs://192.168.110.20:9000/output/part-r-00000即可看到每个单词的数量% [3 \+ I) U: E- A! w" y6 N/ s
8,可以通过IE访问:http://192.168.110.20:23001/dfshealth.jsp
4 _2 p* p+ A9 h' C- w" L" a到此整个过程就结束了……… |
|