现在的位置: 首页 > 云计算 > Docker > 正文
旺旺带你玩kubernetes系列之一:在ubuntu上安装kubernetes1.3
2016年09月23日 Docker, 云计算 ⁄ 共 13005字 暂无评论 ⁄ 被围观 1,868 views+

 

1、环境说明

1.1 软件及版本

全部软件均为当前最新版,且写本文时博主已经在公司进行商用验证,因此全部环境可用于生产环境。

1.2 主机规划

  • master 和 minion:192.168.1.104 node01
  • minion:192.168.1.107 node02
  • minion:192.168.1.108 node03

测试环境我在hosts中填写了解析,生产环境使用dns进行解析。

root@node01:~# cat /etc/hosts
127.0.0.1    localhost

# The following lines are desirable for IPv6 capable hosts
::1     localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
192.168.1.104 node01
192.168.1.107 node02
192.168.1.108 node03

1.3 前提条件

  • master免密码访问所有节点(包含master自己,可以使用非root账户),具体实现办法,自行Google。当然如果这一步都搞不定,也就基本没必要尝试kubernetes了。
  • 安装网桥:bridge-utils

2、安装etcd

kubernetes自带的etcd是单机安装的,而生产环境必须使用集群,因此这里直接介绍集群部署模式。

2.1 文件准备:

下载并拷贝到所有服务器:

root@node01:~# mkdir /opt/bin

root@node01:~# curl -L https://github.com/coreos/etcd/releases/download/v3.0.9/etcd-v3.0.9-linux-amd64.tar.gz -o etcd-v3.0.9-linux-amd64.tar.gz

root@node01:~# tar xzvf etcd-v3.0.9-linux-amd64.tar.gz

root@node01:~# cp etcd-v3.0.9-linux-amd64/etcd etcd-v3.0.9-linux-amd64/etcdctl /opt/bin/

查看版本:

root@node01:~# /opt/bin/etcd -version
etcd Version: 3.0.9
Git SHA: 494c012
Go Version: go1.6.3
Go OS/Arch: linux/amd64

拷贝到所有需要节点:

root@node01:~# ssh node02 -C "mkdir /opt/bin"
root@node01:~# ssh node03 -C "mkdir /opt/bin"

root@node01:~# scp etcd-v3.0.9-linux-amd64/etcd etcd-v3.0.9-linux-amd64/etcdctl node02:/opt/bin
etcd                                                100%   19MB  19.2MB/s   00:01
etcdctl                                             100%   18MB  17.6MB/s   00:00
root@node01:~# scp etcd-v3.0.9-linux-amd64/etcd etcd-v3.0.9-linux-amd64/etcdctl node03:/opt/bin
etcd                                                100%   19MB  19.2MB/s   00:00
etcdctl                                             100%   18MB  17.6MB/s   00:01

建立环境变量,同样拷贝到所有节点:

root@node01:~# cat /etc/profile.d/k8s.sh
export PATH=$PATH:/opt/bin
export KUBERNETES_PROVIDER=ubuntu
root@node01:~# . /etc/profile.d/k8s.sh
root@node01:~# scp /etc/profile.d/k8s.sh node02:/etc/profile.d/k8s.sh
k8s.sh                                              100%   61     0.1KB/s   00:00
root@node01:~# scp /etc/profile.d/k8s.sh node03:/etc/profile.d/k8s.sh
k8s.sh                                              100%   61     0.1KB/s   00:00

2.2 启动etcd集群

2.2.1 第一次启动:

在node01上执行:

root@node01:~# nohup /opt/bin/etcd --name infra0 --data-dir /var/lib/etcd --initial-advertise-peer-urls http://192.168.1.104:2380  --listen-peer-urls http://192.168.1.104:2380   --listen-client-urls http://192.168.1.104:4001,http://127.0.0.1:4001   --advertise-client-urls http://192.168.1.104:4001   --initial-cluster-token etcd-cluster-1   --initial-cluster infra0=http://192.168.1.104:2380,infra1=http://192.168.1.107:2380,infra2=http://192.168.1.108:2380   --initial-cluster-state new &

在node02上执行:

root@node02:~# nohup /opt/bin/etcd --name infra1 --data-dir /var/lib/etcd --initial-advertise-peer-urls http://192.168.1.107:2380  --listen-peer-urls http://192.168.1.107:2380   --listen-client-urls http://192.168.1.107:4001,http://127.0.0.1:4001   --advertise-client-urls http://192.168.1.107:4001   --initial-cluster-token etcd-cluster-1   --initial-cluster infra0=http://192.168.1.104:2380,infra1=http://192.168.1.107:2380,infra2=http://192.168.1.108:2380   --initial-cluster-state new &

在node03上执行:

root@node03:~# nohup /opt/bin/etcd --name infra2 --data-dir /var/lib/etcd --initial-advertise-peer-urls http://192.168.1.108:2380  --listen-peer-urls http://192.168.1.108:2380   --listen-client-urls http://192.168.1.108:4001,http://127.0.0.1:4001   --advertise-client-urls http://192.168.1.108:4001   --initial-cluster-token etcd-cluster-1   --initial-cluster infra0=http://192.168.1.104:2380,infra1=http://192.168.1.107:2380,infra2=http://192.168.1.108:2380   --initial-cluster-state  new &

可以在node01上看下日志输出情况,可以看到集群已经正常启动,是不是非常简单呢:

root@node01:~# tail -f nohup.out |grep member
2016-09-21 00:02:04.856809 N | membership: updated the cluster version from 2.3 to 3.0
^C
root@node01:~# cat nohup.out |grep member
2016-09-21 00:01:15.680098 I | etcdserver: member dir = /var/lib/etcd/member
2016-09-21 00:01:15.683067 I | etcdserver: starting member 47c35f4a0d57afa4 in cluster 959b506b320ffe35
2016-09-21 00:01:15.709563 I | membership: added member 47c35f4a0d57afa4 [http://192.168.1.104:2380] to cluster 959b506b320ffe35
2016-09-21 00:01:15.709833 I | membership: added member b8d5a649fb72d51b [http://192.168.1.108:2380] to cluster 959b506b320ffe35
2016-09-21 00:01:15.709972 I | membership: added member bebba08eea6ea060 [http://192.168.1.107:2380] to cluster 959b506b320ffe35
2016-09-21 00:01:44.791273 W | etcdserver: failed to reach the peerURL(http://192.168.1.108:2380) of member b8d5a649fb72d51b (Get http://192.168.1.108:2380/version: dial tcp 192.168.1.108:2380: getsockopt: connection refused)
2016-09-21 00:01:44.791352 W | etcdserver: cannot get the version of member b8d5a649fb72d51b (Get http://192.168.1.108:2380/version: dial tcp 192.168.1.108:2380: getsockopt: connection refused)
2016-09-21 00:01:44.803034 N | membership: set the initial cluster version to 2.3
2016-09-21 00:01:48.797864 W | etcdserver: failed to reach the peerURL(http://192.168.1.108:2380) of member b8d5a649fb72d51b (Get http://192.168.1.108:2380/version: dial tcp 192.168.1.108:2380: getsockopt: connection refused)
2016-09-21 00:01:48.797935 W | etcdserver: cannot get the version of member b8d5a649fb72d51b (Get http://192.168.1.108:2380/version: dial tcp 192.168.1.108:2380: getsockopt: connection refused)
2016-09-21 00:01:52.810961 W | etcdserver: failed to reach the peerURL(http://192.168.1.108:2380) of member b8d5a649fb72d51b (Get http://192.168.1.108:2380/version: dial tcp 192.168.1.108:2380: getsockopt: connection refused)
2016-09-21 00:01:52.811001 W | etcdserver: cannot get the version of member b8d5a649fb72d51b (Get http://192.168.1.108:2380/version: dial tcp 192.168.1.108:2380: getsockopt: connection refused)
2016-09-21 00:01:56.827061 W | etcdserver: failed to reach the peerURL(http://192.168.1.108:2380) of member b8d5a649fb72d51b (Get http://192.168.1.108:2380/version: dial tcp 192.168.1.108:2380: getsockopt: connection refused)
2016-09-21 00:01:56.827111 W | etcdserver: cannot get the version of member b8d5a649fb72d51b (Get http://192.168.1.108:2380/version: dial tcp 192.168.1.108:2380: getsockopt: connection refused)
2016-09-21 00:02:00.829027 W | etcdserver: failed to reach the peerURL(http://192.168.1.108:2380) of member b8d5a649fb72d51b (Get http://192.168.1.108:2380/version: dial tcp 192.168.1.108:2380: getsockopt: connection refused)
2016-09-21 00:02:00.829069 W | etcdserver: cannot get the version of member b8d5a649fb72d51b (Get http://192.168.1.108:2380/version: dial tcp 192.168.1.108:2380: getsockopt: connection refused)
2016-09-21 00:02:04.856809 N | membership: updated the cluster version from 2.3 to 3.0

2.2.2 以后启动:

第一次启动时,initial-cluster-state选项后面用的参数是new,而除了第一次,以后必须改为existing。为了实现开机启动,我直接写入rc.local中。当然也可以自己写个服务脚本实现自启动。

root@node01:~# cat /etc/rc.local
#!/bin/sh -e
#
# rc.local
#
# This script is executed at the end of each multiuser runlevel.
# Make sure that the script will "exit 0" on success or any other
# value on error.
#
# In order to enable or disable this script just change the execution
# bits.
#
# By default this script does nothing.
nohup /opt/bin/etcd --name infra0 --data-dir /var/lib/etcd --initial-advertise-peer-urls http://192.168.1.104:2380  --listen-peer-urls http://192.168.1.104:2380   --listen-client-urls http://192.168.1.104:4001,http://127.0.0.1:4001   --advertise-client-urls http://192.168.1.104:4001   --initial-cluster-token etcd-cluster-1   --initial-cluster infra0=http://192.168.1.104:2380,infra1=http://192.168.1.107:2380,infra2=http://192.168.1.108:2380   --initial-cluster-state existing &

exit 0

查看集群状态:

root@node01:~# etcdctl member list
47c35f4a0d57afa4: name=infra0 peerURLs=http://192.168.1.104:2380 clientURLs=http://192.168.1.104:4001 isLeader=true
b8d5a649fb72d51b: name=infra2 peerURLs=http://192.168.1.108:2380 clientURLs=http://192.168.1.108:4001 isLeader=false
bebba08eea6ea060: name=infra1 peerURLs=http://192.168.1.107:2380 clientURLs=http://192.168.1.107:4001 isLeader=false

root@node01:~# etcdctl cluster-health
member 47c35f4a0d57afa4 is healthy: got healthy result from http://192.168.1.104:4001
member b8d5a649fb72d51b is healthy: got healthy result from http://192.168.1.108:4001
member bebba08eea6ea060 is healthy: got healthy result from http://192.168.1.107:4001
cluster is healthy

通过这两个命令,可以看出集群健康状态,并能知道那个节点是leader。

我们也可以直接put一个kv进去看看效果:

root@node01:~# curl -L -X PUT http://192.168.1.104:4001/v2/keys/test -d value="etcdiseasy"
{"action":"set","node":{"key":"/test","value":"etcdiseasy","modifiedIndex":9,"createdIndex":9}}
root@node01:~# curl -Ls http://192.168.1.104:4001/v2/keys/test
{"action":"get","node":{"key":"/test","value":"etcdiseasy","modifiedIndex":9,"createdIndex":9}}

3、安装kubernetes

3.1 准备安装文件:

克隆基本源码:

root@node01:/opt# git clone https://github.com/kubernetes/kubernetes.git

在安装过程中,安装脚本还会自动下载最新版Kubernetes,但下载过程很慢,因此我们手动准备Kubernetes最新发行版,最快的方式是用迅雷下载,下载好后再上传到服务器:

root@node01:/opt# wget https://github.com/kubernetes/kubernetes/releases/download/v1.3.5/kubernetes.tar.gz

准备flannel,我这里也下载最新的0.6版本(Kubernetes中默认为0.55)

root@node01:/opt#  wget https://github.com/coreos/flannel/releases/download/v0.6.1/flannel-v0.6.1-linux-amd64.tar.gz

下载好后的文件清单:

root@node01:/opt# ls
bin  flannel-v0.6.1-linux-amd64.tar.gz  kubernetes  kubernetes.tar.gz

3.2 修改安装脚本:

3.2.1 修改基础配置脚本:/opt/kubernetes/cluster/ubuntu/config-default.sh

好习惯,修改之前先备份:

root@node01:/opt/kubernetes/cluster/ubuntu# cp config-default.sh config-default.sh_ba

# 第一个需要修改的,要在哪些节点安装

export nodes=${nodes:-"root@192.168.1.104 root@192.168.1.107 root@192.168.1.108"}

# 节点角色:我的场景默认即可,a表示master,i表示minion

roles=${roles:-"ai i i"}

# 集群IP段:kubernetes service使用的IP,我这里设置为与服务器同网段

export SERVICE_CLUSTER_IP_RANGE=${SERVICE_CLUSTER_IP_RANGE:-192.168.1.0/24}

# docker启动选项,一般需要添加私有仓库地址:

DOCKER_OPTS=${DOCKER_OPTS:-""}

# dns配置:根据实际情况配置内网dns服务器地址,也可以后面使用kubernetes推荐的skydns

DNS_SERVER_IP=${DNS_SERVER_IP:-"192.168.1.253"}
DNS_DOMAIN=${DNS_DOMAIN:-"toxingwang.com"}

3.2.2 修改自动下载脚本:/opt/kubernetes/cluster/ubuntu/download-release.sh

好习惯,修改前先备份:

root@node01:/opt/kubernetes/cluster/ubuntu# cp download-release.sh download-release.sh
_bak

修改flannel 下载部分:

# flannel
FLANNEL_VERSION=${FLANNEL_VERSION:-"v0.6.1"}
echo "Prepare flannel ${FLANNEL_VERSION} release ..."
grep -q "^${FLANNEL_VERSION}\$" binaries/.flannel 2>/dev/null || {
#  curl -L  https://github.com/coreos/flannel/releases/download/v${FLANNEL_VERSION}/flannel-${FLANNEL_VERSION}-linux-amd64.tar.gz -o flannel.tar.gz
cp /opt/flannel-v0.6.1-linux-amd64.tar.gz flannel.tar.gz
tar xf flannel.tar.gz
cp flanneld binaries/master
cp flanneld binaries/minion

echo ${FLANNEL_VERSION} > binaries/.flannel
}

修改etcd下载部分,直接全部注释即可,因为我们已经启动了etcd集群:

# ectd
#ETCD_VERSION=${ETCD_VERSION:-"2.3.1"}
#ETCD="etcd-v${ETCD_VERSION}-linux-amd64"
#echo "Prepare etcd ${ETCD_VERSION} release ..."
#grep -q "^${ETCD_VERSION}\$" binaries/.etcd 2>/dev/null || {
#  curl -L https://github.com/coreos/etcd/releases/download/v${ETCD_VERSION}/${ETCD}.tar.gz -o etcd.tar.gz
#  tar xzf etcd.tar.gz
#  cp ${ETCD}/etcd ${ETCD}/etcdctl binaries/master
#  echo ${ETCD_VERSION} > binaries/.etcd
#}

修改kubernetes下载部分:注意,有两个地方需要修改,内容一样

# k8s
echo "Prepare kubernetes ${KUBE_VERSION} release ..."
grep -q "^${KUBE_VERSION}\$" binaries/.kubernetes 2>/dev/null || {
curl -L https://github.com/kubernetes/kubernetes/releases/download/v${KUBE_VERSION}/kubernetes.tar.gz -o kubernetes.tar.gz
if [ ! -f kubernetes.tar.gz ] ; then
cp /opt/kubernetes.tar.gz kubernetes.tar.gz
tar xzf kubernetes.tar.gz
fi

pushd kubernetes/server
tar xzf kubernetes-server-linux-amd64.tar.gz
popd
cp kubernetes/server/kubernetes/server/bin/kube-apiserver \
kubernetes/server/kubernetes/server/bin/kube-controller-manager \
kubernetes/server/kubernetes/server/bin/kube-scheduler binaries/master
cp kubernetes/server/kubernetes/server/bin/kubelet \
kubernetes/server/kubernetes/server/bin/kube-proxy binaries/minion
cp kubernetes/server/kubernetes/server/bin/kubectl binaries/
echo ${KUBE_VERSION} > binaries/.kubernetes
}

3.2.3 修改etcd地址配置,并进行etcd自动安装:/opt/kubernetes/cluster/ubuntu/util.sh

修改默认的etcd连接地址,用vim打开文件,批量替换即可:

# vim的命令模式下,执行批量替换,两个命令如下:

:%s#127.0.0.1:4001#192.168.1.104:4001,http://192.168.1.107:4001,http://192.168.1.108:4001#g

:%s#${1}:4001#192.168.1.104:4001,http://192.168.1.107:4001,http://192.168.1.108:4001#g

禁用etcd安装:删除有关etcd启动相关的行即可(多数是多行命令,注释会导致问题),如果怕修改错,直接点击下载我修改好的吧,具体如下:

475行 删除

495行 删除(原496行,特别注意别删除错误了)

647行 删除 (原647行,不要删错)

681行 删除

721至727 行 删除

801行及下面包含etcd的行 删除

915至921行 删除

这部分的目的仅仅是为了不让安装程序自动安装etcd而已,多找找,否则安装过程报错后处理比较麻烦。但也别删除多了,这部分需要一定shell编程基础,要能看懂脚本内容。如果后继导致错误,那就省略掉前面的etcd安装过程,采用默认的etcd,先用单机版测试吧。

3.3 执行安装:

安装脚本:/opt/kubernetes/cluster/kube-up.sh

清理脚本:/opt/kubernetes/cluster/kube-down.sh

安装过程仍然可能出现各种不可预知的情况,需要根据实际提示进行处理。有时候只需再次执行一次就OK了。

也可以用bash -x  /opt/kubernetes/cluster/kube-up.sh 观察安装脚本执行过程。

安装完成后,执行如下命令拷贝kubernetes管理命令:

root@node01:~# cp /opt/kubernetes/cluster/ubuntu/binaries/kubectl /opt/bin/

3.4 安装过程中的几个坑:

  • 安装卡在如下地方,超时后结束:

{"Network":"172.16.0.0/16", "Backend": {"Type": "vxlan"}}

解决办法:

service flanneld restart

service flanneld stop

然后再次运行安装脚本

  • 出现如下这样的提示:

(kubectl failed, will retry 1 times)
The connection to the server 192.168.1.104:8080 was refused - did you specify the right host or port?

解决办法:手动启动kubernetes master服务:

service kube-apiserver start

service kube-scheduler start

service kube-controller-manager start

  • 有节点一直处于未准备状态:

('kubectl get nodes' failed, giving up)
Waiting for 3 ready nodes. 2 ready nodes, 2 registered. Retrying.

解决办法:检查那个节点的kubelet和kube-proxy服务没有启动,手动启动即可。

4、验证kubernetes集群

  • 查看节点状态:

root@node01:/opt/bin# kubectl get node
NAME            STATUS    AGE
192.168.1.104   Ready     8m
192.168.1.107   Ready     10m
192.168.1.108   Ready     10m

  • 查看所有默认namaspace中的RC、pod和svc信息:

root@node01:~# kubectl get all
NAME         CLUSTER-IP    EXTERNAL-IP   PORT(S)   AGE
kubernetes   192.168.1.1   <none>        443/TCP   12m

给我留言

留言无头像?


×