Fork me on GitHub
Jay Chang's Blog

我的技术博客


  • 首页

  • 关于

  • 标签

  • 分类

  • 归档

  • 搜索

Ubuntu20.04 源码安装APISIX-2.15.x Dashboard

发表于 2023-04-17 |
字数统计: 911

[TOC]

nodejs及go环境准备

安装nodejs

1
2
3
wget https://nodejs.org/download/release/v14.21.3/node-v14.21.3-linux-x64.tar.gz
tar xzvf node-v14.21.3-linux-x64.tar.gz -C /opt
mv /opt/node-v14.21.3-linux-x64/ /opt/node/

注意nodejs版本不能过高,试过node18构建会报错,建议版本node14~node16

配置环境变量

1
2
3
4
tee -a /etc/profile << EOF
export NODE_HOME=/opt/node
export PATH=$PATH:$NODE_HOME/bin
EOF

使环境变量生效

1
source /etc/profile

安装yarn

1
npm install -g yarn

安装go

https://mirrors.aliyun.com/golang/可以找需要的版本

1
2
wget https://mirrors.aliyun.com/golang/go1.20.3.linux-amd64.tar.gz
tar xzvf go1.20.3.linux-amd64.tar.gz -C /opt/

1
2
3
4
5
6
tee -a /etc/profile << EOF
export GO_HOME=/opt/go
export PATH=$PATH:$GO_HOME/bin
go env -w GO111MODULE=on
go env -w GOPROXY=https://goproxy.cn,direct
EOF

使环境变量生效

1
source /etc/profile

安装APISIX Dashboard

下载源码

1
wget https://github.com/apache/apisix-dashboard/archive/refs/tags/v2.15.1.tar.gz

解压源码

1
2
3
tar xzvf v2.15.1.tar.gz

cd apisix-dashboard-2.15.1

构建

1
make build

构建完成会打印以下内容,估计要7分钟左右

1
2
3
4
5
6
7
其他省略...

The bundle size is significantly larger than recommended.
Consider reducing it with code splitting: https://umijs.org/docs/load-on-demand
You can also analyze the project dependencies using ANALYZE=1

Done in 433.32s.

编译完将output拷贝到/opt/apisix-dashboard

1
mkdir -p mkdir -p /usr/local/apisix/dashboard && cp -r ./output/* /usr/local/apisix/dashboard

修改配置

如果是http访问etcd仅需配置ectd的访问地址即可

如果是https访问etcd则需要相关证书

将根证书,etcd-server-key证书,etcd-server证书拷贝到apisix目录下

1
2
3
4
mkdir -p /usr/local/apisix/ssl

# 根据具体情况即可
cp /opt/etcd3/ssl/{etcd-ca.pem,etcd-server-key.pem,etcd-server.pem} /usr/local/apisix/ssl/

修改vi /usr/local/apisix/dashboard/conf/conf.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# 其余省略...
etcd:
endpoints: # supports defining multiple etcd host addresses for an etcd cluster
- 10.1.80.91:2379
- 10.1.80.92:2379
- 10.1.80.93:2379
# yamllint disable rule:comments-indentation
# etcd basic auth info
# username: "root" # ignore etcd username if not enable etcd auth
# password: "123456" # ignore etcd password if not enable etcd auth
mtls:
key_file: "/usr/local/apisix/ssl/etcd-server-key.pem" # Path of your self-signed client side key
cert_file: "/usr/local/apisix/ssl/etcd-server.pem" # Path of your self-signed client side cert
ca_file: "/usr/local/apisix/ssl/etcd-ca.pem" # Path of your self-signed ca cert, the CA is used to sign callers' certificates
# prefix: /apisix # apisix config's prefix in etcd, /apisix by default
# 其余省略...

运行APISIX DASHBOARD

frontend方式运行apisix-dashboard(不推荐)

1
2
cd /usr/local/apisix/dashboard
./manager-api

安装为系统服务(推荐)

1
2
3
4
5
6
7
8
9
10
11
tee /usr/lib/systemd/system/apisix-dashboard.service << EOF
[Unit]
Description=apisix-dashboard
Conflicts=apisix-dashboard.service
After=network-online.target

[Service]
WorkingDirectory=/usr/local/apisix/dashboard
ExecStart=/usr/local/apisix/dashboard/manager-api -c /usr/local/apisix/dashboard/conf/conf.yaml

EOF

服务管理

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# reload下
systemctl daemon-reload

# enable apisix-dashboard
systemctl enable apisix-dashboard

# start apisix-dashboard
systemctl start apisix-dashboard

# stop apisix-dashboard
systemctl stop apisix-dashboard

# check apisix-dashboard status
systemctl status apisix-dashboard

访问http://10.1.80.91:9000/

若出现以下错误,需要将您的机器IP加入白名单,修改conf.yaml将IP地址或IP网段加入到allow_list内即可

1
2
3
4
5
6
{
Code: 20002,
Message: "IP address not allowed",
Data: null,
SourceSrv: ""
}

修改vi /usr/local/apisix/dashboard/conf/conf.yaml,加入允许访问的网段即可

1
2
3
4
5
6
7
# 其余省略...
allow_list: # If we don't set any IP list, then any IP access is allowed by default.
- 127.0.0.1 # The rules are checked in sequence until the first match is found.
- 10.1.80.0/24 # The rules are checked in sequence until the first match is found.
- ::1 # In this example, access is allowed only for IPv4 network 127.0.0.1, and for IPv6 network ::1.
# It also support CIDR like 192.168.1.0/24 and 2001:0db8::/32
# 其余省略...

Ubuntu20.04 源码安装APISIX-2.15.3

发表于 2023-04-17 |
字数统计: 4,315

[TOC]

OpenResty源码安装

安装前准备

您必须将这些库 perl 5.6.1+, libpcre, libssl安装在您的电脑之中。 对于 Linux来说, 您需要确认使用 ldconfig 命令,让其在您的系统环境路径中能找到它们。

以下安装的内容就是取自https://raw.githubusercontent.com/apache/apisix/master/utils/linux-install-luarocks.sh(master可以替换为具体版本号)

1
2
sudo apt-get install make gcc g++ build-essential curl unzip
sudo apt-get install -y libssl-dev perl zlib1g-dev libpcre3 libpcre3-dev libldap2-dev libpq-dev

编译安装OpenResty

下载,解压openresty-$VERSION.tar.gz

1
2
3
4
mkdir -p /opt/src && cd /opt/src
curl -O https://openresty.org/download/openresty-1.21.4.1.tar.gz
tar xzvf openresty-1.21.4.1.tar.gz -C /opt/src
cd /opt/src/openresty-1.21.4.1

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
./configure --prefix=/usr/local/openresty \
--with-poll_module \
--with-pcre-jit \
--without-http_rds_json_module \
--without-http_rds_csv_module \
--without-lua_rds_parser \
--with-stream \
--with-stream_ssl_module \
--with-stream_ssl_preread_module \
--with-http_v2_module \
--without-mail_pop3_module \
--without-mail_imap_module \
--without-mail_smtp_module \
--with-http_stub_status_module \
--with-http_realip_module \
--with-http_addition_module \
--with-http_auth_request_module \
--with-http_secure_link_module \
--with-http_random_index_module \
--with-http_gzip_static_module \
--with-http_sub_module \
--with-http_dav_module \
--with-http_flv_module \
--with-http_mp4_module \
--with-http_gunzip_module \
--with-threads \
--with-compat \
-j`nproc`
1
make -j`nproc` && make install

安装OpenResty的相关组件

openresty-openssl111-dev openresty-pcre-dev openresty-zlib-dev

我们应该通过添加 GPG 公钥来安装一些需要的组件。这些可以在之后删除

1
sudo apt-get -y install --no-install-recommends wget gnupg ca-certificates

然后导入openresty的 GPG 密钥

1
wget -O - https://openresty.org/package/pubkey.gpg | sudo apt-key add -

接着添加 openresty 的官方 APT 库

1
2
echo "deb https://openresty.org/package/ubuntu $(lsb_release -sc) main" > openresty.list
sudo cp openresty.list /etc/apt/sources.list.d/

请注意,这是针对 x86_64 或 amd64 系统的

对于 Aarch64 或 ARM64 系统,你应该使用这个 URL 来代替

1
echo "deb https://openresty.org/package/arm64/ubuntu $(lsb_release -sc) main"

现在更新 APT 索引

1
sudo apt-get update

安装 Openrety 相关组件

1
sudo apt-get -y install openresty-openssl111-dev openresty-pcre-dev openresty-zlib-dev

安装完会生成/usr/local/openresty目录,且/usr/local/openresty/openssl111,/usr/local/openresty/pcre,/usr/local/openresty/zlib这子目录会在该目录下。

安装luarocks

1
2
3
4
5
6
7
8
9
10
wget https://luarocks.org/releases/luarocks-3.8.0.tar.gz
tar -xzvf luarocks-3.8.0.tar.gz
cd luarocks-3.8.0

./configure --prefix=/usr/local/openresty/luajit \
--with-lua=/usr/local/openresty/luajit/ \
--lua-suffix=jit \
--with-lua-include=/usr/local/openresty/luajit/include/luajit-2.1

make && make install

结果如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
root@s91:/opt/src/apisix-3.2.0/utils/luarocks-3.8.0# ./configure --prefix=/usr/local/openresty/luajit \
> --with-lua=/usr/local/openresty/luajit/ \
> --lua-suffix=jit \
> --with-lua-include=/usr/local/openresty/luajit/include/luajit-2.1
--lua-suffix is no longer necessary.
The suffix is automatically detected.

Configuring LuaRocks version 3.8.0...

Lua version detected: 5.1
Lua interpreter found: /usr/local/openresty/luajit/bin/luajit
lua.h found: /usr/local/openresty/luajit/include/luajit-2.1/lua.h
unzip found in PATH: /usr/bin

Done configuring.

LuaRocks will be installed at......: /usr/local/openresty/luajit
LuaRocks will install rocks at.....: /usr/local/openresty/luajit
LuaRocks configuration directory...: /usr/local/openresty/luajit/etc/luarocks
Using Lua from.....................: /usr/local/openresty/luajit
Lua include directory..............: /usr/local/openresty/luajit/include/luajit-2.1

* Type make and make install:
to install to /usr/local/openresty/luajit as usual.
* Type make bootstrap:
to install LuaRocks into /usr/local/openresty/luajit as a rock.

设置环境变量

1
vi /etc/profile

/etc/profile文件最后添加如下代码

1
2
export OPENRESTY_HOME=/usr/local/openresty
export PATH=$PATH:$OPENRESTY_HOME/bin:$OPENRESTY_HOME/luajit/bin

使得环境变量生效

1
source /etc/profile

重构Openresty

主要是为了加一些apisix所需的模块

To enable etcd client certificate you need to build APISIX-Base, see
https://apisix.apache.org/docs/apisix/FAQ#how-do-i-build-the-apisix-base-environment

apisix提供的openresty构建脚本
我将版本改为2.5.13,即version=${version:-0.0.0}改为了version=2.5.13。
由于git clone经常出问题(github不稳定,翻墙会好一些),我改了下脚本,可以先将需要的包都下载下来(所需的源码包百度网盘下载)[https://pan.baidu.com/s/1X8U4_pIL86QH3wJuhrv6LQ?pwd=k8sl]

可以提前将下载好的安装包放到构建脚本同目录下,构建脚本内容如下,我处理了下:1.有些解压出来不带v,但是文件名带v 2.解压出来是大写的。这两种情况

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
#!/usr/bin/env bash
set -euo pipefail
set -x

version=${version:-2.15.3}

if ([ $# -gt 0 ] && [ "$1" == "latest" ]) || [ "$version" == "latest" ]; then
ngx_multi_upstream_module_ver="master"
mod_dubbo_ver="master"
apisix_nginx_module_ver="main"
wasm_nginx_module_ver="main"
lua_var_nginx_module_ver="master"
grpc_client_nginx_module_ver="main"
amesh_ver="main"
debug_args="--with-debug"
OR_PREFIX=${OR_PREFIX:="/usr/local/openresty-debug"}
else
ngx_multi_upstream_module_ver="1.1.1"
mod_dubbo_ver="1.0.2"
apisix_nginx_module_ver="1.12.0"
wasm_nginx_module_ver="0.6.4"
lua_var_nginx_module_ver="v0.5.3"
grpc_client_nginx_module_ver="v0.4.2"
amesh_ver="main"
debug_args=${debug_args:-}
OR_PREFIX=${OR_PREFIX:="/usr/local/openresty"}
fi

prev_workdir="$PWD"
repo=$(basename "$prev_workdir")
workdir=$(mktemp -d)
cd "$workdir" || exit 1

or_ver="1.21.4.1"
wget --no-check-certificate https://openresty.org/download/openresty-${or_ver}.tar.gz
tar -zxvpf openresty-${or_ver}.tar.gz > /dev/null

if [ "$repo" == ngx_multi_upstream_module ]; then
cp -r "$prev_workdir" ./ngx_multi_upstream_module-${ngx_multi_upstream_module_ver}
else
unzip ${prev_workdir}/ngx_multi_upstream_module-${ngx_multi_upstream_module_ver}.zip
fi

if [ "$repo" == mod_dubbo ]; then
cp -r "$prev_workdir" ./mod_dubbo-${mod_dubbo_ver}
else
unzip ${prev_workdir}/mod_dubbo-${mod_dubbo_ver}.zip
fi

if [ "$repo" == apisix-nginx-module ]; then
cp -r "$prev_workdir" ./apisix-nginx-module-${apisix_nginx_module_ver}
else
unzip ${prev_workdir}/apisix-nginx-module-${apisix_nginx_module_ver}.zip
fi

if [ "$repo" == wasm-nginx-module ]; then
cp -r "$prev_workdir" ./wasm-nginx-module-${wasm_nginx_module_ver}
else
unzip ${prev_workdir}/wasm-nginx-module-${wasm_nginx_module_ver}.zip
fi

if [ "$repo" == lua-var-nginx-module ]; then
cp -r "$prev_workdir" ./lua-var-nginx-module-${lua_var_nginx_module_ver}
else
unzip ${prev_workdir}/lua-var-nginx-module-${lua_var_nginx_module_ver#*v}.zip
fi

if [ "$repo" == grpc-client-nginx-module ]; then
cp -r "$prev_workdir" ./grpc-client-nginx-module-${grpc_client_nginx_module_ver}
else
unzip ${prev_workdir}/grpc-client-nginx-module-${grpc_client_nginx_module_ver#*v}.zip
fi

if [ "$repo" == amesh ]; then
cp -r "$prev_workdir" ./amesh-${amesh_ver}
else
unzip ${prev_workdir}/Amesh-${amesh_ver}.zip
fi

cd ngx_multi_upstream_module-${ngx_multi_upstream_module_ver} || exit 1
./patch.sh ../openresty-${or_ver}
cd ..

cd apisix-nginx-module-${apisix_nginx_module_ver}/patch || exit 1
./patch.sh ../../openresty-${or_ver}
cd ../..

cd wasm-nginx-module-${wasm_nginx_module_ver} || exit 1
./install-wasmtime.sh
cd ..

cc_opt=${cc_opt:-}
ld_opt=${ld_opt:-}
luajit_xcflags=${luajit_xcflags:="-DLUAJIT_NUMMODE=2 -DLUAJIT_ENABLE_LUA52COMPAT"}
no_pool_patch=${no_pool_patch:-}
# TODO: remove old NGX_HTTP_GRPC_CLI_ENGINE_PATH once we have released a new
# version of grpc-client-nginx-module
grpc_engine_path="-DNGX_GRPC_CLI_ENGINE_PATH=$OR_PREFIX/libgrpc_engine.so -DNGX_HTTP_GRPC_CLI_ENGINE_PATH=$OR_PREFIX/libgrpc_engine.so"

cd openresty-${or_ver} || exit 1
# FIXME: remove this once 1.21.4.2 is released
rm -rf bundle/LuaJIT-2.1-20220411
lj_ver=2.1-20230119
wget "https://github.com/openresty/luajit2/archive/v$lj_ver.tar.gz" -O "LuaJIT-$lj_ver.tar.gz"
tar -xzf LuaJIT-$lj_ver.tar.gz
mv luajit2-* bundle/LuaJIT-2.1-20220411

# ${lua_var_nginx_module_ver#*v}是为了去掉v

./configure --prefix="$OR_PREFIX" \
--with-cc-opt="-DAPISIX_BASE_VER=$version $grpc_engine_path $cc_opt" \
--with-ld-opt="-Wl,-rpath,$OR_PREFIX/wasmtime-c-api/lib $ld_opt" \
$debug_args \
--add-module=../mod_dubbo-${mod_dubbo_ver} \
--add-module=../ngx_multi_upstream_module-${ngx_multi_upstream_module_ver} \
--add-module=../apisix-nginx-module-${apisix_nginx_module_ver} \
--add-module=../apisix-nginx-module-${apisix_nginx_module_ver}/src/stream \
--add-module=../apisix-nginx-module-${apisix_nginx_module_ver}/src/meta \
--add-module=../wasm-nginx-module-${wasm_nginx_module_ver} \
--add-module=../lua-var-nginx-module-${lua_var_nginx_module_ver#*v} \
--add-module=../grpc-client-nginx-module-${grpc_client_nginx_module_ver#*v} \
--with-poll_module \
--with-pcre-jit \
--without-http_rds_json_module \
--without-http_rds_csv_module \
--without-lua_rds_parser \
--with-stream \
--with-stream_ssl_module \
--with-stream_ssl_preread_module \
--with-http_v2_module \
--without-mail_pop3_module \
--without-mail_imap_module \
--without-mail_smtp_module \
--with-http_stub_status_module \
--with-http_realip_module \
--with-http_addition_module \
--with-http_auth_request_module \
--with-http_secure_link_module \
--with-http_random_index_module \
--with-http_gzip_static_module \
--with-http_sub_module \
--with-http_dav_module \
--with-http_flv_module \
--with-http_mp4_module \
--with-http_gunzip_module \
--with-threads \
--with-compat \
--with-luajit-xcflags="$luajit_xcflags" \
$no_pool_patch \
-j`nproc`

make -j`nproc`
sudo make install
cd ..

cd apisix-nginx-module-${apisix_nginx_module_ver} || exit 1
sudo OPENRESTY_PREFIX="$OR_PREFIX" make install
cd ..

cd wasm-nginx-module-${wasm_nginx_module_ver} || exit 1
sudo OPENRESTY_PREFIX="$OR_PREFIX" make install
cd ..

cd grpc-client-nginx-module-${grpc_client_nginx_module_ver#*v} || exit 1
sudo sed -i s#https://go.dev/dl/#https://golang.google.cn/dl/#g install-util.sh
sudo /usr/local/go/bin/go env -w GO111MODULE=on
sudo /usr/local/go/bin/go env -w GOPROXY=https://goproxy.cn,direct
sudo OPENRESTY_PREFIX="$OR_PREFIX" make install
cd ..

cd Amesh-${amesh_ver} || exit 1
sudo OPENRESTY_PREFIX="$OR_PREFIX" sh -c 'PATH="${PATH}:/usr/local/go/bin" make install'
cd ..

# package etcdctl
ETCD_ARCH="amd64"
ETCD_VERSION=${ETCD_VERSION:-'3.5.4'}
ARCH=${ARCH:-$(uname -m | tr '[:upper:]' '[:lower:]')}

if [[ $ARCH == "arm64" ]] || [[ $ARCH == "aarch64" ]]; then
ETCD_ARCH="arm64"
fi

wget -q https://github.com/etcd-io/etcd/releases/download/v${ETCD_VERSION}/etcd-v${ETCD_VERSION}-linux-${ETCD_ARCH}.tar.gz
tar xf etcd-v${ETCD_VERSION}-linux-${ETCD_ARCH}.tar.gz
# ship etcdctl under the same bin dir of openresty so we can package it easily
sudo cp etcd-v${ETCD_VERSION}-linux-${ETCD_ARCH}/etcdctl "$OR_PREFIX"/bin/
rm -rf etcd-v${ETCD_VERSION}-linux-${ETCD_ARCH}

设置环境变量

1
vi /etc/profile

添加以下内容

1
2
3
4
cat >> /etc/profile << EOF
export $OPENRESTY_HOME=/usr/local/openresty
export PATH=$PATH:$OPENRESTY_HOME/bin
EOF

然后执行

1
souce /etc/profile

APISIX安装

进入APISIX源码目录

设置APISIX版本,并创建目录

1
2
APISIX_VERSION=2.15.3
mkdir apisix-${APISIX_VERSION}

下载源码包

1
curl -O https://downloads.apache.org/apisix/${APISIX_VERSION}/apache-apisix-${APISIX_VERSION}-src.tgz

解压源码包

1
mkdir -p /opt/src/apisix-${APISIX_VERSION} && tar xzvf apache-apisix-${APISIX_VERSION}-src.tgz -C /opt/src/apisix-${APISIX_VERSION}

修改luarocks源,添加以下配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
mkdir -p ~/.luarocks
cat >> ~/.luarocks/config-5.1.lua << EOF
rocks_servers = {
{
"https://luarocks.cn",
"https://luarocks.org",
"https://raw.githubusercontent.com/rocks-moonscript-org/moonrocks-mirror/master/",
"https://luafr.org/luarocks/",
"http://luarocks.logiceditor.com/rocks"
}
}
variables = {
OPENSSL_INCDIR = "/usr/local/openresty/openssl111/include",
OPENSSL_LIBDIR = "/usr/local/openresty/openssl111/lib"
}
EOF

构建

1
2
3
4
# Switch to the apisix-${APISIX_VERSION} directory
cd apisix-${APISIX_VERSION}
# Create dependencies
make deps

或指定源地址

1
make deps ENV_LUAROCKS_SERVER=https://luarocks.cn

构建的话由于网络问题,可能需要多试几次

最后安装

1
2
# Install apisix command
make install

1
2
3
4
5
6
7
sudo tee -a /etc/security/limits.conf << EOF
#
* soft nofile 65536
* hard nofile 65536
* soft nproc 65536
* hard nproc 65536
EOF

设置luarocks

1
2
3
OPENSSL_PREFIX=/usr/local/openresty/openssl111
luarocks config variables.OPENSSL_LIBDIR ${OPENSSL_PREFIX}/lib
luarocks config variables.OPENSSL_INCDIR ${OPENSSL_PREFIX}/include

拷贝源码构建目录下的apisix目录以及deps目录到 /usr/local/apisix

1
cp -r /opt/src/apisix-${APISIX_VERSION}/apisix/ /opt/src/apisix-${APISIX_VERSION}/deps/ /usr/local/apisix/

测试apisix

1
2
/usr/bin/apisix test
/usr/bin/apisix init

APISIX连接etcd

无证书方式

分别在三个节点上起etcd服务

1
2
3
4
5
6
7
8
9
10
11
12
13
/opt/etcd3/etcd \
--name s1 \
--data-dir /tmp/etcd-data \
--listen-client-urls http://0.0.0.0:2379 \
--advertise-client-urls http://0.0.0.0:2379 \
--listen-peer-urls http://0.0.0.0:2380 \
--initial-advertise-peer-urls http://0.0.0.0:2380 \
--initial-cluster s1=http://0.0.0.0:2380 \
--initial-cluster-token tkn \
--initial-cluster-state new \
--log-level info \
--logger zap \
--log-outputs stderr

1
2
3
4
5
6
7
8
9
10
11
12
13
/opt/etcd3/etcd \
--name s2 \
--data-dir /tmp/etcd-data \
--listen-client-urls http://0.0.0.0:2379 \
--advertise-client-urls http://0.0.0.0:2379 \
--listen-peer-urls http://0.0.0.0:2380 \
--initial-advertise-peer-urls http://0.0.0.0:2380 \
--initial-cluster s1=http://0.0.0.0:2380 \
--initial-cluster-token tkn \
--initial-cluster-state new \
--log-level info \
--logger zap \
--log-outputs stderr
1
2
3
4
5
6
7
8
9
10
11
12
13
/opt/etcd3/etcd \
--name s3 \
--data-dir /tmp/etcd-data \
--listen-client-urls http://0.0.0.0:2379 \
--advertise-client-urls http://0.0.0.0:2379 \
--listen-peer-urls http://0.0.0.0:2380 \
--initial-advertise-peer-urls http://0.0.0.0:2380 \
--initial-cluster s1=http://0.0.0.0:2380 \
--initial-cluster-token tkn \
--initial-cluster-state new \
--log-level info \
--logger zap \
--log-outputs stderr

如果是使用http的话,就配置下host就行了

1
2
3
4
5
6
7
8
9
# 其他配置省略...

etcd:
host:
- "http://10.1.80.91:2379"
- "http://10.1.80.92:2379"
- "http://10.1.80.93:2379"

# 其他配置省略...

有证书方式

如果是使用https的话,需要配置下etcd证书

创建放etcd证书的目录,并将证书拷贝过去,由于我是将etcd和apisix安装在一台上了,请根据具体情况来操作

1
2
3
sudo mkdir -p /usr/local/apisix/ssl
sudo cp /opt/etcd3/ssl/{etcd-ca.pem,etcd-server-key.pem,etcd-server.pem} /usr/local/apisix/ssl
sudo chmod a+r /usr/local/apisix/ssl/etcd-server-key.pem

chmod 666 /usr/local/apisix/ssl/etcd-server-key.pem 为什么加写权限,见常见问题

修改APISIX配置文件,可以先备份下,然后用config-default.yaml这个模板

1
2
3
cp /usr/local/apisix/conf/config.yaml /usr/local/apisix/conf/config.yaml.bak
cp /usr/local/apisix/conf/config-default.yaml /usr/local/apisix/conf/config.yaml
vi /usr/local/apisix/conf/config.yaml

修改内容如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# 其余省略...
ssl:
enable: true
listen: # APISIX listening port in https.
- port: 9443
enable_http2: true
# - ip: 127.0.0.3 # Specific IP, If not set, the default value is `0.0.0.0`.
# port: 9445
# enable_http2: true
ssl_trusted_certificate: /usr/local/apisix/ssl/etcd-ca.pem # Specifies a file path with trusted CA certificates in the PEM format

# 其余省略...
etcd:
host: # it's possible to define multiple etcd hosts addresses of the same etcd cluster.
- "https://10.1.80.91:2379" # multiple etcd address, if your etcd cluster enables TLS, please use https scheme,
- "https://10.1.80.92:2379" # multiple etcd address, if your etcd cluster enables TLS, please use https scheme,
- "https://10.1.80.93:2379" # multiple etcd address, if your etcd cluster enables TLS, please use https scheme,
# e.g. https://127.0.0.1:2379.
prefix: /apisix # configuration prefix in etcd
use_grpc: false # enable the experimental configuration sync via gRPC
timeout: 30 # 30 seconds. Use a much higher timeout (like an hour) if the `use_grpc` is true.
#resync_delay: 5 # when sync failed and a rest is needed, resync after the configured seconds plus 50% random jitter
#health_check_timeout: 10 # etcd retry the unhealthy nodes after the configured seconds
startup_retry: 2 # the number of retry to etcd during the startup, default to 2
#user: root # root username for etcd
#password: 5tHkHhYkjr6cQY # root password for etcd
tls:
# To enable etcd client certificate you need to build APISIX-Base, see
# https://apisix.apache.org/docs/apisix/FAQ#how-do-i-build-the-apisix-base-environment
cert: /usr/local/apisix/ssl/etcd-server.pem # path of certificate used by the etcd client
key: /usr/local/apisix/ssl/etcd-server-key.pem # path of key used by the etcd client

verify: true # whether to verify the etcd endpoint certificate when setup a TLS connection to etcd,

初始化启动APISIX

启动apisix

1
2
3
/usr/bin/apisix test
/usr/bin/apisix init
/usr/bin/apisix start

如果以这种方式启动了,当使用systemd服务方式启动时,别忘了关掉之前的apisix进程

为APISIX添加systemd服务

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
sudo tee -a /usr/lib/systemd/system/apisix.service << EOF  
# apisix systemd service
# https://github.com/api7/apisix-build-tools/blob/master/usr/lib/systemd/system/apisix.service
[Unit]
Description=apisix
#Conflicts=apisix.service
After=network-online.target
Wants=network-online.target

[Service]
Type=forking
Restart=on-failure
WorkingDirectory=/usr/local/apisix
ExecStart=/usr/bin/apisix start
ExecStop=/usr/bin/apisix stop
ExecReload=/usr/bin/apisix reload
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

启用apisix服务

1
2
systemctl daemon-reload
systemctl enable apisix

常见错误

Could not find header file for OPENSSL

1
2
3
4
5
6
7
8
9
10
11
12
13
14
Installing https://luarocks.org/luasec-0.9-1.src.rock

Error: Failed installing dependency: https://luarocks.org/luasec-0.9-1.src.rock - Could not find header file for OPENSSL
No file openssl/ssl.h in /usr/local/openresty/openssl/include
You may have to install OPENSSL in your system and/or pass OPENSSL_DIR or OPENSSL_INCDIR to the luarocks command.
Example: luarocks install luasec OPENSSL_DIR=/usr/local
make: *** [Makefile:152: deps] Error 1
root@exp:/tmp/src/apisix-2.15# ll /usr/local/openresty/op^C
root@exp:/tmp/src/apisix-2.15# apt install libssl-dev
Reading package lists... Done
Building dependency tree
Reading state information... Done
libssl-dev is already the newest version (1.1.1f-1ubuntu2.16).
0 upgraded, 0 newly installed, 0 to remove and 9 not upgraded.

需要安装openresty-openssl111-dev

1
sudo apt install -y openresty-openssl111-dev

gzip module requires the zlib library

在进行./configure编译Nginx提示gzip module requires the zlib library
执行sudo apt install zlib1g-dev可解决

1
sudo apt install zlib1g-dev

ngx_postgres addon was unable to detect version of the libpq library

在进行./configure编译Nginx提示ngx_postgres addon was unable to detect version of the libpq library
执行sudo apt-get install libpq-dev可解决

1
sudo apt-get install libpq-dev

安装apixsix make deps时报如下错误,说明没有安装libldap2-dev

1
2
3
4
Error: Failed installing dependency: https://luarocks.cn/lualdap-1.2.6-1.src.rock - Could not find header file for LDAP
No file ldap.h in /usr/local/include
No file ldap.h in /usr/include
No file ldap.h in /include

APISIX相关文件缺失

遇如下错误,只需将源码apisix-$VERSION/apisix目录拷贝到 /usr/local/apisix安装目录下 cp -r /opt/src/apisix-3.2.0/apisix/ /usr/local/apisix/

1
2
3
4
5
6
root@s91:/usr/local/apisix# apisix init
/usr/local/openresty//luajit/bin/luajit /usr/local/apisix/apisix/cli/apisix.lua init
/usr/local/openresty//luajit/bin/luajit: cannot open /usr/local/apisix/apisix/cli/apisix.lua: No such file or directory
root@s91:/usr/local/apisix# apisix test
/usr/local/openresty//luajit/bin/luajit /usr/local/apisix/apisix/cli/apisix.lua test
/usr/local/openresty//luajit/bin/luajit: cannot open /usr/local/apisix/apisix/cli/apisix.lua: No such file or directory

缺少主机名的解析

遇到如下错误,只需要在/etc/hosts里加一条记录 127.0.0.1 s92.idc3.meeleet.com 即可

1
sudo: unable to resolve host s92.idc3.meeleet.com: Temporary failure in name resolution

默认luarocks源下载慢,很容易下载失败的问题

APISIX make deps构建,如果出现以下错误,需要配置下luarcoks国内源优先,因为默认的源luarocks.org比较慢,使用https://luarocks.cn的源,成功率会高很多,但是可能还有偶尔失败,多试几次就可以了

1
2
3
4
5
6
7
8
9
10
11
12
Installing https://luarocks.org/lua-resty-radixtree-2.8.2-0.src.rock
Missing dependencies for lua-resty-radixtree 2.8.2-0:
lua-resty-expr 1.3.0 (not installed)

lua-resty-radixtree 2.8.2-0 depends on lua-resty-ipmatcher (0.6.1-0 installed)
lua-resty-radixtree 2.8.2-0 depends on lua-resty-expr 1.3.0 (not installed)
Installing https://luarocks.org/lua-resty-expr-1.3.0-0.rockspec
Cloning into 'lua-resty-expr'...
fatal: unable to access 'https://github.com/api7/lua-resty-expr/': GnuTLS recv error (-110): The TLS connection was non-properly terminated.

Error: Failed installing dependency: https://luarocks.org/lua-resty-radixtree-2.8.2-0.src.rock - Failed installing dependency: https://luarocks.org/lua-resty-expr-1.3.0-0.rockspec - Failed cloning git repository.
make: *** [Makefile:152: deps] Error 1

etcd-server-key.pem:Permission denied

1
2
3
2023/04/17 04:28:11 [warn] 110028#110028: *44161 [lua] v3.lua:716: request_chunk(): https://10.1.80.93:2379: /usr/local/apisix/ssl/etcd-server-key.pem: Permission denied. Retrying, context: ngx.timer
2023/04/17 04:28:11 [warn] 110028#110028: *44161 [lua] v3.lua:716: request_chunk(): https://10.1.80.91:2379: /usr/local/apisix/ssl/etcd-server-key.pem: Permission denied. Retrying, context: ngx.timer
2023/04/17 04:28:11 [warn] 110028#110028: *44161 [lua] v3.lua:716: request_chunk(): https://10.1.80.92:2379: /usr/local/apisix/ssl/etcd-server-key.pem: Permission denied. Retrying, context: ngx.timer

需要给证书文件所有用户以读权限

1
sudo chmod a+r /usr/local/apisix/ssl/etcd-server-key.pem

FAQ

需要安装Lua吗?

不需要,因为这里OpenResty用的LuaJIT来运行

怎么卸载APISIX

1
2
3
# To uninstall the APISIX runtime, run:
make uninstall
make undeps

参考

OpenResty® Linux 包

How to Use the OpenResty Web Framework for Nginx on Ubuntu 16.04

APISIX安装依赖

luarocks 简单使用&openresty 离线集成说明 原创

ubuntu 20.04安装 Apache APISIX

help request: please check the version of OpenResty and lua, OpenResty 1.19 + LuaJIT or OpenResty before 1.19 + Lua 5.1 is required for Apache APISIX

apisix配置需要证书的etcd集群

安装部署3节点Etcd高可用集群(3节点同一套证书)【APISIX用的etcd集群】

发表于 2023-04-17 |
字数统计: 3,268

[toc]

环境

IP 主机名
10.1.80.91 s1
10.1.80.92 s2
10.1.80.93 s3

安装cfssl

cfssl版本 可执行文件存放目录 证书存放目录
1.6.2 /usr/local/bin /tmp/certs

下载地址
https://github.com/cloudflare/cfssl/releases

1
2
3
4
5
6
7
curl -L https://github.com/cloudflare/cfssl/releases/download/v1.6.2/cfssl_1.6.2_linux_amd64 -o /tmp/cfssl
chmod +x /tmp/cfssl
mv /tmp/cfssl /usr/bin/cfssl

curl -L https://github.com/cloudflare/cfssl/releases/download/v1.6.2/cfssljson_1.6.2_linux_amd64 -o /tmp/cfssljson
chmod +x /tmp/cfssljson
mv /tmp/cfssljson /usr/bin/cfssljson

打印默认证书配置

1
2
cfssl print-defaults csr
cfssl print-defaults config

生成CA根证书

ca-config配置

需要注意的是,etcd-ca-config.json里面server的配置一定要加上”client auth”,因为APISIX新建集群时候,会默认做一次节点间健康检查,此时并没有像etcdctl那样提供client的证书和私钥,所以节点会用server的证书和私钥来进行通信,此时server证书就互为服务端和客户端了。
并且apisix和etcd集群通信时,etcd由于grpc-gateway的原因,也会把server证书当成客户端证书来验证。

profiles(ca证书不同配置的作用)里面的内容就是CA可以用来发挥作用的功能,一共预置了三种配置,分别对应Server、Peer和Client的证书密钥。

signing即签名证书,key encipherment即加密,server auth即服务器认证,client auth即客户端认证

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
mkdir -p /tmp/certs

cat > /tmp/certs/etcd-ca-config.json <<EOF
{
"signing": {
"default": {
"expiry": "876000h"
},
"profiles": {
"server": {
"expiry": "876000h",
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
]
},
"client": {
"expiry": "876000h",
"usages": [
"signing",
"key encipherment",
"client auth"
]
},
"peer": {
"expiry": "876000h",
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
]
}
}
}
}
EOF

ca-csr配置

hosts里写所有etcd节点的ip,所有etcd节点的ip和对应的域名

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
cat > /tmp/certs/etcd-ca-csr.json <<EOF
{
"CN": "etcd-ca",
"hosts": [
"127.0.0.1",
"localhost",
"10.1.80.91",
"10.1.80.92",
"10.1.80.93"
],
"key": {
"algo": "ecdsa",
"size": 256
},
"names": [
{
"L": "Hangzhou",
"ST": "Zhejiang",
"C": "CN"
}
]
}
EOF

生成CA证书和CA证书的私钥

1
2
cd /tmp/certs
cfssl gencert -initca etcd-ca-csr.json | cfssljson -bare etcd-ca -

会生成以下文件

  • etcd-ca.csr
  • etcd-ca-key.pem
  • etcd-ca.pem

etcd-ca-key.pem为CA的私钥,请妥善保管
etcd-ca.csr文件为证书请求文件,可以删除

生成Server和Peer证书

Server和Peer配置

Server用来服务端客户端通信的,Peer用来节点间通信的,它们共用同一套配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
mkdir -p /tmp/certs

cat > /tmp/certs/etcd-csr.json <<EOF
{
"CN": "etcd",
"hosts": [
"127.0.0.1",
"localhost",
"10.1.80.91",
"10.1.80.92",
"10.1.80.93"
],
"key": {
"algo": "ecdsa",
"size": 256
},
"names": [
{
"L": "Hangzhou",
"ST": "Zhejiang",
"C": "CN"
}
]
}
EOF

生成Server和Peer证书

1
2
3
cfssl gencert -ca=etcd-ca.pem -ca-key=etcd-ca-key.pem -config=etcd-ca-config.json -profile=server etcd-csr.json | cfssljson -bare etcd-server

cfssl gencert -ca=etcd-ca.pem -ca-key=etcd-ca-key.pem -config=etcd-ca-config.json -profile=peer etcd-csr.json | cfssljson -bare etcd-peer

生成以下文件

  • etcd-server.csr
  • etcd-server-key.pem
  • etcd-server.pem
  • etcd-peer.csr
  • etcd-peer-key.pem
  • etcd-peer.pem

生成Client证书

Client配置

客户端证书不需要hosts字段,只需要CN字段设置为client

1
2
3
4
5
6
7
8
9
10
11
mkdir -p /tmp/certs

cat > /tmp/certs/etcd-client-csr.json <<EOF
{
"CN": "etcd-client",
"key": {
"algo": "ecdsa",
"size": 256
}
}
EOF

生成Client证书

1
cfssl gencert -ca=etcd-ca.pem -ca-key=etcd-ca-key.pem -config=etcd-ca-config.json -profile=client etcd-client-csr.json  | cfssljson -bare etcd-client

生成以下文件

  • etcd-client.csr
  • etcd-client-key.pem
  • etcd-client.pem

创建3节点集群

安装etcd

s91,s92,s93上安装etcd

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
ETCD_VER=v3.5.4
ETCD_INSTALL_PATH=/opt/etcd3

# choose either URL
# GOOGLE_URL=https://storage.googleapis.com/etcd
GITHUB_URL=https://github.com/etcd-io/etcd/releases/download
DOWNLOAD_URL=${GITHUB_URL}

mkdir -p ${ETCD_INSTALL_PATH}
curl -L ${DOWNLOAD_URL}/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz -o ${ETCD_INSTALL_PATH}/etcd-${ETCD_VER}-linux-amd64.tar.gz
tar xzvf ${ETCD_INSTALL_PATH}/etcd-${ETCD_VER}-linux-amd64.tar.gz -C /opt/etcd3 --strip-components=1
rm -f ${ETCD_INSTALL_PATH}/etcd-${ETCD_VER}-linux-amd64.tar.gz

${ETCD_INSTALL_PATH}/etcd --version
${ETCD_INSTALL_PATH}/etcdctl version

创建证书存放目录

每个节点创建证书存放目录

1
mkdir -p /opt/etcd3/ssl

将证书传到s91, s92, s93

方便起见,就把所有文件都传了

1
2
3
scp -r /tmp/certs/etcd-*.pem root@10.1.80.91:/opt/etcd3/ssl
scp -r /tmp/certs/etcd-*.pem root@10.1.80.92:/opt/etcd3/ssl
scp -r /tmp/certs/etcd-*.pem root@10.1.80.93:/opt/etcd3/ssl

frontend 运行etcd(不推荐)

运行etcd的模板

–initial-cluster-token etcd-cluster-tkn “etcd-cluster-tkn”可以替换成你自己需要的值

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
./etcd -name etcd1 \
--data-dir /tmp/etcd/s1 \
--auto-tls \
--client-cert-auth \ # 要求客户端验证
--cert-file=/opt/etcd3/ssl/etcd-server.pem \ # 服务端证书
--key-file=/opt/etcd3/ssl/etcd-server-key.pem \ # 服务端私钥
--trusted-ca-file=/opt/etcd3/ssl/etcd-ca.pem \ # 服务端CA根证书
--peer-auto-tls \
--peer-cert-file=/opt/etcd3/ssl/etcd-peer.pem \ # 节点通信证书
--peer-key-file=/opt/etcd3/ssl/etcd-peer-key.pem \ # 节点通信私钥
--peer-client-cert-auth \ # 要求节点通信验证
--peer-trusted-ca-file=/opt/etcd3/ssl/etcd-ca.pem \ #节点通信CA根证书
--advertise-client-urls https://10.1.80.91:2379 \ # 所有url都变成https协议
--listen-client-urls https://10.1.80.91:2379 \
--listen-peer-urls https://10.1.80.91:2380 \
--initial-advertise-peer-urls https://10.1.80.91:2380 \
--initial-cluster-token etcd-cluster-tkn \
--initial-cluster "s1=https://10.1.80.91:2380,s2=https://10.1.80.92:2380,s3=https://10.1.80.93:2380" \
--initial-cluster-state new

分别在s91,s92,s93上运行etcd

  • 节点1

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    # make sure etcd process has write access to this directory
    # remove this directory if the cluster is new; keep if restarting etcd
    # rm -rf /tmp/etcd

    ./etcd -name s1 \
    --data-dir /tmp/etcd-data \
    --auto-tls \
    --client-cert-auth \
    --cert-file=/opt/etcd3/ssl/etcd-server.pem \
    --key-file=/opt/etcd3/ssl/etcd-server-key.pem \
    --trusted-ca-file=/opt/etcd3/ssl/etcd-ca.pem \
    --peer-auto-tls \
    --peer-cert-file=/opt/etcd3/ssl/etcd-peer.pem \
    --peer-key-file=/opt/etcd3/ssl/etcd-peer-key.pem \
    --peer-client-cert-auth \
    --peer-trusted-ca-file=/opt/etcd3/ssl/etcd-ca.pem \
    --advertise-client-urls https://10.1.80.91:2379 \
    --listen-client-urls https://10.1.80.91:2379 \
    --listen-peer-urls https://10.1.80.91:2380 \
    --initial-advertise-peer-urls https://10.1.80.91:2380 \
    --initial-cluster-token etcd-cluster-tkn \
    --initial-cluster "s1=https://10.1.80.91:2380,s2=https://10.1.80.92:2380,s3=https://10.1.80.93:2380" \
    --initial-cluster-state new
  • 节点2

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    # make sure etcd process has write access to this directory
    # remove this directory if the cluster is new; keep if restarting etcd
    # rm -rf /tmp/etcd

    ./etcd -name s2 \
    --data-dir /tmp/etcd-data \
    --auto-tls \
    --client-cert-auth \
    --cert-file=/opt/etcd3/ssl/etcd-server.pem \
    --key-file=/opt/etcd3/ssl/etcd-server-key.pem \
    --trusted-ca-file=/opt/etcd3/ssl/etcd-ca.pem \
    --peer-auto-tls \
    --peer-cert-file=/opt/etcd3/ssl/etcd-peer.pem \
    --peer-key-file=/opt/etcd3/ssl/etcd-peer-key.pem \
    --peer-client-cert-auth \
    --peer-trusted-ca-file=/opt/etcd3/ssl/etcd-ca.pem \
    --advertise-client-urls https://10.1.80.92:2379 \
    --listen-client-urls https://10.1.80.92:2379 \
    --listen-peer-urls https://10.1.80.92:2380 \
    --initial-advertise-peer-urls https://10.1.80.92:2380 \
    --initial-cluster-token etcd-cluster-tkn \
    --initial-cluster "s1=https://10.1.80.91:2380,s2=https://10.1.80.92:2380,s3=https://10.1.80.93:2380" \
    --initial-cluster-state new
  • 节点3

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# make sure etcd process has write access to this directory
# remove this directory if the cluster is new; keep if restarting etcd
# rm -rf /tmp/etcd

./etcd -name s3 \
--data-dir /tmp/etcd-data \
--auto-tls \
--client-cert-auth \
--cert-file=/opt/etcd3/ssl/etcd-server.pem \
--key-file=/opt/etcd3/ssl/etcd-server-key.pem \
--trusted-ca-file=/opt/etcd3/ssl/etcd-ca.pem \
--peer-auto-tls \
--peer-cert-file=/opt/etcd3/ssl/etcd-peer.pem \
--peer-key-file=/opt/etcd3/ssl/etcd-peer-key.pem \
--peer-client-cert-auth \
--peer-trusted-ca-file=/opt/etcd3/ssl/etcd-ca.pem \
--advertise-client-urls https://10.1.80.93:2379 \
--listen-client-urls https://10.1.80.93:2379 \
--listen-peer-urls https://10.1.80.93:2380 \
--initial-advertise-peer-urls https://10.1.80.93:2380 \
--initial-cluster-token etcd-cluster-tkn \
--initial-cluster "s1=https://10.1.80.91:2380,s2=https://10.1.80.92:2380,s3=https://10.1.80.93:2380" \
--initial-cluster-state new

检查etcd集群状态

1
2
3
4
5
6
7
ETCDCTL_API=3 
/opt/etcd3/etcdctl \
--endpoints 10.1.80.91:2379,10.1.80.92:2379,10.1.80.93:2379 \
--cacert /opt/etcd3/ssl/etcd-ca.pem \
--cert /opt/etcd3/ssl/etcd-server.pem \
--key /opt/etcd3/ssl/etcd-server-key.pem \
endpoint health

systemd运行方式(强烈推荐)

如果之前通过frontend方式运行过apisix,记得要停掉之前的apisix进程。

证书先拷贝到/opt/etcd3/ssl目录

1
2
3
4
5
# make sure etcd process has write access to this directory
# remove this directory if the cluster is new; keep if restarting etcd
# rm -rf /tmp/etcd/s1

# to write service file for etcd

s1

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
cat > /tmp/s1.service <<EOF
[Unit]
Description=etcd
Documentation=https://github.com/coreos/etcd
Conflicts=etcd.service
Conflicts=etcd2.service

[Service]
Type=notify
Restart=always
RestartSec=5s
LimitNOFILE=40000
TimeoutStartSec=0

ExecStart=/opt/etcd3/etcd -name s1 \
--data-dir /tmp/etcd-data \
--auto-tls \
--client-cert-auth \
--cert-file=/opt/etcd3/ssl/etcd-server.pem \
--key-file=/opt/etcd3/ssl/etcd-server-key.pem \
--trusted-ca-file=/opt/etcd3/ssl/etcd-ca.pem \
--peer-auto-tls \
--peer-cert-file=/opt/etcd3/ssl/etcd-peer.pem \
--peer-key-file=/opt/etcd3/ssl/etcd-peer-key.pem \
--peer-client-cert-auth \
--peer-trusted-ca-file=/opt/etcd3/ssl/etcd-ca.pem \
--advertise-client-urls https://10.1.80.91:2379 \
--listen-client-urls https://10.1.80.91:2379 \
--listen-peer-urls https://10.1.80.91:2380 \
--initial-advertise-peer-urls https://10.1.80.91:2380 \
--initial-cluster-token etcd-cluster-tkn \
--initial-cluster "s1=https://10.1.80.91:2380,s2=https://10.1.80.92:2380,s3=https://10.1.80.93:2380" \
--initial-cluster-state new

[Install]
WantedBy=multi-user.target
EOF


sudo mv /tmp/s1.service /etc/systemd/system/s1.service

# to start service
sudo systemctl daemon-reload
sudo systemctl enable s1.service
sudo systemctl start s1.service

# cat s1.service
sudo systemctl cat s1.service

# to get logs from service
sudo systemctl status s1.service -l --no-pager
sudo journalctl -u s1.service -l --no-pager|less
sudo journalctl -f -u s1.service

# to stop service
sudo systemctl stop s1.service
sudo systemctl disable s1.service
1
# to start service

s2

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
cat > /tmp/s2.service <<EOF
[Unit]
Description=etcd
Documentation=https://github.com/coreos/etcd
Conflicts=etcd.service
Conflicts=etcd2.service

[Service]
Type=notify
Restart=always
RestartSec=5s
LimitNOFILE=40000
TimeoutStartSec=0

ExecStart=/opt/etcd3/etcd -name s2 \
--data-dir /tmp/etcd-data \
--auto-tls \
--client-cert-auth \
--cert-file=/opt/etcd3/ssl/etcd-server.pem \
--key-file=/opt/etcd3/ssl/etcd-server-key.pem \
--trusted-ca-file=/opt/etcd3/ssl/etcd-ca.pem \
--peer-auto-tls \
--peer-cert-file=/opt/etcd3/ssl/etcd-peer.pem \
--peer-key-file=/opt/etcd3/ssl/etcd-peer-key.pem \
--peer-client-cert-auth \
--peer-trusted-ca-file=/opt/etcd3/ssl/etcd-ca.pem \
--advertise-client-urls https://10.1.80.92:2379 \
--listen-client-urls https://10.1.80.92:2379 \
--listen-peer-urls https://10.1.80.92:2380 \
--initial-advertise-peer-urls https://10.1.80.92:2380 \
--initial-cluster-token etcd-cluster-tkn \
--initial-cluster "s1=https://10.1.80.91:2380,s2=https://10.1.80.92:2380,s3=https://10.1.80.93:2380" \
--initial-cluster-state new

[Install]
WantedBy=multi-user.target
EOF

sudo mv /tmp/s2.service /etc/systemd/system/s2.service

# to start service
sudo systemctl daemon-reload
sudo systemctl enable s2.service
sudo systemctl start s2.service

# cat s2.service
sudo systemctl cat s2.service

# to get logs from service
sudo systemctl status s2.service -l --no-pager
sudo journalctl -u s2.service -l --no-pager|less
sudo journalctl -f -u s2.service

# to stop service
sudo systemctl stop s2.service
sudo systemctl disable s2.service

s3

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
cat > /tmp/s3.service <<EOF
[Unit]
Description=etcd
Documentation=https://github.com/coreos/etcd
Conflicts=etcd.service
Conflicts=etcd2.service

[Service]
Type=notify
Restart=always
RestartSec=5s
LimitNOFILE=40000
TimeoutStartSec=0

ExecStart=/opt/etcd3/etcd -name s3 \
--data-dir /tmp/etcd-data \
--auto-tls \
--client-cert-auth \
--cert-file=/opt/etcd3/ssl/etcd-server.pem \
--key-file=/opt/etcd3/ssl/etcd-server-key.pem \
--trusted-ca-file=/opt/etcd3/ssl/etcd-ca.pem \
--peer-auto-tls \
--peer-cert-file=/opt/etcd3/ssl/etcd-peer.pem \
--peer-key-file=/opt/etcd3/ssl/etcd-peer-key.pem \
--peer-client-cert-auth \
--peer-trusted-ca-file=/opt/etcd3/ssl/etcd-ca.pem \
--advertise-client-urls https://10.1.80.93:2379 \
--listen-client-urls https://10.1.80.93:2379 \
--listen-peer-urls https://10.1.80.93:2380 \
--initial-advertise-peer-urls https://10.1.80.93:2380 \
--initial-cluster-token etcd-cluster-tkn \
--initial-cluster "s1=https://10.1.80.91:2380,s2=https://10.1.80.92:2380,s3=https://10.1.80.93:2380" \
--initial-cluster-state new

[Install]
WantedBy=multi-user.target
EOF

sudo mv /tmp/s3.service /etc/systemd/system/s3.service

# to start service
sudo systemctl daemon-reload
sudo systemctl enable s3.service
sudo systemctl start s3.service

# cat s3.service
sudo systemctl cat s3.service

# to get logs from service
sudo systemctl status s3.service -l --no-pager
sudo journalctl -u s3.service -l --no-pager|less
sudo journalctl -f -u s3.service

# to stop service
sudo systemctl stop s3.service
sudo systemctl disable s3.service

Check status:

etcdctl检查

1
2
3
4
5
6
7
ETCDCTL_API=3 
/opt/etcd3/etcdctl \
--cacert /opt/etcd3/ssl/etcd-ca.pem \
--cert /opt/etcd3/ssl/etcd-server.pem \
--key /opt/etcd3/ssl/etcd-server-key.pem \
--endpoints 10.1.80.91:2379,10.1.80.92:2379,10.1.80.93:2379 \
endpoint health

curl检查

1
2
# curl查看健康状态
curl --cacert /opt/etcd3/ssl/etcd-ca.pem --cert /opt/etcd3/ssl/etcd-server.pem --key /opt/etcd3/ssl/etcd-server-key.pem https://10.1.80.91:2379/health

注意切换到指定用户

若结果如下,表示集群状态正常

1
2
3
10.1.80.93:2379 is healthy: successfully committed proposal: took = 6.506639ms
10.1.80.91:2379 is healthy: successfully committed proposal: took = 6.427877ms
10.1.80.92:2379 is healthy: successfully committed proposal: took = 7.161322ms

参考资料

etcd Labs

cfssl ca证书有效期修改

apisix配置需要证书的etcd集群

为etcd集群配置证书

ETCD集群搭建和TLS证书访问

安装部署3节点Etcd高可用集群(3节点同一套证书)

发表于 2022-10-10 |
字数统计: 2,580

环境

IP 主机名
10.1.80.91 s1
10.1.80.92 s2
10.1.80.93 s3

安装cfssl

cfssl版本 可执行文件存放目录 证书存放目录
1.6.2 /usr/local/bin /tmp/certs

下载地址
https://github.com/cloudflare/cfssl/releases

1
2
3
4
5
6
7
curl -L https://github.com/cloudflare/cfssl/releases/download/v1.6.2/cfssl_1.6.2_linux_amd64 -o /tmp/cfssl
chmod +x /tmp/cfssl
mv /tmp/cfssl /usr/bin/cfssl

curl -L https://github.com/cloudflare/cfssl/releases/download/v1.6.2/cfssljson_1.6.2_linux_amd64 -o /tmp/cfssljson
chmod +x /tmp/cfssljson
mv /tmp/cfssljson /usr/bin/cfssljson

生成自签名CA证书

可通过http://play.etcd.io/install生成配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
mkdir -p /tmp/certs

cat > /tmp/certs/root-ca-csr.json <<EOF
{
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"O": "Meeleet",
"OU": "Meeleet Security",
"L": "Hangzhou",
"ST": "Zhejiang",
"C": "CN"
}
],
"CN": "root-ca"
}
EOF
cfssl gencert --initca=true /tmp/certs/root-ca-csr.json | cfssljson --bare /tmp/certs/root-ca

# verify
openssl x509 -in /tmp/certs/root-ca.pem -text -noout


# cert-generation configuration
cat > /tmp/certs/gencert.json <<EOF
{
"signing": {
"default": {
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
],
"expiry": "876000h"
}
}
}
EOF

结果

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# CSR configuration
/tmp/certs/root-ca-csr.json

# CSR
/tmp/certs/root-ca.csr

# self-signed root CA public key
/tmp/certs/root-ca.pem

# self-signed root CA private key
/tmp/certs/root-ca-key.pem

# cert-generation configuration for other TLS assets
/tmp/certs/gencert.json

生成本地颁发的带有私钥的证书

证书相关操作

由于3个节点用同一套证书,所以只需生成一套即可

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
mkdir -p /tmp/certs

cat > /tmp/certs/etcd-ca-csr.json <<EOF
{
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"O": "Meeleet",
"OU": "Meeleet Security",
"L": "Hangzhou",
"ST": "Zhejiang",
"C": "CN"
}
],
"CN": "etcd",
"hosts": [
"127.0.0.1",
"localhost",
"10.1.80.91",
"10.1.80.92",
"10.1.80.93"
]
}
EOF
cfssl gencert \
--ca /tmp/certs/root-ca.pem \
--ca-key /tmp/certs/root-ca-key.pem \
--config /tmp/certs/gencert.json \
/tmp/certs/etcd-ca-csr.json | cfssljson --bare /tmp/certs/etcd

# verify
openssl x509 -in /tmp/certs/etcd.pem -text -noout

结果

1
2
3
4
5
6
7
8
9
-rw-r--r--  1 root root  323 Sep 30 05:28 etcd-ca-csr.json
-rw-r--r-- 1 root root 1098 Sep 30 05:28 etcd.csr
-rw------- 1 root root 1679 Sep 30 05:28 etcd-key.pem
-rw-r--r-- 1 root root 1493 Sep 30 05:28 etcd.pem
-rw-r--r-- 1 root root 205 Sep 30 05:25 gencert.json
-rw-r--r-- 1 root root 1017 Sep 30 05:25 root-ca.csr
-rw-r--r-- 1 root root 221 Sep 30 05:25 root-ca-csr.json
-rw------- 1 root root 1679 Sep 30 05:25 root-ca-key.pem
-rw-r--r-- 1 root root 1346 Sep 30 05:25 root-ca.pem

将证书传到s91, s92, s93,方便起见,就把所有文件都传了

1
2
3
scp -r /tmp/certs/ root@10.1.80.91:/root/
scp -r /tmp/certs/ root@10.1.80.92:/root/
scp -r /tmp/certs/ root@10.1.80.93:/root/

这里请更改成您自己所用的系统用户名

安装etcd

s91,s92,s93上安装etcd

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
ETCD_VER=v3.4.21
ETCD_INSTALL_PATH=/opt/etcd3

# choose either URL
# GOOGLE_URL=https://storage.googleapis.com/etcd
GITHUB_URL=https://github.com/etcd-io/etcd/releases/download
DOWNLOAD_URL=${GITHUB_URL}

mkdir -p ${ETCD_INSTALL_PATH}
curl -L ${DOWNLOAD_URL}/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz -o ${ETCD_INSTALL_PATH}/etcd-${ETCD_VER}-linux-amd64.tar.gz
tar xzvf ${ETCD_INSTALL_PATH}/etcd-${ETCD_VER}-linux-amd64.tar.gz -C /opt/etcd3 --strip-components=1
rm -f ${ETCD_INSTALL_PATH}/etcd-${ETCD_VER}-linux-amd64.tar.gz

${ETCD_INSTALL_PATH}/etcd --version
${ETCD_INSTALL_PATH}/etcdctl version

frontend 运行etcd(不推荐,只用于测试功能用)

分别在s91,s92,s93上运行etcd

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# make sure etcd process has write access to this directory
# remove this directory if the cluster is new; keep if restarting etcd
# rm -rf /tmp/etcd


/opt/etcd3/etcd --name s1 \
--data-dir /tmp/etcd/s1 \
--listen-client-urls https://10.1.80.91:2379 \
--advertise-client-urls https://10.1.80.91:2379 \
--listen-peer-urls https://10.1.80.91:2380 \
--initial-advertise-peer-urls https://10.1.80.91:2380 \
--initial-cluster s1=https://10.1.80.91:2380,s2=https://10.1.80.92:2380,s3=https://10.1.80.93:2380 \
--initial-cluster-token tkn \
--initial-cluster-state new \
--client-cert-auth \
--trusted-ca-file ${HOME}/certs/root-ca.pem \
--cert-file ${HOME}/certs/etcd.pem \
--key-file ${HOME}/certs/etcd-key.pem \
--peer-client-cert-auth \
--peer-trusted-ca-file ${HOME}/certs/root-ca.pem \
--peer-cert-file ${HOME}/certs/etcd.pem \
--peer-key-file ${HOME}/certs/etcd-key.pem
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
/opt/etcd3/etcd --name s2 \
--data-dir /tmp/etcd/s2 \
--listen-client-urls https://10.1.80.92:2379 \
--advertise-client-urls https://10.1.80.92:2379 \
--listen-peer-urls https://10.1.80.92:2380 \
--initial-advertise-peer-urls https://10.1.80.92:2380 \
--initial-cluster s1=https://10.1.80.91:2380,s2=https://10.1.80.92:2380,s3=https://10.1.80.93:2380 \
--initial-cluster-token tkn \
--initial-cluster-state new \
--client-cert-auth \
--trusted-ca-file ${HOME}/certs/root-ca.pem \
--cert-file ${HOME}/certs/etcd.pem \
--key-file ${HOME}/certs/etcd-key.pem \
--peer-client-cert-auth \
--peer-trusted-ca-file ${HOME}/certs/root-ca.pem \
--peer-cert-file ${HOME}/certs/etcd.pem \
--peer-key-file ${HOME}/certs/etcd-key.pem
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
/opt/etcd3/etcd --name s3 \
--data-dir /tmp/etcd/s3 \
--listen-client-urls https://10.1.80.93:2379 \
--advertise-client-urls https://10.1.80.93:2379 \
--listen-peer-urls https://10.1.80.93:2380 \
--initial-advertise-peer-urls https://10.1.80.93:2380 \
--initial-cluster s1=https://10.1.80.91:2380,s2=https://10.1.80.92:2380,s3=https://10.1.80.93:2380 \
--initial-cluster-token tkn \
--initial-cluster-state new \
--client-cert-auth \
--trusted-ca-file ${HOME}/certs/root-ca.pem \
--cert-file ${HOME}/certs/etcd.pem \
--key-file ${HOME}/certs/etcd-key.pem \
--peer-client-cert-auth \
--peer-trusted-ca-file ${HOME}/certs/root-ca.pem \
--peer-cert-file ${HOME}/certs/etcd.pem \
--peer-key-file ${HOME}/certs/etcd-key.pem

–initial-cluster-token tkn “tkn”可以替换成你自己需要的值

检查etcd集群状态

1
2
3
4
5
6
ETCDCTL_API=3 /opt/etcd3/etcdctl \
--endpoints 10.1.80.91:2379,10.1.80.92:2379,10.1.80.93:2379 \
--cacert ${HOME}/certs/root-ca.pem \
--cert ${HOME}/certs/etcd.pem \
--key ${HOME}/certs/etcd-key.pem \
endpoint health

systemd运行方式(推荐)

s1

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
# after transferring certs to remote machines
mkdir -p ${HOME}/certs
cp /tmp/certs/* ${HOME}/certs


# make sure etcd process has write access to this directory
# remove this directory if the cluster is new; keep if restarting etcd
# rm -rf /tmp/etcd/s1



# to write service file for etcd
cat > /tmp/s1.service <<EOF
[Unit]
Description=etcd
Documentation=https://github.com/coreos/etcd
Conflicts=etcd.service
Conflicts=etcd2.service

[Service]
Type=notify
Restart=always
RestartSec=5s
LimitNOFILE=40000
TimeoutStartSec=0

ExecStart=/opt/etcd3/etcd --name s1 \
--data-dir /tmp/etcd/s1 \
--listen-client-urls https://10.1.80.91:2379 \
--advertise-client-urls https://10.1.80.91:2379 \
--listen-peer-urls https://10.1.80.91:2380 \
--initial-advertise-peer-urls https://10.1.80.91:2380 \
--initial-cluster s1=https://10.1.80.91:2380,s2=https://10.1.80.92:2380,s3=https://10.1.80.93:2380 \
--initial-cluster-token tkn \
--initial-cluster-state new \
--client-cert-auth \
--trusted-ca-file ${HOME}/certs/root-ca.pem \
--cert-file ${HOME}/certs/etcd.pem \
--key-file ${HOME}/certs/etcd-key.pem \
--peer-client-cert-auth \
--peer-trusted-ca-file ${HOME}/certs/root-ca.pem \
--peer-cert-file ${HOME}/certs/etcd.pem \
--peer-key-file ${HOME}/certs/etcd-key.pem

[Install]
WantedBy=multi-user.target
EOF
sudo mv /tmp/s1.service /etc/systemd/system/s1.service



# to start service
sudo systemctl daemon-reload
sudo systemctl enable s1.service
sudo systemctl start s1.service

# cat s1.service
sudo systemctl cat s1.service

# to get logs from service
sudo systemctl status s1.service -l --no-pager
sudo journalctl -u s1.service -l --no-pager|less
sudo journalctl -f -u s1.service

# to stop service
sudo systemctl stop s1.service
sudo systemctl disable s1.service
1
# to start service

s2

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
# after transferring certs to remote machines
mkdir -p ${HOME}/certs
cp /tmp/certs/* ${HOME}/certs


# make sure etcd process has write access to this directory
# remove this directory if the cluster is new; keep if restarting etcd
# rm -rf /tmp/etcd/s2



# to write service file for etcd
cat > /tmp/s2.service <<EOF
[Unit]
Description=etcd
Documentation=https://github.com/coreos/etcd
Conflicts=etcd.service
Conflicts=etcd2.service

[Service]
Type=notify
Restart=always
RestartSec=5s
LimitNOFILE=40000
TimeoutStartSec=0

ExecStart=/opt/etcd3/etcd --name s2 \
--data-dir /tmp/etcd/s2 \
--listen-client-urls https://10.1.80.92:2379 \
--advertise-client-urls https://10.1.80.92:2379 \
--listen-peer-urls https://10.1.80.92:2380 \
--initial-advertise-peer-urls https://10.1.80.92:2380 \
--initial-cluster s1=https://10.1.80.91:2380,s2=https://10.1.80.92:2380,s3=https://10.1.80.93:2380 \
--initial-cluster-token tkn \
--initial-cluster-state new \
--client-cert-auth \
--trusted-ca-file ${HOME}/certs/root-ca.pem \
--cert-file ${HOME}/certs/etcd.pem \
--key-file ${HOME}/certs/etcd-key.pem \
--peer-client-cert-auth \
--peer-trusted-ca-file ${HOME}/certs/root-ca.pem \
--peer-cert-file ${HOME}/certs/etcd.pem \
--peer-key-file ${HOME}/certs/etcd-key.pem

[Install]
WantedBy=multi-user.target
EOF
sudo mv /tmp/s2.service /etc/systemd/system/s2.service



# to start service
sudo systemctl daemon-reload
sudo systemctl enable s2.service
sudo systemctl start s2.service

# cat s2.service
sudo systemctl cat s2.service

# to get logs from service
sudo systemctl status s2.service -l --no-pager
sudo journalctl -u s2.service -l --no-pager|less
sudo journalctl -f -u s2.service

# to stop service
sudo systemctl stop s2.service
sudo systemctl disable s2.service

s3

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
# after transferring certs to remote machines
mkdir -p ${HOME}/certs
cp /tmp/certs/* ${HOME}/certs


# make sure etcd process has write access to this directory
# remove this directory if the cluster is new; keep if restarting etcd
# rm -rf /tmp/etcd/s3



# to write service file for etcd
cat > /tmp/s3.service <<EOF
[Unit]
Description=etcd
Documentation=https://github.com/coreos/etcd
Conflicts=etcd.service
Conflicts=etcd2.service

[Service]
Type=notify
Restart=always
RestartSec=5s
LimitNOFILE=40000
TimeoutStartSec=0

ExecStart=/opt/etcd3/etcd --name s3 \
--data-dir /tmp/etcd/s3 \
--listen-client-urls https://10.1.80.93:2379 \
--advertise-client-urls https://10.1.80.93:2379 \
--listen-peer-urls https://10.1.80.93:2380 \
--initial-advertise-peer-urls https://10.1.80.93:2380 \
--initial-cluster s1=https://10.1.80.91:2380,s2=https://10.1.80.92:2380,s3=https://10.1.80.93:2380 \
--initial-cluster-token tkn \
--initial-cluster-state new \
--client-cert-auth \
--trusted-ca-file ${HOME}/certs/root-ca.pem \
--cert-file ${HOME}/certs/etcd.pem \
--key-file ${HOME}/certs/etcd-key.pem \
--peer-client-cert-auth \
--peer-trusted-ca-file ${HOME}/certs/root-ca.pem \
--peer-cert-file ${HOME}/certs/etcd.pem \
--peer-key-file ${HOME}/certs/etcd-key.pem

[Install]
WantedBy=multi-user.target
EOF
sudo mv /tmp/s3.service /etc/systemd/system/s3.service



# to start service
sudo systemctl daemon-reload
sudo systemctl enable s3.service
sudo systemctl start s3.service

# cat s3.service
sudo systemctl cat s3.service

# to get logs from service
sudo systemctl status s3.service -l --no-pager
sudo journalctl -u s3.service -l --no-pager|less
sudo journalctl -f -u s3.service

# to stop service
sudo systemctl stop s3.service
sudo systemctl disable s3.service

Check status:

1
2
3
4
5
6
ETCDCTL_API=3 /opt/etcd3/etcdctl \
--endpoints 10.1.80.91:2379,10.1.80.92:2379,10.1.80.93:2379 \
--cacert ${HOME}/certs/root-ca.pem \
--cert ${HOME}/certs/etcd.pem \
--key ${HOME}/certs/etcd-key.pem \
endpoint health

注意切换到指定用户

若结果如下,表示集群状态正常

1
2
3
10.1.80.92:2379 is healthy: successfully committed proposal: took = 9.184858ms
10.1.80.93:2379 is healthy: successfully committed proposal: took = 8.681243ms
10.1.80.91:2379 is healthy: successfully committed proposal: took = 11.855811ms

参考

etcd Labs

cfssl ca证书有效期修改

Ubuntu 20.04上配置BIND作为私有网络DNS服务器

发表于 2022-10-10 |
字数统计: 3,394

准备及说明

假设某公司有多个数据中心,数据中心1以内部使用idc1.meeleet.com,数据中心2使用idc2.meeleet.com,数据中心3使用idc3.meeleet.com。我们这里以数据中心3内部需要搭建私网DNS服务器为例来讲,idc3.meeleet.com(具体实施过程中,按读者实际情况进行修改)

  • DNS服务器
私网IP地址 简称 是否为主DNS服务器
10.1.80.220 ns1 是(主DNS服务器)
10.1.80.221 ns2 否(备DNS服务器)
  • DNS客户端
私网IP地址 简称
10.1.80.91 s91
10.1.80.92 s92

步骤1 在DNS服务器上安装Bind

在主DNS服务器及备DNS服务器上安装BIND

ON ns1 ns2

1
sudo apt install bind9 bind9utils bind9-doc

修改启动参数

1
sudo vi /etc/default/named

OPTIONS参数最后添加-4,使用Ipv4模式

1
2
3
4
5
6
#
# run resolvconf?
RESOLVCONF=no

# startup options for the server
OPTIONS="-u bind -4"

重启bind

1
sudo systemctl restart bind9

步骤2 配置主DNS服务器

on ns1

配置Options文件

打开named.conf.options并修改

1
sudo vi /etc/bind/named.conf.options
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
acl "trusted" {
10.1.80.220; # ns1
10.1.80.221; # ns2
10.1.80.0/24; # trusted DNS clients network
};

options {
directory "/var/cache/bind";

recursion yes; # enables recursive queries
allow-recursion { trusted; }; # allows recursive queries from "trusted" clients
allow-query { trusted; }; # allows queries from "trusted" clients
listen-on { 10.1.80.220; }; # ns1 private IP address - listen on private network only
allow-transfer { none; }; # disable zone transfers by default

forwarders {
223.5.5.5;
223.6.6.6;
114.114.114.114;
};

dnssec-validation auto;

listen-on-v6 { any; };
};

注意forwards配置块,有三个ip地址,其中223.5.5.5,223.6.6.6是阿里云公共DNS,114.114.114.114是电信提供公共的DNS。forwards可以为无法直接连接到外网的机器提供域名解析服务。

完成之后,保存并关闭named.conf.options文件。上面的配置指定只有您自己的服务器(受信任”trusted”的服务器)能够查询您的DNS服务器的外部域。

配置named.conf.local

on ns1

打开named.conf.local,用vi来编辑

1
sudo vi /etc/bind/named.conf.local

正向、反向解析,指定配置文件路径

1
2
3
4
5
6
7
8
9
10
11
12

zone "idc3.meeleet.com" {
type primary;
file "/etc/bind/zones/db.idc3.meeleet.com"; # zone file path
allow-transfer { 10.1.80.221; }; # ns2 private IP address - secondary
};

zone "80.1.10.in-addr.arpa" {
type primary;
file "/etc/bind/zones/db.10.1.80"; # 10.1.80.0/24 subnet
allow-transfer { 10.1.80.221; }; # ns2 private IP address - secondary
};

创建正向zone文件

on ns1

创建转发区域文件存放目录

1
mkdir -p /etc/bind/zones

复制db.local并修改

1
sudo cp /etc/bind/db.local /etc/bind/zones/db.idc3.meeleet.com
1
2
3
4
5
6
7
8
9
10
11
12
13
14
;
; BIND data file for local loopback interface
;
$TTL 604800
@ IN SOA localhost. root.localhost. (
2 ; Serial
604800 ; Refresh
86400 ; Retry
2419200 ; Expire
604800 ) ; Negative Cache TTL
;
@ IN NS localhost. ; delete this line
@ IN A 127.0.0.1 ; delete this line
@ IN AAAA ::1 ; delete this line

将localhost替换为idc3.meeleet.com,删除含delete this line标记的行

文件最后添加上DNS服务器NS记录

1
2
3
4
5
6

...

; name servers - NS records
IN NS ns1.idc3.meeleet.com.
IN NS ns2.idc3.meeleet.com.

给这个域内的主机添加A记录

1
2
3
4
5
6
7
8
9
...

; name servers - A records
ns1.idc3.meeleet.com. IN A 10.1.80.220
ns2.idc3.meeleet.com. IN A 10.1.80.221

; 10.1.80.0/24 - A records
s91.idc3.meeleet.com. IN A 10.1.80.91
s92.idc3.meeleet.com. IN A 10.1.80.92

每次编辑区域文件时,需要在重新启动命名进程之前增加Serial值。在这里,将其增加为3,注意增加Serial值非常重要!非常重要!非常重要!最终zone配置文件内容如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

; BIND data file for local loopback interface
;
$TTL 604800
@ IN SOA idc3.meeleet.com. root.idc3.meeleet.com. (
3 ; Serial
604800 ; Refresh
86400 ; Retry
2419200 ; Expire
604800 ) ; Negative Cache TTL
;

; name servers - NS records
IN NS ns1.idc3.meeleet.com.
IN NS ns2.idc3.meeleet.com.





; name servers - A records
ns1.idc3.meeleet.com. IN A 10.1.80.220
ns2.idc3.meeleet.com. IN A 10.1.80.221

; 10.1.80.0/24 - A records
s91.idc3.meeleet.com. IN A 10.1.80.91
s92.idc3.meeleet.com. IN A 10.1.80.92

保存/etc/bind/zones/db.idc3.meeleet.com

创建反向zone文件

1
sudo cp /etc/bind/db.127 /etc/bind/zones/db.10.1.80
1
2
3
4
5
6
7
8
9
10
$TTL    604800
@ IN SOA localhost. root.localhost. (
1 ; Serial
604800 ; Refresh
86400 ; Retry
2419200 ; Expire
604800 ) ; Negative Cache TTL
;
@ IN NS localhost. ; delete this line
1.0.0 IN PTR localhost. ; delete this line

删除含delete this line标记的行

最终

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
$TTL    604800
@ IN SOA ns1.idc3.meeleet.com. root.idc3.meeleet.com. (
3 ; Serial
604800 ; Refresh
86400 ; Retry
2419200 ; Expire
604800 ) ; Negative Cache TTL
;


; name servers - NS records
IN NS ns1.idc3.meeleet.com.
IN NS ns2.idc3.meeleet.com.

; PTR Records
220 IN PTR ns1.idc3.meeleet.com. ; 10.1.80.220
221 IN PTR ns2.idc3.meeleet.com. ; 10.1.80.221
91 IN PTR s91.idc3.meeleet.com. ; 10.1.80.91
92 IN PTR s92.idc3.meeleet.com. ; 10.1.80.92

检查配置文件是否正确

1
2
3
4
sudo named-checkzone idc3.meeleet.com /etc/bind/zones/db.idc3.meeleet.com

zone idc3.meeleet.com/IN: loaded serial 2
OK
1
2
3
sudo named-checkzone 80.1.10.in-addr.arpa /etc/bind/zones/db.10.1.80
zone 80.1.10.in-addr.arpa/IN: loaded serial 2
OK

检查没问题,可以重启bind9

1
sudo systemctl restart bind9

如果有使用ufw,则需要让ufw允许bind9(如果有启用ufw的话)

1
sudo ufw allow Bind9

步骤3 配置备DNS服务器

多数情况下搞一套主备DNS服务器还是很有必要的,这样的话当主备DNS服务器不可用时,备DNS服务器也可以接受DNS查询请求并作出响应。幸运的是,配置一个备DNS服务器比配置主DNS服务器简单多了。

on ns2

编辑named.conf.options文件

1
sudo vi /etc/bind/named.conf.options

在文件的顶部,添加包含所有受信任服务器的私有IP地址的ACL。(也支持添加单独的ip,这里我为了方便,直接用网段来表示,这样就不用一个个IP写出来了)

1
2
3
4
5
6
7
8
9
acl "trusted" {
10.1.80.220; # ns1
10.1.80.221; # ns2
10.1.80.0/24; # trusted DNS clients network
};

options {

. . .

directory指令下方,添加如下行

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
. . .

recursion yes; # enables recursive queries
allow-recursion { trusted; }; # allows recursive queries from "trusted" clients
allow-query { trusted; }; # allows queries from "trusted" clients
listen-on { 10.1.80.221; }; # ns2 private IP address
allow-transfer { none; }; # disable zone transfers by default

forwarders {
223.5.5.5;
223.6.6.6;
114.114.114.114;
};

. . .

保存并关闭named.conf.options文件。这个文件应该与ns1的named.conf.options文件相同,只是它应该被配置为监听ns2的私有IP地址。

最终named.conf.options文件内容如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
acl "trusted" {
10.1.80.220; # ns1
10.1.80.221; # ns2
10.1.80.0/24; # trusted DNS clients network
};
options {
directory "/var/cache/bind";

recursion yes; # enables recursive queries
allow-recursion { trusted; }; # allows recursive queries from "trusted" clients
allow-query { trusted; }; # allows queries from "trusted" clients
listen-on { 10.1.80.221; }; # ns2 private IP address
allow-transfer { none; }; # disable zone transfers by default

forwarders {
223.5.5.5;
223.6.6.6;
114.114.114.114;
};

dnssec-validation auto;

listen-on-v6 { any; };
};

现在编辑named.conf.loca文件

1
sudo vi /etc/bind/named.conf.local

在主DNS服务器上定义与主DNS服务器对应的辅助区域。注意,该类型是slave,文件不包含路径,并且有一个master指令,该指令应该设置为主DNS服务器的私有IP地址。如果您在主DNS服务器中定义了多个反向区域,请确保将它们全部添加到这里:

1
2
3
4
5
6
7
8
9
10
11
12
13


zone "idc3.meeleet.com" {
type slave;
file "db.idc3.meeleet.com";
masters { 10.1.80.220; }; # ns1 private IP
};

zone "80.1.10.in-addr.arpa" {
type slave;
file "db.10.1.80";
masters { 10.1.80.220; }; # ns1 private IP
};

注意:考虑到它们的负面含义,DigitalOcean倾向于尽可能避免使用“主人”和“奴隶”等术语。在Bind的最新版本中,可以使用primaries而不是masters,并将备服务器的类型定义为secondary服务器而不是slave服务器。然而,从默认的Ubuntu 20.04存储库中安装的BIND版本(如步骤1中所述)将无法识别这些选项,这意味着除非升级,否则您将不得不使用包含较少的术语。

保存并关闭named.conf.local文件
运行以下命令检查配置文件有效性

1
sudo named-checkconf

如果命令执行反回没有任何错误,那么就可以重启bind

1
sudo systemctl restart bind9

然后通过修改防火墙规则使得允许DNS客户端的连接到此DNS服务器

1
sudo ufw allow Bind9

到现在为止,DNS主备服务器就搭建好了

从ns1传输到ns2的区域文件放在 ns2的/var/cache/bind/目录

1
2
ls /var/cache/bind/
db.10.1.80 db.idc3.meeleet.com managed-keys.bind managed-keys.bind.jnl

步骤4 配置DNS客户端

在受信任ACL中的所有服务器都可以查询您的DNS服务器之前,必须将每个服务器配置为使用ns1和ns2作为DNS服务器。

假设您的客户端服务器运行Ubuntu,您需要找到与您的专用网络相关联的设备。可以使用ip address命令查询私有子网。在每台客户机上运行以下命令,将突出显示的子网替换为您自己的子网。

1
2
3
4
5
ip address show to 10.1.80.0/24

2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
inet 10.1.80.91/24 brd 10.1.80.255 scope global enp0s3
valid_lft forever preferred_lft forever

本例中私网网卡名为enp0s3。本节中的示例将把enp0s3作为私有网络(10.1.80.0/24网段)网卡,不过您应该根据您的实际情况更改这些示例,以反映您自己的服务器的私有网卡。

修改/etc/netplan/00-installer-config.yaml给s91,s92的相应网卡设置ns1和ns2 DNS服务器的地址,以及search域

1
vi /etc/netplan/00-installer-config.yaml

s91

1
2
3
4
5
6
7
8
9
10
11
12
network:
ethernets:
enp0s3:
addresses:
- 10.1.80.91/24
gateway4: 10.1.80.254
nameservers:
addresses: [10.1.80.220,10.1.80.221]
search: [idc3.meeleet.com]
enp0s8:
dhcp4: true
version: 2

s92

1
2
3
4
5
6
7
8
9
10
11
12
13
14
network:
ethernets:
enp0s3:
addresses:
- 10.1.80.92/24
gateway4: 10.1.80.254
nameservers:
addresses:
- 10.1.80.220
- 10.1.80.221
search: [idc3.meeleet.com]
enp0s8:
dhcp4: true
version: 2

DNS服务器列表network->ethernets->enp0s3->nameservers->addresses两种配置方式都可以。search的作用就是就去ping s91时,原来当访问的域名(s91)不能被DNS解析时,resolver会将该域名加上search指定的参数,也就是s91.idc3.meeleet.com,重新请求DNS,直到被正确解析或试完search指定的列表为止。

保存并关闭

使用netplan try尝试新配置,如果有问netplay会在倒计时后自动回滚配置。

现在,检查系统的DNS解析器,以确定您的DNS配置是否已应用:

1
sudo systemd-resolve --status

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Global
LLMNR setting: no
MulticastDNS setting: no
DNSOverTLS setting: no
DNSSEC setting: no
DNSSEC supported: no
DNSSEC NTA: 10.in-addr.arpa
16.172.in-addr.arpa
168.192.in-addr.arpa
17.172.in-addr.arpa
18.172.in-addr.arpa
19.172.in-addr.arpa
20.172.in-addr.arpa
21.172.in-addr.arpa
22.172.in-addr.arpa
23.172.in-addr.arpa
24.172.in-addr.arpa
25.172.in-addr.arpa
26.172.in-addr.arpa
27.172.in-addr.arpa
28.172.in-addr.arpa
29.172.in-addr.arpa
30.172.in-addr.arpa
31.172.in-addr.arpa
corp
d.f.ip6.arpa
home
internal
intranet
lan
local
private
test

Link 3 (enp0s8)
Current Scopes: DNS
DefaultRoute setting: yes
LLMNR setting: yes
MulticastDNS setting: no
DNSOverTLS setting: no
DNSSEC setting: no
DNSSEC supported: no
Current DNS Server: 223.5.5.5
DNS Servers: 223.5.5.5
223.6.6.6

Link 2 (enp0s3)
Current Scopes: DNS
DefaultRoute setting: yes
LLMNR setting: yes
MulticastDNS setting: no
DNSOverTLS setting: no
DNSSEC setting: no
DNSSEC supported: no
Current DNS Server: 223.5.5.5
DNS Servers: 223.5.5.5
223.6.6.6
10.1.80.200
10.1.80.201
DNS Domain: idc3.meeleet.com

步骤5 测试客户端

步骤6 维护DNS记录

现在你已拥有了工作在内网的DNS服务器,你需要维护DNS记录以便支持你的服务器环境。

添加新主机DNS记录

任何时候你要添加新主机到你的服务器环境(在同一个数据中心),添加一个新主机的DNS记录,需要以下步骤:

主域名服务器

  • 正向区域文件:为新主机添加A记录,并增加Serial值

  • 反向区域文件:为新主机增加PTR记录,并增加Serial值

  • 添加新的主机私网IP地址到受信任的ACL列表(named.conf.options)

测试配置文件:

1
2
3
sudo named-checkconf
sudo named-checkzone idc3.meeleet.com /etc/bind/zones/db.idc3.meeleet.com
sudo named-checkzone 80.1.10.in-addr.arpa /etc/bind/zones/db.10.1.80

辅域名服务器

  • 添加新的主机私网IP地址到受信任的ACL列表(named.conf.options)

检查配置文件语法是否正确

1
sudo named-checkconf

然后重载bind:

1
sudo systemctl reload bind9

删除主机DNS记录

如果你要删除某个主机的DNS记录,只需要删除之前添加的该主机的DNS记录信息(就是与之前添加主机DNS记录的步骤相反)

参考

How To Configure BIND as a Private Network DNS Server on Ubuntu 20.04

如何在CentOS-7上将BIND配置为专用网络DNS服务器

ISC BIND9 - 最详细、最认真的从零开始的 BIND 9 - DNS服务搭建及其原理讲解

Bind9安装设置指南

LVS高可用实战之DR模式(CentOS 7)

发表于 2022-10-09 |
字数统计: 3,134

环境

VirtualBox虚拟机3台,使用的网络有桥接网卡、网络地址转换(NAT)

桥接网卡网段是10.1.80.0/24,网络地址转换(NAT)是为了能访问internet以便安装软件

节点 IP VIP
ds1 10.1.80.91 enp0s3
ds2 10.1.80.92 enp0s3
rs1 10.1.80.93 lo:0
rs2 10.1.80.95 lo:0

VIP地址为10.1.80.90

RS的RIP可以使用私网或公网IP, 文中DS与RS用的是同一网段。实际场景,一般情况下RS的RIP与VIP是不同网段的。

LVS常用名词

  • Load Balancer:负载均衡器,运行LVS负责负载均衡的服务器
  • DS:Director Server,指的是前端负载均衡器节点,也就是运行LVS的服务器;
  • RS:Real Server,后端真实的工作服务器;
  • VIP:Virtual Server IP,向外部直接面向用户请求,作为用户请求的目标的IP地址,一般也是DS的外部IP地址;
  • DIP:Director Server IP,主要用于和内部主机通讯的IP地址,一般也是DS的内部IP地址;
  • RIP:Real Server IP,后端服务器的IP地址;
  • CIP:Client IP,访问客户端的IP地址;

Director Server

ds1

安装ipvsadm

1
sudo yum install -y ipvsadm

安装keepalived

1
sudo yum install -y keepalived

备份keepalived.conf配置文件

1
sudo cp /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.bak

1
sudo vi /etc/keepalived/keepalived.conf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50

global_defs {
router_id LVS_DEVEL
}

vrrp_instance VI_1 {
state MASTER
interface enp0s3
virtual_router_id 51
priority 101
advert_int 1

# 如果两节点的上联交换机禁用了组播,则采用 vrrp 单播通告的方式
# unicast_src_ip 10.1.80.91
# unicast_peer {
# 10.1.80.92
# }

authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
10.1.80.90
}
}

virtual_server 10.1.80.90 80 {
delay_loop 6
lb_algo wrr
lb_kind DR
persistence_timeout 5
protocol TCP

real_server 10.1.80.93 80 {
weight 100
TCP_CHECK {
connect_timeout 3
}
}


real_server 10.1.80.95 80 {
weight 100
TCP_CHECK {
connect_timeout 3
}
}

}

ds1作为MASTER优先级比ds2高,ds1设置priority 101,ds2设置priority 100

重启keepalived

1
sudo systemctl restart keepalived

防火墙开放80端口

1
sudo firewall-cmd --zone=public --add-port=80/tcp --permanent

防火墙允许vrrp的组播

1
2
3
sudo firewall-cmd --direct --permanent --add-rule ipv4 filter INPUT 0 --in-interface enp0s3 --destination 224.0.0.18 --protocol vrrp -j ACCEPT

sudo firewall-cmd --direct --permanent --add-rule ipv4 filter OUTPUT 0 --out-interface enp0s3 --destination 224.0.0.18 --protocol vrrp -j ACCEPT

防火墙重新加载

1
sudo firewall-cmd --reload

ds2

安装ipvsadm

1
sudo yum install -y ipvsadm

安装keepalived

1
sudo yum install -y keepalived

备份keepalived.conf配置文件

1
sudo cp /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.bak

1
sudo vi /etc/keepalived/keepalived.conf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50

global_defs {
router_id LVS_DEVEL
}

vrrp_instance VI_1 {
state BACKUP
interface enp0s3
virtual_router_id 51
priority 100
advert_int 1

# 如果两节点的上联交换机禁用了组播,则采用 vrrp 单播通告的方式
# unicast_src_ip 10.1.80.92
# unicast_peer {
# 10.1.80.91
# }

authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
10.1.80.90
}
}

virtual_server 10.1.80.90 80 {
delay_loop 6
lb_algo wrr
lb_kind DR
persistence_timeout 5
protocol TCP

real_server 10.1.80.93 80 {
weight 100
TCP_CHECK {
connect_timeout 3
}
}


real_server 10.1.80.95 80 {
weight 100
TCP_CHECK {
connect_timeout 3
}
}

}

重启keepalived

1
sudo systemctl restart keepalived

防火墙开放80端口

1
sudo firewall-cmd --zone=public --add-port=80/tcp --permanent

防火墙允许vrrp的组播

1
2
3
sudo firewall-cmd --direct --permanent --add-rule ipv4 filter INPUT 0 --in-interface enp0s3 --destination 224.0.0.18 --protocol vrrp -j ACCEPT

sudo firewall-cmd --direct --permanent --add-rule ipv4 filter OUTPUT 0 --out-interface enp0s3 --destination 224.0.0.18 --protocol vrrp -j ACCEPT

防火墙重新加载

1
sudo firewall-cmd --reload

Real Server

rs1

1
sudo mkdir -p /opt/scripts
1
sudo vi /opt/scripts/lvs_rs.sh

脚本如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#!/bin/bash 
VIP=10.1.80.90
case "$1" in
start)
ifconfig lo:0 $VIP netmask 255.255.255.255 broadcast $VIP
/sbin/route add -host $VIP dev lo:0
echo "1" >/proc/sys/net/ipv4/conf/lo/arp_ignore
echo "2" >/proc/sys/net/ipv4/conf/lo/arp_announce
echo "1" >/proc/sys/net/ipv4/conf/all/arp_ignore
echo "2" >/proc/sys/net/ipv4/conf/all/arp_announce
sysctl -p >/dev/null 2>&1
echo "RealServer Start OK"
;;
stop)
ifconfig lo:0 down
route del $VIP >/dev/null 2>&1
echo "0" >/proc/sys/net/ipv4/conf/lo/arp_ignore
echo "0" >/proc/sys/net/ipv4/conf/lo/arp_announce
echo "0" >/proc/sys/net/ipv4/conf/all/arp_ignore
echo "0" >/proc/sys/net/ipv4/conf/all/arp_announce
echo "RealServer Stoped"
;;
*)
echo "Usage: $0 {start|stop}"
exit 1
esac
exit 0

1
2
sudo chmod +x /opt/scripts/lvs_rs.sh 
sudo /opt/scripts/lvs_rs.sh start

安装tengine

1
2
3
4
5
6
7
sudo mkdir -p /tmp/software
sudo cd /tmp/software
sudo curl -O https://tengine.taobao.org/download/tengine-2.3.3.tar.gz
sudo tar -xzvf tengine-2.3.3.tar.gz
sudo cd tengine-2.3.3
sudo yum install -y gcc make pcre-devel openssl-devel
sudo ./configure --prefix=/usr/local/tengine

修改index.html

1
sudo vi /usr/local/tengine/html/index.html

1
2
3
其余省略...
<h1>Welcome to tengine! I'm rs1</h1>
其余省略...

启动tengine

1
sudo /usr/local/tengine/sbin/nginx

防火墙设置开放80端口

1
2
firewall-cmd --zone=public --add-port=80/tcp --permanent
firewall-cmd --reload

rs2

1
sudo mkdir -p /opt/scripts
1
sudo vi /opt/scripts/lvs_rs.sh

脚本如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#!/bin/bash 
VIP=10.1.80.90
case "$1" in
start)
ifconfig lo:0 $VIP netmask 255.255.255.255 broadcast $VIP
/sbin/route add -host $VIP dev lo:0
echo "1" >/proc/sys/net/ipv4/conf/lo/arp_ignore
echo "2" >/proc/sys/net/ipv4/conf/lo/arp_announce
echo "1" >/proc/sys/net/ipv4/conf/all/arp_ignore
echo "2" >/proc/sys/net/ipv4/conf/all/arp_announce
sysctl -p >/dev/null 2>&1
echo "RealServer Start OK"
;;
stop)
ifconfig lo:0 down
route del $VIP >/dev/null 2>&1
echo "0" >/proc/sys/net/ipv4/conf/lo/arp_ignore
echo "0" >/proc/sys/net/ipv4/conf/lo/arp_announce
echo "0" >/proc/sys/net/ipv4/conf/all/arp_ignore
echo "0" >/proc/sys/net/ipv4/conf/all/arp_announce
echo "RealServer Stoped"
;;
*)
echo "Usage: $0 {start|stop}"
exit 1
esac
exit 0

1
2
sudo chmod +x /opt/scripts/lvs_rs.sh 
sudo /opt/scripts/lvs_rs.sh start

安装tengine

1
2
3
4
5
6
7
sudo mkdir -p /tmp/software
sudo cd /tmp/software
sudo curl -O https://tengine.taobao.org/download/tengine-2.3.3.tar.gz
sudo tar -xzvf tengine-2.3.3.tar.gz
sudo cd tengine-2.3.3
sudo yum install -y gcc make pcre-devel openssl-devel
sudo ./configure --prefix=/usr/local/tengine

修改index.html

1
sudo vi /usr/local/tengine/html/index.html

1
2
3
其余省略...
<h1>Welcome to tengine! I'm rs2</h1>
其余省略...

启动tengine

1
sudo /usr/local/tengine/sbin/nginx

防火墙设置开放80端口

1
2
firewall-cmd --zone=public --add-port=80/tcp --permanent
firewall-cmd --reload

测试keepalived vrrp报文

1
sudo tcpdump -vvv -n -i enp0s3 host 224.0.0.18

在ds1上执行(ds2,rs1,rs2都可以的)

1
2
3
4
14:43:04.752665 IP (tos 0xc0, ttl 255, id 130, offset 0, flags [none], proto VRRP (112), length 40)
10.1.80.91 > 224.0.0.18: vrrp 10.1.80.91 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 101, authtype simple, intvl 1s, length 20, addrs: 10.1.80.90 auth "1111^@^@^@^@"
14:43:05.753059 IP (tos 0xc0, ttl 255, id 131, offset 0, flags [none], proto VRRP (112), length 40)
10.1.80.91 > 224.0.0.18: vrrp 10.1.80.91 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 101, authtype simple, intvl 1s, length 20, addrs: 10.1.80.90 auth "1111^@^@^@^@"

可以看到10.1.80.91在发组播消息,10.1.80.91证明我是MASTER,vip(10.1.80.90)在ds1的enp0s3上。
然后把ds1的keepalived关了,sudo systemctl stop keepalived

观察报文变化

1
2
3
4
14:45:08.488190 IP (tos 0xc0, ttl 255, id 253, offset 0, flags [none], proto VRRP (112), length 40)
10.1.80.91 > 224.0.0.18: vrrp 10.1.80.91 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 0, authtype simple, intvl 1s, length 20, addrs: 10.1.80.90 auth "1111^@^@^@^@"
14:45:09.120876 IP (tos 0xc0, ttl 255, id 8772, offset 0, flags [none], proto VRRP (112), length 40)
10.1.80.92 > 224.0.0.18: vrrp 10.1.80.92 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20, addrs: 10.1.80.90 auth "1111^@^@^@^@"

发现已经由10.1.80.91改为10.1.80.92发组播消息,vip(10.1.80.90)飘到了ds2的enp0s3上。

重新启动ds1上的keepalived,再次观察报文

1
2
3
4
14:46:59.177236 IP (tos 0xc0, ttl 255, id 8880, offset 0, flags [none], proto VRRP (112), length 40)
10.1.80.92 > 224.0.0.18: vrrp 10.1.80.92 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20, addrs: 10.1.80.90 auth "1111^@^@^@^@"
14:46:59.177350 IP (tos 0xc0, ttl 255, id 1, offset 0, flags [none], proto VRRP (112), length 40)
10.1.80.91 > 224.0.0.18: vrrp 10.1.80.91 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 101, authtype simple, intvl 1s, length 20, addrs: 10.1.80.90 auth "1111^@^@^@^@"

发现由10.1.80.92改为了10.1.80.91发组播消息了,vip(10.1.80.90)飘回了ds1的enp0s3上。

验证

  • 打开浏览器,访问http://10.1.80.90

页面展示 Welcome to tengine! I’m rs1

ds1上查看sudo ipvsadm -ln

1
2
3
4
5
6
7
[jaychang@ds01 ~]# sudo ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 10.1.80.90:80 wrr persistent 50
-> 10.1.80.93:80 Route 100 0 0
-> 10.1.80.95:80 Route 100 2 0
  • 停掉rs1的nginx后

ds1上查看/var/log/keepalived.log(如果未设置过,默认是在/var/log/message)

1
2
3
4
Oct  8 15:50:14 ds01 Keepalived_healthcheckers[15444]: TCP connection to [10.1.80.93]:80 failed.
Oct 8 15:50:15 ds01 Keepalived_healthcheckers[15444]: TCP connection to [10.1.80.93]:80 failed.
Oct 8 15:50:15 ds01 Keepalived_healthcheckers[15444]: Check on service [10.1.80.93]:80 failed after 1 retry.
Oct 8 15:50:15 ds01 Keepalived_healthcheckers[15444]: Removing service [10.1.80.93]:80 from VS [10.1.80.90]:80

从日志内容可以看到10.1.80.93已被剔除了

ds1上查看sudo ipvsadm -ln

1
2
3
4
5
6
[jaychang@ds01 ~]$ sudo ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 10.1.80.90:80 wrr persistent 50
-> 10.1.80.95:80 Route 100 4 0

刷新浏览器页面显示Welcome to tengine! I’m rs2

  • 重新开启rs1的nginx

观察/var/log/keepalived.log(如果未设置过,默认是在/var/log/message)

1
2
Oct  8 16:11:41 ds01 Keepalived_healthcheckers[15444]: TCP connection to [10.1.80.93]:80 success.
Oct 8 16:11:41 ds01 Keepalived_healthcheckers[15444]: Adding service [10.1.80.93]:80 to VS [10.1.80.90]:80

ds1上查看sudo ipvsadm -ln

1
2
3
4
5
6
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 10.1.80.90:80 wrr persistent 50
-> 10.1.80.93:80 Route 100 0 0
-> 10.1.80.95:80 Route 100 2 0

发现10.1.80.93已经重新被加入到服务列表中了

Q&A

  • ds1,ds2上的LVS规则如何删除?
1
sudo ipvsadm --clear
  • 如何删掉ds1,ds2上的VIP?
1
sudo ifconfig enp0s3:0 down
  • 如何查看LVS统计信息?
1
sudo ipvsadm -ln --stats
  • 如何查看LVS连接条目?
1
sudo ipvsadm -lnc
  • ds1,ds2上用tcpdump抓包观察组播消息?
1
sudo tcpdump -vvv -n -i enp0s3 host 224.0.0.18

或

1
sudo tcpdump -vvv -n -i enp0s3 vrrp

  • 指定访问来源主机,指定抓包端口号,网卡名称
    1
    sudo tcpdump -i enp0s3 tcp port 80 and host 10.1.80.62

查看详细一点信息

1
sudo tcpdump -vvv -n -i enp0s3 tcp port 80 and host 10.1.80.62
  • 用iptables如何允许组播?
1
2
sudo iptables -I INPUT -i enp0s3 -d 224.0.0.18 -p vrrp -j ACCEPT
sudo iptables -I OUTPUT -o enp0s3 -d 224.0.0.18 -p vrrp -j ACCEPT
  • firewall防火墙如何开放端口,关闭端口?

开放80端口访问

1
sudo firewall-cmd --zone=public --add-port=80/tcp --permanent

关闭80端口访问

1
sudo firewall-cmd --zone=public --remove-port=80/tcp --permanent

记得要reload或restart下,防火墙才会生效

1
sudo firewall-cmd --reload

或

1
sudo systemctl reload firewalld

或

1
sudo systemctl restart firewalld

推荐使用第一种方式重新加载防火墙

  • 查看防火墙开放的端口
1
sudo firewall-cmd --list-port
  • 脑裂的原因有哪些?
  1. 主备的virtual_router_id 不一致
  2. 交换机不允许组播(可以改成单播方式)
  3. 防火墙未允许组播

注意事项

  • 注意要使得rs1,rs2上的配置重启生效的话,可以将/opt/scripts/lvs_rs.sh作为启动脚本(推荐)
1
sudo echo "/opt/scripts/lvs_rs.sh start" >> /etc/rc.d/rc.local

在centos7中,/etc/rc.d/rc.local文件的权限被降低了

1
sudo chmod +x /etc/rc.d/rc.local

  • 或使用以下方法

要使得rs1,rs2上的lo:0重启后仍生效的话,要创建一个ifcfg-lo:0文件

1
sudo cp /etc/sysconfig/network-scripts/ifcfg-lo /etc/sysconfig/network-scripts/ifcfg-lo:0
1
sudo vi /etc/sysconfig/network-scripts/ifcfg-lo:0

内容如下

1
2
3
4
5
DEVICE=lo:0
IPADDR=10.1.80.90
NETMASK=255.255.255.255
BROADCAST=10.1.80.90
ONBOOT=yes

重启网络

1
sudo service network restart

1
sudo vi /etc/sysctl.conf

添加以下配置

1
2
3
4
5
6
net.ipv4.conf.all.arp_ignore = 1
net.ipv4.conf.default.arp_ignore = 1
net.ipv4.conf.lo.arp_ignore = 1
net.ipv4.conf.all.arp_announce = 2
net.ipv4.conf.default.arp_announce = 2
net.ipv4.conf.lo.arp_announce = 2

刷新文件

1
sudo sysctl -p

路由设置添加开启自启动
echo “route add -host 10.1.80.90 dev lo:0” >> /etc/rc.local

参考

LVS三种模式的工作原理

CentOS7搭建LVS负载均衡器

LVS High Availability - The Keepalived Solution

keepalived组播故障排查

centos7 keepalived 主备通信 防火墙vrrp 协议

CentOS 7 配置 Keepalived 实现双机热备

VRRP技术原理与注意点

LVS+Keepalived+Nginx高可用实现

CentOS7安装kubernetes单master节点(v1.19.x)

发表于 2020-12-26 |
字数统计: 5,212

配置要求

对于 Kubernetes 初学者,在搭建K8S集群时,推荐在阿里云或腾讯云采购如下配置:(您也可以使用自己的虚拟机、私有云等您最容易获得的 Linux 环境)

  • 至少2台 2核4G 的服务器
  • Cent OS 7.6 / 7.7 / 7.8

安装后软件版本

  • Kubernetes v1.19.x

    • calico 3.13.1

    • nginx-ingress 1.5.5

  • Docker 19.03.14

关于二进制安装

kubeadm 是 Kubernetes 官方支持的安装方式,“二进制” 不是。本文档采用 kubernetes.io 官方推荐的 kubeadm 工具安装 kubernetes 集群。

检查CentOS / hostname

1
2
3
4
5
6
7
8
9
10
11
# 在 master 节点和 worker 节点都要执行
cat /etc/redhat-release

# 此处 hostname 的输出将会是该机器在 Kubernetes 集群中的节点名字
# 不能使用 localhost 作为节点的名字
hostname

# 请使用 lscpu 命令,核对 CPU 信息
# Architecture: x86_64 本安装文档不支持 arm 架构
# CPU(s): 2 CPU 内核数量不能低于 2
lscpu

操作系统兼容性

CentOS版本 本文档是否兼容 备注
7.8 是 已验证
7.7 是 已验证
7.6 是 已验证
7.5 否 已证实会出现 kubelet 无法启动的问题
7.4 否 已证实会出现 kubelet 无法启动的问题
7.3 否 已证实会出现 kubelet 无法启动的问题
7.2 否 已证实会出现 kubelet 无法启动的问题

修改 hostname

如果您需要修改 hostname,可执行如下指令:

1
2
3
4
5
6
# 修改 hostname
hostnamectl set-hostname your-new-host-name
# 查看修改结果
hostnamectl status
# 设置 hostname 解析
echo "127.0.0.1 $(hostname)" >> /etc/hosts

检查网络

在所有节点执行命令

1
2
3
4
5
6
7
8
9
10
11
12
13
14
[root@demo-master-a-1 ~]$ ip route show
default via 172.21.0.1 dev eth0
169.254.0.0/16 dev eth0 scope link metric 1002
172.21.0.0/20 dev eth0 proto kernel scope link src 172.21.0.12

[root@demo-master-a-1 ~]$ ip address
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:16:3e:12:a4:1b brd ff:ff:ff:ff:ff:ff
inet 172.17.216.80/20 brd 172.17.223.255 scope global dynamic eth0
valid_lft 305741654sec preferred_lft 305741654sec

安装docker及kubelet

使用 root 身份在所有节点执行如下代码,以安装软件:

  • docker
  • nfs-utils
  • kubectl / kubeadm / kubelet

快速安装

请将脚本最后的 1.19.6 替换成您需要的版本号, 脚本中间的 v1.19.x 不要替换

docker hub 镜像请根据自己网络的情况任选一个

  • 第四行为腾讯云 docker hub 镜像
  • 第六行为DaoCloud docker hub 镜像
  • 第八行为华为云 docker hub 镜像
  • 第十行为阿里云 docker hub 镜像
1
2
3
4
5
6
7
8
9
10
11
# 在 master 节点和 worker 节点都要执行
# 最后一个参数 1.19.5 用于指定 kubenetes 版本,支持所有 1.19.x 版本的安装
# 腾讯云 docker hub 镜像
# export REGISTRY_MIRROR="https://mirror.ccs.tencentyun.com"
# DaoCloud 镜像
# export REGISTRY_MIRROR="http://f1361db2.m.daocloud.io"
# 华为云镜像
# export REGISTRY_MIRROR="https://05f073ad3c0010ea0f4bc00b7105ec20.mirror.swr.myhuaweicloud.com"
# 阿里云 docker hub 镜像
export REGISTRY_MIRROR=https://registry.cn-hangzhou.aliyuncs.com
curl -sSL https://kuboard.cn/install-script/v1.19.x/install_kubelet.sh | sh -s 1.19.6

手动安装

手动执行以下代码,结果与快速安装相同。请将脚本第79行(已高亮)的 ${1} 替换成您需要的版本号,例如 1.19.6

docker hub 镜像请根据自己网络的情况任选一个

  • 第四行为腾讯云 docker hub 镜像
  • 第六行为DaoCloud docker hub 镜像
  • 第八行为阿里云 docker hub 镜像
1
2
3
4
5
6
7
8
# 在 master 节点和 worker 节点都要执行
# 最后一个参数 1.19.5 用于指定 kubenetes 版本,支持所有 1.19.x 版本的安装
# 腾讯云 docker hub 镜像
# export REGISTRY_MIRROR="https://mirror.ccs.tencentyun.com"
# DaoCloud 镜像
# export REGISTRY_MIRROR="http://f1361db2.m.daocloud.io"
# 阿里云 docker hub 镜像
export REGISTRY_MIRROR=https://registry.cn-hangzhou.aliyuncs.com

可以指定安装docker-ce的版本,用yum list docker-ce –showduplicates|sort -r 查看所有版本

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
#!/bin/bash

# 在 master 节点和 worker 节点都要执行

# 安装 docker
# 参考文档如下
# https://docs.docker.com/install/linux/docker-ce/centos/
# https://docs.docker.com/install/linux/linux-postinstall/

# 卸载旧版本
yum remove -y docker \
docker-client \
docker-client-latest \
docker-ce-cli \
docker-common \
docker-latest \
docker-latest-logrotate \
docker-logrotate \
docker-selinux \
docker-engine-selinux \
docker-engine

# 设置 yum repository
yum install -y yum-utils \
device-mapper-persistent-data \
lvm2
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo

# 安装并启动 docker
yum install -y docker-ce-19.03.14 docker-ce-cli-19.03.14

mkdir /etc/docker || true

cat > /etc/docker/daemon.json <<EOF
{
"registry-mirrors": ["${REGISTRY_MIRROR}"],
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2",
"storage-opts": [
"overlay2.override_kernel_check=true"
]
}
EOF

mkdir -p /etc/systemd/system/docker.service.d

# Restart Docker
systemctl daemon-reload
systemctl enable docker
systemctl restart docker

# 安装 nfs-utils
# 必须先安装 nfs-utils 才能挂载 nfs 网络存储
yum install -y nfs-utils
yum install -y wget

# 关闭 防火墙
systemctl stop firewalld
systemctl disable firewalld

# 关闭 SeLinux
setenforce 0
sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config

# 关闭 swap
swapoff -a
yes | cp /etc/fstab /etc/fstab_bak
cat /etc/fstab_bak |grep -v swap > /etc/fstab

# 修改 /etc/sysctl.conf
# 如果有配置,则修改
sed -i "s#^net.ipv4.ip_forward.*#net.ipv4.ip_forward=1#g" /etc/sysctl.conf
sed -i "s#^net.bridge.bridge-nf-call-ip6tables.*#net.bridge.bridge-nf-call-ip6tables=1#g" /etc/sysctl.conf
sed -i "s#^net.bridge.bridge-nf-call-iptables.*#net.bridge.bridge-nf-call-iptables=1#g" /etc/sysctl.conf
sed -i "s#^net.ipv6.conf.all.disable_ipv6.*#net.ipv6.conf.all.disable_ipv6=1#g" /etc/sysctl.conf
sed -i "s#^net.ipv6.conf.default.disable_ipv6.*#net.ipv6.conf.default.disable_ipv6=1#g" /etc/sysctl.conf
sed -i "s#^net.ipv6.conf.lo.disable_ipv6.*#net.ipv6.conf.lo.disable_ipv6=1#g" /etc/sysctl.conf
sed -i "s#^net.ipv6.conf.all.forwarding.*#net.ipv6.conf.all.forwarding=1#g" /etc/sysctl.conf
# 可能没有,追加
echo "net.ipv4.ip_forward = 1" >> /etc/sysctl.conf
echo "net.bridge.bridge-nf-call-ip6tables = 1" >> /etc/sysctl.conf
echo "net.bridge.bridge-nf-call-iptables = 1" >> /etc/sysctl.conf
echo "net.ipv6.conf.all.disable_ipv6 = 1" >> /etc/sysctl.conf
echo "net.ipv6.conf.default.disable_ipv6 = 1" >> /etc/sysctl.conf
echo "net.ipv6.conf.lo.disable_ipv6 = 1" >> /etc/sysctl.conf
echo "net.ipv6.conf.all.forwarding = 1" >> /etc/sysctl.conf
# 执行命令以应用
sysctl -p

# 配置K8S的yum源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

# 卸载旧版本
yum remove -y kubelet kubeadm kubectl

# 安装kubelet、kubeadm、kubectl
# 将 ${1} 替换为 kubernetes 版本号,例如 1.19.6
yum install -y kubelet-${1} kubeadm-${1} kubectl-${1}

# 重启 docker,并启动 kubelet
systemctl daemon-reload
systemctl restart docker
systemctl enable kubelet && systemctl start kubelet

docker version

WARNING

如果此时执行 systemctl status kubelet 命令,将得到 kubelet 启动失败的错误提示,请忽略此错误,因为必须完成后续步骤中 kubeadm init 的操作,kubelet 才能正常启动

初始化master节点

关于初始化时用到的环境变量

  • APISERVER_NAME 不能是 master 的 hostname
  • APISERVER_NAME 必须全为小写字母、数字、小数点,不能包含减号
  • POD_SUBNET 所使用的网段不能与 master节点/worker节点 所在的网段重叠。该字段的取值为一个 CIDR 值,如果您对 CIDR 这个概念还不熟悉,请仍然执行 export POD_SUBNET=10.100.0.1/16 命令,不做修改

快速初始化

请将脚本最后的 1.19.6 替换成您需要的版本号, 脚本中间的 v1.19.x 不要替换

1
2
3
4
5
6
7
8
9
10
# 只在 master 节点执行
# 替换 x.x.x.x 为 master 节点实际 IP(请使用内网 IP)
# export 命令只在当前 shell 会话中有效,开启新的 shell 窗口后,如果要继续安装过程,请重新执行此处的 export 命令
export MASTER_IP=x.x.x.x
# 替换 apiserver.demo 为 您想要的 dnsName
export APISERVER_NAME=apiserver.demo
# Kubernetes 容器组所在的网段,该网段安装完成后,由 kubernetes 创建,事先并不存在于您的物理网络中
export POD_SUBNET=10.100.0.1/16
echo "${MASTER_IP} ${APISERVER_NAME}" >> /etc/hosts
curl -sSL https://kuboard.cn/install-script/v1.19.x/init_master.sh | sh -s 1.19.6

手动初始化

手动执行以下代码,结果与快速初始化相同。请将脚本第21行(已高亮)的 ${1} 替换成您需要的版本号,例如 1.19.6

1
2
3
4
5
6
7
8
9
# 只在 master 节点执行
# 替换 x.x.x.x 为 master 节点的内网IP
# export 命令只在当前 shell 会话中有效,开启新的 shell 窗口后,如果要继续安装过程,请重新执行此处的 export 命令
export MASTER_IP=x.x.x.x
# 替换 apiserver.demo 为 您想要的 dnsName
export APISERVER_NAME=apiserver.demo
# Kubernetes 容器组所在的网段,该网段安装完成后,由 kubernetes 创建,事先并不存在于您的物理网络中
export POD_SUBNET=10.100.0.1/16
echo "${MASTER_IP} ${APISERVER_NAME}" >> /etc/hosts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
#!/bin/bash

# 只在 master 节点执行

# 脚本出错时终止执行
set -e

if [ ${#POD_SUBNET} -eq 0 ] || [ ${#APISERVER_NAME} -eq 0 ]; then
echo -e "\033[31;1m请确保您已经设置了环境变量 POD_SUBNET 和 APISERVER_NAME \033[0m"
echo 当前POD_SUBNET=$POD_SUBNET
echo 当前APISERVER_NAME=$APISERVER_NAME
exit 1
fi


# 查看完整配置选项 https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2
rm -f ./kubeadm-config.yaml
cat <<EOF > ./kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: v${1}
imageRepository: registry.aliyuncs.com/k8sxio
controlPlaneEndpoint: "${APISERVER_NAME}:6443"
networking:
serviceSubnet: "10.96.0.0/16"
podSubnet: "${POD_SUBNET}"
dnsDomain: "cluster.local"
EOF

# kubeadm init
# 根据您服务器网速的情况,您需要等候 3 - 10 分钟
kubeadm config images pull --config=kubeadm-config.yaml
kubeadm init --config=kubeadm-config.yaml --upload-certs

# 配置 kubectl
rm -rf /root/.kube/
mkdir /root/.kube/
cp -i /etc/kubernetes/admin.conf /root/.kube/config

# 安装 calico 网络插件
# 参考文档 https://docs.projectcalico.org/v3.13/getting-started/kubernetes/self-managed-onprem/onpremises
echo "安装calico-3.13.1"
rm -f calico-3.13.1.yaml
wget https://kuboard.cn/install-script/calico/calico-3.13.1.yaml
kubectl apply -f calico-3.13.1.yaml

感谢 https://github.com/zhangguanzhang/google_containers (opens new window)提供最新的 google_containers 国内镜像

初始化如果出错请看

  • 请确保您的环境符合 安装docker及kubelet 中所有勾选框的要求

  • 请确保您使用 root 用户执行初始化命令

  • 不能下载 kubernetes 的 docker 镜像

    • 安装文档中,默认使用阿里云的 docker 镜像仓库,然而,有时候,该镜像会罢工

    • 如碰到不能下载 docker 镜像的情况,请尝试手工初始化,并修改手工初始化脚本里的第22行(文档中已高亮)为:

      1
      imageRepository: gcr.azk8s.cn/google-containers
  • 检查环境变量,执行如下命令

1
echo MASTER_IP=${MASTER_IP} && echo APISERVER_NAME=${APISERVER_NAME} && echo POD_SUBNET=${POD_SUBNET}
  • 请验证如下几点:
    • 环境变量 MASTER_IP 的值应该为 master 节点的 内网IP,如果不是,请重新 export
    • APISERVER_NAME 不能是 master 的 hostname
    • APISERVER_NAME 必须全为小写字母、数字、小数点,不能包含减号
    • POD_SUBNET 所使用的网段不能与 master节点/worker节点 所在的网段重叠。该字段的取值为一个 CIDR 值,如果您对 CIDR 这个概念还不熟悉,请仍然执行 export POD_SUBNET=10.100.0.1/16 命令,不做修改
  • 重新初始化 master 节点前,请先执行 kubeadm reset -f 操作
  • ImagePullBackoff / Pending

  • 如果 kubectl get pod -n kube-system -o wide 的输出结果中出现 ImagePullBackoff 或者长时间处于 Pending 的情况,请参考 查看镜像抓取进度

  • ContainerCreating

  • 如果 kubectl get pod -n kube-system -o wide 的输出结果中某个 Pod 长期处于 ContainerCreating、PodInitializing 或 Init:0/3 的状态,可以尝试:

    • 查看该 Pod 的状态,例如:
    1
    kubectl describe pod kube-flannel-ds-amd64-8l25c -n kube-system

    如果输出结果中,最后一行显示的是 Pulling image,请耐心等待,或者参考 查看镜像抓取进度

1
Normal  Pulling    44s   kubelet, k8s-worker-02  Pulling image "quay.io/coreos/flannel:v0.12.0-amd64"
  • 将该 Pod 删除,系统会自动重建一个新的 Pod,例如:
1
kubectl delete pod kube-flannel-ds-amd64-8l25c -n kube-system

检查master初始化结果

1
2
3
4
5
6
7
# 只在 master 节点执行

# 执行如下命令,等待 3-10 分钟,直到所有的容器组处于 Running 状态
watch kubectl get pod -n kube-system -o wide

# 查看 master 节点初始化结果
kubectl get nodes -o wide

如果出错请看

  • ImagePullBackoff / Pending

    • 如果 kubectl get pod -n kube-system -o wide的输出结果中出现 ImagePullBackoff 或者长时间处于 Pending 的情况,请参考 查看镜像抓取进度
  • ContainerCreating

    • 如果kubectl get pod -n kube-system -o wide的输出结果中某个 Pod 长期处于 ContainerCreating、PodInitializing 或 Init:0/3 的状态,可以尝试:

      • 查看该 Pod 的状态,例如:

        1
        kubectl describe pod kube-flannel-ds-amd64-8l25c -n kube-system

        如果输出结果中,最后一行显示的是 Pulling image,请耐心等待,或者参考查看镜像抓取进度

        1
              Normal  Pulling    44s   kubelet, k8s-worker-02  Pulling image "quay.io/coreos/flannel:v0.12.0-amd64"
      • 将该 Pod 删除,系统会自动重建一个新的 Pod,例如:

        1
        kubectl delete pod kube-flannel-ds-amd64-8l25c -n kube-system

初始化worker节点

获得join参数

在master节点上执行

1
2
# 只在 master 节点执行
kubeadm token create --print-join-command

可获取kubeadm join 命令及参数,如下所示

1
2
# kubeadm token create 命令的输出
kubeadm join apiserver.demo:6443 --token mpfjma.4vjjg8flqihor4vt --discovery-token-ca-cert-hash sha256:6f7a8e40a810323672de5eee6f4d19aa2dbdb38411845a1bf5dd63485c43d303

有效时间

该 token 的有效时间为 2 个小时,2小时内,您可以使用此 token 初始化任意数量的 worker 节点。

worker节点执行初始化

针对所有的 worker 节点执行

1
2
3
4
5
6
7
8
9
# 只在 worker 节点执行
# 替换 x.x.x.x 为 master 节点的内网 IP
export MASTER_IP=x.x.x.x
# 替换 apiserver.demo 为初始化 master 节点时所使用的 APISERVER_NAME
export APISERVER_NAME=apiserver.demo
echo "${MASTER_IP} ${APISERVER_NAME}" >> /etc/hosts

# 替换为 master 节点上 kubeadm token create 命令的输出
kubeadm join apiserver.demo:6443 --token mpfjma.4vjjg8flqihor4vt --discovery-token-ca-cert-hash sha256:6f7a8e40a810323672de5eee6f4d19aa2dbdb38411845a1bf5dd63485c43d303

检查初始化结果

在master节点上执行

1
2
# 只在 master 节点执行
kubectl get nodes -o wide

输出结果如下所示:

1
2
3
4
5
[root@demo-master-a-1 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
demo-master-a-1 Ready master 5m3s v1.19.x
demo-worker-a-1 Ready <none> 2m26s v1.19.x
demo-worker-a-2 Ready <none> 3m56s v1.19.x

常见错误原因

经常在群里提问为什么 join 不成功的情况大致有这几种:

worker 节点不能访问 apiserver

在worker节点执行以下语句可验证worker节点是否能访问 apiserver

1
curl -ik https://apiserver.demo:6443

如果不能,请在 master 节点上验证

1
curl -ik https://localhost:6443

正常输出结果如下所示:

1
2
3
4
5
6
7
8
9
10
11
12
HTTP/1.1 403 Forbidden
Cache-Control: no-cache, private
Content-Type: application/json
X-Content-Type-Options: nosniff
Date: Fri, 15 Nov 2019 04:34:40 GMT
Content-Length: 233

{
"kind": "Status",
"apiVersion": "v1",
"metadata": {
...

可能原因

  • 如果 master 节点能够访问 apiserver、而 worker 节点不能,则请检查自己的网络设置
    • /etc/hosts 是否正确设置?
    • 是否有安全组或防火墙的限制?

worker 节点默认网卡

Kubelet使用的 IP 地址与 master 节点可互通(无需 NAT 映射),且没有防火墙、安全组隔离

如果你使用 vmware 或 virtualbox 创建虚拟机用于 K8S 学习,可以尝试 NAT 模式的网络,而不是桥接模式的网络

解决方法

移除worker节点并重试

WARNING

正常情况下,您无需移除 worker 节点,如果添加到集群出错,您可以移除 worker 节点,再重新尝试添加

在准备移除的 worker 节点上执行

1
2
# 只在 worker 节点执行
kubeadm reset -f

在 master 节点 demo-master-a-1 上执行

1
2
# 只在 master 节点执行
kubectl get nodes -o wide

如果列表中没有您要移除的节点,则忽略下一个步骤

1
2
# 只在 master 节点执行
kubectl delete node demo-worker-x-x

TIP

  • 将 demo-worker-x-x 替换为要移除的 worker 节点的名字
  • worker 节点的名字可以通过在节点 demo-master-a-1 上执行 kubectl get nodes 命令获得

安装Ingress Controller

快速初始化

在master节点上执行

1
2
# 只在 master 节点执行
kubectl apply -f https://kuboard.cn/install-script/v1.19.x/nginx-ingress.yaml

nginx-ingress.yaml内容

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
# 如果打算用于生产环境,请参考 https://github.com/nginxinc/kubernetes-ingress/blob/v1.5.5/docs/installation.md 并根据您自己的情况做进一步定制

apiVersion: v1
kind: Namespace
metadata:
name: nginx-ingress

---
apiVersion: v1
kind: ServiceAccount
metadata:
name: nginx-ingress
namespace: nginx-ingress

---
apiVersion: v1
kind: Secret
metadata:
name: default-server-secret
namespace: nginx-ingress
type: Opaque
data:
tls.crt: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUN2akNDQWFZQ0NRREFPRjl0THNhWFhEQU5CZ2txaGtpRzl3MEJBUXNGQURBaE1SOHdIUVlEVlFRRERCWk8KUjBsT1dFbHVaM0psYzNORGIyNTBjbTlzYkdWeU1CNFhEVEU0TURreE1qRTRNRE16TlZvWERUSXpNRGt4TVRFNApNRE16TlZvd0lURWZNQjBHQTFVRUF3d1dUa2RKVGxoSmJtZHlaWE56UTI5dWRISnZiR3hsY2pDQ0FTSXdEUVlKCktvWklodmNOQVFFQkJRQURnZ0VQQURDQ0FRb0NnZ0VCQUwvN2hIUEtFWGRMdjNyaUM3QlBrMTNpWkt5eTlyQ08KR2xZUXYyK2EzUDF0azIrS3YwVGF5aGRCbDRrcnNUcTZzZm8vWUk1Y2Vhbkw4WGM3U1pyQkVRYm9EN2REbWs1Qgo4eDZLS2xHWU5IWlg0Rm5UZ0VPaStlM2ptTFFxRlBSY1kzVnNPazFFeUZBL0JnWlJVbkNHZUtGeERSN0tQdGhyCmtqSXVuektURXUyaDU4Tlp0S21ScUJHdDEwcTNRYzhZT3ExM2FnbmovUWRjc0ZYYTJnMjB1K1lYZDdoZ3krZksKWk4vVUkxQUQ0YzZyM1lma1ZWUmVHd1lxQVp1WXN2V0RKbW1GNWRwdEMzN011cDBPRUxVTExSakZJOTZXNXIwSAo1TmdPc25NWFJNV1hYVlpiNWRxT3R0SmRtS3FhZ25TZ1JQQVpQN2MwQjFQU2FqYzZjNGZRVXpNQ0F3RUFBVEFOCkJna3Foa2lHOXcwQkFRc0ZBQU9DQVFFQWpLb2tRdGRPcEsrTzhibWVPc3lySmdJSXJycVFVY2ZOUitjb0hZVUoKdGhrYnhITFMzR3VBTWI5dm15VExPY2xxeC9aYzJPblEwMEJCLzlTb0swcitFZ1U2UlVrRWtWcitTTFA3NTdUWgozZWI4dmdPdEduMS9ienM3bzNBaS9kclkrcUI5Q2k1S3lPc3FHTG1US2xFaUtOYkcyR1ZyTWxjS0ZYQU80YTY3Cklnc1hzYktNbTQwV1U3cG9mcGltU1ZmaXFSdkV5YmN3N0NYODF6cFErUyt1eHRYK2VBZ3V0NHh3VlI5d2IyVXYKelhuZk9HbWhWNThDd1dIQnNKa0kxNXhaa2VUWXdSN0diaEFMSkZUUkk3dkhvQXprTWIzbjAxQjQyWjNrN3RXNQpJUDFmTlpIOFUvOWxiUHNoT21FRFZkdjF5ZytVRVJxbStGSis2R0oxeFJGcGZnPT0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
tls.key: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFcEFJQkFBS0NBUUVBdi91RWM4b1JkMHUvZXVJTHNFK1RYZUprckxMMnNJNGFWaEMvYjVyYy9XMlRiNHEvClJOcktGMEdYaVN1eE9ycXgrajlnamx4NXFjdnhkenRKbXNFUkJ1Z1B0ME9hVGtIekhvb3FVWmcwZGxmZ1dkT0EKUTZMNTdlT1l0Q29VOUZ4amRXdzZUVVRJVUQ4R0JsRlNjSVo0b1hFTkhzbysyR3VTTWk2Zk1wTVM3YUhudzFtMApxWkdvRWEzWFNyZEJ6eGc2clhkcUNlUDlCMXl3VmRyYURiUzc1aGQzdUdETDU4cGszOVFqVUFQaHpxdmRoK1JWClZGNGJCaW9CbTVpeTlZTW1hWVhsMm0wTGZzeTZuUTRRdFFzdEdNVWozcGJtdlFmazJBNnljeGRFeFpkZFZsdmwKMm82MjBsMllxcHFDZEtCRThCay90elFIVTlKcU56cHpoOUJUTXdJREFRQUJBb0lCQVFDZklHbXowOHhRVmorNwpLZnZJUXQwQ0YzR2MxNld6eDhVNml4MHg4Mm15d1kxUUNlL3BzWE9LZlRxT1h1SENyUlp5TnUvZ2IvUUQ4bUFOCmxOMjRZTWl0TWRJODg5TEZoTkp3QU5OODJDeTczckM5bzVvUDlkazAvYzRIbjAzSkVYNzZ5QjgzQm9rR1FvYksKMjhMNk0rdHUzUmFqNjd6Vmc2d2szaEhrU0pXSzBwV1YrSjdrUkRWYmhDYUZhNk5nMUZNRWxhTlozVDhhUUtyQgpDUDNDeEFTdjYxWTk5TEI4KzNXWVFIK3NYaTVGM01pYVNBZ1BkQUk3WEh1dXFET1lvMU5PL0JoSGt1aVg2QnRtCnorNTZud2pZMy8yUytSRmNBc3JMTnIwMDJZZi9oY0IraVlDNzVWYmcydVd6WTY3TWdOTGQ5VW9RU3BDRkYrVm4KM0cyUnhybnhBb0dCQU40U3M0ZVlPU2huMVpQQjdhTUZsY0k2RHR2S2ErTGZTTXFyY2pOZjJlSEpZNnhubmxKdgpGenpGL2RiVWVTbWxSekR0WkdlcXZXaHFISy9iTjIyeWJhOU1WMDlRQ0JFTk5jNmtWajJTVHpUWkJVbEx4QzYrCk93Z0wyZHhKendWelU0VC84ajdHalRUN05BZVpFS2FvRHFyRG5BYWkyaW5oZU1JVWZHRXFGKzJyQW9HQkFOMVAKK0tZL0lsS3RWRzRKSklQNzBjUis3RmpyeXJpY05iWCtQVzUvOXFHaWxnY2grZ3l4b25BWlBpd2NpeDN3QVpGdwpaZC96ZFB2aTBkWEppc1BSZjRMazg5b2pCUmpiRmRmc2l5UmJYbyt3TFU4NUhRU2NGMnN5aUFPaTVBRHdVU0FkCm45YWFweUNweEFkREtERHdObit3ZFhtaTZ0OHRpSFRkK3RoVDhkaVpBb0dCQUt6Wis1bG9OOTBtYlF4VVh5YUwKMjFSUm9tMGJjcndsTmVCaWNFSmlzaEhYa2xpSVVxZ3hSZklNM2hhUVRUcklKZENFaHFsV01aV0xPb2I2NTNyZgo3aFlMSXM1ZUtka3o0aFRVdnpldm9TMHVXcm9CV2xOVHlGanIrSWhKZnZUc0hpOGdsU3FkbXgySkJhZUFVWUNXCndNdlQ4NmNLclNyNkQrZG8wS05FZzFsL0FvR0FlMkFVdHVFbFNqLzBmRzgrV3hHc1RFV1JqclRNUzRSUjhRWXQKeXdjdFA4aDZxTGxKUTRCWGxQU05rMXZLTmtOUkxIb2pZT2pCQTViYjhibXNVU1BlV09NNENoaFJ4QnlHbmR2eAphYkJDRkFwY0IvbEg4d1R0alVZYlN5T294ZGt5OEp0ek90ajJhS0FiZHd6NlArWDZDODhjZmxYVFo5MWpYL3RMCjF3TmRKS2tDZ1lCbyt0UzB5TzJ2SWFmK2UwSkN5TGhzVDQ5cTN3Zis2QWVqWGx2WDJ1VnRYejN5QTZnbXo5aCsKcDNlK2JMRUxwb3B0WFhNdUFRR0xhUkcrYlNNcjR5dERYbE5ZSndUeThXczNKY3dlSTdqZVp2b0ZpbmNvVlVIMwphdmxoTUVCRGYxSjltSDB5cDBwWUNaS2ROdHNvZEZtQktzVEtQMjJhTmtsVVhCS3gyZzR6cFE9PQotLS0tLUVORCBSU0EgUFJJVkFURSBLRVktLS0tLQo=

---
kind: ConfigMap
apiVersion: v1
metadata:
name: nginx-config
namespace: nginx-ingress
data:
server-names-hash-bucket-size: "1024"


---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: nginx-ingress
rules:
- apiGroups:
- ""
resources:
- services
- endpoints
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- secrets
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- configmaps
verbs:
- get
- list
- watch
- update
- create
- apiGroups:
- ""
resources:
- pods
verbs:
- list
- apiGroups:
- ""
resources:
- events
verbs:
- create
- patch
- apiGroups:
- extensions
resources:
- ingresses
verbs:
- list
- watch
- get
- apiGroups:
- "extensions"
resources:
- ingresses/status
verbs:
- update
- apiGroups:
- k8s.nginx.org
resources:
- virtualservers
- virtualserverroutes
verbs:
- list
- watch
- get

---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: nginx-ingress
subjects:
- kind: ServiceAccount
name: nginx-ingress
namespace: nginx-ingress
roleRef:
kind: ClusterRole
name: nginx-ingress
apiGroup: rbac.authorization.k8s.io

---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: nginx-ingress
namespace: nginx-ingress
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "9113"
spec:
selector:
matchLabels:
app: nginx-ingress
template:
metadata:
labels:
app: nginx-ingress
spec:
serviceAccountName: nginx-ingress
containers:
- image: nginx/nginx-ingress:1.5.5
name: nginx-ingress
ports:
- name: http
containerPort: 80
hostPort: 80
- name: https
containerPort: 443
hostPort: 443
- name: prometheus
containerPort: 9113
env:
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
args:
- -nginx-configmaps=$(POD_NAMESPACE)/nginx-config
- -default-server-tls-secret=$(POD_NAMESPACE)/default-server-secret
#- -v=3 # Enables extensive logging. Useful for troubleshooting.
#- -report-ingress-status
#- -external-service=nginx-ingress
#- -enable-leader-election
- -enable-prometheus-metrics
#- -enable-custom-resources

卸载ingress congroller

在master节点上执行

1
2
#只在 master 节点执行
kubectl delete -f https://kuboard.cn/install-script/v1.19.x/nginx-ingress.yaml

配置域名解析

将域名 *.demo.yourdomain.com 解析到 demo-worker-a-2 的 IP 地址 z.z.z.z (也可以是 demo-worker-a-1 的地址 y.y.y.y)

验证配置

在浏览器访问 a.demo.yourdomain.com,将得到 404 NotFound 错误页面

提示

许多初学者在安装 Ingress Controller 时会碰到问题,请不要灰心,可暂时跳过 安装 Ingress Controller 这个部分,等您学完 www.kuboard.cn 上 Kubernetes 入门 以及 通过互联网访问您的应用程序 这两部分内容后,再来回顾 Ingress Controller 的安装。

也可以参考 Install Nginx Ingress

问题及注意

重启Kubenetes集群后遇到的问题

Kubernetes集群的设计目标是setup-and-run-forever,然而许多学习者使用自己笔记本上的虚拟机安装K8S集群用于学习,这就必然会出现反复重启集群所在虚拟机的情况。本文针对重启后会出现一些的一些令人困惑的问题做了解释。

  • Worker节点不能启动
    Master 节点的 IP 地址变化,导致 worker 节点不能启动。请重装集群,并确保所有节点都有固定内网 IP 地址。

  • 许多Pod一直Crash或不能正常访问

    1
    kubectl get pods --all-namespaces

重启后会发现许多 Pod 不在 Running 状态,此时,请使用如下命令删除这些状态不正常的 Pod。通常,您的 Pod 如果是使用 Deployment、StatefulSet 等控制器创建的,kubernetes 将创建新的 Pod 作为替代,重新启动的 Pod 通常能够正常工作。

1
kubectl delete pod <pod-name> -n <pod-namespece>

安装过程中的问题

1
2
3
4
5
6
7
8
sysctl -p
net.ipv4.ip_forward = 1
sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-ip6tables: No such file or directory
sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-iptables: No such file or directory
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
net.ipv6.conf.all.forwarding = 1

解决办法

1
modprobe br_netfilter

查看版本

1
kubectl version

QA

怎么查看可安装的docker版本

1
yum list docker-ce --showduplicates|sort -r

参考

https://kuboard.cn/install/install-k8s.html#%E6%A3%80%E6%9F%A5-centos-hostname
https://kubernetes.io/zh/docs/setup/production-environment/tools/kubeadm/troubleshooting-kubeadm/

Kafka安装及入门

发表于 2020-03-15 |
字数统计: 494

下载kafka,cmak

下载好后,分别解压到/opt目录下

这里如果是单机kafka,可以直接用kafka自带的,启动方式为

1
./$KAFKA_HOME/bin/zookeeper-server-start.sh -daemon config/zookeeper.properties

配置kafka

1
2
3
broker.id=0
log.dirs=/tmp/kafka-logs
zookeeper.connect=localhost:2181

大多数情况下,可能会改动的估计就是上述3个配置,由于kafka需要做持久化消息,故这里有一个log.dirs的配置项

启动kafka

1
$ ./bin/kafka-server-start.sh -daemon config/server.properties

不加-daemon则是前台启动,若是在docker中启动,就不用加-daemon

cmak(kafka manager)

依赖的组件说明

cmak 3.x需要依赖jdk 11+ 且zookeeper需要3.5.x版本,我直接用了es 7.x的自带jdk13

所以修改了cmak启动脚本get_java_cmd函数:

1
2
3
4
5
6
7
8
9
10
11
12
# Detect if we should use JAVA_HOME or just try PATH.
get_java_cmd() {
# High-priority override for Jlink images
#if [[ -n "$bundled_jvm" ]]; then
# echo "$bundled_jvm/bin/java"
#elif [[ -n "$JAVA_HOME" ]] && [[ -x "$JAVA_HOME/bin/java" ]]; then
# echo "$JAVA_HOME/bin/java"
#else
# echo "java"
#fi
echo "/opt/elasticsearch-7.6.1/jdk/bin/java"
}

修改配置
修改conf/application.conf

1
2
3
4
kafka-manager.zkhosts="localhost:2181"
kafka-manager.zkhosts=${?ZK_HOSTS}
cmak.zkhosts="localhost:2181"
cmak.zkhosts=${?ZK_HOSTS}

当然也可以不修改application.conf文件,而是提供环境变量ZK_HOSTS

启动cmak

1
./bin/cmak > cmak.out 2>&1 &

添加集群

点添加集群(Add Cluster)
upload successful

填写集群名称与zookeeper地址
upload successful

添加完成查看集群信息,包括zookeepers连接地址,kafka版本,topic数量,broker数量
upload successful

查看broker信息

upload successful

查看、操作topic

之前已经创建过first-topic,topics列表就可以看到了
upload successful

给topic增加分区

upload successful

创建topic
upload successful

PS:更多cmak的功能有待实践

常见错误

1
This application is already running (Or delete /opt/kafka-cmak-3.0.0.4/RUNNING_PID file).

删掉RUNNING_PID重新启动

Kafka入门

创建Topic

1
2
root@ubuntu:/opt/kafka_2.12-2.4.0# ./bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic first-topic
Created topic first-topic.

参考资料

https://segmentfault.com/a/1190000012730949
https://blog.wolfogre.com/posts/kafka-manager-download/

ElasticSearch7.x集群安装部署

发表于 2020-03-07 |
字数统计: 2,367

创建用户

1
2
groupadd es && useradd es -g es -M -s /bin/bash
chown -R es:es /opt/$ES_HOME

系统参数修改

  • vi /etc/sysctl.conf添加如下配置
1
vm.max_map_count=262144

使得重启之后也生效,即永久生效

执行如下命令使得立即生效

1
sysctl –p
  • 修改/etc/security/limits.conf添加以下配置

    1
    2
    es hard nofile 65535
    es soft nofile 65535
  • /etc/systemd/user.conf和/etc/systemd/system.conf分别添加以下配置

    1
    DefaultLimitNOFILE=65535

systemd启动elasticsearch,需要设置这两个文件,否则启动会报如下错误
[1] bootstrap checks failed
[1]: max file descriptors [4096] for elasticsearch process is too low, increase to at least [65535]

解压

1
tar xzvf elasticsearch-7.6.1-linux-x86_64.tar.gz -C /opt/

配置ES

1主2从的配置

  • node-1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
cluster.name: my-es

node.name: node-1
node.master: true
node.data: true

network.host: 0.0.0.0
network.publish_host: 192.168.56.131

discovery.seed_hosts: [ "192.168.56.132","192.168.56.133"]
cluster.initial_master_nodes: ["node-1"]

# 允许跨域,elastic header需要
http.cors.enabled: true
http.cors.allow-origin: "*"
  • node-2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
cluster.name: my-es

node.name: node-2
node.master: false
node.data: true

network.host: 0.0.0.0
network.publish_host: 192.168.56.132

discovery.seed_hosts: ["192.168.56.131","192.168.56.133"]

cluster.initial_master_nodes: ["node-1"]

# 允许跨域,elastic header需要
http.cors.enabled: true
http.cors.allow-origin: "*"
  • node-3
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
cluster.name: my-es

node.name: node-3
node.master: false
node.data: true

network.host: 0.0.0.0
network.publish_host: 192.168.56.133

discovery.seed_hosts: ["192.168.56.131", "192.168.56.132"]

cluster.initial_master_nodes: ["node-1"]

# 允许跨域,elastic header需要
http.cors.enabled: true
http.cors.allow-origin: "*"

生产环境建议network.host: 指定IP

检查集群状态

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
curl -XGET http://192.168.56.131:9200/_cluster/health?pretty

{
cluster_name: "my-es",
status: "green",
timed_out: false,
number_of_nodes: 3,
number_of_data_nodes: 3,
active_primary_shards: 0,
active_shards: 0,
relocating_shards: 0,
initializing_shards: 0,
unassigned_shards: 0,
delayed_unassigned_shards: 0,
number_of_pending_tasks: 0,
number_of_in_flight_fetch: 0,
task_max_waiting_in_queue_millis: 0,
active_shards_percent_as_number: 100
}

开机启动ElasticSearch

方案一 在/etc/init.d目录下

创建名为elasticsearch的文件,内容如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
#!/bin/sh
#description: elasticsearch
### BEGIN INIT INFO
# Provides: elasticsearch
# Required-Start: $all
# Required-Stop:
# Default-Start: 2 3 4 5
# Default-Stop:
# Short-Description: start elasticsearch
### END INIT INFO


case "$1" in
start)
cd /opt/elasticsearch-7.6.1
./bin/elasticsearch -d
!
echo "elasticsearch startup"
;;
stop)
es_pid=`ps aux|grep elasticsearch | grep -v 'grep elasticsearch' | awk '{print $2}'`
kill -9 $es_pid
echo "elasticsearch stopped"
;;
restart)
es_pid=`ps aux|grep elasticsearch | grep -v 'grep elasticsearch' | awk '{print $2}'`
kill -9 $es_pid
echo "elasticsearch stopped"
cd /opt/elasticsearch-7.6.1
./bin/elasticsearch -d
!
echo "elasticsearch startup"
;;
*)
echo "start|stop|restart"
;;
esac

exit $?
1
$ chmod a+x elasticsearch.sh

方案二 在/lib/systemd/system目录下

创建名为elasticsearch.service的文件,内容如下

1
2
3
4
5
6
7
8
9
10
11
12
13
[Unit]
Description=elasticsearch
After=network.target

[Service]
Type=simple
ExecStart=/bin/bash /opt/elasticsearch-7.6.1/bin/elasticsearch
Restart=always
User=es
Group=es
WorkingDirectory=/opt/elasticsearch-7.6.1
[Install]
WantedBy=mutil-user.target

重载配置文件

1
$ sudo systemctl daemon-reload

启用服务

1
$ sudo systemctl enable elasticsearch

启动服务

1
$ sudo systemctl start elasticsearch

如果启动失败,想看下日志

1
2
3
4
5
6
7
8
# 查看状态
$ sudo systemctl status elasticsearch

# 查看日志
$ sudo journalctl -u elasticsearch

# 实时输出最新日志
$ sudo journalctl --follow -u elasticsearch

通过systemctl 启动服务报如下错误

1
2
3
4
5
6
7
Mar 15 10:08:51 ubuntu bash[1308]: ERROR: [1] bootstrap checks failed
Mar 15 10:08:51 ubuntu bash[1308]: [1]: max file descriptors [4096] for elasticsearch process is too low, increase to at least [65535]
Mar 15 10:08:51 ubuntu bash[1308]: ERROR: Elasticsearch did not exit normally - check the logs at /opt/elasticsearch-7.6.1/logs/my-es.log
Mar 15 10:08:51 ubuntu bash[1308]: [2020-03-15T02:08:51,352][INFO ][o.e.n.Node ] [node-1] stopping ...
Mar 15 10:08:51 ubuntu bash[1308]: [2020-03-15T02:08:51,462][INFO ][o.e.n.Node ] [node-1] stopped
Mar 15 10:08:51 ubuntu bash[1308]: [2020-03-15T02:08:51,463][INFO ][o.e.n.Node ] [node-1] closing ...
Mar 15 10:08:51 ubuntu bash[1308]: [2020-03-15T02:08:51,535][INFO ][o.e.n.Node ] [node-1] closed

划重点:通过systemctrl自建elasticsearch系统服务的形式来启动的es,则需要修改/etc/systemd/user.conf,/etc/systemd/system.conf这两个文件,添加以下配置

1
DefaultLimitNOFILE=65536

常见错误

1
2
3
4
5
6
7
8
9
10
11
12
13
org.elasticsearch.bootstrap.StartupException: java.lang.RuntimeException: can not run elasticsearch as root
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:174) ~[elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:161) ~[elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86) ~[elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:125) ~[elasticsearch-cli-7.6.1.jar:7.6.1]
at org.elasticsearch.cli.Command.main(Command.java:90) ~[elasticsearch-cli-7.6.1.jar:7.6.1]
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:126) ~[elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:92) ~[elasticsearch-7.6.1.jar:7.6.1]
Caused by: java.lang.RuntimeException: can not run elasticsearch as root
at org.elasticsearch.bootstrap.Bootstrap.initializeNatives(Bootstrap.java:105) ~[elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:172) ~[elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:349) ~[elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:170) ~[elasticsearch-7.6.1.jar:7.6.1]

这个是最常见的了,不能用root用户来启动elasticsearch

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Exception in thread "main" org.elasticsearch.bootstrap.BootstrapException: java.nio.file.AccessDeniedException: /opt/elasticsearch-7.6.1/config/elasticsearch.keystore
Likely root cause: java.nio.file.AccessDeniedException: /opt/elasticsearch-7.6.1/config/elasticsearch.keystore
at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:90)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116)
at java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:219)
at java.base/java.nio.file.Files.newByteChannel(Files.java:374)
at java.base/java.nio.file.Files.newByteChannel(Files.java:425)
at org.apache.lucene.store.SimpleFSDirectory.openInput(SimpleFSDirectory.java:77)
at org.elasticsearch.common.settings.KeyStoreWrapper.load(KeyStoreWrapper.java:219)
at org.elasticsearch.bootstrap.Bootstrap.loadSecureSettings(Bootstrap.java:234)
at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:305)
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:170)
at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:161)
at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86)
at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:125)
at org.elasticsearch.cli.Command.main(Command.java:90)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:126)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:92)

处理方法 chown -R es:es $ES_HOME/conf

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
2020-03-07 14:50:20,774 main ERROR Unable to invoke factory method in class org.apache.logging.log4j.core.appender.RollingFileAppender for element RollingFile: java.lang.IllegalStateException: No factory method found for class org.apache.logging.log4j.core.appender.RollingFileAppender java.lang.IllegalStateException: No factory method found for class org.apache.logging.log4j.core.appender.RollingFileAppender
at org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.findFactoryMethod(PluginBuilder.java:235)
at org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.build(PluginBuilder.java:135)
at org.apache.logging.log4j.core.config.AbstractConfiguration.createPluginObject(AbstractConfiguration.java:959)
at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:899)
at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:891)
at org.apache.logging.log4j.core.config.AbstractConfiguration.doConfigure(AbstractConfiguration.java:514)
at org.apache.logging.log4j.core.config.AbstractConfiguration.initialize(AbstractConfiguration.java:238)
at org.apache.logging.log4j.core.config.AbstractConfiguration.start(AbstractConfiguration.java:250)
at org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:547)
at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:263)
at org.elasticsearch.common.logging.LogConfigurator.configure(LogConfigurator.java:234)
at org.elasticsearch.common.logging.LogConfigurator.configure(LogConfigurator.java:127)
at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:310)
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:170)
at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:161)
at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86)
at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:125)
at org.elasticsearch.cli.Command.main(Command.java:90)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:126)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:92)

处理方法 chown -R es:es $ES_HOME/logs

1
2
3
4
5
6
7
8
9
10
11
12
13
14
[2020-03-07T14:11:03,119][WARN ][o.e.c.c.Coordinator      ] [node-1] failed to validate incoming join request from node [{node-2}{OTFjHR3wRuyQBMbb9PdGoQ}{GH_dXyYZQIuyjnnBTsELRw}{192.168.56.132}{192.168.56.132:9300}{dil}{ml.machine_memory=3665874944, ml.max_open_jobs=20, xpack.installed=true}]
org.elasticsearch.transport.RemoteTransportException: [node-2][192.168.56.132:9300][internal:cluster/coordination/join/validate]
Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: join validation on cluster state with a different cluster uuid of6I6UasSd209Y_5-YdZ2A than local cluster uuid 4jifFlHcR9K7wQflerzo3g, rejecting
at org.elasticsearch.cluster.coordination.JoinHelper.lambda$new$4(JoinHelper.java:148) ~[elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$1.doRun(SecurityServerTransportInterceptor.java:257) ~[?:?]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler.messageReceived(SecurityServerTransportInterceptor.java:315) ~[?:?]
at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:63) ~[elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.transport.InboundHandler$RequestHandler.doRun(InboundHandler.java:264) ~[elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:692) ~[elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.6.1.jar:7.6.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:830) [?:?]

如果集群所有节点cluster.name设置都一样的,但还是报以上错误,可以将集群内的节点的data目录删掉后重启

附加

目前es没有桌面客户端,不过elasticsearch-head算是比较方便的,用于快速查询下数据。
这里我就给出elastic-head,systemd服务的脚本

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
[Unit]
Description=elasticsearch head
After=network.target

[Service]
Type=simple
Environment=PATH=/opt/node-v12.16.1/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
ExecStart=/opt/node-v12.16.1/bin/npm run start
Restart=always
User=es
Group=es
WorkingDirectory=/opt/elasticsearch-head

[Install]
WantedBy=mutil-user.target

请事先安装好node环境,并在/opt/elasticsearch-head目录,执行好npm i -V安装好相关module

执行如下命令,下次就能开机启动了

1
2
$ systemctl daemon-reload
$ systemctl enable elasticsearch-head

参考

https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-network.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/targz.html
https://www.elastic.co/guide/en/elasticsearch/reference/7.6/docker.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/settings.html
https://www.linuxtechi.com/set-ulimit-file-descriptors-limit-linux-servers/
https://askubuntu.com/questions/1102512/set-ulimit-for-non-root-user
https://superuser.com/questions/1200539/cannot-increase-open-file-limit-past-4096-ubuntu/1200818#1200818
http://www.ruanyifeng.com/blog/2018/03/systemd-timer.html

如何构建SpringBoot应用的Docker镜像

发表于 2019-12-02 |
字数统计: 1,990

title: 如何构建SpringBoot应用的Docker镜像
date: 2019-12-04 22:17:54
categories:

  • SpringBoot
    tags:
  • Docker
  • SpringBoot

背景

SpringBoot大道其行,微服务也是大道其行,越来越多的SpringBoot应用使用Docker容器来运行。既然要用Docker容器运行,那首先得把SpringBoot应用打包成Docker镜像。

构建方法

目前主要的方法有三种:

  • 1.使用dockerfile-maven-plugin插件构建
  • 2.使用Google Jib插件构建
  • 3.手动构建

下面我们一一来介绍,首先来介绍

使用dockerfile-maven-plugin插件构建

项目地址:https://github.com/spotify/dockerfile-maven

项目根目录下创建Dockerfile

以下给出了多个Dockerfile,根据实际情况选择

  1. 简洁版

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    FROM openjdk:8-jdk-alpine
    MAINTAINER jaychang <jaychang1987@gmail.com>
    VOLUME /tmp
    ARG FILE
    ARG APP_NAME
    ADD ${FILE} /app.jar
    ENV JAVA_OPTS="-XX:MaxRAMFraction=2"
    # Configure ustc alpine software source and timezone
    RUN sed -i "s#http://dl-cdn.alpinelinux.org#https://mirrors.aliyun.com#g" /etc/apk/repositories \
    && apk update \
    && apk --no-cache add tzdata curl tini \
    && cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime \
    && echo "Asia/Shanghai" > /etc/timezone \
    && rm -rf /var/cache/apk/*
    # To reduce Tomcat startup time we added a system property pointing to "/dev/urandom" as a source of entropy.
    ENTRYPOINT ["/sbin/tini", "--"]
    CMD ["sh","-c","java $JAVA_OPTS -Djava.security.egd=file:/dev/./urandom -jar app.jar"]
  2. 有带-D的参数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
FROM openjdk:8-jdk-alpine
MAINTAINER jaychang <jaychang1987@gmail.com>
VOLUME /tmp
ARG FILE
ARG APP_NAME
ADD ${FILE} /app.jar
# PINPOINT_VERSION Default 1.6.2 it can be replaced at docker run use -e or docker service -e or environment in docker-compose.yml
ENV PINPOINT_VERSION=1.6.2
ENV JAVA_OPTS="-XX:MaxRAMFraction=2"
ENV PINPOINT_OPTS="-Dpinpoint.agentId=${APP_NAME} -Dpinpoint.applicationName=$APP_NAME"
# Configure ustc alpine software source and timezone
RUN sed -i "s/http:\/\/dl-cdn.alpinelinux.org/https:\/\/mirrors.aliyun.com/g" /etc/apk/repositories \
&& apk update \
&& apk --no-cache add tzdata curl tini \
&& cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime \
&& echo "Asia/Shanghai" > /etc/timezone \
&& rm -rf /var/cache/apk/*
# To reduce Tomcat startup time we added a system property pointing to "/dev/urandom" as a source of entropy.
ENTRYPOINT ["/sbin/tini", "--"]
CMD ["sh","-c","java $JAVA_OPTS -javaagent:/pinpoint_data/pinpoint-agent-${PINPOINT_VERSION}/pinpoint-bootstrap-$PINPOINT_VERSION.jar $PINPOINT_OPTS -Djava.security.egd=file:/dev/./urandom -jar app.jar"]
  1. 带字体版

    常见的场景如excel导出,验证码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
FROM openjdk:8-jdk-alpine
MAINTAINER jaychang <jaychang1987@gmail.com>
VOLUME /tmp
ARG FILE
ARG APP_NAME
ADD ${FILE} /app.jar
ENV JAVA_OPTS="-XX:MaxRAMFraction=2"
# Configure ustc alpine software source and timezone
RUN sed -i "s/http:\/\/dl-cdn.alpinelinux.org/https:\/\/mirrors.aliyun.com/g" /etc/apk/repositories \
&& apk update \
&& apk --no-cache add tzdata curl tini \
&& apk --no-cahce add font-adobe-100dpi ttf-dejavu fontconfig \
&& cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime \
&& echo "Asia/Shanghai" > /etc/timezone \
&& rm -rf /var/cache/apk/*
# To reduce Tomcat startup time we added a system property pointing to "/dev/urandom" as a source of entropy.
ENTRYPOINT ["/sbin/tini", "--"]
CMD ["sh","-c","java $JAVA_OPTS -Djava.security.egd=file:/dev/./urandom -jar app.jar"]

在pom.xml中配置maven插件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
<version>2.1.6.RELEASE</version>
<executions>
<execution>
<goals>
<goal>repackage</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>com.spotify</groupId>
<artifactId>dockerfile-maven-plugin</artifactId>
<version>1.4.12</version>
<executions>
<execution>
<id>build-image</id>
<phase>package</phase>
<goals>
<goal>build</goal>
</goals>
</execution>
<execution>
<id>tag-image-version</id>
<phase>package</phase>
<goals>
<goal>tag</goal>
<goal>push</goal>
</goals>
<configuration>
<tag>${project.version}</tag>
</configuration>
</execution>
<execution>
<id>tag-image-latest</id>
<phase>package</phase>
<goals>
<goal>tag</goal>
<goal>push</goal>
</goals>
<configuration>
<tag>latest</tag>
</configuration>
</execution>
</executions>
<configuration>
<tag>${project.version}</tag>
<repository>${docker.image.prefix}/${project.artifactId}</repository>
<buildArgs>
<FILE>target/${project.build.finalName}.jar</FILE>
<APP_NAME>${project.artifactId}</APP_NAME>
</buildArgs>
</configuration>
</plugin>
</plugins>
</build>

docker.image.prefix可以预先在properties标签中定义好,如果要区分不同环境传到不同的镜像中心可使用profiles标签来定义不同环境的docker.image.prefix

开始构建镜像

前提是要先安装好docker服务

1
mvn clean package -Dmaven.test.skip=true -U

使用Google Jib插件构建

项目地址:https://github.com/GoogleContainerTools/jib

Google Jib的优势是无需事先安装docker,无需定义Dockerfile文件,而且Google Jib可以分层,依赖的jar构成一层,resource构成一层,然后项目本身的classes构建一层。这样如果只是改变项目本身代码,那么构建是非常快的。且推送的镜像也是非常小的,可以大大节省网络流量。

在pom.xml中配置maven插件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.8.1</version>
<configuration>
<source>${java.version}</source>
<target>${java.version}</target>
<encoding>${project.build.sourceEncoding}</encoding>
<annotationProcessorPaths>
<path>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<version>${lombok.version}</version>
</path>
<path>
<groupId>org.mapstruct</groupId>
<artifactId>mapstruct-processor</artifactId>
<version>${mapstruct.version}</version>
</path>
<!-- other annotation processors -->
</annotationProcessorPaths>
<compilerArgs>
<compilerArg>-Amapstruct.defaultComponentModel=default</compilerArg>
<compilerArg>-Amapstruct.unmappedTargetPolicy=WARN</compilerArg>
</compilerArgs>
</configuration>
</plugin>
<plugin>
<groupId>com.google.cloud.tools</groupId>
<artifactId>jib-maven-plugin</artifactId>
<version>1.7.0</version>
<configuration>
<from>
<image>harbor.chaomeifan.com/library/openjdk:8-jdk-alpine-v1</image>
</from>
<allowInsecureRegistries>true</allowInsecureRegistries>
<to>
<image>${docker.image.prefix}/${project.artifactId}</image>
<tags>
<tag>${project.version}</tag>
<tag>latest</tag>
</tags>
</to>
<container>
<creationTime>USE_CURRENT_TIMESTAMP</creationTime>
<environment>
<PINPOINT_VERSION>1.8.4</PINPOINT_VERSION>
<PINPOINT_OPTS>-Dpinpoint.agentId=${project.artifactId} -Dpinpoint.applicationName=${project.artifactId}</PINPOINT_OPTS>
<NACOS_OPTS>-Dregistry.nacos.namespace=bbae7e8e-29cb-48e3-8c36-c8a9759975a9</NACOS_OPTS>
</environment>
<user>zcckj:zcckj</user>
<entrypoint>
<arg>/sbin/tini</arg>
<arg>--</arg>
<arg>/bin/sh</arg>
<arg>-c</arg>
<arg>java ${JAVA_OPTS} -javaagent:/pinpoint_data/pinpoint-agent-${PINPOINT_VERSION}/pinpoint-bootstrap-$PINPOINT_VERSION.jar $PINPOINT_OPTS $NACOS_OPTS -Djava.security.egd=file:/dev/./urandom -cp /app/resources/:/app/classes/:/app/libs/* com.zcckj.jingtian.biz.server.JingtianBizApplication</arg>
</entrypoint>
</container>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>build</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
  • lombok.version为1.18.8

  • mapstruct.version为1.3.0.Final

如果您的项目没有用到Pinpoint,Nacos可以完全精简为如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
<build>
<plugins>
<plugin>编译插件省略(同上)...</plugin>
<plugin>
<groupId>com.google.cloud.tools</groupId>
<artifactId>jib-maven-plugin</artifactId>
<version>1.7.0</version>
<configuration>
<from>
<image>harbor.chaomeifan.com/library/openjdk:8-jdk-alpine-v1</image>
</from>
<allowInsecureRegistries>true</allowInsecureRegistries>
<to>
<image>${docker.image.prefix}/${project.artifactId}</image>
<tags>
<tag>${project.version}</tag>
<tag>latest</tag>
</tags>
</to>
<container>
<creationTime>USE_CURRENT_TIMESTAMP</creationTime>
<user>zcckj:zcckj</user>
<entrypoint>
<arg>/sbin/tini</arg>
<arg>--</arg>
<arg>/bin/sh</arg>
<arg>-c</arg>
<arg>java -cp /app/resources/:/app/classes/:/app/libs/* com.zcckj.jingtian.biz.server.JingtianBizApplication</arg>
</entrypoint>
</container>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>build</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>

这里我们构建了一个基镜像harbor.chaomeifan.com/library/openjdk:8-jdk-alpine-v1,以下给出基镜像的Dockerfile

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
FROM openjdk:8-jdk-alpine
MAINTAINER zhangjie <jaychang1987@gmail.com>

ARG user=zcckj
ARG group=zcckj
ARG uid=1739
ARG gid=1739

ENV JAVA_OPTS="-Xmx1024M -Xms1024M -Xmn192M -XX:MaxMetaspaceSize=256M -XX:MetaspaceSize=256M -XX:+UseParallelGC -XX:+UseAdaptiveSizePolicy -XX:MaxGCPauseMillis=100 -XX:ErrorFile=/tmp/hs_err_pid%p.log -Xloggc:/tmp/gc.log -XX:HeapDumpPath=/tmp -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+HeapDumpOnOutOfMemoryError"

# Configure aliyun alpine software source and timezone
RUN set -x \
&& sed -i "s/http:\/\/dl-cdn.alpinelinux.org/https:\/\/mirrors.aliyun.com/g" /etc/apk/repositories \
&& apk update \
&& apk --no-cache add bash bash-completion bash-doc \
&& apk --no-cache add tzdata curl tini \
&& apk --no-cahce add font-adobe-100dpi ttf-dejavu fontconfig \
&& cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime \
&& echo "Asia/Shanghai" > /etc/timezone \
&& rm -rf /var/cache/apk/* \
&& addgroup --gid ${gid} ${group} \
&& adduser --uid ${uid} -G ${group} ${user} -s /bin/bash -D

由于基镜像用了非root用户启动进程,故如果有映射到容器的目录需要将目录所有者所属组改为Dockerfile中定义的用户id,组id,chown -R 1739:1739 /path/directory

如果觉得嫌麻烦,可以在基镜像中直接用root

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
FROM openjdk:8-jdk-alpine
MAINTAINER zhangjie <jaychang1987@gmail.com>

ENV JAVA_OPTS="-Xmx1024M -Xms1024M -Xmn192M -XX:MaxMetaspaceSize=256M -XX:MetaspaceSize=256M -XX:+UseParallelGC -XX:+UseAdaptiveSizePolicy -XX:MaxGCPauseMillis=100 -XX:ErrorFile=/tmp/hs_err_pid%p.log -Xloggc:/tmp/gc.log -XX:HeapDumpPath=/tmp -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+HeapDumpOnOutOfMemoryError"

# Configure aliyun alpine software source and timezone
RUN set -x \
&& sed -i "s/http:\/\/dl-cdn.alpinelinux.org/https:\/\/mirrors.aliyun.com/g" /etc/apk/repositories \
&& apk update \
&& apk --no-cache add bash bash-completion bash-doc \
&& apk --no-cache add tzdata curl tini \
&& apk --no-cahce add font-adobe-100dpi ttf-dejavu fontconfig \
&& cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime \
&& echo "Asia/Shanghai" > /etc/timezone \
&& rm -rf /var/cache/apk/*

这样就无需在jib插件配置中设置zcckj:zcckj

开始构建镜像

1
mvn clean compile jib:build -Dmaven.test.skip=true -DsendCredentialsOverHttp=true -U

手工构建

123
Jay Chang

Jay Chang

当你的才华还撑不起你的野心时,你就应该静下心来学习。

21 日志
5 分类
13 标签
RSS
GitHub Twitter Facebook Google E-Mail
Links
  • 美团点评
  • 有赞技术
  • 网易乐得
  • 你假笨
  • 占小狼
  • 斩秋
  • 芋道源码
  • 伍翀Jark
  • 郭俊Jason
  • Chenssy
  • 龙谭斋
  • Jerry Qu
  • MoreWindows
  • Philipp Hauer
© 2017 — 2023 Jay Chang 浙ICP备19027095号
由 Hexo 强力驱动
|
主题 — NexT.Mist v5.1.4