最近遇到一台服务器重启后docker.service无法启动,记录解决过程。
问题现象
1.执行docker ps
后,终端报错如下:
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
2.执行systemctl status docker
查看详细信息如下:
解决过程
1. 检查docker.service和docker.socket文件是否异常
不看不知道,一看吓一跳,好家伙,这台服务器上这两个文件都不翼而飞,于是从另一台docker功能正常的服务器拷贝上述两个文件到本机,由于docker安装时未修改默认配置,所以两个文件均未修改.
1.1. 修改docker.service
执行vim /lib/systemd/system/docker.service
,添加以下内容:
[Unit]Description=Docker Application Container EngineDocumentation=https://docs.docker.comAfter=network-online.target docker.socket firewalld.service containerd.service time-set.targetWants=network-online.target containerd.serviceRequires=docker.socket[Service]Type=notify# the default is not to use systemd for cgroups because the delegate issues still# exists and systemd currently does not support the cgroup feature set required# for containers run by dockerExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sockExecReload=/bin/kill -s HUP $MAINPIDTimeoutStartSec=0RestartSec=2Restart=always# Note that StartLimit* options were moved from "Service" to "Unit" in systemd 229.# Both the old, and new location are accepted by systemd 229 and up, so using the old location# to make them work for either version of systemd.StartLimitBurst=3# Note that StartLimitInterval was renamed to StartLimitIntervalSec in systemd 230.# Both the old, and new name are accepted by systemd 230 and up, so using the old name to make# this option work for either version of systemd.StartLimitInterval=60s# Having non-zero Limit*s causes performance problems due to accounting overhead# in the kernel. We recommend using cgroups to do container-local accounting.LimitNOFILE=infinityLimitNPROC=infinityLimitCORE=infinity# Comment TasksMax if your systemd version does not support it.# Only systemd 226 and above support this option.TasksMax=infinity# set delegate yes so that systemd does not reset the cgroups of docker containersDelegate=yes# kill only the docker process, not all processes in the cgroupKillMode=processOOMScoreAdjust=-500[Install]WantedBy=multi-user.target
1.2. 修改docker.socket
执行vim /lib/systemd/system/docker.socket
,添加以下内容:
vim /lib/systemd/system/docker.socket[Unit]Description=Docker Socket for the API[Socket]# If /var/run is not implemented as a symlink to /run, you may need to# specify ListenStream=/var/run/docker.sock instead.ListenStream=/run/docker.sockSocketMode=0660SocketUser=rootSocketGroup=docker[Install]WantedBy=sockets.target
2. 重启docker
systemctl daemon-reloadsystemctl restart docker# 验证是否修复docker ps# 相同报错Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
执行以上命令重启docker后,发现报错还在,执行journalctl -u docker.service -f
查看docker服务详细信息,终端输出如下:
3. 检查containerd
执行命令which containerd
, 发现服务器未安装containerd,执行以下命令安装
## 安装containerdapt-get install containerd## 再次查看which containerd## 输出如下/usr/bin/containerd
4. 执行步骤2,再次重启docker
docker重启后,执行docker ps
,此时docker已经可以正常使用:
5. 另一个小问题
使用过程中发现docker命令的tab键命令补全功能失效,在网上找到了解决方式:
## 1.下载自动补全脚本:curl -o /etc/bash_completion.d/docker https://raw.githubusercontent.com/docker/cli/master/contrib/completion/bash/docker## 2.重新加载bash配置:source /etc/bash_completion.d/docker