Troubleshooting¶
This page provides techniques for troubleshooting common Canonical Kubernetes issues.
Kubectl error: dial tcp 127.0.0.1:6443: connect: connection refused
¶
Problem¶
The kubeconfig file generated by the k8s kubectl
CLI can not be used to
access the cluster from an external machine. The following error is seen when
running kubectl
with the invalid kubeconfig:
...
E0412 08:36:06.404499 517166 memcache.go:265] couldn't get current server API group list: Get "https://127.0.0.1:6443/api?timeout=32s": dial tcp 127.0.0.1:6443: connect: connection refused
The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?
Explanation¶
A common technique for viewing a cluster kubeconfig file is by using the
kubectl config view
command.
The k8s kubectl
command invokes an integrated kubectl
client. Thus
k8s kubectl config view
will output a seemingly valid kubeconfig file.
However, this will only be valid on cluster nodes where control plane services
are available on localhost endpoints.
Solution¶
Use k8s config
instead of k8s kubectl config
to generate a kubeconfig file
that is valid for use on external machines.
Kubelet Error: failed to initialize top level QOS containers
¶
Problem¶
This is related to the kubepods
cgroup not getting the cpuset controller up on
the kubelet. kubelet needs a feature from cgroup and the kernel may not be set
up appropriately to provide the cpuset feature.
E0125 00:20:56.003890 2172 kubelet.go:1466] "Failed to start ContainerManager" err="failed to initialise top level QOS containers: root container [kubepods] doesn't exist"
Explanation¶
An excellent deep-dive of the issue exists at kubernetes/kubernetes #122955.
Commenter @haircommander states
basically: we’ve figured out that this issue happens because libcontainer doesn’t initialise the cpuset cgroup for the kubepods slice when the kubelet initially calls into it to do so. This happens because there isn’t a cpuset defined on the top level of the cgroup. however, we fail to validate all of the cgroup controllers we need are present. It’s possible this is a limitation in the dbus API: how do you ask systemd to create a cgroup that is effectively empty?
if we delegate: we are telling systemd to leave our cgroups alone, and not remove the “unneeded” cpuset cgroup.
Solution¶
This is in the process of being fixed upstream via kubernetes/kubernetes #125923.
In the meantime, the best solution is to create a Delegate=yes
configuration
in systemd.
mkdir -p /etc/systemd/system/snap.k8s.kubelet.service.d
cat /etc/systemd/system/snap.k8s.kubelet.service.d/delegate.conf <<EOF
[Service]
Delegate=yes
EOF
reboot
The path required for the containerd socket already exists¶
Problem¶
Canonical Kubernetes tries to create the containerd socket to manage containers, but it fails because the socket file already exists, which indicates another installation of containerd on the system.
Explanation¶
In classic confinement mode, Canonical Kubernetes uses the default containerd paths. This means that a Canonical Kubernetes installation will conflict with any existing system configuration where containerd is already installed. For example, if you have Docker installed, or another Kubernetes distribution that uses containerd.
Solution¶
We recommend running Canonical Kubernetes in an isolated environment, for this purpose, you can create a LXD container for your installation. See Install Canonical Kubernetes in LXD for instructions.