Troubleshooting
This document provides solutions to common issues you may encounter when using Overlock.
Table of Contents
Common Issues
Docker Desktop on Linux: daemon not found
Symptoms:
overlock env createfails immediately with a Docker connection errordocker psworks fine in the same shell- The active Docker context is
desktop-linux
Cause:
Overlock uses the Docker Go SDK, which does not read Docker CLI contexts. When Docker Desktop is installed on Linux, the daemon socket lives under ~/.docker/desktop/ rather than /var/run/docker.sock, so the SDK cannot find it without help.
Solution:
Export DOCKER_HOST to point at the Docker Desktop socket:
export DOCKER_HOST=unix://$HOME/.docker/desktop/docker.raw.sock
Add the line to your shell profile (~/.bashrc, ~/.zshrc) to make it permanent.
Environment Creation Fails
Symptoms:
- Command fails with cluster creation errors
- Timeout during environment setup
- Docker-related errors
Solutions:
-
Ensure Docker is running:
docker psIf this fails, start Docker daemon.
-
Check Kubernetes engine installation:
- For KinD:
kind version - For K3s:
k3s --version - For K3d:
k3d version
- For KinD:
-
Verify system resources:
- Check available memory:
free -h - Check available disk space:
df -h - Ensure at least 4GB RAM and 10GB disk space available
- Check available memory:
-
Check for port conflicts:
# Check if required ports are in use
sudo lsof -i :6443 # Kubernetes API
sudo lsof -i :5000 # Local registry -
Clean up existing environments:
overlock environment list
overlock environment delete <old-env-name>
Package Installation Fails
Symptoms:
- Configuration, provider, or function fails to install
- Timeout errors
- Authentication errors
Solutions:
-
Check internet connectivity:
curl -I https://xpkg.upbound.io -
Verify package URL is correct:
- Check for typos in package name
- Verify version exists
- Try accessing URL in browser
-
Use debug mode for details:
overlock --debug provider install <url> -
Check authentication for private registries:
- Verify registry credentials
- Ensure you're logged into the registry
-
Verify Crossplane is ready:
kubectl get pods -n crossplane-systemAll pods should be in
Runningstate.
Provider Not Working
Symptoms:
- Provider installed but resources not working
- Authentication errors in provider logs
- Resources stuck in non-ready state
Solutions:
-
Verify provider is installed and healthy:
overlock provider list
kubectl get providers -
Check provider logs:
kubectl logs -n crossplane-system deployment/<provider-name> -
Verify provider configuration:
- For GCP: Check service account key configuration
- For AWS: Verify AWS credentials
- For Azure: Check Azure credentials
-
Check Crossplane version compatibility:
- Some providers require specific Crossplane versions
- Check provider documentation for compatibility matrix
-
Verify ProviderConfig exists:
kubectl get providerconfigs
Freezing During Environment Creation
Symptom
The process freezes for a few minutes during the "Joining worker nodes" step when creating multiple environments with Overlock CLI. Eventually, it fails with the following error:
ERROR: failed to create cluster: failed to join node with kubeadm: command "docker exec --privileged dest-worker kubeadm join --config /kind/kubeadm.conf --skip-phases=preflight --v=6" failed with error: exit status 1
Cause
When the Overlock CLI creates environments, it also installs resources, likely increasing the number of file system watches (inotify instances) that Kubernetes and its components need to manage. This increased usage, combined with existing watches from previous Overlock environments, could exceed the default system limits, leading to the kubelet.service on the newly created worker node failing to start due to the error: Failed to allocate directory watch: Too many open files.
Steps to Resolve
-
Run the following command to adjust the
fs.inotify.max_user_instancessetting on your host:sysctl fs.inotify.max_user_instances=512 -
Retry the environment creation command:
overlock env create <name>
Explanation
Why did the error occur?
The error indicates that the kubelet.service failed to start due to the system reaching its limit for the number of file system watches (inotify instances) allowed per user.
How did adjusting fs.inotify.max_user_instances solve the error?
Increasing the fs.inotify.max_user_instances setting allows more inotify instances to be allocated per user, resolving the resource limitation that caused the kubelet.service to fail.
Node Create Hangs on "Waiting for node to appear"
Symptom
overlock env node create (k3s-docker engine) prints the node container start log, then loops forever on:
DEBUG Waiting for node with label overlock.io/node=<name> to appear...
docker ps -a shows the agent container as exited (exit=1), and docker logs <agent-container> includes errors such as:
inotify_init: too many open files
error initializing watcher: too many open files
Failed to start cAdvisor: inotify_init: too many open files
Cause
The Linux kernel enforces fs.inotify.max_user_instances per UID. The K3s server container already consumes a large share of that budget; when the agent container starts, its kubelet, cAdvisor, and dynamic plugin watchers all call inotify_init() and the kernel returns EMFILE. The agent process exits, the Kubernetes node is never registered, and the wait loop never resolves.
The default on many distributions is 128, which is too low for running a K3s server plus one or more agent containers on the same host.
Steps to Resolve
-
Raise the per-user inotify limits (also raise
max_user_watcheswhile you're there — it has the same root cause for kubelet's directory watches):sudo sysctl -w fs.inotify.max_user_instances=8192
sudo sysctl -w fs.inotify.max_user_watches=524288 -
Make the change persistent across reboots:
sudo tee /etc/sysctl.d/99-overlock.conf <<'EOF'
fs.inotify.max_user_instances = 8192
fs.inotify.max_user_watches = 524288
EOF
sudo sysctl --system -
Remove the failed agent container so the next attempt starts clean, then retry:
docker rm -f k3s-docker-<environment>-<node>
overlock env node create <node> --environment <environment> --engine k3s-docker
Verification
Check the new limits are applied:
cat /proc/sys/fs/inotify/max_user_instances
cat /proc/sys/fs/inotify/max_user_watches
Firewall Configuration for Remote Nodes
When using k3s-docker engine with remote nodes via SSH, the server host firewall must allow K3s traffic. If using firewalld:
Open required ports
sudo firewall-cmd --zone=public --add-port=6443/tcp --permanent # K3s API server
sudo firewall-cmd --zone=public --add-port=6444/tcp --permanent # K3s supervisor
sudo firewall-cmd --zone=public --add-port=8472/udp --permanent # Flannel VXLAN overlay
sudo firewall-cmd --zone=public --add-port=10250/tcp --permanent # Kubelet
Trust K3s interfaces
sudo firewall-cmd --zone=trusted --add-interface=cni0 --permanent
sudo firewall-cmd --zone=trusted --add-interface=flannel.1 --permanent
Apply changes
sudo firewall-cmd --reload
Getting Help
Command Help
Use the --help flag to get detailed information about any command:
# General help
overlock --help
# Command-specific help
overlock environment --help
overlock configuration --help
overlock provider --help
Debug Mode
Enable debug mode to see detailed output:
overlock --debug <command>
This will show:
- API calls being made
- Detailed error messages
- Internal operation logs
- Kubernetes resource operations
Community Support
If you're still experiencing issues:
- Check existing issues: Search GitHub Issues
- Join Discord: Get help from the community on Discord
- Create an issue: Report bugs or request features on GitHub
Providing Debug Information
When reporting issues, include:
-
Overlock version:
overlock --version -
Debug output:
overlock --debug <failing-command> 2>&1 | tee debug.log -
System information:
- Operating system and version
- Docker version:
docker version - Kubernetes engine and version
- Available resources (memory, disk)
-
Kubernetes state:
kubectl get pods -A
kubectl get providers
kubectl get configurations
Additional Resources
- Configuration Guide - Detailed configuration options
- Command Reference - Complete command documentation
- Examples - Common usage patterns and workflows