Troubleshooting

This document provides solutions to common issues you may encounter when using Overlock.

Common Issues
Firewall Configuration for Remote Nodes
Getting Help
Debug Mode

Common Issues

Docker Desktop on Linux: daemon not found

Symptoms:

overlock env create fails immediately with a Docker connection error
docker ps works fine in the same shell
The active Docker context is desktop-linux

Cause: Overlock uses the Docker Go SDK, which does not read Docker CLI contexts. When Docker Desktop is installed on Linux, the daemon socket lives under ~/.docker/desktop/ rather than /var/run/docker.sock, so the SDK cannot find it without help.

Solution: Export DOCKER_HOST to point at the Docker Desktop socket:

export DOCKER_HOST=unix://$HOME/.docker/desktop/docker.raw.sock

Add the line to your shell profile (~/.bashrc, ~/.zshrc) to make it permanent.

Environment Creation Fails

Symptoms:

Command fails with cluster creation errors
Timeout during environment setup
Docker-related errors

Solutions:

Ensure Docker is running:
```
docker ps
```
If this fails, start Docker daemon.
Check Kubernetes engine installation:
- For KinD: kind version
- For K3s: k3s --version
- For K3d: k3d version
Verify system resources:
- Check available memory: free -h
- Check available disk space: df -h
- Ensure at least 4GB RAM and 10GB disk space available

Check for port conflicts:

# Check if required ports are in use
sudo lsof -i :6443  # Kubernetes API
sudo lsof -i :5000  # Local registry

Clean up existing environments:

overlock environment list
overlock environment delete <old-env-name>

Package Installation Fails

Symptoms:

Configuration, provider, or function fails to install
Timeout errors
Authentication errors

Solutions:

Check internet connectivity:
```
curl -I https://xpkg.upbound.io
```
Verify package URL is correct:
- Check for typos in package name
- Verify version exists
- Try accessing URL in browser
Use debug mode for details:
```
overlock --debug provider install <url>
```
Check authentication for private registries:
- Verify registry credentials
- Ensure you're logged into the registry
Verify Crossplane is ready:
```
kubectl get pods -n crossplane-system
```
All pods should be in Running state.

Provider Not Working

Symptoms:

Provider installed but resources not working
Authentication errors in provider logs
Resources stuck in non-ready state

Solutions:

Verify provider is installed and healthy:

overlock provider list
kubectl get providers

Check provider logs:

kubectl logs -n crossplane-system deployment/<provider-name>

Verify provider configuration:
- For GCP: Check service account key configuration
- For AWS: Verify AWS credentials
- For Azure: Check Azure credentials
Check Crossplane version compatibility:
- Some providers require specific Crossplane versions
- Check provider documentation for compatibility matrix
Verify ProviderConfig exists:
```
kubectl get providerconfigs
```

Freezing During Environment Creation

Symptom

The process freezes for a few minutes during the "Joining worker nodes" step when creating multiple environments with Overlock CLI. Eventually, it fails with the following error:

ERROR: failed to create cluster: failed to join node with kubeadm: command "docker exec --privileged dest-worker kubeadm join --config /kind/kubeadm.conf --skip-phases=preflight --v=6" failed with error: exit status 1

Cause

When the Overlock CLI creates environments, it also installs resources, likely increasing the number of file system watches (inotify instances) that Kubernetes and its components need to manage. This increased usage, combined with existing watches from previous Overlock environments, could exceed the default system limits, leading to the kubelet.service on the newly created worker node failing to start due to the error: Failed to allocate directory watch: Too many open files.

Steps to Resolve

Run the following command to adjust the fs.inotify.max_user_instances setting on your host:
```
sysctl fs.inotify.max_user_instances=512
```
Retry the environment creation command:
```
overlock env create <name>
```

Explanation

Why did the error occur?

The error indicates that the kubelet.service failed to start due to the system reaching its limit for the number of file system watches (inotify instances) allowed per user.

How did adjusting fs.inotify.max_user_instances solve the error?

Increasing the fs.inotify.max_user_instances setting allows more inotify instances to be allocated per user, resolving the resource limitation that caused the kubelet.service to fail.

Node Create Hangs on "Waiting for node to appear"

Symptom

overlock env node create (k3s-docker engine) prints the node container start log, then loops forever on:

DEBUG   Waiting for node with label overlock.io/node=<name> to appear...

docker ps -a shows the agent container as exited (exit=1), and docker logs <agent-container> includes errors such as:

inotify_init: too many open files
error initializing watcher: too many open files
Failed to start cAdvisor: inotify_init: too many open files

Cause

The Linux kernel enforces fs.inotify.max_user_instances per UID. The K3s server container already consumes a large share of that budget; when the agent container starts, its kubelet, cAdvisor, and dynamic plugin watchers all call inotify_init() and the kernel returns EMFILE. The agent process exits, the Kubernetes node is never registered, and the wait loop never resolves.

The default on many distributions is 128, which is too low for running a K3s server plus one or more agent containers on the same host.

Steps to Resolve

Raise the per-user inotify limits (also raise max_user_watches while you're there — it has the same root cause for kubelet's directory watches):
```
sudo sysctl -w fs.inotify.max_user_instances=8192
sudo sysctl -w fs.inotify.max_user_watches=524288
```

Make the change persistent across reboots:

sudo tee /etc/sysctl.d/99-overlock.conf <<'EOF'
fs.inotify.max_user_instances = 8192
fs.inotify.max_user_watches   = 524288
EOF
sudo sysctl --system

Remove the failed agent container so the next attempt starts clean, then retry:

docker rm -f k3s-docker-<environment>-<node>
overlock env node create <node> --environment <environment> --engine k3s-docker

Verification

Check the new limits are applied:

cat /proc/sys/fs/inotify/max_user_instances
cat /proc/sys/fs/inotify/max_user_watches

Firewall Configuration for Remote Nodes

When using k3s-docker engine with remote nodes via SSH, the server host firewall must allow K3s traffic. If using firewalld:

Open required ports

sudo firewall-cmd --zone=public --add-port=6443/tcp --permanent   # K3s API server
sudo firewall-cmd --zone=public --add-port=6444/tcp --permanent   # K3s supervisor
sudo firewall-cmd --zone=public --add-port=8472/udp --permanent   # Flannel VXLAN overlay
sudo firewall-cmd --zone=public --add-port=10250/tcp --permanent  # Kubelet

Trust K3s interfaces

sudo firewall-cmd --zone=trusted --add-interface=cni0 --permanent
sudo firewall-cmd --zone=trusted --add-interface=flannel.1 --permanent

Apply changes

sudo firewall-cmd --reload

Getting Help

Command Help

Use the --help flag to get detailed information about any command:

# General help
overlock --help

# Command-specific help
overlock environment --help
overlock configuration --help
overlock provider --help

Debug Mode

Enable debug mode to see detailed output:

overlock --debug <command>

This will show:

API calls being made
Detailed error messages
Internal operation logs
Kubernetes resource operations

Community Support

If you're still experiencing issues:

Check existing issues: Search GitHub Issues
Join Discord: Get help from the community on Discord
Create an issue: Report bugs or request features on GitHub

Providing Debug Information

When reporting issues, include:

Overlock version:
```
overlock --version
```

Debug output:

overlock --debug <failing-command> 2>&1 | tee debug.log

System information:
- Operating system and version
- Docker version: docker version
- Kubernetes engine and version
- Available resources (memory, disk)

Kubernetes state:

kubectl get pods -A
kubectl get providers
kubectl get configurations

Additional Resources

Configuration Guide - Detailed configuration options
Command Reference - Complete command documentation
Examples - Common usage patterns and workflows

Table of Contents​

Common Issues​

Docker Desktop on Linux: daemon not found​

Environment Creation Fails​

Package Installation Fails​

Provider Not Working​

Freezing During Environment Creation​

Symptom​

Cause​

Steps to Resolve​

Explanation​

Node Create Hangs on "Waiting for node to appear"​

Symptom​

Cause​

Steps to Resolve​

Verification​

Firewall Configuration for Remote Nodes​

Open required ports​

Trust K3s interfaces​

Apply changes​

Getting Help​

Command Help​

Debug Mode​

Community Support​

Providing Debug Information​

Additional Resources​

Table of Contents

Common Issues

Docker Desktop on Linux: daemon not found

Environment Creation Fails

Package Installation Fails

Provider Not Working

Freezing During Environment Creation

Symptom

Cause

Steps to Resolve

Explanation

Node Create Hangs on "Waiting for node to appear"

Symptom

Cause

Steps to Resolve

Verification

Firewall Configuration for Remote Nodes

Open required ports

Trust K3s interfaces

Apply changes

Getting Help

Command Help

Debug Mode

Community Support

Providing Debug Information

Additional Resources