pandafy@dev-logs:~$

Docker OpenWISP: Overcoming Management Tunnel Challenges

When OpenWISP is deployed over public infrastructure to manage devices geographically distributed over the internet, a management tunnel is required to perform essential network operations and monitoring checks.

The Docker OpenWISP project is an ongoing effort to support deploying OpenWISP using Docker. It creates separate services to handle different aspects of the application, such as the database, dashboard, API, VPN, and background workers, using a docker-compose.yml.

The devices use the VPN service (openvpn) to establish the management tunnel, and the background workers service (celery) is used to perform network operations on the devices. Therefore, it is necessary for the celery service to reach the devices through the management tunnel.

NOTE: Docker OpenWISP creates two services for background workers celery and celery_monitoring for performing network and monitoring operations respectively. This devlog only mentions the celery service for brevity.

From our previous experiences of deploying OpenWISP on a multi-VM setup, I initially believed it would require setting up custom networking routes to forward traffic from the celery service to the openvpn service. However, my perspective changed when I explored different networking modes in Docker.

I was particularly interested in the container network mode which allows one service to attach to another service’s networking stack. This approach enabled the celery service to share the networking stack of the openvpn service, providing background workers with access to the management tunnel without the need for custom networking routes.

celery:
  image: openwisp/openwisp-dashboard:latest
+ network_mode: "service:openvpn"

This led to a bug in the celery service containers where the containers spawned without any network routes. I pin-pointed the cause to the celery service starting before the openvpn service. Hence, I updated the celery service again to add openvpn dependency on the celery service.

celery:
  image: openwisp/openwisp-dashboard:latest
+ network_mode: "service:openvpn"
+ depends_on:
+   - openvpn

While it seemed like an easy fix in theory, it didn’t solve the issue. The openwisp/openwisp-openvpn image used by the openvpn service is configured to wait for the dashboard service to become available so it can download the VPN configuration generated by OpenWISP.

However, there’s a delay between Docker recognizing the openvpn service as started and the actual initiation of the OpenVPN process within the container. Consequently, the celery service starts with empty routes, missing the routes set by OpenVPN.

Thus, a health check was required for the openvpn service which verifies that OpenVPN process has started in the containers.

openvpn:
  image: openwisp/openwisp-openvpn:latest
+ healthcheck:
+   test: ["CMD", "pgrep", "-f", "openvpn"]
+   interval: 30s
+   timeout: 10s
+   retries: 30
+   start_period: 90s
celery:
  image: openwisp/openwisp-dashboard:latest
+ network_mode: "service:openvpn"
+ depends_on:
+   openvpn:
+     condition: service_healthy

After implementing the health check, the celery containers could reliably use the management tunnel to perform network operations.