Auto-update blog content from Obsidian: 2025-09-29 10:36:20
All checks were successful
Blog Deployment / Check-Rebuild (push) Successful in 6s
Blog Deployment / Build (push) Has been skipped
Blog Deployment / Deploy-Staging (push) Successful in 10s
Blog Deployment / Test-Staging (push) Successful in 2s
Blog Deployment / Merge (push) Successful in 6s
Blog Deployment / Deploy-Production (push) Successful in 10s
Blog Deployment / Test-Production (push) Successful in 2s
Blog Deployment / Clean (push) Has been skipped
Blog Deployment / Notify (push) Successful in 2s
All checks were successful
Blog Deployment / Check-Rebuild (push) Successful in 6s
Blog Deployment / Build (push) Has been skipped
Blog Deployment / Deploy-Staging (push) Successful in 10s
Blog Deployment / Test-Staging (push) Successful in 2s
Blog Deployment / Merge (push) Successful in 6s
Blog Deployment / Deploy-Production (push) Successful in 10s
Blog Deployment / Test-Production (push) Successful in 2s
Blog Deployment / Clean (push) Has been skipped
Blog Deployment / Notify (push) Successful in 2s
This commit is contained in:
281
content/post/12-opnsense-virtualization-highly-available.fr.md
Normal file
281
content/post/12-opnsense-virtualization-highly-available.fr.md
Normal file
@@ -0,0 +1,281 @@
|
|||||||
|
---
|
||||||
|
slug: opnsense-virtualization-highly-available
|
||||||
|
title: Build a Highly Available OPNsense Cluster on Proxmox VE
|
||||||
|
description: A proof of concept showing how to virtualize OPNsense on Proxmox VE, configure high availability with CARP and pfSync and handle a single WAN IP.
|
||||||
|
date: 2025-09-29
|
||||||
|
draft: true
|
||||||
|
tags:
|
||||||
|
- opnsense
|
||||||
|
- proxmox
|
||||||
|
- high-availability
|
||||||
|
categories:
|
||||||
|
- homelab
|
||||||
|
---
|
||||||
|
## Intro
|
||||||
|
|
||||||
|
J’ai récemment rencontré mon premier vrai problème, ma box **OPNsense** physique a planté à cause d’un _kernel panic_. J’ai détaillé ce qu'il s'est passé dans [cet article]({{< ref "post/10-opnsense-crash-disk-panic" >}}).
|
||||||
|
|
||||||
|
Cette panne m’a fait repenser mon installation. Un seul pare-feu est un point de défaillance unique, donc pour améliorer la résilience j’ai décidé de prendre une nouvelle approche : **virtualiser OPNsense**.
|
||||||
|
|
||||||
|
Évidemment, faire tourner une seule VM ne suffirait pas. Pour obtenir une vraie redondance, il me faut deux instances OPNsense en **Haute Disponibilité**, l’une active et l’autre en attente.
|
||||||
|
|
||||||
|
Avant de déployer ça sur mon réseau, j’ai voulu valider l’idée dans mon homelab. Dans cet article, je vais détailler la preuve de concept : déployer deux VM OPNsense dans un cluster **Proxmox VE** et les configurer pour fournir un pare-feu hautement disponible.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Current Infrastructure
|
||||||
|
|
||||||
|
Au sommet de mon installation, mon modem FAI, une _Freebox_ en mode bridge, relié directement à l’interface `igc0` de ma box OPNsense, servant d’interface **WAN**. Sur `igc1`, le **LAN** est connecté à mon switch principal via un port trunk, avec le VLAN 1 comme VLAN natif pour mon réseau de management.
|
||||||
|
|
||||||
|
Ce switch relie également mes trois nœuds Proxmox, chacun sur un port trunk avec le même VLAN natif. Chaque nœud dispose de deux cartes réseau : une pour le trafic général, et l’autre dédiée au réseau de stockage Ceph, connecté à un switch séparé de 2,5 Gbps.
|
||||||
|
|
||||||
|
Depuis le crash d’OPNsense, j’ai simplifié l’architecture en supprimant le lien LACP, qui n’apportait pas de réelle valeur :
|
||||||
|

|
||||||
|
|
||||||
|
Jusqu’à récemment, le réseau Proxmox de mon cluster était très basique : chaque nœud était configuré individuellement sans véritable logique commune. Cela a changé après la découverte du SDN Proxmox, qui m’a permis de centraliser les définitions de VLAN sur l’ensemble du cluster. J’ai décrit cette étape dans [cet article]({{< ref "post/11-proxmox-cluster-networking-sdn" >}}).
|
||||||
|
|
||||||
|
---
|
||||||
|
## Preuve de Concept
|
||||||
|
|
||||||
|
Place au lab. Voici les étapes principales :
|
||||||
|
1. Ajouter quelques VLANs dans mon homelab
|
||||||
|
2. Créer un faux routeur FAI
|
||||||
|
3. Construire deux VMs OPNsense
|
||||||
|
4. Configurer la haute disponibilité
|
||||||
|
5. Tester la bascule
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
### Ajouter des VLANs dans mon homelab
|
||||||
|
|
||||||
|
Pour cette expérimentation, je crée trois nouveaux VLANs :
|
||||||
|
- **VLAN 101** : _POC WAN_
|
||||||
|
- **VLAN 102** : _POC LAN_
|
||||||
|
- **VLAN 103** : _POC pfSync_
|
||||||
|
|
||||||
|
Dans l’interface Proxmox, je vais dans `Datacenter` > `SDN` > `VNets` et je clique sur `Create` :
|
||||||
|

|
||||||
|
|
||||||
|
Une fois les trois VLANs créés, j’applique la configuration.
|
||||||
|
|
||||||
|
J’ajoute ensuite ces trois VLANs dans mon contrôleur UniFi. Ici, seul l’ID et le nom sont nécessaires, le contrôleur se charge de les propager via les trunks connectés à mes nœuds Proxmox VE.
|
||||||
|
|
||||||
|
### Créer une VM “Fausse Box FAI”
|
||||||
|
|
||||||
|
Pour simuler mon modem FAI actuel, j’ai créé une VM appelée `fake-freebox`. Cette VM route le trafic entre les réseaux _POC WAN_ et _Lab_, et fait tourner un serveur DHCP qui ne délivre qu’un seul bail, exactement comme ma vraie Freebox en mode bridge.
|
||||||
|
|
||||||
|
Cette VM dispose de 2 cartes réseau, que je configure avec Netplan :
|
||||||
|
- `eth0` (_POC WAN_ VLAN 101) : adresse IP statique `10.101.0.254/24`
|
||||||
|
- `enp6s19` (Lab VLAN 66) : adresse IP obtenue en DHCP depuis mon routeur OPNsense actuel, en amont
|
||||||
|
```yaml
|
||||||
|
network:
|
||||||
|
version: 2
|
||||||
|
ethernets:
|
||||||
|
eth0:
|
||||||
|
addresses:
|
||||||
|
- 10.101.0.254/24
|
||||||
|
enp6s19:
|
||||||
|
dhcp4: true
|
||||||
|
```
|
||||||
|
|
||||||
|
J’active ensuite le routage IP pour permettre à cette VM de router le trafic :
|
||||||
|
```bash
|
||||||
|
echo "net.ipv4.ip_forward=1" | sudo tee -a /etc/sysctl.conf
|
||||||
|
sudo sysctl -p
|
||||||
|
```
|
||||||
|
|
||||||
|
Puis je configure du masquage (NAT) afin que les paquets sortant via le réseau Lab ne soient pas rejetés par mon OPNsense actuel :
|
||||||
|
```bash
|
||||||
|
sudo iptables -t nat -A POSTROUTING -o enp6s19 -j MASQUERADE
|
||||||
|
sudo apt install iptables-persistent -y
|
||||||
|
sudo netfilter-persistent save
|
||||||
|
```
|
||||||
|
|
||||||
|
J’installe `dnsmasq` comme serveur DHCP léger :
|
||||||
|
```bash
|
||||||
|
sudo apt install dnsmasq -y
|
||||||
|
```
|
||||||
|
|
||||||
|
Dans `/etc/dnsmasq.conf`, je configure un bail unique (`10.101.0.150`) et je pointe le DNS vers mon OPNsense actuel, sur le VLAN _Lab_ :
|
||||||
|
```
|
||||||
|
interface=eth0
|
||||||
|
bind-interfaces
|
||||||
|
dhcp-range=10.101.0.150,10.101.0.150,255.255.255.0,12h
|
||||||
|
dhcp-option=3,10.101.0.254 # default gateway = this VM
|
||||||
|
dhcp-option=6,192.168.66.1 # DNS server
|
||||||
|
```
|
||||||
|
|
||||||
|
Je redémarre le service `dnsmasq` pour appliquer la configuration :
|
||||||
|
```bash
|
||||||
|
sudo systemctl restart dnsmasq
|
||||||
|
```
|
||||||
|
|
||||||
|
La VM `fake-freebox` est maintenant prête à fournir du DHCP sur le VLAN 101, avec un seul bail disponible.
|
||||||
|
|
||||||
|
### Construire les VMs OPNsense
|
||||||
|
|
||||||
|
Je commence par télécharger l’ISO d’OPNsense et je l’upload sur un de mes nœuds Proxmox :
|
||||||
|

|
||||||
|
|
||||||
|
#### Création de la VM
|
||||||
|
|
||||||
|
Je crée la première VM `poc-opnsense-1` avec les paramètres suivants :
|
||||||
|
- Type d’OS : Linux (même si OPNsense est basé sur FreeBSD)
|
||||||
|
- Type de machine : `q35`
|
||||||
|
- BIOS : `OVMF (UEFI)`, stockage EFI sur mon pool Ceph
|
||||||
|
- Disque : 20 Gio sur Ceph
|
||||||
|
- CPU/RAM : 2 vCPU, 2 Gio de RAM
|
||||||
|
- Cartes réseau :
|
||||||
|
1. VLAN 101 (_POC WAN_)
|
||||||
|
2. VLAN 102 (_POC LAN_)
|
||||||
|
3. VLAN 103 (_POC pfSync_)
|
||||||
|

|
||||||
|
|
||||||
|
ℹ️ Avant de la démarrer, je clone cette VM pour préparer la seconde : `poc-opnsense-2`
|
||||||
|
|
||||||
|
Au premier démarrage, je tombe sur une erreur “access denied”. Pour corriger, j’entre dans le BIOS, **Device Manager > Secure Boot Configuration**, je décoche _Attempt Secure Boot_ et je redémarre :
|
||||||
|

|
||||||
|
|
||||||
|
#### Installation d’OPNsense
|
||||||
|
|
||||||
|
La VM démarre sur l’ISO, je ne touche à rien jusqu’à l’écran de login :
|
||||||
|

|
||||||
|
|
||||||
|
Je me connecte avec `installer` / `opnsense` et je lance l’installateur. Je sélectionne le disque QEMU de 20 Go comme destination et je démarre l’installation :
|
||||||
|

|
||||||
|
|
||||||
|
Une fois terminé, je retire l’ISO du lecteur et je redémarre la machine.
|
||||||
|
|
||||||
|
#### Configuration de Base d’OPNsense
|
||||||
|
|
||||||
|
Au redémarrage, je me connecte avec `root` / `opnsense` et j’arrive au menu CLI :
|
||||||
|

|
||||||
|
|
||||||
|
Avec l’option 1, je réassigne les interfaces :
|
||||||
|

|
||||||
|
|
||||||
|
L’interface WAN récupère bien `10.101.0.150/24` depuis la `fake-freebox`. Je configure le LAN sur `10.102.0.2/24` et j’ajoute un pool DHCP de `10.102.0.10` à `10.102.0.99` :
|
||||||
|

|
||||||
|
|
||||||
|
✅ La première VM est prête, je reproduis l’opération pour la seconde OPNsense `poc-opnsense-2`, qui aura l’IP `10.102.0.3`.
|
||||||
|
|
||||||
|
### Configurer OPNsense en Haute Disponibilité
|
||||||
|
|
||||||
|
Avec les deux VMs OPNsense opérationnelles, il est temps de passer à la configuration via le WebGUI. Pour y accéder, j’ai connecté une VM Windows au VLAN _POC LAN_ et ouvert l’IP de l’OPNsense sur le port 443 :
|
||||||
|

|
||||||
|
|
||||||
|
#### Ajouter l’Interface pfSync
|
||||||
|
|
||||||
|
La troisième carte réseau (`vtnet2`) est assignée à l’interface _pfSync_. Ce réseau dédié permet aux deux firewalls de synchroniser leurs états via le VLAN _POC pfSync_ :
|
||||||
|

|
||||||
|
|
||||||
|
J’active l’interface sur chaque instance et je leur attribue une IP statique :
|
||||||
|
- **poc-opnsense-1** : `10.103.0.2/24`
|
||||||
|
- **poc-opnsense-2** : `10.103.0.3/24`
|
||||||
|
|
||||||
|
Puis, j’ajoute une règle firewall sur chaque nœud pour autoriser tout le trafic provenant de ce réseau sur l’interface _pfSync_ :
|
||||||
|

|
||||||
|
|
||||||
|
#### Configurer la Haute Disponibilité
|
||||||
|
|
||||||
|
Direction `System` > `High Availability` > `Settings`.
|
||||||
|
- Sur le master (`poc-opnsense-1`), je configure les `General Settings` et les `Synchronization Settings`.
|
||||||
|
- Sur le backup (`poc-opnsense-2`), seuls les `General Settings` suffisent (on ne veut pas qu’il écrase la config du master).
|
||||||
|

|
||||||
|
|
||||||
|
Une fois appliqué, je vérifie la synchro dans l’onglet `Status` :
|
||||||
|

|
||||||
|
|
||||||
|
#### Créer une IP Virtuelle
|
||||||
|
|
||||||
|
Pour fournir une passerelle partagée aux clients, je crée une IP virtuelle (VIP) en **CARP** (Common Address Redundancy Protocol) sur l’interface LAN. L’IP est portée par le nœud actif et bascule automatiquement en cas de failover.
|
||||||
|
|
||||||
|
Menu : `Interfaces` > `Virtual IPs` > `Settings` :
|
||||||
|

|
||||||
|
|
||||||
|
Je réplique ensuite la config depuis `System > High Availability > Status` avec le bouton `Synchronize and reconfigure all`.
|
||||||
|
|
||||||
|
Sur `Interfaces > Virtual IPs > Status`, le master affiche la VIP en `MASTER` et le backup en `BACKUP`.
|
||||||
|
|
||||||
|
#### Reconfigurer le DHCP
|
||||||
|
|
||||||
|
Pour la HA, il faut adapter le DHCP. Comme **Dnsmasq** ne supporte pas la synchro des baux, chaque instance doit répondre indépendamment.
|
||||||
|
|
||||||
|
Sur le master :
|
||||||
|
- `Services` > `Dnsmasq DNS & DHCP` > `General` : cocher `Disable HA sync`
|
||||||
|
- `DHCP ranges` : cocher aussi `Disable HA sync`
|
||||||
|
- `DHCP options` : ajouter l’option `router [3]` avec la valeur `10.102.0.1` (VIP LAN)
|
||||||
|
- `DHCP options` : cloner la règle pour `dns-server [6]` vers la même VIP.
|
||||||
|

|
||||||
|
|
||||||
|
Sur le backup :
|
||||||
|
- `Services` > `Dnsmasq DNS & DHCP` > `General` : cocher `Disable HA sync`
|
||||||
|
- Régler `DHCP reply delay` à `5` secondes (laisser la priorité au master)
|
||||||
|
- `DHCP ranges` : définir un autre pool, plus petit (`10.102.0.200 -> 220`).
|
||||||
|
|
||||||
|
Ainsi, seules les **options** DHCP sont synchronisées, les plages restant distinctes.
|
||||||
|
|
||||||
|
#### Interface WAN
|
||||||
|
|
||||||
|
Mon modem FAI n’attribue qu’une seule IP en DHCP, je ne veux pas que mes 2 VMs entrent en compétition. Pour gérer ça :
|
||||||
|
1. Dans Proxmox, je copie l’adresse MAC de `net0` (WAN) de `poc-opnsense-1` et je l’applique à `poc-opnsense-2`. Ainsi, le bail DHCP est partagé.
|
||||||
|
⚠️ Si les deux VMs activent la même MAC en même temps, cela provoque des conflits ARP et peut casser le réseau. Seul le MASTER doit activer son WAN.
|
||||||
|
2. Un hook event CARP procure la possibilité de lancer des scripts. J’ai déployé ce [script Gist](https://gist.github.com/spali/2da4f23e488219504b2ada12ac59a7dc#file-10-wancarp) dans `/usr/local/etc/rc.syshook.d/carp/10-wan` sur les deux nœuds. Ce script active le WAN uniquement sur le MASTER.
|
||||||
|
```php
|
||||||
|
#!/usr/local/bin/php
|
||||||
|
<?php
|
||||||
|
|
||||||
|
require_once("config.inc");
|
||||||
|
require_once("interfaces.inc");
|
||||||
|
require_once("util.inc");
|
||||||
|
require_once("system.inc");
|
||||||
|
|
||||||
|
$subsystem = !empty($argv[1]) ? $argv[1] : '';
|
||||||
|
$type = !empty($argv[2]) ? $argv[2] : '';
|
||||||
|
|
||||||
|
if ($type != 'MASTER' && $type != 'BACKUP') {
|
||||||
|
log_error("Carp '$type' event unknown from source '{$subsystem}'");
|
||||||
|
exit(1);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (!strstr($subsystem, '@')) {
|
||||||
|
log_error("Carp '$type' event triggered from wrong source '{$subsystem}'");
|
||||||
|
exit(1);
|
||||||
|
}
|
||||||
|
|
||||||
|
$ifkey = 'wan';
|
||||||
|
|
||||||
|
if ($type === "MASTER") {
|
||||||
|
log_error("enable interface '$ifkey' due CARP event '$type'");
|
||||||
|
$config['interfaces'][$ifkey]['enable'] = '1';
|
||||||
|
write_config("enable interface '$ifkey' due CARP event '$type'", false);
|
||||||
|
interface_configure(false, $ifkey, false, false);
|
||||||
|
} else {
|
||||||
|
log_error("disable interface '$ifkey' due CARP event '$type'");
|
||||||
|
unset($config['interfaces'][$ifkey]['enable']);
|
||||||
|
write_config("disable interface '$ifkey' due CARP event '$type'", false);
|
||||||
|
interface_configure(false, $ifkey, false, false);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Tester le Failover
|
||||||
|
|
||||||
|
Passons aux tests !
|
||||||
|
|
||||||
|
OPNsense propose un _CARP Maintenance Mode_. Avec le master actif, seul lui avait son WAN monté. En activant le mode maintenance, les rôles basculent : le master devient backup, son WAN est désactivé et celui du backup est activé :
|
||||||
|

|
||||||
|
|
||||||
|
Pendant mes pings vers l’extérieur, aucune perte de paquets au moment du basculement.
|
||||||
|
|
||||||
|
Ensuite, j’ai simulé un crash en éteignant le master. Le backup a pris le relais de façon transparente, seulement un paquet perdu, et grâce à la synchro des états, même ma session SSH est restée ouverte. 🎉
|
||||||
|
|
||||||
|
## Conclusion
|
||||||
|
|
||||||
|
Cette preuve de concept démontre qu’il est possible de faire tourner **OPNsense en haute dispo sous Proxmox VE**, même avec une seule IP WAN. Les briques nécessaires :
|
||||||
|
- Segmentation VLAN
|
||||||
|
- Réseau dédié pfSync
|
||||||
|
- IP virtuelle partagée (CARP)
|
||||||
|
- Script pour gérer l’interface WAN
|
||||||
|
|
||||||
|
Le résultat est à la hauteur : failover transparent, synchro des états, et connexions actives qui survivent à un crash. Le point le plus délicat reste la gestion du bail WAN, mais le hook CARP règle ce problème.
|
||||||
|
|
||||||
|
🚀 Prochaine étape : préparer la migration de mon réseau de production vers ce cluster HA virtuel, avec un minimum de coupures. Restez connecté !
|
@@ -1,8 +1,8 @@
|
|||||||
---
|
---
|
||||||
slug: opnsense-virtualization-highly-available
|
slug: opnsense-virtualization-highly-available
|
||||||
title: Template
|
title: Build a Highly Available OPNsense Cluster on Proxmox VE
|
||||||
description:
|
description: A proof of concept showing how to virtualize OPNsense on Proxmox VE, configure high availability with CARP and pfSync and handle a single WAN IP.
|
||||||
date:
|
date: 2025-09-29
|
||||||
draft: true
|
draft: true
|
||||||
tags:
|
tags:
|
||||||
- opnsense
|
- opnsense
|
||||||
@@ -19,18 +19,17 @@ That failure made me rethink my setup. A unique firewall is a single point of fa
|
|||||||
|
|
||||||
Of course, just running one VM wouldn’t be enough. To get real redundancy, I need two OPNsense instances in **High Availability**, with one active and the other standing by.
|
Of course, just running one VM wouldn’t be enough. To get real redundancy, I need two OPNsense instances in **High Availability**, with one active and the other standing by.
|
||||||
|
|
||||||
Before rolling this out in my network, I wanted to demonstrate the idea in my homelab.
|
Before rolling this out in my network, I wanted to demonstrate the idea in my homelab. In this post, I’ll walk through the proof of concept: deploying two OPNsense VMs inside a **Proxmox VE** cluster and configuring them to provide a highly available firewall.
|
||||||
In this post, I’ll walk through the proof of concept: deploying two OPNsense VMs inside a **Proxmox VE** cluster and configuring them to provide a highly available firewall.
|
|
||||||
|
|
||||||
---
|
---
|
||||||
## Current Infrastructure
|
## Current Infrastructure
|
||||||
|
|
||||||
At the edge of my setup, my ISP modem, a *Freebox* in bridge mode, connects directly to the `igc0` interface of my OPNsense box, serving as the **WAN**. On `igc1`, the **LAN** is linked to my main switch using a trunk port, with VLAN 1 as the native VLAN for my management network.
|
On top of my setup, my ISP modem, a *Freebox* in bridge mode, connects directly to the `igc0` interface of my OPNsense box, serving as the **WAN**. On `igc1`, the **LAN** is linked to my main switch using a trunk port, with VLAN 1 as the native VLAN for my management network.
|
||||||
|
|
||||||
The switch also connects my three Proxmox nodes, each on trunk ports with the same native VLAN. Every node has two NICs: one for general networking and the other dedicated to the Ceph storage network, which runs through a separate 2.5 Gbps switch.
|
The switch also connects my three Proxmox nodes, each on trunk ports with the same native VLAN. Every node has two NICs: one for general networking and the other dedicated to the Ceph storage network, which runs through a separate 2.5 Gbps switch.
|
||||||
|
|
||||||
Since the OPNsense crash, I’ve simplified things by removing the LACP link, it wasn’t adding real value:
|
Since the OPNsense crash, I’ve simplified things by removing the LACP link, it wasn’t adding real value:
|
||||||

|

|
||||||
|
|
||||||
|
|
||||||
Until recently, Proxmox networking on my cluster was very basic: each node was configured individually with no real overlay logic. That changed after I explored Proxmox SDN, where I centralized VLAN definitions across the cluster. I described that step in [this article]({{< ref "post/11-proxmox-cluster-networking-sdn" >}}).
|
Until recently, Proxmox networking on my cluster was very basic: each node was configured individually with no real overlay logic. That changed after I explored Proxmox SDN, where I centralized VLAN definitions across the cluster. I described that step in [this article]({{< ref "post/11-proxmox-cluster-networking-sdn" >}}).
|
||||||
@@ -61,13 +60,13 @@ Once the 3 new VLAN have been created, I apply the configuration.
|
|||||||
|
|
||||||
Additionally, I add these 3 VLANs in my UniFi Controller. Here only the VLAN ID and name are needed, since the controller will propagate them through the trunks connected to my Proxmox VE nodes.
|
Additionally, I add these 3 VLANs in my UniFi Controller. Here only the VLAN ID and name are needed, since the controller will propagate them through the trunks connected to my Proxmox VE nodes.
|
||||||
|
|
||||||
### Create Fake ISP Box VM
|
### Create “Fake ISP Box“ VM
|
||||||
|
|
||||||
To simulate my current ISP modem, I built a VM named `fake-freebox`. This VM routes traffic between the _POC WAN_ and _POC LAN_ networks and runs a DHCP server that serves only one lease, just like my real Freebox in bridge mode.
|
To simulate my current ISP modem, I built a VM named `fake-freebox`. This VM routes traffic between the *POC WAN* and *Lab* networks and runs a DHCP server that serves only one lease, just like my real Freebox in bridge mode.
|
||||||
|
|
||||||
This VM has 2 NICs, I configure Netplan with:
|
This VM has 2 NICs, I configure Netplan with:
|
||||||
- `eth0` (*POC WAN* VLAN 101): static IP address `10.101.0.254/24`
|
- `eth0` (*POC WAN* VLAN 101): static IP address `10.101.0.254/24`
|
||||||
- enp6s19 (Lab VLAN 66): DHCP address given by my current OPNsense router, my upstream
|
- enp6s19 (Lab VLAN 66): DHCP address given by my current OPNsense router, in upstream
|
||||||
```yaml
|
```yaml
|
||||||
network:
|
network:
|
||||||
version: 2
|
version: 2
|
||||||
@@ -85,7 +84,7 @@ echo "net.ipv4.ip_forward=1" | sudo tee -a /etc/sysctl.conf
|
|||||||
sudo sysctl -p
|
sudo sysctl -p
|
||||||
```
|
```
|
||||||
|
|
||||||
Then I set up masquerading so packets leaving through the lab network wouldn’t be dropped by my production OPNsense:
|
Then I set up masquerading so packets leaving through the lab network wouldn’t be dropped by my current OPNsense:
|
||||||
```bash
|
```bash
|
||||||
sudo iptables -t nat -A POSTROUTING -o enp6s19 -j MASQUERADE
|
sudo iptables -t nat -A POSTROUTING -o enp6s19 -j MASQUERADE
|
||||||
sudo apt install iptables-persistent -y
|
sudo apt install iptables-persistent -y
|
||||||
@@ -97,7 +96,7 @@ I install `dnsmasq` as a lightweight DHCP server:
|
|||||||
sudo apt install dnsmasq -y
|
sudo apt install dnsmasq -y
|
||||||
```
|
```
|
||||||
|
|
||||||
I configure `/etc/dnsmasq.conf` to serve exactly one lease (`10.101.0.150`) with DNS pointing to my real OPNsense router, in the *Lab* VLAN:
|
In `/etc/dnsmasq.conf`, I configure to serve exactly one lease (`10.101.0.150`) with DNS pointing to my current OPNsense router, in the *Lab* VLAN:
|
||||||
```
|
```
|
||||||
interface=eth0
|
interface=eth0
|
||||||
bind-interfaces
|
bind-interfaces
|
||||||
@@ -106,7 +105,7 @@ dhcp-option=3,10.101.0.254 # default gateway = this VM
|
|||||||
dhcp-option=6,192.168.66.1 # DNS server
|
dhcp-option=6,192.168.66.1 # DNS server
|
||||||
```
|
```
|
||||||
|
|
||||||
I restart the dnsmasq service to apply the configuration:
|
I restart the `dnsmasq` service to apply the configuration:
|
||||||
```bash
|
```bash
|
||||||
sudo systemctl restart dnsmasq
|
sudo systemctl restart dnsmasq
|
||||||
```
|
```
|
||||||
@@ -116,7 +115,7 @@ The `fake-freebox` VM is now ready to serve DHCP on VLAN 101 and serve only one
|
|||||||
### Build OPNsense VMs
|
### Build OPNsense VMs
|
||||||
|
|
||||||
First I download the OPNsense ISO and upload it to one of my Proxmox nodes:
|
First I download the OPNsense ISO and upload it to one of my Proxmox nodes:
|
||||||

|

|
||||||
|
|
||||||
#### VM Creation
|
#### VM Creation
|
||||||
|
|
||||||
@@ -143,27 +142,27 @@ The VM boots on the ISO, I touch nothing until I get into the login screen:
|
|||||||

|

|
||||||
|
|
||||||
I log in as `installer` / `opnsense` and launch the installer. I select the QEMU hard disk of 20GB as destination and launch the installation:
|
I log in as `installer` / `opnsense` and launch the installer. I select the QEMU hard disk of 20GB as destination and launch the installation:
|
||||||

|

|
||||||
|
|
||||||
Once the installation is finished, I remove the ISO from the drive and restart the machine.
|
Once the installation is finished, I remove the ISO from the drive and restart the machine.
|
||||||
|
|
||||||
#### OPNsense basic configuration
|
#### OPNsense Basic Configuration
|
||||||
|
|
||||||
After reboot, I log in as `root` / `opnsense` and get into the CLI menu:
|
After reboot, I log in as `root` / `opnsense` and get into the CLI menu:
|
||||||

|

|
||||||
|
|
||||||
Using option 1, I reassigned interfaces:
|
Using option 1, I reassigned interfaces:
|
||||||

|

|
||||||
|
|
||||||
The WAN interface successfully pulled `10.101.0.150/24` from the `fake-freebox`. I set the LAN interface to `10.102.0.2/24` and configured a DHCP pool from `10.102.0.10` to `10.102.0.99`:
|
The WAN interface successfully pulled `10.101.0.150/24` from the `fake-freebox`. I set the LAN interface to `10.102.0.2/24` and configured a DHCP pool from `10.102.0.10` to `10.102.0.99`:
|
||||||

|

|
||||||
|
|
||||||
✅ The first VM is ready, I start over for the second OPNsense VM, `poc-opnsense-2` which will have the IP `10.102.0.3`
|
✅ The first VM is ready, I start over for the second OPNsense VM, `poc-opnsense-2` which will have the IP `10.102.0.3`
|
||||||
|
|
||||||
### Configure OPNsense Highly Available
|
### Configure OPNsense Highly Available
|
||||||
|
|
||||||
With both OPNsense VMs operational, it’s time to configure them from the WebGUI. To access the interface, I connected a Windows VM into the _POC LAN_ VLAN and browsed to the OPNsense IP on port 443:
|
With both OPNsense VMs operational, it’s time to configure them from the WebGUI. To access the interface, I connected a Windows VM into the _POC LAN_ VLAN and browsed to the OPNsense IP on port 443:
|
||||||

|

|
||||||
|
|
||||||
#### Add pfSync Interface
|
#### Add pfSync Interface
|
||||||
|
|
||||||
@@ -184,38 +183,45 @@ Next, in `System` > `High Availability` > `Settings`.
|
|||||||
- On the backup (`poc-opnsense-2`), only `General Settings` are needed, you don't want your backup overwrite the master config.
|
- On the backup (`poc-opnsense-2`), only `General Settings` are needed, you don't want your backup overwrite the master config.
|
||||||

|

|
||||||
|
|
||||||
Once applied, I can verify that it is ok on the `Status` page:
|
Once applied, I verify synchronization on the `Status` page:
|
||||||

|

|
||||||
|
|
||||||
#### Create Virtual IP Address
|
#### Create Virtual IP Address
|
||||||
|
|
||||||
Now I need to create the VIP for the LAN interface, an IP address shared across the cluster. The master node will claim that IP which is the gateway given to the clients. The VIP will use the CARP, Common Address Redundancy Protocol for failover. To create it, navigate to `Interfaces` > `Virtual IPs` > `Settings`:
|
To provide a shared gateway for clients, I create a CARP Virtual IP (VIP) on the LAN interface. It is using the Common Address Redundancy Protocol. This IP is claimed by the active node and automatically fails over.
|
||||||
|
|
||||||
|
Navigate to `Interfaces` > `Virtual IPs` > `Settings`:
|
||||||

|

|
||||||
|
|
||||||
To replicate the config to the backup node, go to `System` > `High Availability` > `Status` and click the `Synchronize and reconfigure all` button. To verify, on both node navigate to `Interfaces` > `Virtual IPs` > `Status`. The master node should have the VIP active with the status `MASTER`, and the backup node with the status `BACKUP`.
|
To replicate the config, I go to `System > High Availability > Status` and click the button next to `Synchronize and reconfigure all`.
|
||||||
|
|
||||||
|
On the `Interfaces > Virtual IPs > Status` page, the master show the VIP as `MASTER`, while the backup report `BACKUP`.
|
||||||
|
|
||||||
#### Reconfigure DHCP
|
#### Reconfigure DHCP
|
||||||
|
|
||||||
I need to reconfigure the DHCP for HA. Dnsmasq does not support DHCP lease synchronization, I have to configure the two instances independently, they would serve both DHCP lease at the same time.
|
For HA, I need to adjust the DHCP setup. Since **Dnsmasq** does not support lease synchronization, both instances must serve leases independently.
|
||||||
|
|
||||||
On the master node, in `Services` > `Dnsmasq DNS & DHCP` > `General`, I tick the `Disable HA sync` box. Then in `DHCP ranges`, I edit the current one and also tick the `Disable HA sync` box. In `DHCP options`, I add the option `router [3]` with the value 10.102.0.1, to advertise the VIP address:
|
On the master:
|
||||||
|
- `Services` > `Dnsmasq DNS & DHCP` > `General`: tick the `Disable HA sync` box.
|
||||||
|
- `DHCP ranges`: also tick the `Disable HA sync` box
|
||||||
|
- `DHCP options`: add the option `router [3]` with the value `10.102.0.1` (LAN VIP)
|
||||||
|
- `DHCP options`: clone the rule for `router [6]` pointing to the same VIP.
|
||||||

|

|
||||||
|
|
||||||
I clone that rule for the option `dns-server [6]` with the same address.
|
On the backup:
|
||||||
|
- `Services` > `Dnsmasq DNS & DHCP` > `General`: also tick the `Disable HA sync` box
|
||||||
|
- Set `DHCP reply delay` to `5` seconds, to give master priority to answer.
|
||||||
|
- `DHCP ranges`: Use a different pool, smaller (`10.102.0.200` -> `220`)
|
||||||
|
- but I also set the value `5` to `DHCP reply delay`. This would give enough time to the master node to provide a DHCP lease before the backup node. In `DHCP ranges`, I edit the current one and give a smaller pool, different than the master's. Here I also tick the `Disable HA sync` box.
|
||||||
|
|
||||||
On the backup node, in `Services` > `Dnsmasq DNS & DHCP` > `General`, I also tick the `Disable HA sync` box, but I also set the value `5` to `DHCP reply delay`. This would give enough time to the master node to provide a DHCP lease before the backup node. In `DHCP ranges`, I edit the current one and give a smaller pool, different than the master's. Here I also tick the `Disable HA sync` box.
|
This way, only DHCP options sync between nodes, while lease ranges stay separate.
|
||||||
|
|
||||||
Now I can safely sync my services like described above, this will only propagate the DHCP options, which are mean to be the same.
|
|
||||||
|
|
||||||
#### WAN Interface
|
#### WAN Interface
|
||||||
|
|
||||||
The last thing I need to configure is the WAN interface, my ISP box is only giving me one IP address over DHCP, I don't want my 2 VMs compete to claim it. To handle that, I give my 2 VMs the same MAC for the WAN interface, then I need to find a solution to enable the WAN interface only on the master node.
|
My ISP modem only provides a single DHCP lease, I don't want my 2 VMs compete to claim it. To handle this:
|
||||||
|
1. In Proxmox, I copy the MAC of the `net0` (WAN) interface from `poc-opnsense-1` and applied it to `poc-opnsense-2`. This way, the DHCP lease could be shared among the nodes.
|
||||||
In the Proxmox WebGUI, I copy the MAC address of the net0 interface (*POC WAN*) from `poc-opnsense-1` and paste it to the one in `poc-opnsense-2`.
|
⚠️ If both VMs bring up the same MAC, it can cause ARP conflicts and break connectivity, only the MASTER should keep its WAN active.
|
||||||
|
2. CARP event hook provides the possibility to run scripts, I deployed this [Gist script](https://gist.github.com/spali/2da4f23e488219504b2ada12ac59a7dc#file-10-wancarp) in `/usr/local/etc/rc.syshook.d/carp/10-wan` on both nodes. This ensures the WAN is active only on the MASTER, avoiding conflicts.
|
||||||
To handle the activation of the WAN interface on the master node while deactivating the backup, I can use a script. On CARP event, scripts located in `/usr/local/etc/rc.syshood.d/carp` are played. I found this [Gist](https://gist.github.com/spali/2da4f23e488219504b2ada12ac59a7dc#file-10-wancarp) which is exactly what I wanted.
|
|
||||||
|
|
||||||
I copy this script in `/usr/local/etc/rc.syshood.d/carp/10-wan` on both nodes:
|
|
||||||
```php
|
```php
|
||||||
#!/usr/local/bin/php
|
#!/usr/local/bin/php
|
||||||
<?php
|
<?php
|
||||||
@@ -255,13 +261,23 @@ if ($type === "MASTER") {
|
|||||||
|
|
||||||
### Test Failover
|
### Test Failover
|
||||||
|
|
||||||
Time for testing! OPNsense provides a way to enter CARP maintenance mode. Before pushing the button, my master has its WAN interface enabled and the backup doesn't:
|
Time for the real test!
|
||||||

|
|
||||||
|
|
||||||
Once I enter the CARP maintenance mode, the master node become backup and vice versa, the WAN interface get disabled while it's enabling on the other node. I was pinging outside of the network while switching and experienced not a single drop!
|
OPNsense provides a _CARP Maintenance Mode_. With the master active, WAN was enabled only on that node. Entering maintenance mode flipped the roles: the master became backup, its WAN disabled, while the backup enabled its WAN:
|
||||||
|

|
||||||
|
|
||||||
Finally, I simulate a crash by powering off the master node and the magic happens! Here I have only one packet lost and, thanks to the firewall state sync, I can even keep my SSH connection alive.
|
While pinging outside the network, I observed zero packet loss during the failover.
|
||||||
|
|
||||||
|
Finally, I simulated a crash by powering off the master. The backup took over seamlessly, I saw only one dropped packet, and thanks to state synchronization, even my SSH session stayed alive. 🎉
|
||||||
|
|
||||||
## Conclusion
|
## Conclusion
|
||||||
|
|
||||||
|
This proof of concept showed that running **OPNsense in high availability on Proxmox VE** is possible, even with a single WAN IP address. To achieve this, I needed these components:
|
||||||
|
- VLAN segmentation.
|
||||||
|
- Dedicated pfSync network.
|
||||||
|
- Shared virtual IP.
|
||||||
|
- Script to manage the WAN interface.
|
||||||
|
|
||||||
|
The setup behave exactly as expected, seamless failover, synchronized firewall states, and even live sessions surviving a node crash. The most delicate part was handling the WAN lease, since my ISP modem only provides one IP, but the CARP hook script solved that challenge.
|
||||||
|
|
||||||
|
🚀 The next milestone will be carefully planning the migration of my network into this virtual HA pair with minimal downtime, stay tuned!
|
Reference in New Issue
Block a user