Compare commits
106 Commits
feature-re
...
preview
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
44ba9fc8b4 | ||
|
|
ef4a5aefd1 | ||
|
|
6fa0fed3c5 | ||
|
|
5fb57c5dc0 | ||
|
|
7d7ab3493c | ||
|
|
44b8f36979 | ||
|
|
b388107289 | ||
|
|
68cbef5791 | ||
|
|
48ff17c292 | ||
|
|
d495546120 | ||
|
|
4d2f015de0 | ||
|
|
2fdbd11020 | ||
|
|
4f152bf733 | ||
|
|
08708e2c6c | ||
|
|
0b24515e7f | ||
|
|
fc6c382c52 | ||
|
|
3048c96362 | ||
|
|
0ff5d0de9e | ||
|
|
3fce8381c5 | ||
|
|
53c3af0e11 | ||
|
|
437f03f496 | ||
|
|
ae86e4299e | ||
|
|
4eaf78ada7 | ||
|
|
940bc95436 | ||
|
|
1266824a75 | ||
|
|
52cb5cf946 | ||
|
|
3bf8b55f9b | ||
|
|
2ec0509d55 | ||
|
|
e9aef5e181 | ||
|
|
5919b95364 | ||
|
|
5829b3a198 | ||
|
|
e088e8ab18 | ||
|
|
ab7996e685 | ||
|
|
97b79cc592 | ||
|
|
dfb50b5e9e | ||
|
|
ea07c41c93 | ||
|
|
d55e0e1724 | ||
|
|
f609fda5a2 | ||
|
|
697af312f0 | ||
|
|
9c98652535 | ||
|
|
dc6bf7b47d | ||
|
|
6f0b6363e0 | ||
|
|
80c05adf83 | ||
|
|
5213faad68 | ||
|
|
54569bc98b | ||
|
|
e2b1e02c0a | ||
|
|
e20fb0b37e | ||
|
|
1ea2845767 | ||
|
|
97c2ef3a85 | ||
|
|
c206deeafb | ||
|
|
c5c6557691 | ||
|
|
54d09bf788 | ||
|
|
a741aebd3f | ||
|
|
fd48512c1b | ||
|
|
d7ab8cdab3 | ||
|
|
bc44f93c3f | ||
|
|
61309cfb4e | ||
|
|
50cc6c195a | ||
|
|
c27bd9f906 | ||
|
|
e495593cc1 | ||
|
|
dc9c6d7164 | ||
|
|
ca68e911eb | ||
|
|
721e911258 | ||
|
|
09ed5ade30 | ||
|
|
ddb46f8aa3 | ||
|
|
65af7bcee5 | ||
|
|
271fe23e23 | ||
|
|
5a2a530d32 | ||
|
|
a88e3158c5 | ||
|
|
4cd7c76c0a | ||
|
|
67d23c90ac | ||
|
|
36f1374128 | ||
|
|
b801726508 | ||
|
|
c87b9f4bc9 | ||
|
|
08a8b65a1d | ||
|
|
ce1249a924 | ||
|
|
545720c4c0 | ||
|
|
2119cbf695 | ||
|
|
ce2288dabe | ||
|
|
a1fa2c0d53 | ||
|
|
bacb22987a | ||
|
|
4a20a913e0 | ||
|
|
c5c6d9b91d | ||
|
|
236e9fa668 | ||
|
|
35b8a8596f | ||
|
|
d73aaad0b4 | ||
|
|
c43d2af086 | ||
|
|
011dbc7293 | ||
|
|
37025b683a | ||
|
|
ef84e229b2 | ||
|
|
0598cf2a5f | ||
|
|
e851ee9bd9 | ||
|
|
3389de98d8 | ||
|
|
aa9077a47b | ||
|
|
d888220239 | ||
|
|
44ddcb6589 | ||
|
|
d3ad691387 | ||
|
|
57db4726d7 | ||
|
|
58856f4668 | ||
|
|
62833b288a | ||
|
|
739763bc9c | ||
|
|
8482223f48 | ||
|
|
302c6d1a46 | ||
|
|
fbafb580a0 | ||
|
|
8ed82a75ab | ||
|
|
b6b8083adb |
@@ -1,137 +0,0 @@
|
|||||||
---
|
|
||||||
slug:
|
|
||||||
title: Template
|
|
||||||
description:
|
|
||||||
date:
|
|
||||||
draft: true
|
|
||||||
tags:
|
|
||||||
- opnsense
|
|
||||||
- high-availability
|
|
||||||
- proxmox
|
|
||||||
categories:
|
|
||||||
---
|
|
||||||
|
|
||||||
## Intro
|
|
||||||
|
|
||||||
In my previous [post]({{< ref "post/12-opnsense-virtualization-highly-available" >}}), I've set up a PoC to validate the possibility to create a cluster of 2 **OPNsense** VMs in **Proxmox VE** and make the firewall highly available.
|
|
||||||
|
|
||||||
This time, I will cover the creation of my future OPNsense cluster from scratch, plan the cut over and finally migrate from my current physical box.
|
|
||||||
|
|
||||||
---
|
|
||||||
## Build the Foundation
|
|
||||||
|
|
||||||
For the real thing, I'll have to connect the WAN, coming from my ISP box, to my main switch. For that I have to add a VLAN to transport this flow to my Proxmox nodes.
|
|
||||||
|
|
||||||
### UniFi
|
|
||||||
|
|
||||||
The first thing I do is to configure my layer 2 network which is managed by UniFi. There I need to create two VLANs:
|
|
||||||
- *WAN* (20): transport the WAN between my ISP box and my Proxmox nodes.
|
|
||||||
- *pfSync* (44), communication between my OPNsense nodes.
|
|
||||||
|
|
||||||
In the UniFi controller, in `Settings` > `Networks`, I add a `New Virtual Network`. I name it `WAN` and give it the VLAN ID 20:
|
|
||||||

|
|
||||||
|
|
||||||
I do the same thing again for the `pfSync` VLAN with the VLAN ID 44.
|
|
||||||
|
|
||||||
I will plug my ISP box on the port 15 of my switch, which is disabled for now. I set it as active, set the native VLAN on the newly created one `WAN (20)` and disable trunking:
|
|
||||||

|
|
||||||
|
|
||||||
Once this setting applied, I make sure that only the ports where are connected my Proxmox nodes propagate these VLAN on their trunk.
|
|
||||||
|
|
||||||
We are done with UniFi configuration.
|
|
||||||
|
|
||||||
### Proxmox SDN
|
|
||||||
|
|
||||||
Now that the VLAN can reach my nodes, I want to handle it in the Proxmox SDN.
|
|
||||||
|
|
||||||
In `Datacenter` > `SDN` > `VNets`, I create a new VNet, name it `vlan20` to follow my own naming convention, give it the *WAN* alias and use the tag (ID) 20:
|
|
||||||

|
|
||||||
|
|
||||||
I also create the `vlan44` for the *pfSync* VLAN, then I apply this configuration and we are done with the SDN.
|
|
||||||
|
|
||||||
---
|
|
||||||
## Create the VMs
|
|
||||||
|
|
||||||
Now that the VLAN configuration is done, I can start buiding the virtual machines on Proxmox.
|
|
||||||
|
|
||||||
The first VM is named `cerbere-head1` (I didn't tell you? My current firewall is named `cerbere`, it makes even more sense now!) Here are the settings:
|
|
||||||
- OS type: Linux
|
|
||||||
- Machine type: `q35`
|
|
||||||
- BIOS: `OVMF (UEFI)`
|
|
||||||
- Disk: 20 GiB on Ceph storage
|
|
||||||
- CPU/RAM: 2 vCPU, 4 GiB RAM
|
|
||||||
- NICs:
|
|
||||||
1. `vmbr0` (*Mgmt*)
|
|
||||||
2. `vlan20` (*WAN*)
|
|
||||||
3. `vlan13` *(User)*
|
|
||||||
4. `vlan37` *(IoT)*
|
|
||||||
5. `vlan44` *(pfSync)*
|
|
||||||
6. `vlan55` *(DMZ)*
|
|
||||||
7. `vlan66` *(Lab)*
|
|
||||||

|
|
||||||
|
|
||||||
ℹ️ Now I clone that VM to create `cerbere-head2`, then I proceed with OPNsense installation. I don't want to go into much details about OPNsense installation, I already documented it in the previous [post]({{< ref "post/12-opnsense-virtualization-highly-available" >}}).
|
|
||||||
|
|
||||||
After the installation of both OPNsense instances, I give to each of them their IP in the *Mgmt* network:
|
|
||||||
- `cerbere-head1`: `192.168.88.2/24`
|
|
||||||
- `cerbere-head2`: `192.168.88.3/24`
|
|
||||||
|
|
||||||
While these routers are not managing the networks, I give them my current OPNsense routeur as gateway (`192.168.88.1`) to able to reach them from my PC in another VLAN.
|
|
||||||
|
|
||||||
---
|
|
||||||
## Configure OPNsense
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
## TODO
|
|
||||||
|
|
||||||
HA in proxmox
|
|
||||||
Make sure VM start at proxmox boot
|
|
||||||
Check conso Watt average
|
|
||||||
Check temp average
|
|
||||||
## Switch
|
|
||||||
|
|
||||||
Backup OPNsense box
|
|
||||||
Disable DHCP on OPNsene box
|
|
||||||
Change OPNsense box IPs
|
|
||||||
|
|
||||||
Remove GW on VM
|
|
||||||
Configure DHCP on both instance
|
|
||||||
Enable DHCP on VM
|
|
||||||
Change VIP on VM
|
|
||||||
Replicate configuration on VM
|
|
||||||
Unplug OPNsense box WAN
|
|
||||||
Plug WAN on port 15
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
## Verify
|
|
||||||
|
|
||||||
Ping VIP
|
|
||||||
Vérifier interface
|
|
||||||
tests locaux (ssh, ping)
|
|
||||||
|
|
||||||
Basic (dhcp, dns, internet)
|
|
||||||
Firewall
|
|
||||||
All sites
|
|
||||||
mDNS (chromecast)
|
|
||||||
VPN
|
|
||||||
TV
|
|
||||||
|
|
||||||
Vérifier tous les devices
|
|
||||||
|
|
||||||
DNS blocklist
|
|
||||||
|
|
||||||
Check load (ram, cpu)
|
|
||||||
Failover
|
|
||||||
|
|
||||||
Test proxmox full shutdown
|
|
||||||
|
|
||||||
## Clean Up
|
|
||||||
|
|
||||||
Shutdown OPNsense
|
|
||||||
Check watt
|
|
||||||
Check temp
|
|
||||||
|
|
||||||
## Rollback
|
|
||||||
424
content/post/14-proxmox-cluster-upgrade-8-to-9-ceph.fr.md
Normal file
@@ -0,0 +1,424 @@
|
|||||||
|
---
|
||||||
|
slug: proxmox-cluster-upgrade-8-to-9-ceph
|
||||||
|
title: Mise à niveau de mon cluster Proxmox VE HA 3 nœuds de 8 vers 9 basé sur Ceph
|
||||||
|
description: Mise à niveau pas à pas de mon cluster Proxmox VE 3 nœuds en haute disponibilité, de 8 vers 9, basé sur Ceph, sans aucune interruption.
|
||||||
|
date: 2025-11-04
|
||||||
|
draft: false
|
||||||
|
tags:
|
||||||
|
- proxmox
|
||||||
|
- high-availability
|
||||||
|
- ceph
|
||||||
|
categories:
|
||||||
|
- homelab
|
||||||
|
---
|
||||||
|
|
||||||
|
## Intro
|
||||||
|
|
||||||
|
Mon **cluster Proxmox VE** a presque un an maintenant, et je n’ai pas tenu les nœuds complètement à jour. Il est temps de m’en occuper et de le passer en Proxmox VE **9**.
|
||||||
|
|
||||||
|
Je recherche principalement les nouvelles règles d’affinité HA, mais voici les changements utiles apportés par cette version :
|
||||||
|
- Debian 13 "Trixie".
|
||||||
|
- Snapshots pour le stockage LVM partagé thick-provisioned.
|
||||||
|
- Fonctionnalité SDN fabrics.
|
||||||
|
- Interface mobile améliorée.
|
||||||
|
- Règles d’affinité dans le cluster HA.
|
||||||
|
|
||||||
|
Le cluster est composée de 3 nœuds, hautement disponible, avec une configuration hyper‑convergée, utilisant Ceph pour le stockage distribué.
|
||||||
|
|
||||||
|
Dans cet article, je décris les étapes de mise à niveau de mon cluster Proxmox VE, de la version 8 vers 9, tout en gardant les ressources actives. [Documentation officielle](https://pve.proxmox.com/wiki/Upgrade_from_8_to_9).
|
||||||
|
|
||||||
|
---
|
||||||
|
## Prérequis
|
||||||
|
|
||||||
|
Avant de se lancer dans la mise à niveau, passons en revue les prérequis :
|
||||||
|
|
||||||
|
1. Tous les nœuds mis à jour vers la dernière version Proxmox VE `8.4`.
|
||||||
|
2. Cluster Ceph mis à niveau vers Squid (`19.2`).
|
||||||
|
3. Proxmox Backup Server mis à jour vers la version 4.
|
||||||
|
4. Accès fiable au nœud.
|
||||||
|
5. Cluster en bonne santé.
|
||||||
|
6. Sauvegarde de toutes les VM et CT.
|
||||||
|
7. Au moins 5 Go libres sur `/`.
|
||||||
|
|
||||||
|
Remarques sur mon environnement :
|
||||||
|
|
||||||
|
- Les nœuds PVE sont en `8.3.2`, donc une mise à jour mineure vers 8.4 est d’abord requise.
|
||||||
|
- Ceph tourne sous Reef (`18.2.4`) et sera mis à niveau vers Squid après PVE 8.4.
|
||||||
|
- Je n’utilise pas PBS dans mon homelab, donc je peux sauter cette étape.
|
||||||
|
- J’ai plus de 10 Go disponibles sur `/` sur mes nœuds, c’est suffisant.
|
||||||
|
- Je n’ai qu’un accès console SSH, si un nœud ne répond plus je pourrais avoir besoin d’un accès physique.
|
||||||
|
- Une VM a un passthrough CPU (APU). Le passthrough empêche la migration à chaud, donc je supprime ce mapping avant la mise à niveau.
|
||||||
|
- Mettre les OSD Ceph en `noout` pendant la mise à niveau pour éviter le rebalancing automatique :
|
||||||
|
```bash
|
||||||
|
ceph osd set noout
|
||||||
|
```
|
||||||
|
|
||||||
|
### Mettre à Jour Proxmox VE vers 8.4.14
|
||||||
|
|
||||||
|
Le plan est simple, pour tous les nœuds, un par un :
|
||||||
|
|
||||||
|
1. Activer le mode maintenance
|
||||||
|
```bash
|
||||||
|
ha-manager crm-command node-maintenance enable $(hostname)
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Mettre à jour le nœud
|
||||||
|
```bash
|
||||||
|
apt-get update
|
||||||
|
apt-get dist-upgrade -y
|
||||||
|
```
|
||||||
|
|
||||||
|
À la fin de la mise à jour, on me propose de retirer booloader, ce que j’exécute :
|
||||||
|
```plaintext
|
||||||
|
Removable bootloader found at '/boot/efi/EFI/BOOT/BOOTX64.efi', but GRUB packages not set up to update it!
|
||||||
|
Run the following command:
|
||||||
|
|
||||||
|
echo 'grub-efi-amd64 grub2/force_efi_extra_removable boolean true' | debconf-set-selections -v -u
|
||||||
|
|
||||||
|
Then reinstall GRUB with 'apt install --reinstall grub-efi-amd64'
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Redémarrer la machine
|
||||||
|
```bash
|
||||||
|
reboot
|
||||||
|
```
|
||||||
|
|
||||||
|
4. Désactiver le mode maintenance
|
||||||
|
```bash
|
||||||
|
ha-manager crm-command node-maintenance disable $(hostname)
|
||||||
|
```
|
||||||
|
|
||||||
|
Entre chaque nœud, j’attends que le statut Ceph soit clean, sans alertes.
|
||||||
|
|
||||||
|
✅ À la fin, le cluster Proxmox VE est mis à jour vers `8.4.14`
|
||||||
|
|
||||||
|
### Mettre à Niveau Ceph de Reef vers Squid
|
||||||
|
|
||||||
|
Je peux maintenant passer à la mise à niveau de Ceph, la documentation Proxmox pour cette procédure est [ici](https://pve.proxmox.com/wiki/Ceph_Reef_to_Squid).
|
||||||
|
|
||||||
|
Mettre à jour les sources de paquets Ceph sur chaque nœud :
|
||||||
|
```bash
|
||||||
|
sed -i 's/reef/squid/' /etc/apt/sources.list.d/ceph.list
|
||||||
|
```
|
||||||
|
|
||||||
|
Mettre à niveau les paquets Ceph :
|
||||||
|
```
|
||||||
|
apt update
|
||||||
|
apt full-upgrade -y
|
||||||
|
```
|
||||||
|
|
||||||
|
Après la mise à niveau sur le premier nœud, la version Ceph affiche maintenant `19.2.3`, je peux voir mes OSD apparaître comme obsolètes, les moniteurs nécessitent soit une mise à niveau soit un redémarrage :
|
||||||
|

|
||||||
|
|
||||||
|
Je poursuis et mets à niveau les paquets sur les 2 autres nœuds.
|
||||||
|
|
||||||
|
J’ai un moniteur sur chaque nœud, donc je dois redémarrer chaque moniteur, un nœud à la fois :
|
||||||
|
```bash
|
||||||
|
systemctl restart ceph-mon.target
|
||||||
|
```
|
||||||
|
|
||||||
|
Je vérifie le statut Ceph entre chaque redémarrage :
|
||||||
|
```bash
|
||||||
|
ceph status
|
||||||
|
```
|
||||||
|
|
||||||
|
Une fois tous les moniteurs redémarrés, ils rapportent la dernière version, avec `ceph mon dump` :
|
||||||
|
- Avant : `min_mon_release 18 (reef)`
|
||||||
|
- Après : `min_mon_release 19 (squid)`
|
||||||
|
|
||||||
|
Je peux maintenant redémarrer les OSD, toujours un nœud à la fois. Dans ma configuration, j’ai un OSD par nœud :
|
||||||
|
```bash
|
||||||
|
systemctl restart ceph-osd.target
|
||||||
|
```
|
||||||
|
|
||||||
|
Je surveille le statut Ceph avec la WebGUI Proxmox. Après le redémarrage, elle affiche quelques couleurs fancy. J’attends juste que les PG redeviennent verts, cela prend moins d’une minute :
|
||||||
|

|
||||||
|
|
||||||
|
Un avertissement apparaît : `HEALTH_WARN: all OSDs are running squid or later but require_osd_release < squid`
|
||||||
|
|
||||||
|
Maintenant tous mes OSD tournent sous Squid, je peux fixer la version minimum à celle‑ci :
|
||||||
|
```bash
|
||||||
|
ceph osd require-osd-release squid
|
||||||
|
```
|
||||||
|
|
||||||
|
ℹ️ Je n’utilise pas actuellement CephFS donc je n’ai pas à me soucier du daemon MDS (MetaData Server).
|
||||||
|
|
||||||
|
✅ Le cluster Ceph a été mis à niveau avec succès vers Squid (`19.2.3`).
|
||||||
|
|
||||||
|
---
|
||||||
|
## Vérifications
|
||||||
|
|
||||||
|
Les prérequis pour mettre à niveau le cluster vers Proxmox VE 9 sont maintenant complets. Suis‑je prêt à mettre à niveau ? Pas encore.
|
||||||
|
|
||||||
|
Un petit programme de checklist nommé **`pve8to9`** est inclus dans les derniers paquets Proxmox VE 8.4. Le programme fournit des indices et des alertes sur les problèmes potentiels avant, pendant et après la mise à niveau. Pratique non ?
|
||||||
|
|
||||||
|
Lancer l’outil la première fois me donne des indications sur ce que je dois faire. Le script vérifie un certain nombre de paramètres, regroupés par thème. Par exemple, voici la section sur les Virtual Guest :
|
||||||
|
```plaintext
|
||||||
|
= VIRTUAL GUEST CHECKS =
|
||||||
|
|
||||||
|
INFO: Checking for running guests..
|
||||||
|
WARN: 1 running guest(s) detected - consider migrating or stopping them.
|
||||||
|
INFO: Checking if LXCFS is running with FUSE3 library, if already upgraded..
|
||||||
|
SKIP: not yet upgraded, no need to check the FUSE library version LXCFS uses
|
||||||
|
INFO: Checking for VirtIO devices that would change their MTU...
|
||||||
|
PASS: All guest config descriptions fit in the new limit of 8 KiB
|
||||||
|
INFO: Checking container configs for deprecated lxc.cgroup entries
|
||||||
|
PASS: No legacy 'lxc.cgroup' keys found.
|
||||||
|
INFO: Checking VM configurations for outdated machine versions
|
||||||
|
PASS: All VM machine versions are recent enough
|
||||||
|
```
|
||||||
|
|
||||||
|
À la fin, vous avez le résumé. L’objectif est de corriger autant de `FAILURES` et `WARNINGS` que possible :
|
||||||
|
```plaintext
|
||||||
|
= SUMMARY =
|
||||||
|
|
||||||
|
TOTAL: 57
|
||||||
|
PASSED: 43
|
||||||
|
SKIPPED: 7
|
||||||
|
WARNINGS: 2
|
||||||
|
FAILURES: 2
|
||||||
|
```
|
||||||
|
|
||||||
|
Passons en revue les problèmes qu’il a trouvés :
|
||||||
|
|
||||||
|
```
|
||||||
|
FAIL: 1 custom role(s) use the to-be-dropped 'VM.Monitor' privilege and need to be adapted after the upgrade
|
||||||
|
```
|
||||||
|
|
||||||
|
Il y a quelque temps, pour utiliser Terraform avec mon cluster Proxmox, j'ai créé un rôle dédié. C'était détaillé dans cet [article]({{< ref "post/3-terraform-create-vm-proxmox" >}}).
|
||||||
|
|
||||||
|
Ce rôle utilise le privilège `VM.Monitor`, qui a été supprimé dans Proxmox VE 9. De nouveaux privilèges, sous `VM.GuestAgent.*`, existent à la place. Je supprime donc celui-ci et j'ajouterai les nouveaux une fois le cluster mis à niveau.
|
||||||
|
|
||||||
|
```
|
||||||
|
FAIL: systemd-boot meta-package installed. This will cause problems on upgrades of other boot-related packages. Remove 'systemd-boot' See https://pve.proxmox.com/wiki/Upgrade_from_8_to_9#sd-boot-warning for more information.
|
||||||
|
```
|
||||||
|
|
||||||
|
Proxmox VE utilise généralement `systemd-boot` pour le démarrage uniquement dans certaines configurations gérées par proxmox-boot-tool. Le méta-paquet `systemd-boot` doit être supprimé. Ce paquet était automatiquement installé sur les systèmes de PVE 8.1 à 8.4, car il contenait `bootctl` dans Bookworm.
|
||||||
|
|
||||||
|
Si le script de la checklist pve8to9 le suggère, vous pouvez supprimer le méta-paquet `systemd-boot` sans risque, sauf si vous l'avez installé manuellement et que vous utilisez `systemd-boot` comme bootloader :
|
||||||
|
```bash
|
||||||
|
apt remove systemd-boot -y
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
WARN: 1 running guest(s) detected - consider migrating or stopping them.
|
||||||
|
```
|
||||||
|
|
||||||
|
Dans une configuration HA, avant de mettre à jour un nœud, je le mets en mode maintenance. Cela déplace automatiquement les ressources ailleurs. Quand ce mode est désactivé, la machine revient à son emplacement précédent.
|
||||||
|
|
||||||
|
```
|
||||||
|
WARN: The matching CPU microcode package 'amd64-microcode' could not be found! Consider installing it to receive the latest security and bug fixes for your CPU.
|
||||||
|
Ensure you enable the 'non-free-firmware' component in the apt sources and run:
|
||||||
|
apt install amd64-microcode
|
||||||
|
```
|
||||||
|
|
||||||
|
Il est recommandé d’installer le microcode processeur pour les mises à jour qui peuvent corriger des bogues matériels, améliorer les performances et renforcer la sécurité du processeur.
|
||||||
|
|
||||||
|
J’ajoute la source `non-free-firmware` aux sources actuelles :
|
||||||
|
```bash
|
||||||
|
sed -i '/^deb /{/non-free-firmware/!s/$/ non-free-firmware/}' /etc/apt/sources.list
|
||||||
|
```
|
||||||
|
|
||||||
|
Puis installe le paquet `amd64-microcode` :
|
||||||
|
```bash
|
||||||
|
apt update
|
||||||
|
apt install amd64-microcode -y
|
||||||
|
```
|
||||||
|
|
||||||
|
Après ces petits ajustements, suis‑je prêt ? Vérifions en relançant le script `pve8to9`.
|
||||||
|
|
||||||
|
⚠️ N’oubliez pas de lancer `pve8to9` sur tous les nœuds pour vous assurer que tout est OK.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Mise à Niveau
|
||||||
|
|
||||||
|
🚀 Maintenant tout est prêt pour le grand saut ! Comme pour la mise à jour mineure, je procéderai nœud par nœud, en gardant mes VM et CT actives.
|
||||||
|
|
||||||
|
### Mettre le Mode Maintenance
|
||||||
|
|
||||||
|
D’abord, j’entre le nœud en mode maintenance. Cela déplacera la charge existante sur les autres nœuds :
|
||||||
|
```bash
|
||||||
|
ha-manager crm-command node-maintenance enable $(hostname)
|
||||||
|
```
|
||||||
|
|
||||||
|
Après avoir exécuté la commande, j’attends environ une minute pour laisser le temps aux ressources de migrer.
|
||||||
|
|
||||||
|
### Changer les Dépôts Sources vers Trixie
|
||||||
|
|
||||||
|
Depuis Debian Trixie, le format `deb822` est désormais disponible et recommandé pour les sources. Il est structuré autour d’un format clé/valeur. Cela offre une meilleure lisibilité et sécurité.
|
||||||
|
|
||||||
|
#### Sources Debian
|
||||||
|
```bash
|
||||||
|
cat > /etc/apt/sources.list.d/debian.sources << EOF
|
||||||
|
Types: deb deb-src
|
||||||
|
URIs: http://deb.debian.org/debian/
|
||||||
|
Suites: trixie trixie-updates
|
||||||
|
Components: main contrib non-free-firmware
|
||||||
|
Signed-By: /usr/share/keyrings/debian-archive-keyring.gpg
|
||||||
|
|
||||||
|
Types: deb deb-src
|
||||||
|
URIs: http://security.debian.org/debian-security/
|
||||||
|
Suites: trixie-security
|
||||||
|
Components: main contrib non-free-firmware
|
||||||
|
Signed-By: /usr/share/keyrings/debian-archive-keyring.gpg
|
||||||
|
EOF
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Sources Proxmox (sans subscription)
|
||||||
|
```bash
|
||||||
|
cat > /etc/apt/sources.list.d/proxmox.sources << EOF
|
||||||
|
Types: deb
|
||||||
|
URIs: http://download.proxmox.com/debian/pve
|
||||||
|
Suites: trixie
|
||||||
|
Components: pve-no-subscription
|
||||||
|
Signed-By: /usr/share/keyrings/proxmox-archive-keyring.gpg
|
||||||
|
EOF
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Sources Ceph Squid (sans subscription)
|
||||||
|
```bash
|
||||||
|
cat > /etc/apt/sources.list.d/ceph.sources << EOF
|
||||||
|
Types: deb
|
||||||
|
URIs: http://download.proxmox.com/debian/ceph-squid
|
||||||
|
Suites: trixie
|
||||||
|
Components: no-subscription
|
||||||
|
Signed-By: /usr/share/keyrings/proxmox-archive-keyring.gpg
|
||||||
|
EOF
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Supprimer les Anciennes Listes Bookworm
|
||||||
|
|
||||||
|
Les listes pour Debian Bookworm au format ancien doivent être supprimées :
|
||||||
|
```bash
|
||||||
|
rm -f /etc/apt/sources.list{,.d/*.list}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Mettre à Jour les Dépôts `apt` Configurés
|
||||||
|
|
||||||
|
Rafraîchir les dépôts :
|
||||||
|
```bash
|
||||||
|
apt update
|
||||||
|
```
|
||||||
|
```plaintext
|
||||||
|
Get:1 http://security.debian.org/debian-security trixie-security InRelease [43.4 kB]
|
||||||
|
Get:2 http://deb.debian.org/debian trixie InRelease [140 kB]
|
||||||
|
Get:3 http://download.proxmox.com/debian/ceph-squid trixie InRelease [2,736 B]
|
||||||
|
Get:4 http://download.proxmox.com/debian/pve trixie InRelease [2,771 B]
|
||||||
|
Get:5 http://deb.debian.org/debian trixie-updates InRelease [47.3 kB]
|
||||||
|
Get:6 http://security.debian.org/debian-security trixie-security/main Sources [91.1 kB]
|
||||||
|
Get:7 http://security.debian.org/debian-security trixie-security/non-free-firmware Sources [696 B]
|
||||||
|
Get:8 http://security.debian.org/debian-security trixie-security/main amd64 Packages [69.0 kB]
|
||||||
|
Get:9 http://security.debian.org/debian-security trixie-security/main Translation-en [45.1 kB]
|
||||||
|
Get:10 http://security.debian.org/debian-security trixie-security/non-free-firmware amd64 Packages [544 B]
|
||||||
|
Get:11 http://security.debian.org/debian-security trixie-security/non-free-firmware Translation-en [352 B]
|
||||||
|
Get:12 http://download.proxmox.com/debian/ceph-squid trixie/no-subscription amd64 Packages [33.2 kB]
|
||||||
|
Get:13 http://deb.debian.org/debian trixie/main Sources [10.5 MB]
|
||||||
|
Get:14 http://download.proxmox.com/debian/pve trixie/pve-no-subscription amd64 Packages [241 kB]
|
||||||
|
Get:15 http://deb.debian.org/debian trixie/non-free-firmware Sources [6,536 B]
|
||||||
|
Get:16 http://deb.debian.org/debian trixie/contrib Sources [52.3 kB]
|
||||||
|
Get:17 http://deb.debian.org/debian trixie/main amd64 Packages [9,669 kB]
|
||||||
|
Get:18 http://deb.debian.org/debian trixie/main Translation-en [6,484 kB]
|
||||||
|
Get:19 http://deb.debian.org/debian trixie/contrib amd64 Packages [53.8 kB]
|
||||||
|
Get:20 http://deb.debian.org/debian trixie/contrib Translation-en [49.6 kB]
|
||||||
|
Get:21 http://deb.debian.org/debian trixie/non-free-firmware amd64 Packages [6,868 B]
|
||||||
|
Get:22 http://deb.debian.org/debian trixie/non-free-firmware Translation-en [4,704 B]
|
||||||
|
Get:23 http://deb.debian.org/debian trixie-updates/main Sources [2,788 B]
|
||||||
|
Get:24 http://deb.debian.org/debian trixie-updates/main amd64 Packages [5,412 B]
|
||||||
|
Get:25 http://deb.debian.org/debian trixie-updates/main Translation-en [4,096 B]
|
||||||
|
Fetched 27.6 MB in 3s (8,912 kB/s)
|
||||||
|
Reading package lists... Done
|
||||||
|
Building dependency tree... Done
|
||||||
|
Reading state information... Done
|
||||||
|
666 packages can be upgraded. Run 'apt list --upgradable' to see them.
|
||||||
|
```
|
||||||
|
|
||||||
|
😈 666 paquets, je suis condamné !
|
||||||
|
|
||||||
|
### Mise à Niveau vers Debian Trixie et Proxmox VE 9
|
||||||
|
|
||||||
|
Lancer la mise à niveau :
|
||||||
|
```bash
|
||||||
|
apt-get dist-upgrade -y
|
||||||
|
```
|
||||||
|
|
||||||
|
Pendant le processus, vous serez invité à approuver des changements de fichiers de configuration et certains redémarrages de services. Il se peut aussi que vous voyiez la sortie de certains changements, vous pouvez simplement en sortir en appuyant sur `q` :
|
||||||
|
- `/etc/issue` : Proxmox VE régénérera automatiquement ce fichier au démarrage -> `No`
|
||||||
|
- `/etc/lvm/lvm.conf` : Changements pertinents pour Proxmox VE seront mis à jour -> `Yes`
|
||||||
|
- `/etc/ssh/sshd_config` : Selon votre configuration -> `Inspect`
|
||||||
|
- `/etc/default/grub` : Seulement si vous l’avez modifié manuellement -> `Inspect`
|
||||||
|
- `/etc/chrony/chrony.conf` : Si vous n’avez pas fait de modifications supplémentaires -> `Yes`
|
||||||
|
|
||||||
|
La mise à niveau a pris environ 5 minutes, selon le matériel.
|
||||||
|
|
||||||
|
À la fin de la mise à niveau, redémarrez la machine :
|
||||||
|
```bash
|
||||||
|
reboot
|
||||||
|
```
|
||||||
|
### Sortir du Mode Maintenance
|
||||||
|
|
||||||
|
Enfin, quand le nœud (espérons‑le) est revenu, vous pouvez désactiver le mode maintenance. La charge qui était localisée sur cette machine reviendra :
|
||||||
|
```bash
|
||||||
|
ha-manager crm-command node-maintenance disable $(hostname)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Validation Après Mise à Niveau
|
||||||
|
|
||||||
|
- Vérifier la communication du cluster :
|
||||||
|
```bash
|
||||||
|
pvecm status
|
||||||
|
```
|
||||||
|
|
||||||
|
- Vérifier les points de montage des stockages
|
||||||
|
|
||||||
|
- Vérifier la santé du cluster Ceph :
|
||||||
|
```bash
|
||||||
|
ceph status
|
||||||
|
```
|
||||||
|
|
||||||
|
- Confirmer les opérations VM, les sauvegardes et les groupes HA
|
||||||
|
|
||||||
|
Les groupes HA ont été retirés au profit des règles d’affinité HA. Les groupes HA sont automatiquement migrés en règles HA.
|
||||||
|
|
||||||
|
- Désactiver le dépôt PVE Enterprise
|
||||||
|
|
||||||
|
Si vous n’utilisez pas le dépôt `pve-enterprise`, vous pouvez le désactiver : `` ```
|
||||||
|
```bash
|
||||||
|
sed -i 's/^/#/' /etc/apt/sources.list.d/pve-enterprise.sources
|
||||||
|
```
|
||||||
|
|
||||||
|
🔁 Ce nœud est maintenant mis à niveau vers Proxmox VE 9. Procédez aux autres nœuds.
|
||||||
|
|
||||||
|
## Actions Postérieures
|
||||||
|
|
||||||
|
Une fois que tout le cluster a été mis à niveau, procédez aux actions postérieures :
|
||||||
|
|
||||||
|
- Supprimer le flag `noout` du cluster Ceph :
|
||||||
|
```bash
|
||||||
|
ceph osd unset noout
|
||||||
|
```
|
||||||
|
|
||||||
|
- Recréer les mappings PCI passthrough
|
||||||
|
|
||||||
|
Pour la VM pour laquelle j’ai retiré le mapping hôte au début de la procédure, je peux maintenant recréer le mapping.
|
||||||
|
|
||||||
|
- Ajouter les privilèges pour le rôle Terraform
|
||||||
|
|
||||||
|
Pendant la phase de vérification, il m’a été conseillé de supprimer le privilège `VM.Monitor` de mon rôle personnalisé pour Terraform. Maintenant que de nouveaux privilèges ont été ajoutés avec Proxmox VE 9, je peux les attribuer à ce rôle :
|
||||||
|
- VM.GuestAgent.Audit
|
||||||
|
- VM.GuestAgent.FileRead
|
||||||
|
- VM.GuestAgent.FileWrite
|
||||||
|
- VM.GuestAgent.FileSystemMgmt
|
||||||
|
- VM.GuestAgent.Unrestricted
|
||||||
|
|
||||||
|
## Conclusion
|
||||||
|
|
||||||
|
🎉 Mon cluster Proxmox VE est maintenant en version 9 !
|
||||||
|
|
||||||
|
Le processus de mise à niveau s’est déroulé assez tranquillement, sans aucune interruption pour mes ressources.
|
||||||
|
|
||||||
|
J’ai maintenant accès aux règles d’affinité HA, dont j’avais besoin pour mon cluster OPNsense.
|
||||||
|
|
||||||
|
Comme vous avez pu le constater, je ne maintiens pas mes nœuds à jour très souvent. Je pourrais automatiser cela la prochaine fois, pour les garder à jour sans effort.
|
||||||
|
|
||||||
|
|
||||||
425
content/post/14-proxmox-cluster-upgrade-8-to-9-ceph.md
Normal file
@@ -0,0 +1,425 @@
|
|||||||
|
---
|
||||||
|
slug: proxmox-cluster-upgrade-8-to-9-ceph
|
||||||
|
title: Upgrading my 3-node Proxmox VE HA Cluster from 8 to 9 based on Ceph
|
||||||
|
description: Step-by-step upgrade of my 3-node Proxmox VE highly available cluster from 8 to 9, based on Ceph distributed storage, without any downtime.
|
||||||
|
date: 2025-11-04
|
||||||
|
draft: false
|
||||||
|
tags:
|
||||||
|
- proxmox
|
||||||
|
- high-availability
|
||||||
|
- ceph
|
||||||
|
categories:
|
||||||
|
- homelab
|
||||||
|
---
|
||||||
|
|
||||||
|
## Intro
|
||||||
|
|
||||||
|
My **Proxmox VE** cluster is almost one year old now, and I haven’t kept the nodes fully up to date. Time to address this and bump it to Proxmox VE **9**.
|
||||||
|
|
||||||
|
I'm mainly after the new HA affinity rules, but here the useful changes that this version brings:
|
||||||
|
- Debian 13 "Trixie".
|
||||||
|
- Snapshots for thick-provisioned LVM shared storage.
|
||||||
|
- SDN fabrics feature.
|
||||||
|
- Improved mobile UI.
|
||||||
|
- Affinity rules in HA cluster.
|
||||||
|
|
||||||
|
The cluster is a three‑node, highly available, hyper‑converged setup using Ceph for distributed storage.
|
||||||
|
|
||||||
|
In this article, I'll walk through the upgrade steps for my Proxmox VE cluster, from 8 to 9, while keeping the resources up and running. [Official docs](https://pve.proxmox.com/wiki/Upgrade_from_8_to_9).
|
||||||
|
|
||||||
|
---
|
||||||
|
## Prerequisites
|
||||||
|
|
||||||
|
Before jumping into the upgrade, let's review the prerequisites:
|
||||||
|
|
||||||
|
1. All nodes upgraded to the latest Proxmox VE `8.4`.
|
||||||
|
2. Ceph cluster upgraded to Squid (`19.2`).
|
||||||
|
3. Proxmox Backup Server upgraded to version 4.
|
||||||
|
4. Reliable access to the node.
|
||||||
|
5. Healthy cluster.
|
||||||
|
6. Backup of all VMs and CTs.
|
||||||
|
7. At least 5 GB free on `/`.
|
||||||
|
|
||||||
|
Notes about my environment:
|
||||||
|
|
||||||
|
- PVE nodes are on `8.3.2`, so a minor upgrade to 8.4 is required first.
|
||||||
|
- Ceph is Reef (`18.2.4`) and will be upgraded to Squid after PVE 8.4.
|
||||||
|
- I don’t use PBS in my homelab, so I can skip that step.
|
||||||
|
- I have more than 10GB available on `/` on my nodes, this is fine.
|
||||||
|
- I only have SSH console access, if a node becomes unresponsive I may need physical access.
|
||||||
|
- One VM has a CPU passthrough (APU). Passthrough prevents live‑migration, so I remove that mapping prior to the upgrade.
|
||||||
|
- Set Ceph OSDs to `noout` during the upgrade to avoid automatic rebalancing:
|
||||||
|
```bash
|
||||||
|
ceph osd set noout
|
||||||
|
```
|
||||||
|
|
||||||
|
### Update Proxmox VE to 8.4.14
|
||||||
|
|
||||||
|
The plan is simple, for all nodes, one at a time:
|
||||||
|
|
||||||
|
1. Enable the maintenance mode
|
||||||
|
```bash
|
||||||
|
ha-manager crm-command node-maintenance enable $(hostname)
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Update the node
|
||||||
|
```bash
|
||||||
|
apt-get update
|
||||||
|
apt-get dist-upgrade -y
|
||||||
|
```
|
||||||
|
|
||||||
|
At the end of the update, I'm invited to remove a bootloader, which I execute:
|
||||||
|
```plaintext
|
||||||
|
Removable bootloader found at '/boot/efi/EFI/BOOT/BOOTX64.efi', but GRUB packages not set up to update it!
|
||||||
|
Run the following command:
|
||||||
|
|
||||||
|
echo 'grub-efi-amd64 grub2/force_efi_extra_removable boolean true' | debconf-set-selections -v -u
|
||||||
|
|
||||||
|
Then reinstall GRUB with 'apt install --reinstall grub-efi-amd64'
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Restart the machine
|
||||||
|
```bash
|
||||||
|
reboot
|
||||||
|
```
|
||||||
|
|
||||||
|
4. Disable the maintenance node
|
||||||
|
```bash
|
||||||
|
ha-manager crm-command node-maintenance disable $(hostname)
|
||||||
|
```
|
||||||
|
|
||||||
|
Between each node, I wait for the Ceph status to be clean, without warnings.
|
||||||
|
|
||||||
|
✅ At the end, the Proxmox VE cluster is updated to `8.4.14`
|
||||||
|
|
||||||
|
### Upgrade Ceph from Reef to Squid
|
||||||
|
|
||||||
|
I can now move on into the Ceph upgrade, the Proxmox documentation for that procedure is [here](https://pve.proxmox.com/wiki/Ceph_Reef_to_Squid).
|
||||||
|
|
||||||
|
Update Ceph package sources on every node:
|
||||||
|
```bash
|
||||||
|
sed -i 's/reef/squid/' /etc/apt/sources.list.d/ceph.list
|
||||||
|
```
|
||||||
|
|
||||||
|
Upgrade the Ceph packages:
|
||||||
|
```
|
||||||
|
apt update
|
||||||
|
apt full-upgrade -y
|
||||||
|
```
|
||||||
|
|
||||||
|
After the upgrade on the first node, the Ceph version now shows `19.2.3`, I can see my OSDs appear as outdated, the monitors need either an upgrade or a restart:
|
||||||
|

|
||||||
|
|
||||||
|
I carry on and upgrade the packages on the 2 other nodes.
|
||||||
|
|
||||||
|
I have a monitor on each node, so I have to restart each monitor, one node at a time:
|
||||||
|
```bash
|
||||||
|
systemctl restart ceph-mon.target
|
||||||
|
```
|
||||||
|
|
||||||
|
I verify the Ceph status between each restart:
|
||||||
|
```bash
|
||||||
|
ceph status
|
||||||
|
```
|
||||||
|
|
||||||
|
Once all monitors are restarted, they report the latest version, with `ceph mon dump`:
|
||||||
|
- Before: `min_mon_release 18 (reef)`
|
||||||
|
- After: `min_mon_release 19 (squid)`
|
||||||
|
|
||||||
|
Now I can restart the OSDs, still one node at a time. In my setup, I have one OSD per node:
|
||||||
|
```bash
|
||||||
|
systemctl restart ceph-osd.target
|
||||||
|
```
|
||||||
|
|
||||||
|
I monitor the Ceph status with the Proxmox WebGUI. After the restart, it is showing some fancy colors. I'm just waiting for the PGs to be back to green, it takes less than a minute:
|
||||||
|

|
||||||
|
|
||||||
|
A warning shows up: `HEALTH_WARN: all OSDs are running squid or later but require_osd_release < squid`
|
||||||
|
|
||||||
|
Now all my OSDs are running Squid, I can set the minimum version to it:
|
||||||
|
```bash
|
||||||
|
ceph osd require-osd-release squid
|
||||||
|
```
|
||||||
|
|
||||||
|
ℹ️ I'm not currently using CephFS so I don't have to care about the MDS (MetaData Server) daemon.
|
||||||
|
|
||||||
|
✅ The Ceph cluster has been successfully upgraded to Squid (`19.2.3`).
|
||||||
|
|
||||||
|
---
|
||||||
|
## Checks
|
||||||
|
|
||||||
|
The prerequisites to upgrade the cluster to Proxmox VE 9 are now complete. Am I ready to upgrade? Not yet.
|
||||||
|
|
||||||
|
A small checklist program named **`pve8to9`** is included in the latest Proxmox VE 8.4 packages. The program will provide hints and warnings about potential issues before, during and after the upgrade process. Pretty handy isn't it?
|
||||||
|
|
||||||
|
Running the tool the first time give me some insights on what I need to do. The script checks a number of parameters, grouped by theme. For example, this is the Virtual Guest section:
|
||||||
|
```plaintext
|
||||||
|
= VIRTUAL GUEST CHECKS =
|
||||||
|
|
||||||
|
INFO: Checking for running guests..
|
||||||
|
WARN: 1 running guest(s) detected - consider migrating or stopping them.
|
||||||
|
INFO: Checking if LXCFS is running with FUSE3 library, if already upgraded..
|
||||||
|
SKIP: not yet upgraded, no need to check the FUSE library version LXCFS uses
|
||||||
|
INFO: Checking for VirtIO devices that would change their MTU...
|
||||||
|
PASS: All guest config descriptions fit in the new limit of 8 KiB
|
||||||
|
INFO: Checking container configs for deprecated lxc.cgroup entries
|
||||||
|
PASS: No legacy 'lxc.cgroup' keys found.
|
||||||
|
INFO: Checking VM configurations for outdated machine versions
|
||||||
|
PASS: All VM machine versions are recent enough
|
||||||
|
```
|
||||||
|
|
||||||
|
At the end, you have the summary. The goal is to address as many `FAILURES` and `WARNINGS` as possible:
|
||||||
|
```plaintext
|
||||||
|
= SUMMARY =
|
||||||
|
|
||||||
|
TOTAL: 57
|
||||||
|
PASSED: 43
|
||||||
|
SKIPPED: 7
|
||||||
|
WARNINGS: 2
|
||||||
|
FAILURES: 2
|
||||||
|
```
|
||||||
|
|
||||||
|
Let's review the problems it found:
|
||||||
|
|
||||||
|
```
|
||||||
|
FAIL: 1 custom role(s) use the to-be-dropped 'VM.Monitor' privilege and need to be adapted after the upgrade
|
||||||
|
```
|
||||||
|
|
||||||
|
Some time ago, in order to use Terraform with my Proxmox cluster, I created a dedicated role. This was detailed in that [post]({{< ref "post/3-terraform-create-vm-proxmox" >}}).
|
||||||
|
|
||||||
|
This role is using the `VM.Monitor` privilege, which is removed in Proxmox VE 9. Instead, new privileges under `VM.GuestAgent.*` exist. So I remove this one and I'll add those once the cluster have been upgraded.
|
||||||
|
|
||||||
|
```
|
||||||
|
FAIL: systemd-boot meta-package installed. This will cause problems on upgrades of other boot-related packages. Remove 'systemd-boot' See https://pve.proxmox.com/wiki/Upgrade_from_8_to_9#sd-boot-warning for more information.
|
||||||
|
```
|
||||||
|
|
||||||
|
Proxmox VE usually uses `systemd-boot` for booting only in some configurations which are managed by `proxmox-boot-tool`, the meta-package `systemd-boot` should be removed. The package was automatically shipped for systems installed from the PVE 8.1 to PVE 8.4, as it contained `bootctl` in Bookworm.
|
||||||
|
|
||||||
|
If the `pve8to9` checklist script suggests it, the `systemd-boot` meta-package is safe to remove unless you manually installed it and are using `systemd-boot` as a bootloader:
|
||||||
|
```bash
|
||||||
|
apt remove systemd-boot -y
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
WARN: 1 running guest(s) detected - consider migrating or stopping them.
|
||||||
|
```
|
||||||
|
|
||||||
|
In HA setup, before updating a node, I put it in maintenance mode. This automatically moves the workload elsewhere. When this mode is disabled, the workload moves back to its previous location.
|
||||||
|
|
||||||
|
```
|
||||||
|
WARN: The matching CPU microcode package 'amd64-microcode' could not be found! Consider installing it to receive the latest security and bug fixes for your CPU.
|
||||||
|
Ensure you enable the 'non-free-firmware' component in the apt sources and run:
|
||||||
|
apt install amd64-microcode
|
||||||
|
```
|
||||||
|
|
||||||
|
It is recommended to install processor microcode for updates which can fix hardware bugs, improve performance, and enhance security features of the processor.
|
||||||
|
|
||||||
|
I add the `non-free-firmware` source to the current ones:
|
||||||
|
```bash
|
||||||
|
sed -i '/^deb /{/non-free-firmware/!s/$/ non-free-firmware/}' /etc/apt/sources.list
|
||||||
|
```
|
||||||
|
|
||||||
|
Then install the `amd64-microcode` package:
|
||||||
|
```bash
|
||||||
|
apt update
|
||||||
|
apt install amd64-microcode -y
|
||||||
|
```
|
||||||
|
|
||||||
|
After these small adjustments, am I ready yet? Let's find out by relaunching the `pve8to9` script.
|
||||||
|
|
||||||
|
⚠️ Don't forget to run the `pve8to9` on all nodes to make sure everything is good.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Upgrade
|
||||||
|
|
||||||
|
🚀 Now everything is ready for the big move! Like I did for the minor update, I'll proceed one node at a time, keeping my VMs and CTs up and running.
|
||||||
|
|
||||||
|
### Set Maintenance Mode
|
||||||
|
|
||||||
|
First, I enter the node into maintenance mode. This will move existing workload on other nodes:
|
||||||
|
```bash
|
||||||
|
ha-manager crm-command node-maintenance enable $(hostname)
|
||||||
|
```
|
||||||
|
|
||||||
|
After issuing the command, I wait about one minute to give the resources the time to migrate.
|
||||||
|
|
||||||
|
### Change Source Repositories to Trixie
|
||||||
|
|
||||||
|
Since Debian Trixie, the `deb822` format is now available and recommended for sources. It is structured around key/value format. This offers better readability and security.
|
||||||
|
|
||||||
|
#### Debian Sources
|
||||||
|
```bash
|
||||||
|
cat > /etc/apt/sources.list.d/debian.sources << EOF
|
||||||
|
Types: deb deb-src
|
||||||
|
URIs: http://deb.debian.org/debian/
|
||||||
|
Suites: trixie trixie-updates
|
||||||
|
Components: main contrib non-free-firmware
|
||||||
|
Signed-By: /usr/share/keyrings/debian-archive-keyring.gpg
|
||||||
|
|
||||||
|
Types: deb deb-src
|
||||||
|
URIs: http://security.debian.org/debian-security/
|
||||||
|
Suites: trixie-security
|
||||||
|
Components: main contrib non-free-firmware
|
||||||
|
Signed-By: /usr/share/keyrings/debian-archive-keyring.gpg
|
||||||
|
EOF
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Proxmox Sources (without subscription)
|
||||||
|
```bash
|
||||||
|
cat > /etc/apt/sources.list.d/proxmox.sources << EOF
|
||||||
|
Types: deb
|
||||||
|
URIs: http://download.proxmox.com/debian/pve
|
||||||
|
Suites: trixie
|
||||||
|
Components: pve-no-subscription
|
||||||
|
Signed-By: /usr/share/keyrings/proxmox-archive-keyring.gpg
|
||||||
|
EOF
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Ceph Squid Sources (without subscription)
|
||||||
|
```bash
|
||||||
|
cat > /etc/apt/sources.list.d/ceph.sources << EOF
|
||||||
|
Types: deb
|
||||||
|
URIs: http://download.proxmox.com/debian/ceph-squid
|
||||||
|
Suites: trixie
|
||||||
|
Components: no-subscription
|
||||||
|
Signed-By: /usr/share/keyrings/proxmox-archive-keyring.gpg
|
||||||
|
EOF
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Remove Old Bookworm Source Lists
|
||||||
|
|
||||||
|
The lists for Debian Bookworm in the old format must be removed:
|
||||||
|
```bash
|
||||||
|
rm -f /etc/apt/sources.list{,.d/*.list}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Update the Configured `apt` Repositories
|
||||||
|
|
||||||
|
Refresh the repositories:
|
||||||
|
```bash
|
||||||
|
apt update
|
||||||
|
```
|
||||||
|
```plaintext
|
||||||
|
Get:1 http://security.debian.org/debian-security trixie-security InRelease [43.4 kB]
|
||||||
|
Get:2 http://deb.debian.org/debian trixie InRelease [140 kB]
|
||||||
|
Get:3 http://download.proxmox.com/debian/ceph-squid trixie InRelease [2,736 B]
|
||||||
|
Get:4 http://download.proxmox.com/debian/pve trixie InRelease [2,771 B]
|
||||||
|
Get:5 http://deb.debian.org/debian trixie-updates InRelease [47.3 kB]
|
||||||
|
Get:6 http://security.debian.org/debian-security trixie-security/main Sources [91.1 kB]
|
||||||
|
Get:7 http://security.debian.org/debian-security trixie-security/non-free-firmware Sources [696 B]
|
||||||
|
Get:8 http://security.debian.org/debian-security trixie-security/main amd64 Packages [69.0 kB]
|
||||||
|
Get:9 http://security.debian.org/debian-security trixie-security/main Translation-en [45.1 kB]
|
||||||
|
Get:10 http://security.debian.org/debian-security trixie-security/non-free-firmware amd64 Packages [544 B]
|
||||||
|
Get:11 http://security.debian.org/debian-security trixie-security/non-free-firmware Translation-en [352 B]
|
||||||
|
Get:12 http://download.proxmox.com/debian/ceph-squid trixie/no-subscription amd64 Packages [33.2 kB]
|
||||||
|
Get:13 http://deb.debian.org/debian trixie/main Sources [10.5 MB]
|
||||||
|
Get:14 http://download.proxmox.com/debian/pve trixie/pve-no-subscription amd64 Packages [241 kB]
|
||||||
|
Get:15 http://deb.debian.org/debian trixie/non-free-firmware Sources [6,536 B]
|
||||||
|
Get:16 http://deb.debian.org/debian trixie/contrib Sources [52.3 kB]
|
||||||
|
Get:17 http://deb.debian.org/debian trixie/main amd64 Packages [9,669 kB]
|
||||||
|
Get:18 http://deb.debian.org/debian trixie/main Translation-en [6,484 kB]
|
||||||
|
Get:19 http://deb.debian.org/debian trixie/contrib amd64 Packages [53.8 kB]
|
||||||
|
Get:20 http://deb.debian.org/debian trixie/contrib Translation-en [49.6 kB]
|
||||||
|
Get:21 http://deb.debian.org/debian trixie/non-free-firmware amd64 Packages [6,868 B]
|
||||||
|
Get:22 http://deb.debian.org/debian trixie/non-free-firmware Translation-en [4,704 B]
|
||||||
|
Get:23 http://deb.debian.org/debian trixie-updates/main Sources [2,788 B]
|
||||||
|
Get:24 http://deb.debian.org/debian trixie-updates/main amd64 Packages [5,412 B]
|
||||||
|
Get:25 http://deb.debian.org/debian trixie-updates/main Translation-en [4,096 B]
|
||||||
|
Fetched 27.6 MB in 3s (8,912 kB/s)
|
||||||
|
Reading package lists... Done
|
||||||
|
Building dependency tree... Done
|
||||||
|
Reading state information... Done
|
||||||
|
666 packages can be upgraded. Run 'apt list --upgradable' to see them.
|
||||||
|
```
|
||||||
|
|
||||||
|
😈 666 packages, I'm doomed!
|
||||||
|
|
||||||
|
### Upgrade to Debian Trixie and Proxmox VE 9
|
||||||
|
|
||||||
|
Launch the upgrade:
|
||||||
|
```bash
|
||||||
|
apt-get dist-upgrade -y
|
||||||
|
```
|
||||||
|
|
||||||
|
During the process , you will be prompted to approve changes to configuration files and some service restarts. You may also be shown the output of changes, you can simply exit there by pressing `q`:
|
||||||
|
- `/etc/issue`: Proxmox VE will auto-generate this file on boot -> `No`
|
||||||
|
- `/etc/lvm/lvm.conf`: Changes relevant for Proxmox VE will be updated -> `Yes`
|
||||||
|
- `/etc/ssh/sshd_config`: Depending your setup -> `Inspect`
|
||||||
|
- `/etc/default/grub`: Only if you changed it manually -> `Inspect`
|
||||||
|
- `/etc/chrony/chrony.conf`: If you did not make extra changes yourself -> `Yes`
|
||||||
|
|
||||||
|
The upgrade took about 5 minutes, depending of the hardware.
|
||||||
|
|
||||||
|
At the end of the upgrade, restart the machine:
|
||||||
|
```bash
|
||||||
|
reboot
|
||||||
|
```
|
||||||
|
### Remove Maintenance Mode
|
||||||
|
|
||||||
|
Finally when the node (hopefully) comes back, you can disable the maintenance mode. The workload which was located on that machine will come back:
|
||||||
|
```bash
|
||||||
|
ha-manager crm-command node-maintenance disable $(hostname)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Post-Upgrade Validation
|
||||||
|
|
||||||
|
- Check cluster communication:
|
||||||
|
```bash
|
||||||
|
pvecm status
|
||||||
|
```
|
||||||
|
|
||||||
|
- Verify storage mounts points
|
||||||
|
|
||||||
|
- Check Ceph cluster health :
|
||||||
|
```bash
|
||||||
|
ceph status
|
||||||
|
```
|
||||||
|
|
||||||
|
- Confirm VM operations, backups, and HA groups
|
||||||
|
|
||||||
|
HA groups have been removed at the profit of HA affinity rules. HA groups are automatically migrated to HA rules.
|
||||||
|
|
||||||
|
- Disable PVE Enterprise repository
|
||||||
|
|
||||||
|
If you don't use the `pve-enterprise` repo, you can disable it:
|
||||||
|
```bash
|
||||||
|
sed -i 's/^/#/' /etc/apt/sources.list.d/pve-enterprise.sources
|
||||||
|
```
|
||||||
|
|
||||||
|
🔁 This node is now upgraded to Proxmox VE 9. Proceed to other nodes.
|
||||||
|
|
||||||
|
## Post Actions
|
||||||
|
|
||||||
|
Once the whole cluster has been upgraded, proceed to post actions:
|
||||||
|
|
||||||
|
- Remove the Ceph cluster `noout` flag:
|
||||||
|
```bash
|
||||||
|
ceph osd unset noout
|
||||||
|
```
|
||||||
|
|
||||||
|
- Recreate PCI passthrough mapping
|
||||||
|
|
||||||
|
For the VM which I removed the host mapping at the beginning of the procedure, I can now recreate the mapping.
|
||||||
|
|
||||||
|
- Add privileges for the Terraform role
|
||||||
|
|
||||||
|
During the check phase, I was advised to remove the privilege `VM.Monitor` from my custom role for Terraform. Now that new privileges have been added with Proxmox VE 9, I can assign them to that role:
|
||||||
|
- VM.GuestAgent.Audit
|
||||||
|
- VM.GuestAgent.FileRead
|
||||||
|
- VM.GuestAgent.FileWrite
|
||||||
|
- VM.GuestAgent.FileSystemMgmt
|
||||||
|
- VM.GuestAgent.Unrestricted
|
||||||
|
|
||||||
|
## Conclusion
|
||||||
|
|
||||||
|
🎉My Proxmox VE cluster is now is version 9!
|
||||||
|
|
||||||
|
The upgrade process was pretty smooth, without any downtime for my resources.
|
||||||
|
|
||||||
|
Now I have access to HA affinity rules, which I was needing for my OPNsense cluster.
|
||||||
|
|
||||||
|
As you could observe, I'm not maintaining my node up to date quite often. I might automate this next time, to keep them updated without any effort.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@@ -0,0 +1,423 @@
|
|||||||
|
---
|
||||||
|
slug: migration-opnsense-proxmox-highly-available
|
||||||
|
title: Migration vers mon cluster OPNsense HA dans Proxmox VE
|
||||||
|
description: La démarche détaillée de la migration de ma box OPNsense physique vers un cluster de VM hautement disponible dans Proxmox VE.
|
||||||
|
date: 2025-11-20
|
||||||
|
draft: false
|
||||||
|
tags:
|
||||||
|
- opnsense
|
||||||
|
- high-availability
|
||||||
|
- proxmox
|
||||||
|
categories:
|
||||||
|
- homelab
|
||||||
|
---
|
||||||
|
## Intro
|
||||||
|
|
||||||
|
C'est la dernière étape de mon aventure de virtualisation d'**OPNsense**.
|
||||||
|
|
||||||
|
Il y a quelques mois, ma [box OPNsense physique a crash]({{< ref "post/10-opnsense-crash-disk-panic" >}}) à cause d'une défaillance matérielle. Cela a plongé ma maison dans le noir, littéralement. Pas de réseau, pas de lumières.
|
||||||
|
|
||||||
|
💡 Pour éviter de me retrouver à nouveau dans cette situation, j'ai imaginé un plan pour virtualiser mon pare-feu OPNsense dans mon cluster **Proxmox VE**. La dernière fois, j'avais mis en place un [proof of concept]({{< ref "post/12-opnsense-virtualization-highly-available" >}}) pour valider cette solution : créer un cluster de deux VM **OPNsense** dans Proxmox et rendre le firewall hautement disponible.
|
||||||
|
|
||||||
|
Cette fois, je vais couvrir la création de mon futur cluster OPNsense depuis zéro, planifier la bascule et finalement migrer depuis ma box physique actuelle. C'est parti !
|
||||||
|
|
||||||
|
---
|
||||||
|
## La Configuration VLAN
|
||||||
|
|
||||||
|
Pour mes plans, je dois connecter le WAN, provenant de ma box FAI, à mon switch principal. Pour cela je crée un VLAN dédié pour transporter ce flux jusqu'à mes nœuds Proxmox.
|
||||||
|
|
||||||
|
### UniFi
|
||||||
|
|
||||||
|
D'abord, je configure mon réseau de couche 2 qui est géré par UniFi. Là, je dois créer deux VLANs :
|
||||||
|
|
||||||
|
- _WAN_ (20) : transporte le WAN entre ma box FAI et mes nœuds Proxmox.
|
||||||
|
- _pfSync_ (44), communication entre mes nœuds OPNsense.
|
||||||
|
|
||||||
|
Dans le contrôleur UniFi, dans `Paramètres` > `Réseaux`, j'ajoute un `New Virtual Network`. Je le nomme `WAN` et lui donne l'ID VLAN 20 :
|
||||||
|

|
||||||
|
|
||||||
|
Je fais la même chose pour le VLAN `pfSync` avec l'ID VLAN 44.
|
||||||
|
|
||||||
|
Je prévois de brancher ma box FAI sur le port 15 de mon switch, qui est désactivé pour l'instant. Je l'active, définis le VLAN natif sur le nouveau `WAN (20)` et désactive le trunking :
|
||||||
|

|
||||||
|
|
||||||
|
Une fois ce réglage appliqué, je m'assure que seules les ports où sont connectés mes nœuds Proxmox propagent ces VLANs sur leur trunk.
|
||||||
|
|
||||||
|
J'ai fini la configuration UniFi.
|
||||||
|
|
||||||
|
### Proxmox SDN
|
||||||
|
|
||||||
|
Maintenant que le VLAN peut atteindre mes nœuds, je veux le gérer dans le SDN de Proxmox. J'ai configuré le SDN dans [cet article]({{< ref "post/11-proxmox-cluster-networking-sdn" >}}).
|
||||||
|
|
||||||
|
Dans `Datacenter` > `SDN` > `VNets`, je crée un nouveau VNet, je l'appelle `vlan20` pour suivre ma propre convention de nommage, je lui donne l'alias _WAN_ et j'utilise le tag (ID VLAN) 20 :
|
||||||
|

|
||||||
|
|
||||||
|
Je crée aussi le `vlan44` pour le VLAN _pfSync_, puis j'applique cette configuration et nous avons terminé avec le SDN.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Création des VMs
|
||||||
|
|
||||||
|
Maintenant que la configuration VLAN est faite, je peux commencer à construire les machines virtuelles sur Proxmox.
|
||||||
|
|
||||||
|
La première VM s'appelle `cerbere-head1` (je ne vous l'ai pas dit ? Mon firewall actuel s'appelle `cerbere`, ça a encore plus de sens maintenant !). Voici les réglages :
|
||||||
|
- **Type d'OS** : Linux (même si OPNsense est basé sur FreeBSD)
|
||||||
|
- **Type de machine** : `q35`
|
||||||
|
- **BIOS** : `OVMF (UEFI)`
|
||||||
|
- **Disque** : 20 Go sur stockage Ceph distribué
|
||||||
|
- **RAM** : 4 Go, ballooning désactivé
|
||||||
|
- **CPU** : 2 vCPU
|
||||||
|
- **NICs**, pare-feu désactivé :
|
||||||
|
1. `vmbr0` (_Mgmt_)
|
||||||
|
2. `vlan20` (_WAN_)
|
||||||
|
3. `vlan13` _(User)_
|
||||||
|
4. `vlan37` _(IoT)_
|
||||||
|
5. `vlan44` _(pfSync)_
|
||||||
|
6. `vlan55` _(DMZ)_
|
||||||
|
7. `vlan66` _(Lab)_
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
ℹ️ Maintenant je clone cette VM pour créer `cerbere-head2`, puis je procède à l'installation d'OPNsense. Je ne veux pas entrer trop dans les détails de l'installation d'OPNsense, je l'ai déjà documentée dans le [proof of concept]({{< ref "post/12-opnsense-virtualization-highly-available" >}}).
|
||||||
|
|
||||||
|
Après l'installation des deux instances OPNsense, j'attribue à chacune leur IP sur le réseau _Mgmt_ :
|
||||||
|
- `cerbere-head1` : `192.168.88.2/24`
|
||||||
|
- `cerbere-head2` : `192.168.88.3/24`
|
||||||
|
|
||||||
|
Tant que ces routeurs ne gèrent pas encore les réseaux, je leur donne comme passerelle mon routeur OPNsense actuel (`192.168.88.1`) pour me permettre de les atteindre depuis mon portable dans un autre VLAN.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Configuration d'OPNsense
|
||||||
|
|
||||||
|
Initialement, j'envisageais de restaurer ma configuration OPNsense existante et de l'adapter à l'installation.
|
||||||
|
|
||||||
|
Puis j'ai décidé de repartir de zéro pour documenter et partager la procédure. Cette partie devenant trop longue, j'ai préféré créer un article dédié.
|
||||||
|
|
||||||
|
📖 Vous pouvez trouver les détails de la configuration complète d'OPNsense dans cet [article]({{< ref "post/13-opnsense-full-configuration" >}}), couvrant HA, DNS, DHCP, VPN et reverse proxy.
|
||||||
|
|
||||||
|
---
|
||||||
|
## VM Proxmox Hautement Disponible
|
||||||
|
|
||||||
|
Les ressources (VM ou LXC) dans Proxmox VE peuvent être marquées comme hautement disponibles, voyons comment les configurer.
|
||||||
|
|
||||||
|
### Prérequis pour la HA Proxmox
|
||||||
|
|
||||||
|
D'abord, votre cluster Proxmox doit le permettre. Il y a quelques exigences :
|
||||||
|
|
||||||
|
- Au moins 3 nœuds pour avoir le quorum
|
||||||
|
- Stockage partagé pour vos ressources
|
||||||
|
- Horloge synchronisée
|
||||||
|
- Réseau fiable
|
||||||
|
|
||||||
|
Un mécanisme de fencing doit être activé. Le fencing est le processus d'isoler un nœud de cluster défaillant pour s'assurer qu'il n'accède plus aux ressources partagées. Cela évite les situations de split-brain et permet à Proxmox HA de redémarrer en toute sécurité les VM affectées sur des nœuds sains. Par défaut, il utilise le watchdog logiciel Linux, _softdog_, suffisant pour moi.
|
||||||
|
|
||||||
|
Dans Proxmox VE 8, il était possible de créer des groupes HA, en fonction de leurs ressources, emplacements, etc. Cela a été remplacé, dans Proxmox VE 9, par des règles d'affinité HA. C'est la raison principale derrière la mise à niveau de mon cluster Proxmox VE, que j'ai détaillée dans ce [post]({{< ref "post/14-proxmox-cluster-upgrade-8-to-9-ceph" >}}).
|
||||||
|
|
||||||
|
### Configurer la HA pour les VM
|
||||||
|
|
||||||
|
Le cluster Proxmox est capable de fournir de la HA pour les ressources, mais vous devez définir les règles.
|
||||||
|
|
||||||
|
Dans `Datacenter` > `HA`, vous pouvez voir le statut et gérer les ressources. Dans le panneau `Resources` je clique sur `Add`. Je dois choisir la ressource à configurer en HA dans la liste, ici `cerbere-head1` avec l'ID 122. Puis dans l'infobulle je peux définir le maximum de redémarrages et de relocations, je laisse `Failback` activé et l'état demandé à `started` :
|
||||||
|

|
||||||
|
|
||||||
|
Le cluster Proxmox s'assurera maintenant que cette VM est démarrée. Je fais de même pour l'autre VM OPNsense, `cerbere-head2`.
|
||||||
|
|
||||||
|
### Règles d'Affinité HA
|
||||||
|
|
||||||
|
Super, mais je ne veux pas qu'elles tournent sur le même nœud. C'est là qu'intervient la nouvelle fonctionnalité des règles d'affinité HA de Proxmox VE 9. Proxmox permet de créer des règles d'affinité de nœud et de ressource. Peu m'importe sur quel nœud elles tournent, mais je ne veux pas qu'elles soient ensemble. J'ai besoin d'une règle d'affinité de ressource.
|
||||||
|
|
||||||
|
Dans `Datacenter` > `HA` > `Affinity Rules`, j'ajoute une nouvelle règle d'affinité de ressource HA. Je sélectionne les deux VMs et choisis l'option `Keep Separate` :
|
||||||
|

|
||||||
|
|
||||||
|
✅ Mes VMs OPNsense sont maintenant entièrement prêtes !
|
||||||
|
|
||||||
|
---
|
||||||
|
## Migration
|
||||||
|
|
||||||
|
🚀 Il est temps de rendre cela réel !
|
||||||
|
|
||||||
|
Je ne vais pas mentir, je suis assez excité. Je travaille pour ce moment depuis des jours.
|
||||||
|
|
||||||
|
### Le Plan de Migration
|
||||||
|
|
||||||
|
Ma box OPNsense physique est directement connectée à ma box FAI. Je veux la remplacer par le cluster de VM. (Pour éviter d'écrire le mot OPNsense à chaque ligne, j'appellerai simplement l'ancienne instance "la box" et la nouvelle "la VM" )
|
||||||
|
|
||||||
|
Voici le plan :
|
||||||
|
1. Sauvegarde de la configuration de la box.
|
||||||
|
2. Désactiver le serveur DHCP sur la box.
|
||||||
|
3. Changer les adresses IP de la box.
|
||||||
|
4. Changer les VIP sur la VM.
|
||||||
|
5. Désactiver la passerelle sur la VM.
|
||||||
|
6. Configurer le DHCP sur les deux VMs.
|
||||||
|
7. Activer le répéteur mDNS sur la VM.
|
||||||
|
8. Répliquer les services sur la VM.
|
||||||
|
9. Déplacement du câble Ethernet.
|
||||||
|
|
||||||
|
### Stratégie de Retour Arrière
|
||||||
|
|
||||||
|
Aucune. 😎
|
||||||
|
|
||||||
|
Je plaisante, le retour arrière consiste à restaurer la configuration de la box, arrêter les VMs OPNsense et rebrancher le câble Ethernet dans la box.
|
||||||
|
|
||||||
|
### Plan de vérification
|
||||||
|
|
||||||
|
Pour valider la migration, je dresse une checklist :
|
||||||
|
1. Bail DHCP WAN dans la VM.
|
||||||
|
2. Ping depuis mon PC vers le VIP du VLAN User.
|
||||||
|
3. Ping entre les VLANs.
|
||||||
|
4. SSH vers mes machines.
|
||||||
|
5. Renouveler le bail DHCP.
|
||||||
|
6. Vérifier `ipconfig`
|
||||||
|
7. Tester l'accès à des sites internet.
|
||||||
|
8. Vérifier les logs du pare-feu.
|
||||||
|
9. Vérifier mes services web.
|
||||||
|
10. Vérifier que mes services internes ne sont pas accessibles depuis l'extérieur.
|
||||||
|
11. Tester le VPN.
|
||||||
|
12. Vérifier tous les appareils IoT.
|
||||||
|
13. Vérifier les fonctionnalités Home Assistant.
|
||||||
|
14. Vérifier que la TV fonctionne.
|
||||||
|
15. Tester le Chromecast.
|
||||||
|
16. Imprimer quelque chose.
|
||||||
|
17. Vérifier la blocklist DNS.
|
||||||
|
18. Speedtest.
|
||||||
|
19. Bascule.
|
||||||
|
20. Failover.
|
||||||
|
21. Reprise après sinistre.
|
||||||
|
22. Champagne !
|
||||||
|
|
||||||
|
Est-ce que ça va marcher ? On verra bien !
|
||||||
|
|
||||||
|
### Étapes de Migration
|
||||||
|
|
||||||
|
1. **Sauvegarde de la configuration de la box.**
|
||||||
|
|
||||||
|
Sur mon instance OPNsense physique, dans `System` > `Configuration` > `Backups`, je clique sur le bouton `Download configuration` qui me donne le précieux fichier XML. Celui qui m'a sauvé la mise la [dernière fois]({{< ref "post/10-opnsense-crash-disk-panic" >}}).
|
||||||
|
|
||||||
|
2. **Désactiver le serveur DHCP sur la box.**
|
||||||
|
|
||||||
|
Dans `Services` > `ISC DHCPv4`, et pour toutes mes interfaces, je désactive le serveur DHCP. Je ne fournis que du DHCPv4 dans mon réseau.
|
||||||
|
|
||||||
|
3. **Changer les adresses IP de la box.**
|
||||||
|
|
||||||
|
Dans `Interfaces`, et pour toutes mes interfaces, je modifie l'IP du firewall, de `.1` à `.253`. Je veux réutiliser la même adresse IP comme VIP, et garder cette instance encore joignable si besoin.
|
||||||
|
|
||||||
|
Dès que je clique sur `Apply`, je perds la communication, ce qui est attendu.
|
||||||
|
|
||||||
|
4. **Changer les VIP sur la VM.**
|
||||||
|
|
||||||
|
Sur ma VM maître, dans `Interfaces` > `Virtual IPs` > `Settings`, je change l'adresse VIP pour chaque interface et la mets en `.1`.
|
||||||
|
|
||||||
|
5. **Désactiver la passerelle sur la VM.**
|
||||||
|
|
||||||
|
Dans `System` > `Gateways` > `Configuration`, je désactive `LAN_GW` qui n'est plus nécessaire.
|
||||||
|
|
||||||
|
6. **Configurer le DHCP sur les deux VMs.**
|
||||||
|
|
||||||
|
Sur les deux VMs, dans `Services` > `Dnsmasq DNS & DHCP`, j'active le service sur mes 5 interfaces.
|
||||||
|
|
||||||
|
7. **Activer le répéteur mDNS sur la VM.**
|
||||||
|
|
||||||
|
Dans `Services` > `mDNS Repeater`, j'active le service et j'active aussi le `CARP Failover`.
|
||||||
|
|
||||||
|
Le service ne démarre pas. Je verrai ce problème plus tard.
|
||||||
|
|
||||||
|
8. **Répliquer les services sur la VM.**
|
||||||
|
|
||||||
|
Dans `Système` > `High Availability` > `Status`, je clique sur le bouton `Synchronize and reconfigure all`.
|
||||||
|
|
||||||
|
9. **Déplacement du câble Ethernet.**
|
||||||
|
|
||||||
|
Physiquement dans mon rack, je débranche le câble Ethernet du port WAN (`igc0`) de ma box OPNsense physique et je le branche sur le port 15 de mon switch UniFi.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Vérification
|
||||||
|
|
||||||
|
😮💨 Je prends une grande inspiration et commence la phase de vérification.
|
||||||
|
|
||||||
|
### Checklist
|
||||||
|
|
||||||
|
- ✅ Bail DHCP WAN dans la VM.
|
||||||
|
- ✅ Ping depuis mon PC vers le VIP du VLAN User.
|
||||||
|
- ⚠️ Ping entre VLANs.
|
||||||
|
Les pings fonctionnent, mais j'observe quelques pertes, environ 10 %.
|
||||||
|
- ✅ SSH vers mes machines.
|
||||||
|
- ✅ Renouvellement du bail DHCP.
|
||||||
|
- ✅ Vérifier `ipconfig`
|
||||||
|
- ❌ Tester un site internet. → ✅
|
||||||
|
Quelques sites fonctionnent, tout est incroyablement lent... Ça doit être le DNS. J'essaie de résoudre un domaine au hasard, ça marche. Mais je ne peux pas résoudre `google.com`. Je redémarre le service Unbound DNS, tout fonctionne maintenant. C'est toujours le DNS...
|
||||||
|
- ⚠️ Vérifier les logs du pare-feu.
|
||||||
|
Quelques flux sont bloqués, pas critique.
|
||||||
|
- ✅ Vérifier mes services web.
|
||||||
|
- ✅ Vérifier que mes services internes ne sont pas accessibles depuis l'extérieur.
|
||||||
|
- ✅ Tester le VPN.
|
||||||
|
- ✅ Vérifier tous les appareils IoT.
|
||||||
|
- ✅ Vérifier les fonctionnalités Home Assistant.
|
||||||
|
- ✅ Vérifier que la TV fonctionne.
|
||||||
|
- ❌ Tester le Chromecast.
|
||||||
|
C'est lié au service mDNS qui ne parvient pas à démarrer. Je peux le démarrer si je décoche l'option `CARP Failover`. Le Chromecast est visible maintenant. → ⚠️
|
||||||
|
- ✅ Imprimer quelque chose.
|
||||||
|
- ✅ Vérifier la blocklist DNS.
|
||||||
|
- ✅ Speedtest.
|
||||||
|
J'observe environ 15 % de diminution de bande passante (de 940Mbps à 825Mbps).
|
||||||
|
- ❌ Bascule.
|
||||||
|
La bascule fonctionne difficilement, beaucoup de paquets perdus pendant la bascule. Le service rendu n'est pas génial : plus d'accès internet et mes services web sont inaccessibles.
|
||||||
|
- ⌛ Failover.
|
||||||
|
- ⌛ Reprise après sinistre.
|
||||||
|
À tester plus tard.
|
||||||
|
|
||||||
|
📝 Bon, les résultats sont plutôt bons, pas parfaits, mais satisfaisants !
|
||||||
|
### Résolution des Problèmes
|
||||||
|
|
||||||
|
Je me concentre sur la résolution des problèmes restants rencontrés lors des tests.
|
||||||
|
|
||||||
|
1. **DNS**
|
||||||
|
|
||||||
|
Lors de la bascule, la connexion internet ne fonctionne pas. Pas de DNS, c'est toujours le DNS.
|
||||||
|
|
||||||
|
C'est parce que le nœud de secours n'a pas de passerelle lorsqu'il est en mode passif. L'absence de passerelle empêche le DNS de résoudre. Après la bascule, il conserve des domaines non résolus dans son cache. Ce problème conduit aussi à un autre souci : quand il est passif, je ne peux pas mettre à jour le système.
|
||||||
|
|
||||||
|
**Solution** : Définir une passerelle pointant vers l'autre nœud, avec un numéro de priorité plus élevé que la passerelle WAN (un numéro plus élevé signifie une priorité plus basse). Ainsi, cette passerelle n'est pas active tant que le nœud est maître.
|
||||||
|
|
||||||
|
2. **Reverse Proxy**
|
||||||
|
|
||||||
|
Lors de la bascule, tous les services web que j'héberge (reverse proxy/proxy couche 4) renvoient cette erreur : `SSL_ERROR_INTERNAL_ERROR_ALERT`. Après vérification des services synchronisés via XMLRPC Sync, Caddy et mDNS repeater n'étaient pas sélectionnés. C'est parce que ces services ont été installés après la configuration initiale du HA.
|
||||||
|
|
||||||
|
**Solution** : Ajouter Caddy à XMLRPC Sync.
|
||||||
|
|
||||||
|
3. **Pertes de paquets**
|
||||||
|
|
||||||
|
J'observe environ 10 % de pertes de paquets pour les pings depuis n'importe quel VLAN vers le VLAN _Mgmt_. Je n'ai pas ce problème pour les autres VLANs.
|
||||||
|
|
||||||
|
Le VLAN _Mgmt_ est le VLAN natif dans mon réseau, cela pourrait être la raison de ce problème. C'est le seul réseau non défini dans le SDN Proxmox. Je ne veux pas avoir à tagger ce VLAN.
|
||||||
|
|
||||||
|
**Solution** : Désactiver le pare-feu Proxmox de cette interface pour la VM. En réalité, je les ai tous désactivés et mis à jour la documentation ci-dessus. Je ne sais pas exactement pourquoi cela causait ce type de problème, mais la désactivation a résolu mon souci (j'ai pu reproduire le comportement en réactivant le pare-feu).
|
||||||
|
|
||||||
|
4. **Script CARP**
|
||||||
|
|
||||||
|
Lors de la bascule, le script d'événement CARP est déclenché autant de fois qu'il y a d'interfaces. J'ai 5 IPs virtuelles, le script reconfigure mon interface WAN 5 fois.
|
||||||
|
|
||||||
|
**Solution** : Retravailler le script pour récupérer l'état de l'interface WAN et ne reconfigurer l'interface que lorsque c'est nécessaire :
|
||||||
|
```php
|
||||||
|
#!/usr/local/bin/php
|
||||||
|
<?php
|
||||||
|
/**
|
||||||
|
* OPNsense CARP event script
|
||||||
|
* - Enables/disables the WAN interface only when needed
|
||||||
|
* - Avoids reapplying config when CARP triggers multiple times
|
||||||
|
*/
|
||||||
|
|
||||||
|
require_once("config.inc");
|
||||||
|
require_once("interfaces.inc");
|
||||||
|
require_once("util.inc");
|
||||||
|
require_once("system.inc");
|
||||||
|
|
||||||
|
// Read CARP event arguments
|
||||||
|
$subsystem = !empty($argv[1]) ? $argv[1] : '';
|
||||||
|
$type = !empty($argv[2]) ? $argv[2] : '';
|
||||||
|
|
||||||
|
// Accept only MASTER/BACKUP events
|
||||||
|
if (!in_array($type, ['MASTER', 'BACKUP'])) {
|
||||||
|
// Ignore CARP INIT, DEMOTED, etc.
|
||||||
|
exit(0);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Validate subsystem name format, expected pattern: <ifname>@<vhid>
|
||||||
|
if (!preg_match('/^[a-z0-9_]+@\S+$/i', $subsystem)) {
|
||||||
|
log_error("Malformed subsystem argument: '{$subsystem}'.");
|
||||||
|
exit(0);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Interface key to manage
|
||||||
|
$ifkey = 'wan';
|
||||||
|
// Determine whether WAN interface is currently enabled
|
||||||
|
$ifkey_enabled = !empty($config['interfaces'][$ifkey]['enable']) ? true : false;
|
||||||
|
|
||||||
|
// MASTER event
|
||||||
|
if ($type === "MASTER") {
|
||||||
|
// Enable WAN only if it's currently disabled
|
||||||
|
if (!$ifkey_enabled) {
|
||||||
|
log_msg("CARP event: switching to '$type', enabling interface '$ifkey'.", LOG_WARNING);
|
||||||
|
$config['interfaces'][$ifkey]['enable'] = '1';
|
||||||
|
write_config("enable interface '$ifkey' due CARP event '$type'", false);
|
||||||
|
interface_configure(false, $ifkey, false, false);
|
||||||
|
} else {
|
||||||
|
log_msg("CARP event: already '$type' for interface '$ifkey', nothing to do.");
|
||||||
|
}
|
||||||
|
|
||||||
|
// BACKUP event
|
||||||
|
} else {
|
||||||
|
// Disable WAN only if it's currently enabled
|
||||||
|
if ($ifkey_enabled) {
|
||||||
|
log_msg("CARP event: switching to '$type', disabling interface '$ifkey'.", LOG_WARNING);
|
||||||
|
unset($config['interfaces'][$ifkey]['enable']);
|
||||||
|
write_config("disable interface '$ifkey' due CARP event '$type'", false);
|
||||||
|
interface_configure(false, $ifkey, false, false);
|
||||||
|
} else {
|
||||||
|
log_msg("CARP event: already '$type' for interface '$ifkey', nothing to do.");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
5. **mDNS Repeater**
|
||||||
|
|
||||||
|
Le répéteur mDNS ne veut pas démarrer quand je sélectionne l'option `CARP Failover`.
|
||||||
|
|
||||||
|
**Solution** : La machine nécessite un redémarrage pour démarrer ce service compatible CARP.
|
||||||
|
|
||||||
|
6. **Adresse IPv6**
|
||||||
|
|
||||||
|
Mon nœud `cerbere-head1` crie dans le fichier de logs tandis que l'autre ne le fait pas. Voici les messages affichés chaque seconde quand il est maître :
|
||||||
|
```plaintext
|
||||||
|
Warning rtsold <interface_up> vtnet1 is disabled. in the logs (OPNsense)
|
||||||
|
```
|
||||||
|
|
||||||
|
Un autre message que j'ai plusieurs fois après un switchback :
|
||||||
|
```plaintext
|
||||||
|
Error dhcp6c transmit failed: Can't assign requested address
|
||||||
|
```
|
||||||
|
|
||||||
|
Ceci est lié à IPv6. J'observe que mon nœud principal n'a pas d'adresse IPv6 globale, seulement une link-local. De plus, il n'a pas de passerelle IPv6. Mon nœud secondaire, en revanche, a à la fois l'adresse globale et la passerelle.
|
||||||
|
|
||||||
|
Je ne suis pas expert IPv6, après quelques heures de recherche, j'abandonne IPv6. Si quelqu'un peut m'aider, ce serait vraiment apprécié !
|
||||||
|
|
||||||
|
**Contournement** : Supprimer DHCPv6 pour mon interface WAN.
|
||||||
|
|
||||||
|
### Confirmation
|
||||||
|
|
||||||
|
Maintenant que tout est corrigé, je peux évaluer les performances du failover.
|
||||||
|
|
||||||
|
1. **Basculement**
|
||||||
|
|
||||||
|
En entrant manuellement en mode maintenance CARP depuis l'interface WebGUI, aucune perte de paquets n'est observée. Impressionnant.
|
||||||
|
|
||||||
|
2. **Failover**
|
||||||
|
|
||||||
|
Pour simuler un failover, je tue la VM OPNsense active. Ici j'observe une seule perte de paquet. Génial.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
3. **Reprise après sinistre**
|
||||||
|
|
||||||
|
Une reprise après sinistre est ce qui se produirait après un arrêt complet d'un cluster Proxmox, suite à une coupure de courant par exemple. Je n'ai pas eu le temps (ni le courage) de m'en occuper, je préfère mieux me préparer pour éviter les dommages collatéraux. Mais il est certain que ce genre de scénario doit être évalué.
|
||||||
|
|
||||||
|
### Avantages Supplémentaires
|
||||||
|
|
||||||
|
Outre le fait que cette nouvelle configuration est plus résiliente, j'ai constaté quelques autres avantages.
|
||||||
|
|
||||||
|
Mon rack est minuscule et l'espace est restreint. L'ensemble chauffe beaucoup, dépassant les 40 °C au sommet du rack en été. Réduire le nombre de machines allumées a permis de faire baisser la température. J'ai gagné 1,5 °C après avoir éteint l'ancien boîtier OPNsense, c'est super !
|
||||||
|
|
||||||
|
La consommation électrique est également un point important, mon petit datacenter consommait en moyenne 85 W. Là encore, j'ai constaté une légère baisse, d'environ 8 W. Sachant que le système fonctionne 24/7, ce n'est pas négligeable.
|
||||||
|
|
||||||
|
Enfin, j'ai également retiré le boîtier lui-même et le câble d'alimentation. Les places sont très limitées, ce qui est un autre point positif.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Conclusion
|
||||||
|
|
||||||
|
🎉 J'ai réussi les gars ! Je suis très fier du résultat, et fier de moi.
|
||||||
|
|
||||||
|
De mon [premier crash de ma box OPNsense]({{< ref "post/10-opnsense-crash-disk-panic" >}}), à la recherche d'une solution, en passant par la [proof of concept]({{< ref "post/12-opnsense-virtualization-highly-available" >}}) de haute disponibilité, jusqu'à cette migration, ce fut un projet assez long, mais extrêmement intéressant.
|
||||||
|
|
||||||
|
🎯 Se fixer des objectifs, c'est bien, mais les atteindre, c'est encore mieux.
|
||||||
|
|
||||||
|
Je vais maintenant mettre OPNsense de côté un petit moment pour me recentrer sur mon apprentissage de Kubernetes !
|
||||||
|
|
||||||
|
Comme toujours, si vous avez des questions, des remarques ou une solution à mon problème d'IPv6, je serai ravi de vous aider.
|
||||||
420
content/post/15-migration-opnsense-proxmox-highly-available.md
Normal file
@@ -0,0 +1,420 @@
|
|||||||
|
---
|
||||||
|
slug: migration-opnsense-proxmox-highly-available
|
||||||
|
title: Migration to my OPNsense HA Cluster in Proxmox VE
|
||||||
|
description: The detailed steps of the migration from my OPNsense physical box to a highly available cluster of VM in Proxmox VE.
|
||||||
|
date: 2025-11-20
|
||||||
|
draft: false
|
||||||
|
tags:
|
||||||
|
- opnsense
|
||||||
|
- high-availability
|
||||||
|
- proxmox
|
||||||
|
categories:
|
||||||
|
- homelab
|
||||||
|
---
|
||||||
|
## Intro
|
||||||
|
|
||||||
|
This is the final stage of my **OPNsense** virtualization journey.
|
||||||
|
|
||||||
|
A few months ago, my physical [OPNsense box crashed]({{< ref "post/10-opnsense-crash-disk-panic" >}}) because of a hardware failure. This leads my home in the dark, literally. No network, no lights.
|
||||||
|
|
||||||
|
💡 To avoid being in that situation again, I imagined a plan to virtualize my OPNsense firewall into my **Proxmox VE** cluster. The last time, I've set up a [proof of concept]({{< ref "post/12-opnsense-virtualization-highly-available" >}}) to validate this solution: create a cluster of two **OPNsense** VMs in Proxmox and make the firewall highly available.
|
||||||
|
|
||||||
|
This time, I will cover the creation of my future OPNsense cluster from scratch, plan the cut over and finally migrate from my current physical box. Let's go!
|
||||||
|
|
||||||
|
---
|
||||||
|
## The VLAN Configuration
|
||||||
|
|
||||||
|
For my plans, I'll have to connect the WAN, coming from my ISP box, to my main switch. For that I create a dedicated VLAN to transport this flow to my Proxmox nodes.
|
||||||
|
|
||||||
|
### UniFi
|
||||||
|
|
||||||
|
First, I configure my layer 2 network which is managed by UniFi. There I need to create two VLANs:
|
||||||
|
- *WAN* (20): transport the WAN between my ISP box and my Proxmox nodes.
|
||||||
|
- *pfSync* (44), communication between my OPNsense nodes.
|
||||||
|
|
||||||
|
In the UniFi controller, in `Settings` > `Networks`, I add a `New Virtual Network`. I name it `WAN` and give it the VLAN ID 20:
|
||||||
|

|
||||||
|
|
||||||
|
I do the same thing again for the `pfSync` VLAN with the VLAN ID 44.
|
||||||
|
|
||||||
|
I plan to plug my ISP box on the port 15 of my switch, which is disabled for now. I set it as active, set the native VLAN on the newly created one `WAN (20)` and disable trunking:
|
||||||
|

|
||||||
|
|
||||||
|
Once this setting applied, I make sure that only the ports where are connected my Proxmox nodes propagate these VLAN on their trunk.
|
||||||
|
|
||||||
|
I'm done with UniFi configuration.
|
||||||
|
|
||||||
|
### Proxmox SDN
|
||||||
|
|
||||||
|
Now that the VLAN can reach my nodes, I want to handle it in the Proxmox SDN. I've configured the SDN in [that article]({{< ref "post/11-proxmox-cluster-networking-sdn" >}}).
|
||||||
|
|
||||||
|
In `Datacenter` > `SDN` > `VNets`, I create a new VNet, call it `vlan20` to follow my own naming convention, give it the *WAN* alias and use the tag (VLAN ID) 20:
|
||||||
|

|
||||||
|
|
||||||
|
I also create the `vlan44` for the *pfSync* VLAN, then I apply this configuration and we are done with the SDN.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Create the VMs
|
||||||
|
|
||||||
|
Now that the VLAN configuration is done, I can start buiding the virtual machines on Proxmox.
|
||||||
|
|
||||||
|
The first VM is named `cerbere-head1` (I didn't tell you? My current firewall is named `cerbere`, it makes even more sense now!). Here are the settings:
|
||||||
|
- **OS type**: Linux (even if OPNsense is based on FreeBSD)
|
||||||
|
- **Machine type**: `q35`
|
||||||
|
- **BIOS**: `OVMF (UEFI)`
|
||||||
|
- **Disk**: 20 GB on Ceph distributed storage
|
||||||
|
- **RAM**: 4 GB RAM, ballooning disabled
|
||||||
|
- **CPU**: 2 vCPU
|
||||||
|
- **NICs**, firewall disabled:
|
||||||
|
1. `vmbr0` (*Mgmt*)
|
||||||
|
2. `vlan20` (*WAN*)
|
||||||
|
3. `vlan13` *(User)*
|
||||||
|
4. `vlan37` *(IoT)*
|
||||||
|
5. `vlan44` *(pfSync)*
|
||||||
|
6. `vlan55` *(DMZ)*
|
||||||
|
7. `vlan66` *(Lab)*
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
ℹ️ Now I clone that VM to create `cerbere-head2`, then I proceed with OPNsense installation. I don't want to go into much details about OPNsense installation, I already documented it in the [proof of concept]({{< ref "post/12-opnsense-virtualization-highly-available" >}}).
|
||||||
|
|
||||||
|
After the installation of both OPNsense instances, I give to each of them their IP in the *Mgmt* network:
|
||||||
|
- `cerbere-head1`: `192.168.88.2/24`
|
||||||
|
- `cerbere-head2`: `192.168.88.3/24`
|
||||||
|
|
||||||
|
I give them the other OPNsense node as gateway (`192.168.88.1`) to allow me to reach them from my laptop in another VLAN.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Configure OPNsense
|
||||||
|
|
||||||
|
Initially, I considered restoring my existing OPNsense configuration and adapt it to the setup.
|
||||||
|
|
||||||
|
Then I decided to start over to document and share it. This part was getting so long that I prefered create a dedicated post instead.
|
||||||
|
|
||||||
|
📖 You can find the details of the full OPNsense configuration in that [article]({{< ref "post/13-opnsense-full-configuration" >}}), covering HA, DNS, DHCP, VPN and reverse proxy.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Proxmox VM High Availability
|
||||||
|
|
||||||
|
Resources (VM or LXC) in Proxmox VE can be tagged as highly available, let see how to set it up.
|
||||||
|
|
||||||
|
### Proxmox HA Requirements
|
||||||
|
|
||||||
|
First, your Proxmox cluster must allow it. There are some requirements:
|
||||||
|
- At least 3 nodes to have quorum
|
||||||
|
- Shared storage for your resources
|
||||||
|
- Time synchronized
|
||||||
|
- Reliable network
|
||||||
|
|
||||||
|
A fencing mechanism must be enabled. Fencing is the process of isolating a failed cluster node to ensure it no longer accesses shared resources. This prevents split-brain situations and allows Proxmox HA to safely restart affected VMs on healthy nodes. By default, it is using Linux software watchdog, *softdog*, good enough for me.
|
||||||
|
|
||||||
|
In Proxmox VE 8, It was possible to create HA groups, depending of their resources, locations, etc. This has been replaced, in Proxmox VE 9, by HA affinity rules. This is actually the main reason behind my Proxmox VE cluster upgrade, which I've detailed in that [post]({{< ref "post/14-proxmox-cluster-upgrade-8-to-9-ceph" >}}).
|
||||||
|
|
||||||
|
### Configure VM HA
|
||||||
|
|
||||||
|
The Proxmox cluster is able to provide HA for the resources, but you need to define the rules.
|
||||||
|
|
||||||
|
In `Datacenter` > `HA`, you can see the status and manage the resources. In the `Resources` panel I click on `Add`. I need to pick the resource to configure as HA in the list, here `cerbere-head1` with ID 122. Then in the tooltip I can define the maximum of restart and relocate, I keep `Failback` enabled and the requested state to `started`:
|
||||||
|

|
||||||
|
|
||||||
|
The Proxmox cluster will now make sure this VM is started. I do the same for the other OPNsense VM, `cerbere-head2`.
|
||||||
|
|
||||||
|
### HA Affinity Rules
|
||||||
|
|
||||||
|
Great, but I don't want them on the same node. This is when the new feature HA affinity rules, of Proxmox VE 9, come in. Proxmox allows to create node affinity and resource affinity rules. I don't mind on which node they run, but I don't want them together. I need a resource affinity rule.
|
||||||
|
|
||||||
|
In `Datacenter` > `HA` > `Affinity Rules`, I add a new HA resource affinity rule. I select both VMs and pick the option `Keep Separate`:
|
||||||
|

|
||||||
|
|
||||||
|
✅ My OPNsense VMs are now fully ready!
|
||||||
|
|
||||||
|
---
|
||||||
|
## Migration
|
||||||
|
|
||||||
|
🚀 Time to make it real!
|
||||||
|
|
||||||
|
I'm not gonna lie, I'm quite excited. I'm working for this moment for days.
|
||||||
|
|
||||||
|
### The Migration Plan
|
||||||
|
|
||||||
|
I have my physical OPNsense box directly connected to my ISP box. I want to swap it for the VM cluster. (To avoid writing the word OPNsense on each line, I'll simply name it "the box" and "the VM")
|
||||||
|
|
||||||
|
Here is the plan:
|
||||||
|
1. Backup of the box configuration.
|
||||||
|
2. Disable DHCP server on the box.
|
||||||
|
3. Change IP addresses of the box.
|
||||||
|
4. Change VIP on the VM.
|
||||||
|
5. Disable gateway on VM.
|
||||||
|
6. Configure DHCP on both VMs.
|
||||||
|
7. Enable mDNS repeater on VM.
|
||||||
|
8. Replicate services on VM.
|
||||||
|
9. Move of the Ethernet cable.
|
||||||
|
### Rollback Strategy
|
||||||
|
|
||||||
|
None. 😎
|
||||||
|
|
||||||
|
I'm kidding, the rollback consists of restoring the box configuration, shutdown the OPNsense VMs and plug back the Ethernet cable into the box.
|
||||||
|
|
||||||
|
### Verification Plan
|
||||||
|
|
||||||
|
To validate the migration, I'm drawing up a checklist:
|
||||||
|
1. WAN DHCP lease in the VM.
|
||||||
|
2. Ping from my PC to the VIP of the User VLAN.
|
||||||
|
3. Ping cross VLAN.
|
||||||
|
4. SSH into my machines.
|
||||||
|
5. Renew DHCP lease.
|
||||||
|
6. Check `ipconfig`
|
||||||
|
7. Test internet website.
|
||||||
|
8. Check firewall logs.
|
||||||
|
9. Check my webservices.
|
||||||
|
10. Verify if my internal webservices are not accessible from outside.
|
||||||
|
11. Test VPN.
|
||||||
|
12. Check all IoT devices.
|
||||||
|
13. Check Home Assistant features.
|
||||||
|
14. Check if the TV works.
|
||||||
|
15. Test the Chromecast.
|
||||||
|
16. Print something.
|
||||||
|
17. Verify DNS blocklist.
|
||||||
|
18. Speedtest.
|
||||||
|
19. Switchover.
|
||||||
|
20. Failover.
|
||||||
|
21. Disaster Recovery.
|
||||||
|
22. Champaign!
|
||||||
|
|
||||||
|
Will it work? Let's find out!
|
||||||
|
|
||||||
|
### Migration Steps
|
||||||
|
|
||||||
|
1. **Backup of the box configuration.**
|
||||||
|
|
||||||
|
On my physical OPNsense instance, in `System` > `Configuration` > `Backups`, I click the `Download configuration` button which give me the precious XML file. The one that saved my ass the [last time]({{< ref "post/10-opnsense-crash-disk-panic" >}}).
|
||||||
|
|
||||||
|
2. **Disable DHCP server on the box.**
|
||||||
|
|
||||||
|
In `Services` > `ISC DHCPv4`, and for all my interfaces, I disable the DHCP server. I only serve DHCPv4 in my network.
|
||||||
|
|
||||||
|
3. **Change IP addresses of the box.**
|
||||||
|
|
||||||
|
In `Interfaces`, and for all my interfaces, I modify the IP of the firewall, from `.1` to `.253`. I want to reuse the same IP address as VIP, and have this instance still reachable if needed.
|
||||||
|
|
||||||
|
As soon as I click on `Apply`, I lost the communication, which is expected.
|
||||||
|
|
||||||
|
4. **Change VIP on the VM.**
|
||||||
|
|
||||||
|
On my master VM, In `Interfaces` > `Virtual IPs` > `Settings`, I change the VIP address for each interface and set it to `.1`.
|
||||||
|
|
||||||
|
5. **Disable gateway on VM.**
|
||||||
|
|
||||||
|
In `System` > `Gateways` > `Configuration`, I disable the `LAN_GW` which is not needed anymore.
|
||||||
|
|
||||||
|
6. **Configure DHCP on both VMs.**
|
||||||
|
|
||||||
|
In both VM, in `Services` > `Dnsmasq DNS & DHCP`, I enable the service on my 5 interfaces.
|
||||||
|
|
||||||
|
7. **Enable mDNS repeater on VM.**
|
||||||
|
|
||||||
|
In `Services` > `mDNS Repeater`, I enable the service and also enable CARP Failover.
|
||||||
|
|
||||||
|
The service does not start. I'll see that problem later.
|
||||||
|
|
||||||
|
8. **Replicate services on VM.**
|
||||||
|
|
||||||
|
In `System` > `High Availability` > `Status`, I click the button to `Synchronize and reconfigure all`.
|
||||||
|
|
||||||
|
9. **Move of the Ethernet cable.**
|
||||||
|
|
||||||
|
Physically in my rack, I unplug the Ethernet cable from the WAN port (`igc0`) of my physical OPNsense box and plug it into the port 15 of my UniFi switch.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Verification
|
||||||
|
|
||||||
|
😮💨 I take a deep breath and start the verification phase.
|
||||||
|
|
||||||
|
### Checklist
|
||||||
|
|
||||||
|
- ✅ WAN DHCP lease in the VM.
|
||||||
|
- ✅ Ping from my PC to the VIP of the User VLAN.
|
||||||
|
- ⚠️ Ping cross VLAN.
|
||||||
|
Pings are working, but I observe some drops, about 10%.
|
||||||
|
- ✅ SSH into my machines.
|
||||||
|
- ✅ Renew DHCP lease.
|
||||||
|
- ✅ Check `ipconfig`
|
||||||
|
- ❌ Test internet website. → ✅
|
||||||
|
A few websites are working, everything is incredibly slow... It must be the DNS. I try to lookup a random domain, it is working. But I can't lookup `google.com`. I restart the Unbound DNS service, everything works now. It is always the DNS...
|
||||||
|
- ⚠️ Check firewall logs.
|
||||||
|
Few flows are blocks, not mandatory.
|
||||||
|
- ✅Check my webservices.
|
||||||
|
- ✅Verify if my internal webservices are not accessible from outside.
|
||||||
|
- ✅ Test VPN.
|
||||||
|
- ✅ Check all IoT devices.
|
||||||
|
- ✅ Check Home Assistant features.
|
||||||
|
- ✅Check if the TV works.
|
||||||
|
- ❌ Test the Chromecast.
|
||||||
|
It is related to the mDNS service not able to start. I can start it if I uncheck the `CARP Failover` option. the Chromecast is visible now. → ⚠️
|
||||||
|
- ✅Print something.
|
||||||
|
- ✅Verify DNS blocklist.
|
||||||
|
- ✅Speedtest.
|
||||||
|
I observe roughly 15% of decrease bandwidth (from 940Mbps to 825Mbps).
|
||||||
|
- ❌ Switchover.
|
||||||
|
The switchover barely works, a lot of dropped packets during the switch. The service provided is not great: no more internet and my webservices are not reachable.
|
||||||
|
- ⌛ Failover.
|
||||||
|
- ⌛ Disaster Recovery.
|
||||||
|
To be tested later.
|
||||||
|
|
||||||
|
📝 Well, the results are pretty good, not perfect, but satisfying!
|
||||||
|
### Problem Solving
|
||||||
|
|
||||||
|
I focus on resolving remaining problems experienced during the tests.
|
||||||
|
|
||||||
|
1. **DNS**
|
||||||
|
|
||||||
|
During the switchover, the internet connection is not working. No DNS, it is always DNS.
|
||||||
|
|
||||||
|
It's because the backup node does not have a gateway while passive. No gateway prevents the DNS to resolve. After the switchover, it still has unresolved domains in its cache. This problem also lead to another issue, while passive, I can't update the system.
|
||||||
|
|
||||||
|
**Solution**: Create a gateway pointing to the other node, with a higher priority number than the WAN gateway (higher number means lower priority). This way, that gateway is not active while the node is master.
|
||||||
|
|
||||||
|
2. **Reverse Proxy**
|
||||||
|
|
||||||
|
During the switchover, every webservices which I host (reverse proxy/layer 4 proxy) give this error: `SSL_ERROR_INTERNAL_ERROR_ALERT`. After checking the services synchronized throught XMLRPC Sync, Caddy and mDNS repeater were not selected. It is because these services were installed after the initial configuration of the HA.
|
||||||
|
|
||||||
|
**Solution**: Add Caddy to XMLRPC Sync.
|
||||||
|
|
||||||
|
3. **Packet Drops**
|
||||||
|
|
||||||
|
I observe about 10% packet drops for pings from any VLAN to the *Mgmt* VLAN. I don't have this problem for the other VLANs.
|
||||||
|
|
||||||
|
The *Mgmt* VLAN is the native one in my network, it might be the reason behind this issue. This is the only network not defined in the Proxmox SDN. I don't want to have to tag this VLAN.
|
||||||
|
|
||||||
|
**Solution**: Disable the Proxmox firewall of this interface for the VM. I actually disable them all and update the documentation above. I'm not sure why this cause that kind of problem, but disabling it fixed my issue (I could reproduce the behavior while activating the firewall again).
|
||||||
|
|
||||||
|
4. **CARP Script**
|
||||||
|
|
||||||
|
During a switchover, the CARP event script is triggered as many times as the number of interfaces. I have 5 virtual IPs, the script reconfigure my WAN interface 5 times.
|
||||||
|
|
||||||
|
**Solution**: Rework the script to get the WAN interface state and only reconfigure the inteface when needed:
|
||||||
|
```php
|
||||||
|
#!/usr/local/bin/php
|
||||||
|
<?php
|
||||||
|
/**
|
||||||
|
* OPNsense CARP event script
|
||||||
|
* - Enables/disables the WAN interface only when needed
|
||||||
|
* - Avoids reapplying config when CARP triggers multiple times
|
||||||
|
*/
|
||||||
|
|
||||||
|
require_once("config.inc");
|
||||||
|
require_once("interfaces.inc");
|
||||||
|
require_once("util.inc");
|
||||||
|
require_once("system.inc");
|
||||||
|
|
||||||
|
// Read CARP event arguments
|
||||||
|
$subsystem = !empty($argv[1]) ? $argv[1] : '';
|
||||||
|
$type = !empty($argv[2]) ? $argv[2] : '';
|
||||||
|
|
||||||
|
// Accept only MASTER/BACKUP events
|
||||||
|
if (!in_array($type, ['MASTER', 'BACKUP'])) {
|
||||||
|
// Ignore CARP INIT, DEMOTED, etc.
|
||||||
|
exit(0);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Validate subsystem name format, expected pattern: <ifname>@<vhid>
|
||||||
|
if (!preg_match('/^[a-z0-9_]+@\S+$/i', $subsystem)) {
|
||||||
|
log_error("Malformed subsystem argument: '{$subsystem}'.");
|
||||||
|
exit(0);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Interface key to manage
|
||||||
|
$ifkey = 'wan';
|
||||||
|
// Determine whether WAN interface is currently enabled
|
||||||
|
$ifkey_enabled = !empty($config['interfaces'][$ifkey]['enable']) ? true : false;
|
||||||
|
|
||||||
|
// MASTER event
|
||||||
|
if ($type === "MASTER") {
|
||||||
|
// Enable WAN only if it's currently disabled
|
||||||
|
if (!$ifkey_enabled) {
|
||||||
|
log_msg("CARP event: switching to '$type', enabling interface '$ifkey'.", LOG_WARNING);
|
||||||
|
$config['interfaces'][$ifkey]['enable'] = '1';
|
||||||
|
write_config("enable interface '$ifkey' due CARP event '$type'", false);
|
||||||
|
interface_configure(false, $ifkey, false, false);
|
||||||
|
} else {
|
||||||
|
log_msg("CARP event: already '$type' for interface '$ifkey', nothing to do.");
|
||||||
|
}
|
||||||
|
|
||||||
|
// BACKUP event
|
||||||
|
} else {
|
||||||
|
// Disable WAN only if it's currently enabled
|
||||||
|
if ($ifkey_enabled) {
|
||||||
|
log_msg("CARP event: switching to '$type', disabling interface '$ifkey'.", LOG_WARNING);
|
||||||
|
unset($config['interfaces'][$ifkey]['enable']);
|
||||||
|
write_config("disable interface '$ifkey' due CARP event '$type'", false);
|
||||||
|
interface_configure(false, $ifkey, false, false);
|
||||||
|
} else {
|
||||||
|
log_msg("CARP event: already '$type' for interface '$ifkey', nothing to do.");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
5. **mDNS Repeater**
|
||||||
|
|
||||||
|
The mDNS repeater does not want to start when I select the option for `CARP Failover`.
|
||||||
|
|
||||||
|
**Solution**: The machine requires a reboot to start this service CARP aware.
|
||||||
|
|
||||||
|
6. **IPv6 Address**
|
||||||
|
|
||||||
|
My `cerbere-head1` node is crying in the log file while the other does not. Here are the messages spit every seconds while it is master:
|
||||||
|
```plaintext
|
||||||
|
Warning rtsold <interface_up> vtnet1 is disabled. in the logs (OPNsense)
|
||||||
|
```
|
||||||
|
|
||||||
|
Another one I'm having several times after a switchback:
|
||||||
|
```plaintext
|
||||||
|
Error dhcp6c transmit failed: Can't assign requested address
|
||||||
|
```
|
||||||
|
|
||||||
|
This is related to IPv6. I observe that my main node does not have a global IPv6 address, only a link-local. Also, it does not have a IPv6 gateway. My secondary node, in the other hand, has both addresses and the gateway.
|
||||||
|
|
||||||
|
I'm no IPv6 expert, after searching for a couple of hours, I give up the IPv6. If someone out here can help, it would be really appreciated!
|
||||||
|
|
||||||
|
**Workaround**: Remove DHCPv6 for my WAN interface.
|
||||||
|
|
||||||
|
### Confirmation
|
||||||
|
|
||||||
|
Now that everything is fixed, I can evaluate the failover performance.
|
||||||
|
|
||||||
|
1. **Switchover**
|
||||||
|
|
||||||
|
When manually entering CARP maintenance mode from the WebGUI interface, no packet drop is observed. Impressive.
|
||||||
|
|
||||||
|
2. **Failover**
|
||||||
|
|
||||||
|
To simulate a failover, I kill the active OPNsense VM. Here I observe only one packet dropped. Awesome.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
3. **Disaster Recovery**
|
||||||
|
|
||||||
|
A disaster recovery is what would happen after a full Proxmox cluster stop, after an electrical outage for example. I didn't have the time (or the courage) to do that, I'd prefer to prepare a bit better to avoid collateral damages. But surely, this kind of scenario must be evaluated.
|
||||||
|
|
||||||
|
### Extras Benefits
|
||||||
|
|
||||||
|
Leaving aside the fact that this new setup is more resilient, I have few more bonuses.
|
||||||
|
|
||||||
|
My rack is tiny and the space is tight. The whole thing is heating quite much, exceeding 40°C on top of the rack in summer. Reducing the number of machines powered up lower the temperature. I've gained **1,5°C** after shutting down the old OPNsense box, cool!
|
||||||
|
|
||||||
|
Power consumption is also a concern, my tiny datacenter was drawing 85W on average. Here again I could observe a small decrease, about 8W lower. Considering that this run 24/7, not negligible.
|
||||||
|
|
||||||
|
Finally I also removed the box itself and the power cable. Slots are very limited, another good point.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Conclusion
|
||||||
|
|
||||||
|
🎉 I did it guys! I'm very proud of the results, proud of myself.
|
||||||
|
|
||||||
|
From my [first OPNsense box crash]({{< ref "post/10-opnsense-crash-disk-panic" >}}), the thinking about a solution, the HA [proof of concept]({{< ref "post/12-opnsense-virtualization-highly-available" >}}), to this migration. This has been a quite long project, but extremly interesting.
|
||||||
|
|
||||||
|
🎯 This is great to set objectives, but this is even better when you reach them.
|
||||||
|
|
||||||
|
Now I'm going to leave OPNsense aside for a bit, to be able to re-focus on my Kubernetes journey!
|
||||||
|
|
||||||
|
As always, if you have questions, remarks or a solution for my IPv6 problem, I'll be really happy to share with you.
|
||||||
170
content/post/16-how-I-deploy-application.fr.md
Normal file
@@ -0,0 +1,170 @@
|
|||||||
|
---
|
||||||
|
slug: how-I-deploy-application
|
||||||
|
title: Comment je Déploie des Applications Aujourd’hui
|
||||||
|
description: La méthode que j’utilise aujourd’hui pour déployer de nouvelles applications dans mon homelab. Workflow simple tirant parti de Docker Compose dans une VM sur Proxmox VE
|
||||||
|
date: 2026-01-31
|
||||||
|
draft: false
|
||||||
|
tags:
|
||||||
|
- docker
|
||||||
|
- proxmox
|
||||||
|
- opnsense
|
||||||
|
- treafik
|
||||||
|
- gitea
|
||||||
|
categories:
|
||||||
|
- homelab
|
||||||
|
---
|
||||||
|
## Intro
|
||||||
|
|
||||||
|
Dans cet article, je ne vais pas expliquer les bonnes pratiques pour déployer des applications. À la place, je veux documenter comment je déploie actuellement de nouvelles applications dans mon homelab.
|
||||||
|
|
||||||
|
Considérez cet article comme un snapshot. C’est comme ça que les choses fonctionnent vraiment aujourd’hui, sachant que dans un futur proche j’aimerais évoluer vers un workflow plus orienté GitOps.
|
||||||
|
|
||||||
|
La méthode que j’utilise est assez simple. J’ai essayé de la standardiser autant que possible, mais elle implique encore pas mal d’étapes manuelles. J’expliquerai aussi comment je mets à jour les applications, ce qui est, à mon avis, la plus grande faiblesse de cette configuration. À mesure que le nombre d’applications augmente, garder le tout à jour demande de plus en plus de temps.
|
||||||
|
|
||||||
|
|
||||||
|
---
|
||||||
|
## Overview de la Plateforme
|
||||||
|
|
||||||
|
Avant d’entrer dans le workflow, voici un rapide aperçu des principaux composants impliqués.
|
||||||
|
### Docker
|
||||||
|
|
||||||
|
Docker est la base de ma stack applicative. Quand c’est possible, je déploie les applications sous forme de conteneurs.
|
||||||
|
|
||||||
|
J’utilise Docker Compose depuis des années. À l’époque, tout tournait sur un seul serveur physique. Aujourd’hui, mon installation est basée sur des VM, et je pourrais migrer vers Docker Swarm, mais j’ai choisi de ne pas le faire. Cela peut avoir du sens dans certains scénarios, mais ce n’est pas aligné avec là où je veux aller à long terme.
|
||||||
|
|
||||||
|
Pour l’instant, je m’appuie toujours sur une seule VM pour héberger toutes les applications Docker. Cette VM est plus ou moins un clone de mon ancien serveur physique, simplement virtualisé.
|
||||||
|
|
||||||
|
### Proxmox VE
|
||||||
|
|
||||||
|
Cette VM est hébergée sur un cluster Proxmox VE, composé de trois nœuds et utilisant Ceph comme stockage distribué.
|
||||||
|
|
||||||
|
Cela me donne de la haute disponibilité et facilite grandement la gestion des VM, même si le workload Docker n'est pas hautement disponible.
|
||||||
|
|
||||||
|
### Traefik
|
||||||
|
|
||||||
|
Traefik tourne directement sur l’hôte Docker et fait office de reverse proxy.
|
||||||
|
|
||||||
|
Il est responsable d’acheminer le trafic HTTPS vers les bons conteneurs et de gérer automatiquement les certificats TLS via Let’s Encrypt. Cela garde la configuration au niveau des applications simple et centralisée.
|
||||||
|
|
||||||
|
### OPNsense
|
||||||
|
|
||||||
|
OPNsense est mon routeur, pare-feu et agit aussi comme reverse proxy.
|
||||||
|
|
||||||
|
Le trafic HTTPS entrant est transféré vers Traefik en utilisant le plugin Caddy avec des règles Layer 4. Le TLS n’est pas terminé au niveau du pare-feu. Il est transmis à Traefik, qui gère l’émission et le renouvellement des certificats.
|
||||||
|
|
||||||
|
### Gitea
|
||||||
|
|
||||||
|
Gitea est un dépôt Git self-hosted, j’ai une instance qui tourne dans mon homelab.
|
||||||
|
|
||||||
|
Dans Gitea, j’ai un dépôt privé qui contient toutes mes configurations Docker Compose. Chaque application a son propre dossier, ce qui rend le dépôt facile à parcourir et à maintenir.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Déployer une Nouvelle Application
|
||||||
|
|
||||||
|
Pour standardiser les déploiements, j’utilise un template `docker-compose.yml` qui ressemble à ceci :
|
||||||
|
```yml
|
||||||
|
services:
|
||||||
|
NAME:
|
||||||
|
image: IMAGE
|
||||||
|
container_name: NAME
|
||||||
|
volumes:
|
||||||
|
- /appli/data/NAME/:/
|
||||||
|
environment:
|
||||||
|
- TZ=Europe/Paris
|
||||||
|
networks:
|
||||||
|
- web
|
||||||
|
labels:
|
||||||
|
- traefik.enable=true
|
||||||
|
- traefik.http.routers.NAME.rule=Host(`HOST.vezpi.com`)
|
||||||
|
- traefik.http.routers.NAME.entrypoints=https
|
||||||
|
- traefik.http.routers.NAME.tls.certresolver=letsencrypt
|
||||||
|
- traefik.http.services.NAME.loadbalancer.server.port=PORT
|
||||||
|
restart: always
|
||||||
|
|
||||||
|
networks:
|
||||||
|
web:
|
||||||
|
external: true
|
||||||
|
```
|
||||||
|
|
||||||
|
Laissez-moi expliquer.
|
||||||
|
|
||||||
|
Pour l’image, selon l’application, le registre utilisé peut varier, mais j’utilise quand même Docker Hub par défaut. Quand j’essaie une nouvelle application, je peux utiliser le tag `latest` au début. Ensuite, si je choisis de la garder, je préfère épingler la version actuelle plutôt que `latest`.
|
||||||
|
|
||||||
|
J’utilise des montages de volumes pour tout ce qui est stateful. Chaque application a son propre dossier dans le filesystem `/appli/data`.
|
||||||
|
|
||||||
|
Quand une application doit être accessible en HTTPS, je relie le conteneur qui sert les requêtes au réseau `web`, qui est géré par Traefik et je lui associe des labels. Les `entrypoint` et `certresolver` sont définis dans ma configuration Traefik. L’URL définie dans `Host()` est celle qui sera utilisée pour accéder à l’application. Elle doit être identique à ce qui est défini dans la route Layer4 du plugin Caddy d’OPNsense.
|
||||||
|
|
||||||
|
Si plusieurs conteneurs doivent communiquer entre eux, j’ajoute un réseau `backend` qui sera créé lors du déploiement de la stack, dédié à l’application. Ainsi, aucun port n’a besoin d’être ouvert sur l’hôte.
|
||||||
|
|
||||||
|
### Étapes de Déploiement
|
||||||
|
|
||||||
|
La plupart du travail est effectué depuis VScode :
|
||||||
|
- Créer un nouveau dossier dans ce dépôt, avec le nom de l’application.
|
||||||
|
- Copier le template ci-dessus dans ce dossier.
|
||||||
|
- Adapter le template avec les valeurs fournies par la documentation de l’application.
|
||||||
|
- Créer un fichier `.env` pour les secrets si nécessaire. Ce fichier est ignoré par `.gitignore`.
|
||||||
|
- Démarrer les services directement depuis VS Code en utilisant l’extension Docker.
|
||||||
|
|
||||||
|
Puis dans l’interface Web OPNsense, je mets à jour 2 routes Layer4 pour le plugin Caddy:
|
||||||
|
- Selon que l’application doit être exposée sur Internet ou non, j’ai une route _Internal_ et une route _External_. J’ajoute l’URL donnée à Traefik dans l’une d’elles.
|
||||||
|
- J’ajoute aussi cette URL dans une autre route pour rediriger le challenge HTTP Let’s Encrypt vers Traefik.
|
||||||
|
|
||||||
|
Une fois terminé, je teste l’URL. Si tout est correctement configuré, l’application devrait être accessible en HTTPS.
|
||||||
|
|
||||||
|
Quand tout fonctionne comme prévu, je commit le nouveau dossier de l’application dans le dépôt.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Mettre à Jour une Application
|
||||||
|
|
||||||
|
Les mises à jour d’applications sont encore entièrement manuelles.
|
||||||
|
|
||||||
|
Je n’utilise pas d’outils automatisés comme Watchtower pour l’instant. Environ une fois par mois, je cherche de nouvelles versions en regardant Docker Hub, les releases GitHub ou la documentation de l’application.
|
||||||
|
|
||||||
|
Pour chaque application que je veux mettre à jour, je passe en revue:
|
||||||
|
- Nouvelles fonctionnalités
|
||||||
|
- Breaking changes
|
||||||
|
- Chemins de mise à niveau si nécessaire
|
||||||
|
|
||||||
|
La plupart du temps, les mises à jour sont simples:
|
||||||
|
|
||||||
|
- Mettre à jour le tag de l’image dans le fichier Docker Compose
|
||||||
|
- Redémarrer la stack.
|
||||||
|
- Vérifier que les conteneurs redémarrent correctement
|
||||||
|
- Consulter les logs Docker
|
||||||
|
- Tester l’application pour détecter des régressions
|
||||||
|
|
||||||
|
Si ça fonctionne, je continue à mettre à niveau étape par étape jusqu’à atteindre la dernière version disponible.
|
||||||
|
|
||||||
|
Sinon, je débogue jusqu’à corriger le problème. Les retours arrière sont pénibles.
|
||||||
|
|
||||||
|
Une fois la dernière version atteinte, je commit les changements dans le dépôt.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Avantages et inconvénients
|
||||||
|
|
||||||
|
Qu’est-ce qui fonctionne bien et qu’est-ce qui fonctionne moins ?
|
||||||
|
|
||||||
|
### Avantages
|
||||||
|
|
||||||
|
- Modèle simple, une VM, un fichier compose par application.
|
||||||
|
- Facile à déployer, idéal pour tester une application.
|
||||||
|
- Emplacement central pour les configurations.
|
||||||
|
|
||||||
|
### Inconvénients
|
||||||
|
|
||||||
|
- La VM Docker unique est un point de défaillance unique.
|
||||||
|
- Les mises à jour manuelles ne passent pas à l’échelle quand le nombre d’applications augmente.
|
||||||
|
- Devoir déclarer l’URL dans Caddy est fastidieux.
|
||||||
|
- Difficile de suivre ce qui est en ligne et ce qui ne l’est pas.
|
||||||
|
- Les secrets dans .env sont pratiques mais basiques.
|
||||||
|
- Pas de moyen rapide de rollback.
|
||||||
|
- Les opérations sur la VM sont critiques.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Conclusion
|
||||||
|
|
||||||
|
Cette configuration fonctionne, et elle m’a bien servi jusqu’ici. Elle est simple et intuitive. Cependant, elle est aussi très manuelle, surtout pour les mises à jour et la maintenance à long terme.
|
||||||
|
|
||||||
|
À mesure que le nombre d’applications augmente, cette approche ne passe clairement pas très bien à l’échelle. C’est l’une des principales raisons pour lesquelles je regarde vers GitOps et des workflows plus déclaratifs pour l’avenir.
|
||||||
|
|
||||||
|
Pour l'instant, cependant, c'est ainsi que je déploie des applications dans mon homelab, et cet article sert de point de référence pour savoir par où j'ai commencé.
|
||||||
169
content/post/16-how-I-deploy-application.md
Normal file
@@ -0,0 +1,169 @@
|
|||||||
|
---
|
||||||
|
slug: how-I-deploy-application
|
||||||
|
title: How Do I Deploy Application Today
|
||||||
|
description: The method I use today to deploy new application in my homelab. Simple workflow taking advantage of Docker Compose in a VM on Proxmox VE
|
||||||
|
date: 2026-01-31
|
||||||
|
draft: false
|
||||||
|
tags:
|
||||||
|
- docker
|
||||||
|
- proxmox
|
||||||
|
- opnsense
|
||||||
|
- treafik
|
||||||
|
- gitea
|
||||||
|
categories:
|
||||||
|
- homelab
|
||||||
|
---
|
||||||
|
## Intro
|
||||||
|
|
||||||
|
In this post, I am not going to explain best practices for deploying applications. Instead, I want to document how I am currently deploying new applications in my homelab.
|
||||||
|
|
||||||
|
Think of this article as a snapshot in time. This is how things really work today, knowing that in the near future I would like to move toward a more GitOps-oriented workflow.
|
||||||
|
|
||||||
|
The method I use is fairly simple. I have tried to standardize it as much as possible, but it still involves quite a few manual steps. I will also explain how I update applications, which is, in my opinion, the biggest weakness of this setup. As the number of applications keeps growing, keeping everything up to date requires more and more time.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Platform Overview
|
||||||
|
|
||||||
|
Before diving into the workflow, here is a quick overview of the main components involved.
|
||||||
|
### Docker
|
||||||
|
|
||||||
|
Docker is the foundation of my application stack. Whenever possible, I deploy applications as containers.
|
||||||
|
|
||||||
|
I have been using Docker Compose for years. At the time, everything was running on a single physical server. Today, my setup is VM-based, and I could migrate to Docker Swarm, but I have chosen not to. It might make sense in some scenarios, but it is not aligned with where I want to go long term.
|
||||||
|
|
||||||
|
For now, I still rely on a single VM to host all Docker applications. This VM is more or less a clone of my old physical server, just virtualized.
|
||||||
|
|
||||||
|
### Proxmox VE
|
||||||
|
|
||||||
|
This VM is hosted on a Proxmox VE cluster, composed of three nodes and uses Ceph as a distributed storage backend.
|
||||||
|
|
||||||
|
This gives me high availability and makes VM management much easier, even though the Docker workloads themselves are not highly available.
|
||||||
|
|
||||||
|
### Traefik
|
||||||
|
|
||||||
|
Traefik runs directly on the Docker host and acts as the reverse proxy.
|
||||||
|
|
||||||
|
It is responsible for routing the HTTPS traffic to the correct containers and for managing TLS certificates automatically using Let’s Encrypt. This keeps application-level configuration simple and centralized.
|
||||||
|
|
||||||
|
### OPNsense
|
||||||
|
|
||||||
|
OPNsense is my router, firewall and also acts as reverse proxy.
|
||||||
|
|
||||||
|
Incoming HTTPS traffic is forwarded to Traefik using the Caddy plugin with Layer 4 rules. TLS is not terminated at the firewall level. It is passed through to Traefik, which handles certificate issuance and renewal.
|
||||||
|
|
||||||
|
### Gitea
|
||||||
|
|
||||||
|
Gitea is a self-hosted Git repository, I have one instance running in my homelab.
|
||||||
|
|
||||||
|
Inside Gitea, I have a private repository that contains all my Docker Compose configurations. Each application has its own folder, making the repository easy to navigate and maintain.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Deploy New Application
|
||||||
|
|
||||||
|
To standardize deployments, I use a `docker-compose.yml` template that looks like this:
|
||||||
|
```yml
|
||||||
|
services:
|
||||||
|
NAME:
|
||||||
|
image: IMAGE
|
||||||
|
container_name: NAME
|
||||||
|
volumes:
|
||||||
|
- /appli/data/NAME/:/
|
||||||
|
environment:
|
||||||
|
- TZ=Europe/Paris
|
||||||
|
networks:
|
||||||
|
- web
|
||||||
|
labels:
|
||||||
|
- traefik.enable=true
|
||||||
|
- traefik.http.routers.NAME.rule=Host(`HOST.vezpi.com`)
|
||||||
|
- traefik.http.routers.NAME.entrypoints=https
|
||||||
|
- traefik.http.routers.NAME.tls.certresolver=letsencrypt
|
||||||
|
- traefik.http.services.NAME.loadbalancer.server.port=PORT
|
||||||
|
restart: always
|
||||||
|
|
||||||
|
networks:
|
||||||
|
web:
|
||||||
|
external: true
|
||||||
|
```
|
||||||
|
|
||||||
|
Let me explain.
|
||||||
|
|
||||||
|
For the image, depending on the application, the registry used could differ, but I still the Docker Hub by default. When I try a new application, I might use the `latest` tag at first. Then if I choose to keep the it, I prefer to pin the current version instead of `latest`.
|
||||||
|
|
||||||
|
I use volume binds for everything stateful. Every application got its own folder in the `/appli/data` filesystem.
|
||||||
|
|
||||||
|
When an application needs to be reachable with HTTPS, I link the container serving the requests in the `web` network, which is managed by Traefik and I associate it labels. The `entrypoint` and `certresolver` is defined in my Traefik configuration. The URL defined in `Host()` is the one which will be used to access the application. This needs to be the same as defined in the Layer4 route in the Caddy plugin of OPNsense.
|
||||||
|
|
||||||
|
If several containers need to talk to each other, I add a `backend` network which will be created when the stack will be deployed, dedicated for the application. This way, no ports need to be opened on the host.
|
||||||
|
|
||||||
|
### Steps to Deploy
|
||||||
|
|
||||||
|
Most of the work is done from VScode:
|
||||||
|
- Create a new folder in that repository, with the application name.
|
||||||
|
- Copy the template above inside this folder.
|
||||||
|
- Adapt the template with the values given by the application documentation.
|
||||||
|
- Create a `.env` file for secrets if needed. This file is ignored by `.gitignore`.
|
||||||
|
- Start the services directly from VS Code using the Docker extension.
|
||||||
|
|
||||||
|
|
||||||
|
Then in the OPNsense WebUI, I update 2 Layer4 routes for the Caddy plugin:
|
||||||
|
- Depending if the application should be exposed on the internet or not, I have an *Internal* and *External* route. I add the URL given to Traefik in one of these.
|
||||||
|
- I also add this URL in another route to redirect the Let's Encrypt HTTP challenge to Traefik.
|
||||||
|
|
||||||
|
Once complete, I test the URL. If everything is configured correctly, the application should be reachable over HTTPS.
|
||||||
|
|
||||||
|
When everything works as expected, I commit the new application folder to the repository.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Update Application
|
||||||
|
|
||||||
|
Application updates are still entirely manual.
|
||||||
|
|
||||||
|
I do not use automated tools like Watchtower for now. About once a month, I check for new versions by looking at Docker Hub, GitHub releases, or the application documentation.
|
||||||
|
|
||||||
|
For each application I want to update, I review:
|
||||||
|
- New features
|
||||||
|
- Breaking changes
|
||||||
|
- Upgrade paths if required
|
||||||
|
|
||||||
|
Most of the time, updates are straightforward:
|
||||||
|
- Bump the image tag in the Docker Compose file
|
||||||
|
- Restart the stack.
|
||||||
|
- Verify that the containers restart properly
|
||||||
|
- Check Docker logs
|
||||||
|
- Test the application to detect regressions
|
||||||
|
|
||||||
|
If it works, I continue upgrading step by step until I reach the latest available version.
|
||||||
|
|
||||||
|
If not, I debug until I fix the problem. Rollbacks are painful.
|
||||||
|
|
||||||
|
Once the latest version is reached, I commit the changes to the repository.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Pros and Cons
|
||||||
|
|
||||||
|
What works well and what doesn't?
|
||||||
|
|
||||||
|
### Pros
|
||||||
|
|
||||||
|
- Simple model, one VM, one compose file per application.
|
||||||
|
- Easy to deploy, great to test an application.
|
||||||
|
- Central location for the configurations.
|
||||||
|
|
||||||
|
### Cons
|
||||||
|
|
||||||
|
- Single Docker VM is a single point of failure.
|
||||||
|
- Manual updates don’t scale as the app count grows.
|
||||||
|
- Having to declare the URL on Caddy is boring.
|
||||||
|
- Hard to follow what is up, and what is not.
|
||||||
|
- Secrets in .env are convenient but basic.
|
||||||
|
- No fast way to rollback.
|
||||||
|
- Operations on the VM are critical.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Conclusion
|
||||||
|
|
||||||
|
This setup works, and it has served me well so far. It is simple and intuitive. However, it is also very manual, especially when it comes to updates and long-term maintenance.
|
||||||
|
|
||||||
|
As the number of applications grows, this approach clearly does not scale very well. That is one of the main reasons why I am looking toward GitOps and more declarative workflows for the future.
|
||||||
|
|
||||||
|
For now, though, this is how I deploy applications in my homelab, and this post serves as a reference point for where I started.
|
||||||
259
content/post/17-semaphore-ui-interface-ansible-terraform.fr.md
Normal file
@@ -0,0 +1,259 @@
|
|||||||
|
---
|
||||||
|
slug: semaphore-ui-interface-ansible-terraform
|
||||||
|
title: Semaphore UI, une excellente interface pour Ansible et Terraform
|
||||||
|
description: Démonstration de Semaphore UI, une interface web pour exécuter des playbooks Ansible, du code Terraform et bien plus. Installation avec Docker et exemples rapides.
|
||||||
|
date: 2026-02-09
|
||||||
|
draft: false
|
||||||
|
tags:
|
||||||
|
- semaphore-ui
|
||||||
|
- ansible
|
||||||
|
- terraform
|
||||||
|
- proxmox
|
||||||
|
- docker
|
||||||
|
categories:
|
||||||
|
- homelab
|
||||||
|
---
|
||||||
|
## Intro
|
||||||
|
|
||||||
|
Dans mon homelab, j'aime expérimenter avec des outils comme Ansible et Terraform. L'interface principale est le CLI, que j'adore, mais parfois une jolie interface web est juste agréable.
|
||||||
|
|
||||||
|
Après avoir configuré mon cluster OPNsense, je voulais un moyen de le tenir à jour selon un calendrier. Pour moi, l'automatisation passe par Ansible, mais comment automatiser et planifier des playbooks ?
|
||||||
|
|
||||||
|
Au travail j'utilise Red Hat Ansible Automation Platform, qui est excellent, mais overkill pour mon lab. C'est ainsi que j'ai découvert Semaphore UI. Voyons ce qu'il peut faire.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Qu'est‑ce que Semaphore UI
|
||||||
|
|
||||||
|
[Semaphore UI](https://semaphoreui.com/docs/) est une interface web élégante conçue pour exécuter de l'automatisation avec des outils comme Ansible et Terraform, et même des scripts Bash, Powershell ou Python.
|
||||||
|
|
||||||
|
Initialement créé sous le nom Ansible Semaphore, une interface web destinée à fournir un front-end simple pour exécuter uniquement des playbooks Ansible. Au fil du temps, la communauté a fait évoluer le projet en une plateforme de contrôle d'automatisation multi‑outils.
|
||||||
|
|
||||||
|
C'est une application autonome écrite en Go avec des dépendances minimales, capable d'utiliser différents backends de base de données, tels que PostgreSQL, MySQL ou BoltDB.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Installation
|
||||||
|
|
||||||
|
Semaphore UI prend en charge plusieurs méthodes d'[installation](https://semaphoreui.com/docs/category/installation) : Docker, Kubernetes, gestionnaire de paquets ou simple binaire.
|
||||||
|
|
||||||
|
J'ai utilisé Docker pour mon installation, vous pouvez voir comment je déploie actuellement des applications dans ce [post]({{< ref "post/16-how-I-deploy-application" >}})
|
||||||
|
|
||||||
|
Voici mon fichier `docker-compose.yml` que j'ai configuré en utilisant PostgreSQL :
|
||||||
|
```yaml
|
||||||
|
services:
|
||||||
|
semaphore:
|
||||||
|
image: semaphoreui/semaphore:v2.16.45
|
||||||
|
container_name: semaphore_ui
|
||||||
|
environment:
|
||||||
|
- TZ=Europe/Paris
|
||||||
|
- SEMAPHORE_DB_USER=${POSTGRES_USER}
|
||||||
|
- SEMAPHORE_DB_PASS=${POSTGRES_PASSWORD}
|
||||||
|
- SEMAPHORE_DB_HOST=postgres
|
||||||
|
- SEMAPHORE_DB_PORT=5432
|
||||||
|
- SEMAPHORE_DB_DIALECT=postgres
|
||||||
|
- SEMAPHORE_DB=${POSTGRES_DB}
|
||||||
|
- SEMAPHORE_PLAYBOOK_PATH=/tmp/semaphore/
|
||||||
|
- SEMAPHORE_ADMIN_PASSWORD=${SEMAPHORE_ADMIN_PASSWORD}
|
||||||
|
- SEMAPHORE_ADMIN_NAME=${SEMAPHORE_ADMIN_NAME}
|
||||||
|
- SEMAPHORE_ADMIN_EMAIL=${SEMAPHORE_ADMIN_EMAIL}
|
||||||
|
- SEMAPHORE_ADMIN=${SEMAPHORE_ADMIN}
|
||||||
|
- SEMAPHORE_ACCESS_KEY_ENCRYPTION=${SEMAPHORE_ACCESS_KEY_ENCRYPTION}
|
||||||
|
- SEMAPHORE_LDAP_ACTIVATED='no'
|
||||||
|
# - SEMAPHORE_LDAP_HOST=dc01.local.example.com
|
||||||
|
# - SEMAPHORE_LDAP_PORT='636'
|
||||||
|
# - SEMAPHORE_LDAP_NEEDTLS='yes'
|
||||||
|
# - SEMAPHORE_LDAP_DN_BIND='uid=bind_user,cn=users,cn=accounts,dc=local,dc=shiftsystems,dc=net'
|
||||||
|
# - SEMAPHORE_LDAP_PASSWORD='ldap_bind_account_password'
|
||||||
|
# - SEMAPHORE_LDAP_DN_SEARCH='dc=local,dc=example,dc=com'
|
||||||
|
# - SEMAPHORE_LDAP_SEARCH_FILTER="(\u0026(uid=%s)(memberOf=cn=ipausers,cn=groups,cn=accounts,dc=local,dc=example,dc=com))"
|
||||||
|
depends_on:
|
||||||
|
- postgres
|
||||||
|
networks:
|
||||||
|
- backend
|
||||||
|
- web
|
||||||
|
labels:
|
||||||
|
- traefik.enable=true
|
||||||
|
- traefik.http.routers.semaphore.rule=Host(`semaphore.vezpi.com`)
|
||||||
|
- traefik.http.routers.semaphore.entrypoints=https
|
||||||
|
- traefik.http.routers.semaphore.tls.certresolver=letsencrypt
|
||||||
|
- traefik.http.services.semaphore.loadbalancer.server.port=3000
|
||||||
|
restart: unless-stopped
|
||||||
|
|
||||||
|
postgres:
|
||||||
|
image: postgres:14
|
||||||
|
hostname: postgres
|
||||||
|
container_name: semaphore_postgres
|
||||||
|
volumes:
|
||||||
|
- /appli/data/semaphore/db:/var/lib/postgresql/data
|
||||||
|
environment:
|
||||||
|
- POSTGRES_USER=${POSTGRES_USER}
|
||||||
|
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
|
||||||
|
- POSTGRES_DB=${POSTGRES_DB}
|
||||||
|
networks:
|
||||||
|
- backend
|
||||||
|
restart: unless-stopped
|
||||||
|
|
||||||
|
networks:
|
||||||
|
backend:
|
||||||
|
web:
|
||||||
|
external: true
|
||||||
|
```
|
||||||
|
|
||||||
|
Pour générer les clés d'accès chiffrées, j'utilise cette commande :
|
||||||
|
```bash
|
||||||
|
head -c32 /dev/urandom | base64
|
||||||
|
```
|
||||||
|
|
||||||
|
Avec Semaphore en fonctionnement, faisons rapidement le tour de l'UI et connectons-la à un dépôt.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Discovery
|
||||||
|
|
||||||
|
Après avoir démarré la stack, je peux atteindre la page de connexion à l'URL :
|
||||||
|

|
||||||
|
|
||||||
|
Pour me connecter, j'utilise les identifiants définis par `SEMAPHORE_ADMIN_NAME`/`SEMAPHORE_ADMIN_PASSWORD`.
|
||||||
|
|
||||||
|
Au premier accès, Semaphore me demande de créer un projet. J'ai créé le projet Homelab :
|
||||||
|

|
||||||
|
|
||||||
|
La première chose que je veux faire est d'ajouter mon dépôt _homelab_ (vous pouvez trouver son miroir sur Github [ici](https://github.com/Vezpi/homelab)). Dans `Repository`, je clique sur le bouton `New Repository`, et j'ajoute l'URL du repo. Je ne spécifie pas d'identifiants car le dépôt est public :
|
||||||
|

|
||||||
|
|
||||||
|
ℹ️ Avant de continuer, je déploie 3 VM à des fins de test : `sem01`, `sem02` et `sem03`. Je les ai créées avec Terraform via [ce projet](https://github.com/Vezpi/Homelab/tree/main/terraform/projects/semaphore-vms).
|
||||||
|
|
||||||
|
Pour interagir avec ces VM, je dois configurer des identifiants. Dans le `Key Store`, j'ajoute la première donnée d'identification, une clé SSH pour mon utilisateur :
|
||||||
|

|
||||||
|
|
||||||
|
Ensuite je crée un nouvel `Inventory`. J'utilise le format d'inventaire Ansible (le seul disponible). Je sélectionne la clé SSH créée précédemment et choisis le type `Static`. Dans les champs je renseigne les 3 hôtes créés avec leur FQDN :
|
||||||
|

|
||||||
|
|
||||||
|
✅ Avec un projet, un repo, des identifiants et un inventaire en place, je peux avancer et tester l'exécution d'un playbook Ansible.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Launching an Ansible playbook
|
||||||
|
|
||||||
|
Je veux tester quelque chose de simple : installer un serveur web avec une page personnalisée sur ces 3 VM. Je crée le playbook `install_nginx.yml` :
|
||||||
|
```yaml
|
||||||
|
---
|
||||||
|
- name: Demo Playbook - Install Nginx and Serve Hostname Page
|
||||||
|
hosts: all
|
||||||
|
become: true
|
||||||
|
|
||||||
|
tasks:
|
||||||
|
- name: Ensure apt cache is updated
|
||||||
|
ansible.builtin.apt:
|
||||||
|
update_cache: true
|
||||||
|
cache_valid_time: 3600
|
||||||
|
|
||||||
|
- name: Install nginx
|
||||||
|
ansible.builtin.apt:
|
||||||
|
name: nginx
|
||||||
|
state: present
|
||||||
|
|
||||||
|
- name: Create index.html with hostname
|
||||||
|
ansible.builtin.copy:
|
||||||
|
dest: /var/www/html/index.html
|
||||||
|
content: |
|
||||||
|
<html>
|
||||||
|
<head><title>Demo</title></head>
|
||||||
|
<body>
|
||||||
|
<h1>Hostname: {{ inventory_hostname }}</h1>
|
||||||
|
</body>
|
||||||
|
</html>
|
||||||
|
owner: www-data
|
||||||
|
group: www-data
|
||||||
|
mode: "0644"
|
||||||
|
|
||||||
|
- name: Ensure nginx is running
|
||||||
|
ansible.builtin.service:
|
||||||
|
name: nginx
|
||||||
|
state: started
|
||||||
|
enabled: true
|
||||||
|
```
|
||||||
|
|
||||||
|
Dans Semaphore UI, je peux maintenant créer mon premier `Task Template` pour un playbook Ansible. Je lui donne un nom, le chemin du playbook (depuis le dossier racine du repo), le dépôt et sa branche :
|
||||||
|

|
||||||
|
|
||||||
|
Il est temps de lancer le playbook ! Dans la liste des task templates, je clique sur le bouton ▶️ :
|
||||||
|

|
||||||
|
|
||||||
|
Le playbook se lance et je peux suivre la sortie en temps réel :
|
||||||
|

|
||||||
|
|
||||||
|
Je peux aussi consulter les exécutions précédentes :
|
||||||
|

|
||||||
|
|
||||||
|
|
||||||
|
✅ Enfin, je peux confirmer que le travail est fini en vérifiant l'URL sur le port 80 (http) :
|
||||||
|

|
||||||
|
|
||||||
|
Gérer des playbooks Ansible dans Semaphore UI est assez simple et vraiment pratique. L'interface est très soignée.
|
||||||
|
|
||||||
|
Il existe aussi beaucoup d'options de personnalisation lors de la configuration d'un task template. Je peux utiliser des variables via un survey, spécifier un limit ou des tags. J'apprécie vraiment cela.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Déploiement avec Terraform
|
||||||
|
|
||||||
|
Alors que l'exécution des playbooks Ansible était simple dès le départ, le déploiement avec Terraform sur Proxmox VE a été un peu différent. Avant de commencer, je détruis les 3 VM déployées précédemment.
|
||||||
|
|
||||||
|
Auparavant depuis le CLI, j'interagissais avec Terraform sur le cluster Proxmox en utilisant une clé SSH. Je n'ai pas réussi à le faire fonctionner depuis Semaphore UI. J'ai dû utiliser un nom d'utilisateur avec un mot de passe à la place.
|
||||||
|
|
||||||
|
Je me suis dit que c'était une bonne occasion d'utiliser Ansible pour créer un utilisateur Proxmox dédié. Ma première exécution a échoué avec :
|
||||||
|
```plaintext
|
||||||
|
Unable to encrypt nor hash, passlib must be installed. No module named 'passlib'
|
||||||
|
```
|
||||||
|
|
||||||
|
C'est apparemment un problème connu de l'environnement Python de Semaphore. Comme contournement, j'ai installé `passlib` directement dans le conteneur :
|
||||||
|
```bash
|
||||||
|
docker exec -it semaphore_ui pip install passlib
|
||||||
|
```
|
||||||
|
|
||||||
|
Avec cela en place, le playbook a réussi et j'ai pu créer l'utilisateur :
|
||||||
|
```yaml
|
||||||
|
---
|
||||||
|
- name: Create Terraform local user for Proxmox
|
||||||
|
hosts: nodes
|
||||||
|
become: true
|
||||||
|
tasks:
|
||||||
|
|
||||||
|
- name: Create terraform user
|
||||||
|
ansible.builtin.user:
|
||||||
|
name: "{{ terraform_user }}"
|
||||||
|
password: "{{ terraform_password | password_hash('sha512') }}"
|
||||||
|
shell: /bin/bash
|
||||||
|
|
||||||
|
- name: Create sudoers file for terraform user
|
||||||
|
ansible.builtin.copy:
|
||||||
|
dest: /etc/sudoers.d/{{ terraform_user }}
|
||||||
|
mode: '0440'
|
||||||
|
content: |
|
||||||
|
{{ terraform_user }} ALL=(root) NOPASSWD: /sbin/pvesm
|
||||||
|
{{ terraform_user }} ALL=(root) NOPASSWD: /sbin/qm
|
||||||
|
{{ terraform_user }} ALL=(root) NOPASSWD: /usr/bin/tee /var/lib/vz/*
|
||||||
|
```
|
||||||
|
|
||||||
|
Ensuite je crée un variable group `pve_vm`. Un variable group me permet de définir plusieurs variables et secrets ensemble :
|
||||||
|

|
||||||
|
|
||||||
|
Puis je crée un nouveau task template, cette fois de type Terraform Code. Je lui donne un nom, le chemin du projet Terraform, un workspace, le dépôt avec sa branche et le variable group :
|
||||||
|

|
||||||
|
|
||||||
|
Lancer le template me donne quelques options supplémentaires liées à Terraform :
|
||||||
|

|
||||||
|
|
||||||
|
Après le plan Terraform, il me propose d'appliquer, d'annuler ou d'arrêter :
|
||||||
|

|
||||||
|
|
||||||
|
Enfin, après avoir cliqué sur ✅ pour appliquer, j'ai pu regarder Terraform construire les VM, comme avec le CLI. À la fin, les VM ont été déployées avec succès sur Proxmox :
|
||||||
|

|
||||||
|
|
||||||
|
---
|
||||||
|
## Conclusion
|
||||||
|
|
||||||
|
Voilà pour mes tests de Semaphore UI, j'espère que cela vous aidera à voir ce que vous pouvez en faire.
|
||||||
|
|
||||||
|
Dans l'ensemble, l'interface est propre et agréable à utiliser. Je peux tout à fait m'imaginer planifier des playbooks Ansible avec elle, comme les mises à jour OPNsense dont je parlais en intro.
|
||||||
|
|
||||||
|
Pour Terraform, je l'utiliserai probablement pour lancer des VM éphémères pour des tests. J'aimerais utiliser le backend HTTP pour tfstate, mais cela nécessite la version Pro.
|
||||||
|
|
||||||
|
Pour conclure, Semaphore UI est un excellent outil, intuitif, esthétique et pratique. Beau travail de la part du projet !
|
||||||
259
content/post/17-semaphore-ui-interface-ansible-terraform.md
Normal file
@@ -0,0 +1,259 @@
|
|||||||
|
---
|
||||||
|
slug: semaphore-ui-interface-ansible-terraform
|
||||||
|
title: Semaphore UI, a Great Interface for Ansible & Terraform
|
||||||
|
description: Demonstration of Semaphore UI, a web interface to run Ansible playbooks, Terraform code and even more. Installation with Docker and quick examples.
|
||||||
|
date: 2026-02-09
|
||||||
|
draft: false
|
||||||
|
tags:
|
||||||
|
- semaphore-ui
|
||||||
|
- ansible
|
||||||
|
- terraform
|
||||||
|
- proxmox
|
||||||
|
- docker
|
||||||
|
categories:
|
||||||
|
- homelab
|
||||||
|
---
|
||||||
|
## Intro
|
||||||
|
|
||||||
|
In my homelab, I like to play with tools like Ansible and Terraform. The primary interface is the CLI, which I love, but sometimes a fancy web UI is just nice.
|
||||||
|
|
||||||
|
After setting up my OPNsense cluster, I wanted a way to keep it up to date on a schedule. Automation means Ansible to me, but how do you automate and schedule playbooks?
|
||||||
|
|
||||||
|
At work I use Red Hat Ansible Automation Platform, which is great, but overkill for my lab. That’s how I found Semaphore UI. Let’s see what it can do.
|
||||||
|
|
||||||
|
---
|
||||||
|
## What is Semaphore UI
|
||||||
|
|
||||||
|
[Semaphore UI](https://semaphoreui.com/docs/) is a sleek web interface designed to run automation with tools like Ansible and Terraform, and even Bash, Powershell or Python scripts.
|
||||||
|
|
||||||
|
Initially began as Ansible Semaphore, a web interface created to provide a simple front-end for running solely Ansible playbooks. Over time the community evolved the project into a multi-tool automation control plane.
|
||||||
|
|
||||||
|
It is a self-contained Go application with minimal dependencies capable of using different database backend, such as PostgreSQL, MySQL, or BoltDB.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Installation
|
||||||
|
|
||||||
|
Semaphore UI supports several [installation](https://semaphoreui.com/docs/category/installation) methods: Docker, Kubernetes, package manager or simple binary file.
|
||||||
|
|
||||||
|
I used Docker for my setup, you can see how I currently deploy application in this [post]({{< ref "post/16-how-I-deploy-application" >}})
|
||||||
|
|
||||||
|
Here my `docker-compose.yml` file I've configured using PostgreSQL:
|
||||||
|
```yaml
|
||||||
|
services:
|
||||||
|
semaphore:
|
||||||
|
image: semaphoreui/semaphore:v2.16.45
|
||||||
|
container_name: semaphore_ui
|
||||||
|
environment:
|
||||||
|
- TZ=Europe/Paris
|
||||||
|
- SEMAPHORE_DB_USER=${POSTGRES_USER}
|
||||||
|
- SEMAPHORE_DB_PASS=${POSTGRES_PASSWORD}
|
||||||
|
- SEMAPHORE_DB_HOST=postgres
|
||||||
|
- SEMAPHORE_DB_PORT=5432
|
||||||
|
- SEMAPHORE_DB_DIALECT=postgres
|
||||||
|
- SEMAPHORE_DB=${POSTGRES_DB}
|
||||||
|
- SEMAPHORE_PLAYBOOK_PATH=/tmp/semaphore/
|
||||||
|
- SEMAPHORE_ADMIN_PASSWORD=${SEMAPHORE_ADMIN_PASSWORD}
|
||||||
|
- SEMAPHORE_ADMIN_NAME=${SEMAPHORE_ADMIN_NAME}
|
||||||
|
- SEMAPHORE_ADMIN_EMAIL=${SEMAPHORE_ADMIN_EMAIL}
|
||||||
|
- SEMAPHORE_ADMIN=${SEMAPHORE_ADMIN}
|
||||||
|
- SEMAPHORE_ACCESS_KEY_ENCRYPTION=${SEMAPHORE_ACCESS_KEY_ENCRYPTION}
|
||||||
|
- SEMAPHORE_LDAP_ACTIVATED='no'
|
||||||
|
# - SEMAPHORE_LDAP_HOST=dc01.local.example.com
|
||||||
|
# - SEMAPHORE_LDAP_PORT='636'
|
||||||
|
# - SEMAPHORE_LDAP_NEEDTLS='yes'
|
||||||
|
# - SEMAPHORE_LDAP_DN_BIND='uid=bind_user,cn=users,cn=accounts,dc=local,dc=shiftsystems,dc=net'
|
||||||
|
# - SEMAPHORE_LDAP_PASSWORD='ldap_bind_account_password'
|
||||||
|
# - SEMAPHORE_LDAP_DN_SEARCH='dc=local,dc=example,dc=com'
|
||||||
|
# - SEMAPHORE_LDAP_SEARCH_FILTER="(\u0026(uid=%s)(memberOf=cn=ipausers,cn=groups,cn=accounts,dc=local,dc=example,dc=com))"
|
||||||
|
depends_on:
|
||||||
|
- postgres
|
||||||
|
networks:
|
||||||
|
- backend
|
||||||
|
- web
|
||||||
|
labels:
|
||||||
|
- traefik.enable=true
|
||||||
|
- traefik.http.routers.semaphore.rule=Host(`semaphore.vezpi.com`)
|
||||||
|
- traefik.http.routers.semaphore.entrypoints=https
|
||||||
|
- traefik.http.routers.semaphore.tls.certresolver=letsencrypt
|
||||||
|
- traefik.http.services.semaphore.loadbalancer.server.port=3000
|
||||||
|
restart: unless-stopped
|
||||||
|
|
||||||
|
postgres:
|
||||||
|
image: postgres:14
|
||||||
|
hostname: postgres
|
||||||
|
container_name: semaphore_postgres
|
||||||
|
volumes:
|
||||||
|
- /appli/data/semaphore/db:/var/lib/postgresql/data
|
||||||
|
environment:
|
||||||
|
- POSTGRES_USER=${POSTGRES_USER}
|
||||||
|
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
|
||||||
|
- POSTGRES_DB=${POSTGRES_DB}
|
||||||
|
networks:
|
||||||
|
- backend
|
||||||
|
restart: unless-stopped
|
||||||
|
|
||||||
|
networks:
|
||||||
|
backend:
|
||||||
|
web:
|
||||||
|
external: true
|
||||||
|
```
|
||||||
|
|
||||||
|
To generate the encrypting access keys, I use this command:
|
||||||
|
```bash
|
||||||
|
head -c32 /dev/urandom | base64
|
||||||
|
```
|
||||||
|
|
||||||
|
With Semaphore running, let’s take a quick tour of the UI and wire it up to a repo.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Discovery
|
||||||
|
|
||||||
|
After starting the stack, I can reach the login page at the URL:
|
||||||
|

|
||||||
|
|
||||||
|
To log in, I use the credentials defined by `SEMAPHORE_ADMIN_NAME`/`SEMAPHORE_ADMIN_PASSWORD`.
|
||||||
|
|
||||||
|
On first login, Semaphore prompt me to create a project. I created the Homelab project:
|
||||||
|

|
||||||
|
|
||||||
|
The first thing I want to do is to add my *homelab* repository (you can find its mirror on Github [here](https://github.com/Vezpi/homelab)). In `Repository`, I click the `New Repository` button, and add the repo URL. I don't specify credentials because the repo is public:
|
||||||
|

|
||||||
|
|
||||||
|
ℹ️ Before continue, I deploy 3 VMs for testing purpose: `sem01`, `sem02` and `sem03`. I created them using Terraform with [this project](https://github.com/Vezpi/Homelab/tree/main/terraform/projects/semaphore-vms).
|
||||||
|
|
||||||
|
To interact with these VMs I need to configure credentials. In the the `Key Store`, I add the first credential, a SSH key for my user:
|
||||||
|

|
||||||
|
|
||||||
|
Then I create a new `Inventory`. I'm using the Ansible inventory format (the only one available). I select the SSH key previously created and select the type as `Static`. In the fields I enter the 3 hosts created with their FQDN:
|
||||||
|

|
||||||
|
|
||||||
|
✅ With a project, repo, credentials, and inventory in place, I can move forward and test to run an Ansible playbook.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Launching an Ansible playbook
|
||||||
|
|
||||||
|
I want to test something simple, install a web server with a custom page on these 3 VMs, I create the playbook `install_nginx.yml`:
|
||||||
|
```yaml
|
||||||
|
---
|
||||||
|
- name: Demo Playbook - Install Nginx and Serve Hostname Page
|
||||||
|
hosts: all
|
||||||
|
become: true
|
||||||
|
|
||||||
|
tasks:
|
||||||
|
- name: Ensure apt cache is updated
|
||||||
|
ansible.builtin.apt:
|
||||||
|
update_cache: true
|
||||||
|
cache_valid_time: 3600
|
||||||
|
|
||||||
|
- name: Install nginx
|
||||||
|
ansible.builtin.apt:
|
||||||
|
name: nginx
|
||||||
|
state: present
|
||||||
|
|
||||||
|
- name: Create index.html with hostname
|
||||||
|
ansible.builtin.copy:
|
||||||
|
dest: /var/www/html/index.html
|
||||||
|
content: |
|
||||||
|
<html>
|
||||||
|
<head><title>Demo</title></head>
|
||||||
|
<body>
|
||||||
|
<h1>Hostname: {{ inventory_hostname }}</h1>
|
||||||
|
</body>
|
||||||
|
</html>
|
||||||
|
owner: www-data
|
||||||
|
group: www-data
|
||||||
|
mode: "0644"
|
||||||
|
|
||||||
|
- name: Ensure nginx is running
|
||||||
|
ansible.builtin.service:
|
||||||
|
name: nginx
|
||||||
|
state: started
|
||||||
|
enabled: true
|
||||||
|
```
|
||||||
|
|
||||||
|
In Semaphore UI, I can now create my first `Task Template` for Ansible playbook. I give it a name, the playbook path (from the root folder of the repo), the repository and its branch:
|
||||||
|

|
||||||
|
|
||||||
|
Time to launch the playbook! In the task templates list, I click on the ▶️ button:
|
||||||
|

|
||||||
|
|
||||||
|
The playbook launches and I can follow the output in real time:
|
||||||
|

|
||||||
|
|
||||||
|
I can also review previous runs:
|
||||||
|

|
||||||
|
|
||||||
|
|
||||||
|
✅ Finally I can confirm the job is done by checking the URL on port 80 (http):
|
||||||
|

|
||||||
|
|
||||||
|
Managing Ansible playbooks in Semaphore UI is pretty simple and really convenient. The interface is really sleek.
|
||||||
|
|
||||||
|
There are also a lot of customization available when setting the task template up. I can use variables in a survey, specify limit or tags. I really like it.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Deploy with Terraform
|
||||||
|
|
||||||
|
While running Ansible playbooks was easy out of the box, this was a bit different to deploy with Terraform on Proxmox VE. Before starting, I destroy the 3 VMs deployed earlier.
|
||||||
|
|
||||||
|
Previously from the CLI, I was interacting on Terraform with the Proxmox cluster using a SSH key. I was not able to put it to work from Semaphore UI. I had to use a username with a password instead.
|
||||||
|
|
||||||
|
I told myself it was a good opportunity to use Ansible to create a dedicated Proxmox user. My first run failed with:
|
||||||
|
```plaintext
|
||||||
|
Unable to encrypt nor hash, passlib must be installed. No module named 'passlib'
|
||||||
|
```
|
||||||
|
|
||||||
|
This is apparently a known issue with Semaphore’s Python environment. As a workaround, I installed `passlib` directly in the container:
|
||||||
|
```bash
|
||||||
|
docker exec -it semaphore_ui pip install passlib
|
||||||
|
```
|
||||||
|
|
||||||
|
With that in place, the playbook succeeded and I could create the user:
|
||||||
|
```yaml
|
||||||
|
---
|
||||||
|
- name: Create Terraform local user for Proxmox
|
||||||
|
hosts: nodes
|
||||||
|
become: true
|
||||||
|
tasks:
|
||||||
|
|
||||||
|
- name: Create terraform user
|
||||||
|
ansible.builtin.user:
|
||||||
|
name: "{{ terraform_user }}"
|
||||||
|
password: "{{ terraform_password | password_hash('sha512') }}"
|
||||||
|
shell: /bin/bash
|
||||||
|
|
||||||
|
- name: Create sudoers file for terraform user
|
||||||
|
ansible.builtin.copy:
|
||||||
|
dest: /etc/sudoers.d/{{ terraform_user }}
|
||||||
|
mode: '0440'
|
||||||
|
content: |
|
||||||
|
{{ terraform_user }} ALL=(root) NOPASSWD: /sbin/pvesm
|
||||||
|
{{ terraform_user }} ALL=(root) NOPASSWD: /sbin/qm
|
||||||
|
{{ terraform_user }} ALL=(root) NOPASSWD: /usr/bin/tee /var/lib/vz/*
|
||||||
|
```
|
||||||
|
|
||||||
|
Next I create a variable group `pve_vm`. A variable group let me define multiple variables and secrets together:
|
||||||
|

|
||||||
|
|
||||||
|
Then I create a new task template, this time with the kind Terraform Code. I give it a name, the path of the terraform [project](https://github.com/Vezpi/Homelab/tree/main/terraform/projects/semaphore-vms), a workspace, the repository along with its branch and. the variable group:
|
||||||
|

|
||||||
|
|
||||||
|
Running the template gives me some additional options related to Terraform:
|
||||||
|

|
||||||
|
|
||||||
|
After the Terraform plan, I'm proposed to apply, cancel or stop:
|
||||||
|

|
||||||
|
|
||||||
|
Finally after hitting ✅ to apply, I could watch Terraform build the VMs, just like using the CLI. At the end, the VMs were successfully deployed on Proxmox:
|
||||||
|

|
||||||
|
|
||||||
|
---
|
||||||
|
## Conclusion
|
||||||
|
|
||||||
|
That's it for my Semaphore UI tests, I hope this could help you to see what you can do with it.
|
||||||
|
|
||||||
|
Overall, the interface is clean and pleasant to use. I can definitely see myself scheduling Ansible playbooks with it, like the OPNsense updates I mentioned in the intro.
|
||||||
|
|
||||||
|
For Terraform, I’ll probably use it to spin up short-lived VMs for tests. I’d love to use the HTTP backend for tfstate, but that requires the Pro version.
|
||||||
|
|
||||||
|
To conclude, Semaphore UI is a great tool, intuitive, good-looking, and practical. Nice work from the project!
|
||||||
223
content/post/18-create-nas-server-with-truenas.fr.md
Normal file
@@ -0,0 +1,223 @@
|
|||||||
|
---
|
||||||
|
slug: create-nas-server-with-truenas
|
||||||
|
title: Construction et installation de mon NAS avec TrueNAS SCALE
|
||||||
|
description: "Guide pas à pas pour un NAS TrueNAS SCALE en homelab : choix du matériel, installation, pool ZFS et datasets, partages SMB/NFS et snapshots."
|
||||||
|
date: 2026-02-27
|
||||||
|
draft: false
|
||||||
|
tags:
|
||||||
|
- truenas
|
||||||
|
categories:
|
||||||
|
- homelab
|
||||||
|
---
|
||||||
|
## Introduction
|
||||||
|
|
||||||
|
Dans mon homelab, j'ai besoin d'un endroit pour stocker des données en dehors de mon cluster Proxmox VE.
|
||||||
|
|
||||||
|
Au départ, mon unique serveur physique a 2 disques HDD de 2 To. Quand j'ai installé Proxmox dessus, ces disques sont restés attachés à l'hôte. Je les ai partagés via un serveur NFS dans un LXC, loin des bonnes pratiques.
|
||||||
|
|
||||||
|
Cet hiver, le nœud a commencé à montrer des signes de faiblesse, s'éteignant sans raison. Ce compagnon a maintenant 7 ans. Lorsqu'il est passé hors ligne, mes partages NFS ont disparu, entraînant la chute de quelques services dans mon homelab. Le remplacement du ventilateur du CPU l'a stabilisé, mais je veux maintenant un endroit plus sûr pour ces données.
|
||||||
|
|
||||||
|
Dans cet article, je vais vous expliquer comment j'ai construit mon NAS avec TrueNAS.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Choisir la bonne plateforme
|
||||||
|
|
||||||
|
Depuis un moment je voulais un NAS. Pas un Synology ou QNAP prêt à l'emploi, même si je pense qu'ils sont de bons produits. Je voulais le construire moi‑même. L'espace est limité dans mon petit rack, et les boîtiers NAS compacts sont rares.
|
||||||
|
|
||||||
|
### Matériel
|
||||||
|
|
||||||
|
Je suis parti sur un NAS full‑flash. Pourquoi ?
|
||||||
|
|
||||||
|
- C'est rapide
|
||||||
|
- C'est ~~furieux~~ compact
|
||||||
|
- C'est silencieux
|
||||||
|
- Ça consomme moins d'énergie
|
||||||
|
- Ça chauffe moins
|
||||||
|
|
||||||
|
Le problème est le prix.
|
||||||
|
|
||||||
|
La vitesse réseau est de toute façon mon goulot d'étranglement, mais les autres avantages sont exactement ce que je veux. Je n'ai pas besoin d'une capacité massive, environ 2 To utilisables suffisent.
|
||||||
|
|
||||||
|
Mon premier choix était le [Aiffro K100](https://www.aiffro.com/fr/products/all-ssd-nas-k100). Mais la livraison vers la France a presque doublé le prix. Finalement j'ai opté pour un [Beelink ME mini](https://www.bee-link.com/products/beelink-me-mini-n150?variant=48678160236786).
|
||||||
|
|
||||||
|
Ce petit cube a :
|
||||||
|
- CPU Intel N200
|
||||||
|
- 12 Go de RAM
|
||||||
|
- 2x Ethernet 2,5 Gbps
|
||||||
|
- Jusqu'à 6x disques NVMe
|
||||||
|
- Une puce eMMC 64 Go pour l'OS
|
||||||
|
|
||||||
|
J'ai commencé avec 2 disques NVMe pour l'instant, 2 To chacun.
|
||||||
|
|
||||||
|
### Logiciel
|
||||||
|
|
||||||
|
Maintenant que le matériel est choisi, quel logiciel vais‑je utiliser ?
|
||||||
|
|
||||||
|
Mes besoins étaient simples :
|
||||||
|
- partages NFS
|
||||||
|
- support ZFS
|
||||||
|
- capacités VM
|
||||||
|
|
||||||
|
J'ai considéré FreeNAS/TrueNAS, OpenMediaVault et Unraid. J'ai choisi TrueNAS SCALE 25.10 Community Edition. Pour être clair : FreeNAS a été renommé TrueNAS CORE (basé sur FreeBSD), tandis que TrueNAS SCALE est la gamme basée sur Linux. J'utilise SCALE.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Installer TrueNAS
|
||||||
|
|
||||||
|
⚠️ J'ai installé TrueNAS sur la puce eMMC. Ce n'est pas recommandé, l'endurance de l'eMMC peut être un risque.
|
||||||
|
|
||||||
|
L'installation ne s'est pas déroulée aussi bien que prévu...
|
||||||
|
|
||||||
|
J'utilise [Ventoy](https://www.ventoy.net/en/index.html) pour garder plusieurs ISOs sur une clé USB. J'étais en version 1.0.99, et l'ISO ne se lançait pas. La mise à jour vers 1.1.10 a résolu le problème :
|
||||||
|

|
||||||
|
|
||||||
|
Mais là j'ai rencontré un autre problème lors du lancement de l'installation sur mon périphérique de stockage eMMC :
|
||||||
|
```
|
||||||
|
Failed to find partition number 2 on mmcblk0
|
||||||
|
```
|
||||||
|
|
||||||
|
J'ai trouvé une solution sur ce [post](https://forums.truenas.com/t/installation-failed-on-emmc-odroid-h4/15317/12) :
|
||||||
|
- Entrer dans le shell
|
||||||
|

|
||||||
|
- Éditer le fichier `/lib/python3/dist-packages/truenas_installer/utils.py`
|
||||||
|
- Déplacer la ligne `await asyncio.sleep(1)` juste sous `for _try in range(tries):`
|
||||||
|
- Modifier la ligne 46 pour ajouter `+ 'p'` :
|
||||||
|
`for partdir in filter(lambda x: x.is_dir() and x.name.startswith(device + 'p'), dir_contents):`
|
||||||
|

|
||||||
|
- Quitter le shell et lancer l'installation sans redémarrer
|
||||||
|
|
||||||
|
L'installateur a finalement pu passer :
|
||||||
|

|
||||||
|
|
||||||
|
Une fois l'installation terminée, j'ai éteint la machine. Ensuite je l'ai installée dans mon rack au-dessus des 3 nœuds Proxmox VE. J'ai branché les deux câbles Ethernet depuis mon switch et je l'ai mise sous tension.
|
||||||
|
|
||||||
|
## Configurer TrueNAS
|
||||||
|
|
||||||
|
Par défaut, TrueNAS utilise DHCP. J'ai trouvé son adresse MAC dans mon interface UniFi et créé une réservation DHCP. Dans OPNsense, j'ai ajouté un override host pour Dnsmasq. Dans le plugin Caddy, j'ai configuré un domaine pour TrueNAS pointant vers cette IP, puis j'ai redémarré.
|
||||||
|
|
||||||
|
✅ Après quelques minutes, TrueNAS est maintenant disponible sur [https://nas.vezpi.com](https://nas.vezpi.com/).
|
||||||
|
|
||||||
|
### Paramètres généraux
|
||||||
|
|
||||||
|
Pendant l'installation, je n'ai pas défini de mot de passe pour truenas_admin. La page de connexion m'a forcé à en choisir un :
|
||||||
|

|
||||||
|
|
||||||
|
Une fois le mot de passe mis à jour, j'arrive sur le tableau de bord. L'interface donne une bonne impression au premier abord :
|
||||||
|

|
||||||
|
|
||||||
|
J'explore rapidement l'interface, la première chose que je fais est de changer le hostname en `granite` et de cocher la case en dessous pour hériter du domaine depuis DHCP :
|
||||||
|

|
||||||
|
|
||||||
|
Dans les `General Settings`, je change les paramètres de `Localization`. Je mets le Console Keyboard Map sur `French (AZERTY)` et le Fuseau horaire sur `Europe/Paris`.
|
||||||
|
|
||||||
|
Je crée un nouvel utilisateur `vez`, avec le rôle `Full Admin` dans TrueNAS. J'autorise SSH uniquement pour l'authentification par clé, pas de mots de passe :
|
||||||
|

|
||||||
|
|
||||||
|
Finalement je retire le rôle admin de `truenas_admin` et verrouille le compte.
|
||||||
|
|
||||||
|
### Création du pool
|
||||||
|
|
||||||
|
Dans TrueNAS, un pool est une collection de stockage créée en combinant plusieurs disques en un espace unifié géré par ZFS.
|
||||||
|
|
||||||
|
Dans la page `Storage`, je trouve mes `Disks`, où je peux confirmer que TrueNAS voit mon couple de NVMe :
|
||||||
|

|
||||||
|
|
||||||
|
De retour sur le `Storage Dashboard`, je clique sur le bouton `Create Pool`. Je nomme le pool `storage` parce que je suis vraiment inspiré pour lui donner un nom :
|
||||||
|

|
||||||
|
|
||||||
|
Puis je sélectionne la disposition `Mirror` :
|
||||||
|

|
||||||
|
|
||||||
|
J'explore rapidement les configurations optionnelles, mais les valeurs par défaut me conviennent : autotrim, compression, pas de dedup, etc. À la fin, avant de créer le pool, il y a une section `Review` :
|
||||||
|

|
||||||
|
|
||||||
|
Après avoir cliqué sur `Create Pool`, on m'avertit que tout sur les disques sera effacé, ce que je confirme. Finalement le pool est créé.
|
||||||
|
|
||||||
|
### Création des datasets
|
||||||
|
|
||||||
|
Un dataset est un système de fichiers à l'intérieur d'un pool. Il peut contenir des fichiers, des répertoires et des datasets enfants, il peut être partagé via NFS et/ou SMB. Il vous permet de gérer indépendamment les permissions, la compression, les snapshots et les quotas pour différents ensembles de données au sein du même pool de stockage.
|
||||||
|
|
||||||
|
#### Partage SMB
|
||||||
|
|
||||||
|
Créons maintenant mon premier dataset `files` pour partager des fichiers sur le réseau pour mes clients Windows, comme des ISOs, etc :
|
||||||
|

|
||||||
|
|
||||||
|
Lors de la création de datasets SMB dans SCALE, définissez le Share Type sur SMB afin que les bons ACL/xattr par défaut s'appliquent. TrueNAS me demande alors de démarrer/activer le service SMB :
|
||||||
|

|
||||||
|
|
||||||
|
Depuis mon portable Windows, j'essaie d'accéder à mon nouveau partage `\\granite.mgmt.vezpi.com\files`. Comme prévu on me demande des identifiants.
|
||||||
|
|
||||||
|
Je crée un nouveau compte utilisateur avec permission SMB.
|
||||||
|
|
||||||
|
✅ Succès : je peux parcourir et copier des fichiers.
|
||||||
|
|
||||||
|
#### Partage NFS
|
||||||
|
|
||||||
|
Je crée un autre dataset : `media`, et un enfant `photos`. Je crée un partage NFS à partir de ce dernier.
|
||||||
|
|
||||||
|
Sur mon serveur NFS actuel, les fichiers photos sont possédés par `root` (gérés par _Immich_). Plus tard je verrai comment migrer vers une version sans root.
|
||||||
|
|
||||||
|
⚠️ Pour l'instant je définis, dans les `Advanced Options`, le `Maproot User` et le `Maproot Group` sur `root`. Cela équivaut à l'attribut NFS `no_squash_root`, le `root` local du client reste `root` sur le serveur, ne faites pas ça :
|
||||||
|

|
||||||
|
|
||||||
|
✅ Je monte le partage NFS sur un client, cela fonctionne bien.
|
||||||
|
|
||||||
|
Après la configuration initiale, mes datasets du pool `storage` ressemblent à :
|
||||||
|
|
||||||
|
- `backups`
|
||||||
|
- `duplicati` : backend de stockage [Duplicati](https://duplicati.com/)
|
||||||
|
- `proxmox` : futur Proxmox Backup Server
|
||||||
|
- `cloud` : données `Nextcloud`
|
||||||
|
- `files` :
|
||||||
|
- `media`
|
||||||
|
- `downloads`
|
||||||
|
- `photos`
|
||||||
|
- `videos`
|
||||||
|
|
||||||
|
J'ai mentionné les capacités VM dans mes exigences. Je ne couvrirais pas cela dans ce post, ce sera abordé la prochaine fois.
|
||||||
|
|
||||||
|
### Protection des données
|
||||||
|
|
||||||
|
Il est maintenant temps d'activer quelques fonctionnalités de protection des données :
|
||||||
|

|
||||||
|
|
||||||
|
Je veux créer des snapshots automatiques pour certains de mes datasets, ceux qui me tiennent le plus à cœur : mes fichiers cloud et les photos.
|
||||||
|
|
||||||
|
Créons des tâches de snapshot. Je clique sur le bouton `Add` à côté de `Periodic Snapshot Tasks` :
|
||||||
|
- cloud : snapshots quotidiens, conserver pendant 2 mois
|
||||||
|
- photos : snapshots quotidiens, conserver pendant 7 jours
|
||||||
|

|
||||||
|
|
||||||
|
Je pourrais aussi configurer une `Cloud Sync Task`, mais Duplicati gère déjà les sauvegardes hors site.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Utilisation de TrueNAS
|
||||||
|
|
||||||
|
Maintenant que mon instance TrueNAS est configurée, je dois planifier la migration des données depuis mon serveur NFS actuel vers TrueNAS.
|
||||||
|
|
||||||
|
### Migration des données
|
||||||
|
|
||||||
|
Pour chacun de mes partages NFS actuels, sur un client, je monte le nouveau partage NFS pour synchroniser les données :
|
||||||
|
```
|
||||||
|
sudo mkdir /new_photos
|
||||||
|
sudo mount 192.168.88.30:/mnt/storage/media/photos /new_photos
|
||||||
|
sudo rsync -a --info=progress2 /data/photo/ /new_photos
|
||||||
|
```
|
||||||
|
|
||||||
|
À la fin, je pourrais décommissionner mon ancien serveur NFS sur le LXC. La disposition des datasets après migration ressemble à ceci :
|
||||||
|

|
||||||
|
|
||||||
|
### Application Android
|
||||||
|
|
||||||
|
Par curiosité, j'ai cherché sur le Play Store une application pour gérer une instance TrueNAS. J'ai trouvé [Nasdeck](https://play.google.com/store/apps/details?id=com.strtechllc.nasdeck&hl=fr&pli=1), qui est plutôt sympa. Voici quelques captures d'écran :
|
||||||
|

|
||||||
|
|
||||||
|
---
|
||||||
|
## Conclusion
|
||||||
|
|
||||||
|
Mon NAS est maintenant prêt à stocker mes données.
|
||||||
|
|
||||||
|
Je n'ai pas abordé les capacités VM car je vais bientôt les expérimenter pour installer Proxmox Backup Server en VM. De plus je n'ai pas configuré les notifications, je dois mettre en place une solution pour recevoir des alertes par email dans mon système de notification.
|
||||||
|
|
||||||
|
TrueNAS est un excellent produit. Il nécessite du matériel capable pour ZFS, mais l'expérience est excellente une fois configuré.
|
||||||
|
|
||||||
|
Étape suivante : déployer Proxmox Backup Server en tant que VM sur TrueNAS, puis revoir les permissions NFS pour passer Immich en mode sans root.
|
||||||
219
content/post/18-create-nas-server-with-truenas.md
Normal file
@@ -0,0 +1,219 @@
|
|||||||
|
---
|
||||||
|
slug: create-nas-server-with-truenas
|
||||||
|
title: Build and install of my NAS with TrueNAS Scale
|
||||||
|
description: "Step-by-step TrueNAS SCALE homelab NAS build: hardware choice, installation, ZFS pool and datasets, SMB/NFS shares and snapshots."
|
||||||
|
date: 2026-02-27
|
||||||
|
draft: false
|
||||||
|
tags:
|
||||||
|
- truenas
|
||||||
|
categories:
|
||||||
|
- homelab
|
||||||
|
---
|
||||||
|
## Introduction
|
||||||
|
|
||||||
|
In my homelab, I need a place to store data outside of my Proxmox VE cluster.
|
||||||
|
|
||||||
|
At the beginning, my single physical server has 2 HDDs disks of 2 TB. When I installed Proxmox on it, those disks stayed attached to the host. I shared them via an NFS server in an LXC, far from best practice.
|
||||||
|
|
||||||
|
This winter, the node started to fail, shutting down for no reason. This buddy is now 7 years old. When it went offline, my NFS shares disappeared, taking a few services down with them in my homelab. Replacing the CPU fan stabilized it, but I now want a safer home for that data.
|
||||||
|
|
||||||
|
In this article, I’ll walk you through how I built my NAS with TrueNAS.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Choose the right platform
|
||||||
|
|
||||||
|
For a while I wanted a NAS. Not an out‑of‑the‑box Synology or QNAP, even though I think they’re great products. I wanted to build mine. Space is tight in my tiny rack, and small NAS cases are rare.
|
||||||
|
|
||||||
|
### Hardware
|
||||||
|
|
||||||
|
I went for an all‑flash NAS. Why?
|
||||||
|
- It's fast
|
||||||
|
- It's ~~furious~~ compact
|
||||||
|
- It's quiet
|
||||||
|
- It uses less power
|
||||||
|
- It runs cooler
|
||||||
|
|
||||||
|
The trade‑off is price.
|
||||||
|
|
||||||
|
Network speed is my bottleneck anyway, but the other benefits are exactly what I want. I don’t need massive capacity, about 2 TB usable is enough.
|
||||||
|
|
||||||
|
My first choice was the [Aiffro K100](https://www.aiffro.com/fr/products/all-ssd-nas-k100). But shipping to France nearly doubled the price. Finally I ended up with a [Beelink ME mini](https://www.bee-link.com/products/beelink-me-mini-n150?variant=48678160236786).
|
||||||
|
|
||||||
|
This small cube has:
|
||||||
|
- Intel N200 CPU
|
||||||
|
- 12 GB RAM
|
||||||
|
- 2x 2.5 Gbps Ethernet
|
||||||
|
- Up to 6x NVMe drives
|
||||||
|
- A 64 GB eMMC chip for the OS
|
||||||
|
|
||||||
|
I started with 2 NVMe drives for now, 2 TB each.
|
||||||
|
|
||||||
|
### Software
|
||||||
|
|
||||||
|
Now that the hardware is chosen, which software will I use?
|
||||||
|
|
||||||
|
My requirements were simple:
|
||||||
|
- NFS shares
|
||||||
|
- ZFS support
|
||||||
|
- VM capabilities
|
||||||
|
|
||||||
|
I considered FreeNAS/TrueNAS, OpenMediaVault, and Unraid. I chose TrueNAS SCALE 25.10 Community Edition. For clarity: FreeNAS was renamed TrueNAS CORE (FreeBSD‑based), while TrueNAS SCALE is the Linux‑based line. I’m using SCALE.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Install TrueNAS
|
||||||
|
|
||||||
|
⚠️ I installed TrueNAS on the eMMC chip. That’s not recommended, eMMC endurance can be a risk.
|
||||||
|
|
||||||
|
The install didn’t go as smoothly as expected...
|
||||||
|
|
||||||
|
I use [Ventoy](https://www.ventoy.net/en/index.html) to keep multiple ISOs on one USB stick. I was in version 1.0.99, and the ISO wouldn't launch. Updating to 1.1.10 fixed it:
|
||||||
|

|
||||||
|
|
||||||
|
But here I encountered another problem when launching the installation on my eMMC storage device:
|
||||||
|
```
|
||||||
|
Failed to find partition number 2 on mmcblk0
|
||||||
|
```
|
||||||
|
|
||||||
|
I found a solution on this [post](https://forums.truenas.com/t/installation-failed-on-emmc-odroid-h4/15317/12):
|
||||||
|
- Enter the shell
|
||||||
|

|
||||||
|
- Edit the file `/lib/python3/dist-packages/truenas_installer/utils.py`
|
||||||
|
- Move the line `await asyncio.sleep(1)` right beneath `for _try in range(tries):`
|
||||||
|
- Edit line 46 to add `+ 'p'`:
|
||||||
|
`for partdir in filter(lambda x: x.is_dir() and x.name.startswith(device + 'p'), dir_contents):`
|
||||||
|

|
||||||
|
- Exit the shell and start the installation without reboot
|
||||||
|
|
||||||
|
The installer was finally able to get through:
|
||||||
|

|
||||||
|
|
||||||
|
Once the installation was complete, I shut down the machine. Then I installed it into my rack on top of the 3 Proxmox VE nodes. I plugged both Ethernet cables from my switch and powered it up.
|
||||||
|
|
||||||
|
## Configure TrueNAS
|
||||||
|
|
||||||
|
By default, TrueNAS uses DHCP. I found its MAC address in my UniFi interface and created a DHCP reservation. In OPNsense, I added a Dnsmasq host override. In the Caddy plugin, I set up a domain for TrueNAS pointing to that IP, then rebooted.
|
||||||
|
|
||||||
|
✅ After a few minutes, TrueNAS is now available on https://nas.vezpi.com.
|
||||||
|
### General Settings
|
||||||
|
|
||||||
|
During install, I didn’t set a password for truenas_admin. The login page forced me to pick one:
|
||||||
|

|
||||||
|
|
||||||
|
Once the password is updated, I land on the dashboard. The UI feels great at first glance:
|
||||||
|

|
||||||
|
|
||||||
|
I quickly explore the interface, the first thing I do is changing the hostname to `granite` and check the box below et it inherit domain from DHCP:
|
||||||
|

|
||||||
|
|
||||||
|
In the `General Settings`, I change the `Localization` settings. I set the Console Keyboard Map to `French (AZERTY)` and the Timezone to `Europe/Paris`.
|
||||||
|
|
||||||
|
I create a new user `vez`, with `Full Admin` role within TrueNAS. I allow SSH for key‑based auth only, no passwords:
|
||||||
|

|
||||||
|
|
||||||
|
Finally I remove the admin role from `truenas_admin` and lock the account.
|
||||||
|
|
||||||
|
### Pool creation
|
||||||
|
|
||||||
|
In TrueNAS, a pool is a storage collection created by combining multiple disks into a unified ZFS‑managed space.
|
||||||
|
|
||||||
|
In the `Storage` page, I can find my `Disks`, where I can confirm TrueNAS can see my couple of NVMe drives:
|
||||||
|

|
||||||
|
|
||||||
|
Back in the `Storage Dashboard`, I click the `Create Pool` button. I name the pool `storage` because I'm really inspired to give it a name:
|
||||||
|

|
||||||
|
|
||||||
|
Then I select the `Mirror` layout:
|
||||||
|

|
||||||
|
|
||||||
|
I explore quickly the optional configurations, but the defaults are fine to me: autotrim, compression, no dedup, etc. At the end, before creating the pool, there is a `Review` section:
|
||||||
|

|
||||||
|
|
||||||
|
After hitting `Create Pool`, I'm warned that everything on the disks will be wiped, which I confirm. Finally the pool is created.
|
||||||
|
|
||||||
|
### Datasets creation
|
||||||
|
|
||||||
|
A dataset is a filesystem inside a pool. It can contains files, directories and child datasets, it can be shared using NFS and/or SMB. It allows you to independently manage permissions, compression, snapshots, and quotas for different sets of data within the same storage pool.
|
||||||
|
|
||||||
|
#### SMB share
|
||||||
|
|
||||||
|
Let's now create my first dataset `files` to share files over the network for my Windows clients, like ISOs, etc:
|
||||||
|

|
||||||
|
|
||||||
|
When creating SMB datasets in SCALE, set Share Type to SMB so the right ACL/xattr defaults apply. TrueNAS then prompts me to start/enable the SMB service:
|
||||||
|

|
||||||
|
|
||||||
|
From my Windows Laptop, I try to access my new share `\\granite.mgmt.vezpi.com\files`. As expected I'm prompt to give credentials.
|
||||||
|
|
||||||
|
I create a new user account with SMB permission.
|
||||||
|
|
||||||
|
✅ Success: I can browse and copy files.
|
||||||
|
|
||||||
|
#### NFS share
|
||||||
|
|
||||||
|
I create another dataset: `media`, and a child `photos`. I create a NFS share from the latter.
|
||||||
|
|
||||||
|
On my current NFS server, the files for the photos are owned by `root` (managed by *Immich*). Later I'll see how I can migrate towards a root-less version.
|
||||||
|
|
||||||
|
⚠️ For now I set, in `Advanced Options`, the `Maproot User` and `Maproot Group` to `root`. This is equivalent to the NFS attribute `no_squash_root`, the local `root` of the client stays `root` on the server, don't do that:
|
||||||
|

|
||||||
|
|
||||||
|
✅ I mount the NFS share on a client, this works fine.
|
||||||
|
|
||||||
|
After initial setup, my `storage` pool datasets look like:
|
||||||
|
- `backups`
|
||||||
|
- `duplicati`: [Duplicati](https://duplicati.com/) storage backend
|
||||||
|
- `proxmox`: future Proxmox Backup Server
|
||||||
|
- `cloud`: `Nextcloud` data
|
||||||
|
- `files`:
|
||||||
|
- `media`
|
||||||
|
- `downloads`
|
||||||
|
- `photos`
|
||||||
|
- `videos`
|
||||||
|
|
||||||
|
I mentioned VM capabilities in my requirements. I won't cover that is this post, it will be covered next time.
|
||||||
|
|
||||||
|
### Data protection
|
||||||
|
|
||||||
|
Now time to enable some data protection features:
|
||||||
|

|
||||||
|
|
||||||
|
I want to create automatic snapshots for some of my datasets, those I care the most: my cloud files and photos.
|
||||||
|
|
||||||
|
Let's create snapshot tasks. I click on the `Add` button next to `Periodic Snapshot Tasks`:
|
||||||
|
- cloud: daily snapshots, keep for 2 months
|
||||||
|
- photos: daily snapshots, keep for 7 days
|
||||||
|

|
||||||
|
|
||||||
|
I could also set up a `Cloud Sync Task`, but Duplicati already handles offsite backups.
|
||||||
|
|
||||||
|
---
|
||||||
|
## Using TrueNAS
|
||||||
|
|
||||||
|
Now my TrueNAS instance is configured, I need to plan the migration of the data from my current NFS server to TrueNAS.
|
||||||
|
### Data migration
|
||||||
|
|
||||||
|
For each of my current NFS shares, on a client, I mount the new NFS share to synchronize the data:
|
||||||
|
```
|
||||||
|
sudo mkdir /new_photos
|
||||||
|
sudo mount 192.168.88.30:/mnt/storage/media/photos /new_photos
|
||||||
|
sudo rsync -a --info=progress2 /data/photo/ /new_photos
|
||||||
|
```
|
||||||
|
|
||||||
|
At the end, I could decommission my old NFS server on the LXC. The dataset layout after migration looks like this:
|
||||||
|

|
||||||
|
|
||||||
|
### Android application
|
||||||
|
|
||||||
|
Out of curiosity, I've checked on the Google Play store for an app to manage a TrueNAS instance. I've found [Nasdeck](https://play.google.com/store/apps/details?id=com.strtechllc.nasdeck&hl=fr&pli=1), which is quite nice. Here some screenshots:
|
||||||
|

|
||||||
|
|
||||||
|
---
|
||||||
|
## Conclusion
|
||||||
|
|
||||||
|
My NAS is now ready to store my data.
|
||||||
|
|
||||||
|
I didn't address VM capabilities as I will experience it soon to install Proxmox Backup Server as VM. Also I didn't configure notifications, I need to setup a solution to receive email alerts to my notification system.
|
||||||
|
|
||||||
|
TrueNAS is a great product. It needs capable hardware for ZFS, but the experience is excellent once set up.
|
||||||
|
|
||||||
|
Next step: deploy Proxmox Backup Server as a VM on TrueNAS, then revisit NFS permissions to go root‑less for Immich.
|
||||||
@@ -15,7 +15,7 @@ categories:
|
|||||||
|
|
||||||
L’un des aspects les plus satisfaisant de la création de mon homelab, c’est de pouvoir y appliquer des outils production-grade. J’ai voulu définir toute mon infrastructure as code, et la première étape que j’ai abordée est le déploiement de Machines Virtuelles avec **Terraform** sur **Proxmox**.
|
L’un des aspects les plus satisfaisant de la création de mon homelab, c’est de pouvoir y appliquer des outils production-grade. J’ai voulu définir toute mon infrastructure as code, et la première étape que j’ai abordée est le déploiement de Machines Virtuelles avec **Terraform** sur **Proxmox**.
|
||||||
|
|
||||||
Dans cet article, je vous guide pas à pas pour créer une simple VM sur Proxmox en utilisant Terraform, basée sur un template **cloud-init** que j’ai détaillé dans [cet article]({{< ref "post/1-proxmox-cloud-init-vm-template" >}}). L’exécution se fait depuis un conteneur LXC dédié qui centralise toute la gestion de mon infrastructure.
|
Dans cet article, je vous guide pas à pas pour créer une simple VM sur Proxmox VE 8 en utilisant Terraform, basée sur un template **cloud-init** que j’ai détaillé dans [cet article]({{< ref "post/1-proxmox-cloud-init-vm-template" >}}). L’exécution se fait depuis un conteneur LXC dédié qui centralise toute la gestion de mon infrastructure.
|
||||||
|
|
||||||
📝 Le code complet utilisé dans cet article est disponible dans mon [dépôt GitHub Homelab](https://github.com/Vezpi/Homelab)
|
📝 Le code complet utilisé dans cet article est disponible dans mon [dépôt GitHub Homelab](https://github.com/Vezpi/Homelab)
|
||||||
|
|
||||||
@@ -102,6 +102,43 @@ pveum role add TerraformUser -privs "\
|
|||||||
SDN.Use"
|
SDN.Use"
|
||||||
```
|
```
|
||||||
|
|
||||||
|
⚠️ La liste des privilèges disponibles a été modifiée dans PVE 9.0, utilisez cette commande:
|
||||||
|
```bash
|
||||||
|
pveum role add TerraformUser -privs "\
|
||||||
|
Datastore.Allocate \
|
||||||
|
Datastore.AllocateSpace \
|
||||||
|
Datastore.Audit \
|
||||||
|
Pool.Allocate \
|
||||||
|
Pool.Audit \
|
||||||
|
Sys.Audit \
|
||||||
|
Sys.Console \
|
||||||
|
Sys.Modify \
|
||||||
|
Sys.Syslog \
|
||||||
|
VM.Allocate \
|
||||||
|
VM.Audit \
|
||||||
|
VM.Clone \
|
||||||
|
VM.Config.CDROM \
|
||||||
|
VM.Config.Cloudinit \
|
||||||
|
VM.Config.CPU \
|
||||||
|
VM.Config.Disk \
|
||||||
|
VM.Config.HWType \
|
||||||
|
VM.Config.Memory \
|
||||||
|
VM.Config.Network \
|
||||||
|
VM.Config.Options \
|
||||||
|
VM.Console \
|
||||||
|
VM.Migrate \
|
||||||
|
VM.GuestAgent.Audit \
|
||||||
|
VM.GuestAgent.FileRead \
|
||||||
|
VM.GuestAgent.FileWrite \
|
||||||
|
VM.GuestAgent.FileSystemMgmt \
|
||||||
|
VM.GuestAgent.Unrestricted \
|
||||||
|
VM.PowerMgmt \
|
||||||
|
Mapping.Audit \
|
||||||
|
Mapping.Use \
|
||||||
|
SDN.Audit \
|
||||||
|
SDN.Use"
|
||||||
|
```
|
||||||
|
|
||||||
2. **Créer l'Utilisateur `terraformer`**
|
2. **Créer l'Utilisateur `terraformer`**
|
||||||
```bash
|
```bash
|
||||||
pveum user add terraformer@pve --password <password>
|
pveum user add terraformer@pve --password <password>
|
||||||
@@ -112,7 +149,7 @@ pveum user add terraformer@pve --password <password>
|
|||||||
pveum aclmod / -user terraformer@pve -role TerraformUser
|
pveum aclmod / -user terraformer@pve -role TerraformUser
|
||||||
```
|
```
|
||||||
|
|
||||||
4. Créer le Jeton API pour l'Utilisateur `terraformer`**
|
4. **Créer le Jeton API pour l'Utilisateur `terraformer`**
|
||||||
```bash
|
```bash
|
||||||
pveum user token add terraformer@pve terraform -expire 0 -privsep 0 -comment "Terraform token"
|
pveum user token add terraformer@pve terraform -expire 0 -privsep 0 -comment "Terraform token"
|
||||||
```
|
```
|
||||||
|
|||||||
@@ -15,7 +15,7 @@ categories:
|
|||||||
|
|
||||||
One of the most satisfying parts of building a homelab is getting to apply production-grade tooling to a personal setup. I’ve been working on defining my entire infrastructure as code, and the first piece I tackled was VM deployment with **Terraform** on **Proxmox**.
|
One of the most satisfying parts of building a homelab is getting to apply production-grade tooling to a personal setup. I’ve been working on defining my entire infrastructure as code, and the first piece I tackled was VM deployment with **Terraform** on **Proxmox**.
|
||||||
|
|
||||||
In this article, I’ll walk you through creating a simple VM on Proxmox using Terraform, based on a **cloud-init** template I covered in [this article]({{< ref "post/1-proxmox-cloud-init-vm-template" >}}). Everything runs from a dedicated LXC container where I manage my whole infrastructure.
|
In this article, I’ll walk you through creating a simple VM on Proxmox VE 8 using Terraform, based on a **cloud-init** template I covered in [this article]({{< ref "post/1-proxmox-cloud-init-vm-template" >}}). Everything runs from a dedicated LXC container where I manage my whole infrastructure.
|
||||||
|
|
||||||
📝 The full code used in this article is available in my [Homelab GitHub repository](https://github.com/Vezpi/Homelab)
|
📝 The full code used in this article is available in my [Homelab GitHub repository](https://github.com/Vezpi/Homelab)
|
||||||
|
|
||||||
@@ -102,6 +102,43 @@ pveum role add TerraformUser -privs "\
|
|||||||
SDN.Use"
|
SDN.Use"
|
||||||
```
|
```
|
||||||
|
|
||||||
|
⚠️ The list of available privileges has been changed in PVE 9.0, use this command:
|
||||||
|
```bash
|
||||||
|
pveum role add TerraformUser -privs "\
|
||||||
|
Datastore.Allocate \
|
||||||
|
Datastore.AllocateSpace \
|
||||||
|
Datastore.Audit \
|
||||||
|
Pool.Allocate \
|
||||||
|
Pool.Audit \
|
||||||
|
Sys.Audit \
|
||||||
|
Sys.Console \
|
||||||
|
Sys.Modify \
|
||||||
|
Sys.Syslog \
|
||||||
|
VM.Allocate \
|
||||||
|
VM.Audit \
|
||||||
|
VM.Clone \
|
||||||
|
VM.Config.CDROM \
|
||||||
|
VM.Config.Cloudinit \
|
||||||
|
VM.Config.CPU \
|
||||||
|
VM.Config.Disk \
|
||||||
|
VM.Config.HWType \
|
||||||
|
VM.Config.Memory \
|
||||||
|
VM.Config.Network \
|
||||||
|
VM.Config.Options \
|
||||||
|
VM.Console \
|
||||||
|
VM.Migrate \
|
||||||
|
VM.GuestAgent.Audit \
|
||||||
|
VM.GuestAgent.FileRead \
|
||||||
|
VM.GuestAgent.FileWrite \
|
||||||
|
VM.GuestAgent.FileSystemMgmt \
|
||||||
|
VM.GuestAgent.Unrestricted \
|
||||||
|
VM.PowerMgmt \
|
||||||
|
Mapping.Audit \
|
||||||
|
Mapping.Use \
|
||||||
|
SDN.Audit \
|
||||||
|
SDN.Use"
|
||||||
|
```
|
||||||
|
|
||||||
2. **Create the User `terraformer`**
|
2. **Create the User `terraformer`**
|
||||||
```bash
|
```bash
|
||||||
pveum user add terraformer@pve --password <password>
|
pveum user add terraformer@pve --password <password>
|
||||||
|
|||||||
@@ -11,8 +11,20 @@ Hi there, how are you ?
|
|||||||
|
|
||||||
I'm ==testing==
|
I'm ==testing==
|
||||||
|
|
||||||
## Emoji
|
I've changed the Images location
|
||||||
|
|
||||||
🚀💡🔧🔁⚙️📝📌✅⚠️🍒❌ℹ️⌛🚨🎉📖🔥
|
|
||||||
|
→
|
||||||
|
|
||||||
[post]({{< ref "post/0-template" >}})
|
[post]({{< ref "post/0-template" >}})
|
||||||
|
|
||||||
|
List:
|
||||||
|
- One
|
||||||
|
- Two
|
||||||
|
- Three
|
||||||
|
|
||||||
|
Checklist:
|
||||||
|
- [ ] Not Checked
|
||||||
|
- [x] Checked
|
||||||
|
|
||||||
|
Look this is ~~strike~~ !
|
||||||
BIN
static/img/nasdeck-android-app.png
Normal file
|
After Width: | Height: | Size: 507 KiB |
BIN
static/img/opnsense-ping-failover.png
Normal file
|
After Width: | Height: | Size: 47 KiB |
BIN
static/img/proxmox-add-vm-ha.png
Normal file
|
After Width: | Height: | Size: 118 KiB |
BIN
static/img/proxmox-ceph-status-osd-restart.png
Normal file
|
After Width: | Height: | Size: 42 KiB |
BIN
static/img/proxmox-ceph-version-upgrade.png
Normal file
|
After Width: | Height: | Size: 73 KiB |
|
Before Width: | Height: | Size: 221 KiB After Width: | Height: | Size: 197 KiB |
BIN
static/img/proxmox-ha-resource-affinity-rule.png
Normal file
|
After Width: | Height: | Size: 88 KiB |
BIN
static/img/semaphore-add-repository.png
Normal file
|
After Width: | Height: | Size: 159 KiB |
BIN
static/img/semaphore-create-new-ansible-task-template.png
Normal file
|
After Width: | Height: | Size: 225 KiB |
BIN
static/img/semaphore-create-new-ssh-key.png
Normal file
|
After Width: | Height: | Size: 149 KiB |
BIN
static/img/semaphore-create-new-static-inventory.png
Normal file
|
After Width: | Height: | Size: 144 KiB |
BIN
static/img/semaphore-create-project.png
Normal file
|
After Width: | Height: | Size: 96 KiB |
BIN
static/img/semaphore-login-page.png
Normal file
|
After Width: | Height: | Size: 80 KiB |
BIN
static/img/semaphore-run-test-playbook.png
Normal file
|
After Width: | Height: | Size: 114 KiB |
BIN
static/img/semaphore-running-terraform-code-options.png
Normal file
|
After Width: | Height: | Size: 48 KiB |
BIN
static/img/semaphore-task-template-terraform.png
Normal file
|
After Width: | Height: | Size: 254 KiB |
BIN
static/img/semaphore-terraform-task-working.png
Normal file
|
After Width: | Height: | Size: 80 KiB |
BIN
static/img/semaphore-ui-ansible-task-output.png
Normal file
|
After Width: | Height: | Size: 262 KiB |
BIN
static/img/semaphore-ui-create-variable-group.png
Normal file
|
After Width: | Height: | Size: 169 KiB |
BIN
static/img/semaphore-ui-deploy-with-terraform.png
Normal file
|
After Width: | Height: | Size: 316 KiB |
BIN
static/img/semaphore-ui-task-template-run-list.png
Normal file
|
After Width: | Height: | Size: 227 KiB |
BIN
static/img/semaphore-ui-test-nginx-page-playbook.png
Normal file
|
After Width: | Height: | Size: 21 KiB |
BIN
static/img/truenas-config-change-hostname.png
Normal file
|
After Width: | Height: | Size: 213 KiB |
BIN
static/img/truenas-create-dataset-files.png
Normal file
|
After Width: | Height: | Size: 86 KiB |
BIN
static/img/truenas-create-new-user.png
Normal file
|
After Width: | Height: | Size: 58 KiB |
BIN
static/img/truenas-create-periodic-snapshot.png
Normal file
|
After Width: | Height: | Size: 48 KiB |
BIN
static/img/truenas-data-protection-tab.png
Normal file
|
After Width: | Height: | Size: 120 KiB |
BIN
static/img/truenas-dataset-photos-nfs-share.png
Normal file
|
After Width: | Height: | Size: 59 KiB |
BIN
static/img/truenas-datasets-layout.png
Normal file
|
After Width: | Height: | Size: 127 KiB |
BIN
static/img/truenas-fresh-install-dashboard.png
Normal file
|
After Width: | Height: | Size: 183 KiB |
BIN
static/img/truenas-iso-enter-shell.png
Normal file
|
After Width: | Height: | Size: 28 KiB |
BIN
static/img/truenas-iso-fix-installer.png
Normal file
|
After Width: | Height: | Size: 211 KiB |
BIN
static/img/truenas-iso-installation-splash.png
Normal file
|
After Width: | Height: | Size: 50 KiB |
BIN
static/img/truenas-iso-installation.png
Normal file
|
After Width: | Height: | Size: 145 KiB |
BIN
static/img/truenas-login-page-change-password.png
Normal file
|
After Width: | Height: | Size: 451 KiB |
BIN
static/img/truenas-pool-creation-general.png
Normal file
|
After Width: | Height: | Size: 70 KiB |
BIN
static/img/truenas-pool-creation-layout.png
Normal file
|
After Width: | Height: | Size: 96 KiB |
BIN
static/img/truenas-pool-creation-review.png
Normal file
|
After Width: | Height: | Size: 72 KiB |
BIN
static/img/truenas-start-smb-service.png
Normal file
|
After Width: | Height: | Size: 19 KiB |
BIN
static/img/truenas-storage-disks-unconfigured.png
Normal file
|
After Width: | Height: | Size: 73 KiB |