The Challenge
The environment evolved over many years and faced several real‑world constraints:
- Vendor limitations — VMware’s free edition restricted backups, API access, and live migration.
- Ageing hardware — initial hosts were DL380 G5 and DL360 G6 servers sourced from the secondary market.
- Limited storage — the QNAP NAS provided only modest iSCSI/NFS capacity, restricting VDI retention depth.
- No budget for commercial backup tools — all backup operations had to be built using Linux CLI tools.
- Tape capacity limits — LTO4 and LTO5 tapes could not hold 30 days of VDI snapshots as the environment grew.
- Need for air‑gapped protection — backups had to be physically removable and independent of cloud services.
- Zero‑downtime requirements — hypervisor upgrades and hardware refreshes had to be performed live.
The challenge was to maintain a reliable, recoverable virtualisation platform with long‑term retention while working entirely within these constraints.
The Solution
1. Migration from VMware to XenServer/XCP‑ng
To escape VMware’s limitations, the platform was migrated to XenServer 7 and later upgraded in‑place to XCP‑ng while workloads remained online. This provided:
- Live migration
- Open storage formats
- Flexible upgrade paths
- Full CLI access
- No licensing restrictions
A three‑node DL360 G6 cluster was built, backed by a QNAP NAS providing iSCSI and NFS storage for VM disks and backups.
2. Hardware Evolution to DL360 G9
As workloads increased, two G6 nodes were replaced with DL360 G9 servers. The mixed‑generation pool remained stable thanks to XCP‑ng’s broad hardware support, improving:
- CPU performance
- RAM capacity
- Power efficiency
- Overall reliability
3. Fully CLI‑Driven Tape Backup Architecture
A dedicated DL360 G6 backup server was equipped with two tape drives:
- LTO4 on
/dev/st0 - LTO5 on
/dev/st1
The backup system was built entirely using:
tarmt- cron
- Xen Orchestra streaming
- NFS/iSCSI storage
Key Engineering Features
a. Streaming VDI backups directly to tape
Xen Orchestra streamed full and incremental VDI chains at ~110 MB/s directly into tar, avoiding:
- filesystem overhead
- millions of tiny files
- tape shoe‑shining
- local disk requirements
b. Compressing file‑level data locally
Zimbra and Samba data were compressed into a single backup.tar.gz to maximise tape efficiency and simplify restores.
c. Splitting workloads across two tape drives
To maximise throughput:
- LTO4 handled the compressed archive
- LTO5 handled VDI streams and metadata
This allowed both drives to run at full speed simultaneously.
d. Automated tape rotation
Cron scripts:
- set tape block sizes
- generated archives
- streamed data to tape
- ejected tapes automatically
The onsite person simply removed the tape and took it off‑site.
4. Retention Strategy
The system maintained:
- Daily full backups on tape
- Incremental VDI chains in Xen Orchestra
- 14‑day hypervisor snapshots
- Months of VDI history across rotated tapes
Retention depth was adjusted as VDI chains grew and QNAP storage became constrained, ensuring tapes remained within capacity while still providing long‑term recovery.
Outcome
Despite limited hardware and no commercial backup tools, the platform delivered:
- Zero data‑loss incidents
- Reliable recovery from tape when required
- Continuous uptime during hypervisor migrations
- Long‑term retention without cloud dependency
- A fully auditable, vendor‑independent backup format
- Maximum value extracted from refurbished hardware