Ansible Patch Management: Automated OS Patching Across Linux and Windows Enterprise Fleets
By Luca Berton · Published 2024-01-01 · Category: installation
Automate OS patch management with Ansible. Rolling updates, pre-patch snapshots, compliance reporting, and zero-downtime patching for Linux and Windows fleets.
Introduction
Unpatched servers are the #1 attack vector in enterprise breaches. Yet manual patching is slow, error-prone, and disruptive. Ansible automates the entire patch lifecycle — pre-patch checks, snapshots, rolling updates, post-patch validation, and compliance reporting — across both Linux and Windows fleets with zero-downtime strategies.
See also: RHSB-2024–001 Leaky Vessels — runc — (CVE-2024–21626)
Patch Workflow
Pre-Patch Patch Post-Patch Report
┌──────────┐ ┌──────────┐ ┌──────────────┐ ┌──────────┐
│ Snapshot │───→│ Apply │───→│ Health Check │───→│ Generate │
│ Backup │ │ Updates │ │ Validate │ │ Report │
│ Pre-check │ │ Reboot │ │ Services OK │ │ Close CR │
└──────────┘ └──────────┘ └──────────────┘ └──────────┘
Linux Patching
Full Patch Playbook
---
- name: Linux patch management
hosts: linux_servers
become: true
serial: "25%"
max_fail_percentage: 10
vars:
patch_date: "{{ ansible_date_time.date }}"
reboot_required: true
pre_tasks:
- name: Pre-patch health check
ansible.builtin.uri:
url: "http://{{ inventory_hostname }}:{{ app_port | default(8080) }}/health"
status_code: 200
register: pre_health
failed_when: false
- name: Remove from load balancer
ansible.builtin.uri:
url: "{{ lb_api }}/pools/{{ lb_pool }}/members/{{ inventory_hostname }}"
method: DELETE
delegate_to: localhost
when: lb_api is defined
- name: Wait for connections to drain
ansible.builtin.wait_for:
timeout: 30
when: lb_api is defined
tasks:
# --- RHEL/CentOS/Rocky/Alma ---
- name: Apply security updates (RHEL family)
ansible.builtin.dnf:
name: "*"
state: latest
security: true
update_cache: true
register: dnf_result
when: ansible_os_family == 'RedHat'
# --- Ubuntu/Debian ---
- name: Update apt cache
ansible.builtin.apt:
update_cache: true
cache_valid_time: 3600
when: ansible_os_family == 'Debian'
- name: Apply security updates (Debian family)
ansible.builtin.apt:
upgrade: safe
autoremove: true
register: apt_result
when: ansible_os_family == 'Debian'
- name: Check if reboot required
ansible.builtin.stat:
path: /var/run/reboot-required
register: reboot_flag
- name: Reboot if needed
ansible.builtin.reboot:
reboot_timeout: 600
msg: "Rebooting for kernel/security updates"
when: >
reboot_required and
(reboot_flag.stat.exists | default(false) or
(dnf_result.changed | default(false)) or
(apt_result.changed | default(false)))
post_tasks:
- name: Wait for server to be ready
ansible.builtin.wait_for_connection:
timeout: 300
- name: Post-patch health check
ansible.builtin.uri:
url: "http://{{ inventory_hostname }}:{{ app_port | default(8080) }}/health"
status_code: 200
retries: 10
delay: 15
register: post_health
until: post_health.status == 200
- name: Verify critical services
ansible.builtin.systemd:
name: "{{ item }}"
register: svc_status
loop: "{{ critical_services | default(['sshd']) }}"
- name: Assert services running
ansible.builtin.assert:
that: item.status.ActiveState == 'active'
fail_msg: "Service {{ item.item }} not running after patching!"
loop: "{{ svc_status.results }}"
- name: Re-add to load balancer
ansible.builtin.uri:
url: "{{ lb_api }}/pools/{{ lb_pool }}/members"
method: POST
body_format: json
body:
address: "{{ inventory_hostname }}"
delegate_to: localhost
when: lb_api is defined
Selective Patching
- name: Apply only critical CVE patches
ansible.builtin.dnf:
name: "*"
state: latest
security: true
bugfix: false
when: patch_scope == 'critical_only'
- name: Exclude specific packages
ansible.builtin.dnf:
name: "*"
state: latest
exclude:
- kernel*
- docker*
- kubelet*
when: patch_scope == 'no_kernel'
See also: Ansible for Windows Server Enterprise Management: Active Directory, IIS, and Group Policy
Windows Patching
- name: Windows patch management
hosts: windows_servers
serial: "25%"
tasks:
- name: Pre-patch snapshot (VMware)
community.vmware.vmware_guest_snapshot:
hostname: "{{ vcenter_host }}"
username: "{{ vcenter_user }}"
password: "{{ vcenter_pass }}"
validate_certs: false
datacenter: DC01
name: "{{ inventory_hostname }}"
state: present
snapshot_name: "pre-patch-{{ patch_date }}"
quiesce: true
delegate_to: localhost
- name: Install Windows updates
ansible.windows.win_updates:
category_names:
- SecurityUpdates
- CriticalUpdates
- UpdateRollups
state: installed
reboot: true
reboot_timeout: 1800
register: win_updates
- name: Report installed updates
ansible.builtin.debug:
msg: |
Host: {{ inventory_hostname }}
Updates installed: {{ win_updates.installed_update_count }}
Reboot required: {{ win_updates.reboot_required }}
- name: Post-patch verification
ansible.windows.win_service:
name: "{{ item }}"
register: win_svc
loop: "{{ critical_windows_services }}"
- name: Remove snapshot after validation (7-day delay via scheduled job)
ansible.windows.win_powershell:
script: |
# Schedule snapshot removal for 7 days from now
$trigger = New-ScheduledTaskTrigger -Once -At (Get-Date).AddDays(7)
$action = New-ScheduledTaskAction -Execute "PowerShell" `
-Argument "-Command Write-Host 'Snapshot cleanup placeholder'"
Register-ScheduledTask -TaskName "CleanupSnapshot" -Trigger $trigger -Action $action
Compliance Reporting
- name: Generate patch compliance report
hosts: all
become: true
tasks:
- name: Check for pending updates (RHEL)
ansible.builtin.command: dnf check-update --security
register: pending_updates
failed_when: false
changed_when: false
when: ansible_os_family == 'RedHat'
- name: Check kernel version
ansible.builtin.command: uname -r
register: running_kernel
changed_when: false
- name: Check last patch date
ansible.builtin.stat:
path: /var/log/dnf.log
register: dnf_log
when: ansible_os_family == 'RedHat'
- name: Build compliance data
ansible.builtin.set_fact:
compliance:
hostname: "{{ inventory_hostname }}"
os: "{{ ansible_distribution }} {{ ansible_distribution_version }}"
kernel: "{{ running_kernel.stdout }}"
pending_security_updates: "{{ pending_updates.rc | default(0) }}"
last_patched: "{{ dnf_log.stat.mtime | default(0) | int }}"
compliant: "{{ (pending_updates.rc | default(0)) == 0 }}"
- name: Compile report
hosts: localhost
tasks:
- name: Generate compliance report
ansible.builtin.template:
src: compliance-report.j2
dest: "/reports/patch-compliance-{{ ansible_date_time.date }}.html"
vars:
all_compliance: "{{ groups['all'] | map('extract', hostvars, 'compliance') | list }}"
total_hosts: "{{ groups['all'] | length }}"
compliant_hosts: "{{ all_compliance | selectattr('compliant', 'true') | list | length }}"
compliance_pct: "{{ (compliant_hosts | int / total_hosts | int * 100) | round(1) }}%"
See also: Ansible for Healthcare: HIPAA Compliance, EHR Systems, and Medical Device Management
Patch Scheduling with AAP
# AAP Schedule: Monthly Patch Tuesday + 3 days
# Week 1: Dev/Test (auto-approve)
# Week 2: Staging (auto-approve)
# Week 3: Production (requires approval)
# Workflow Template:
# ┌──────────┐ ┌──────────┐ ┌──────────┐
# │ Snapshot │───→│ Patch │───→│ Validate │
# │ All VMs │ │ Servers │ │ Services │
# └──────────┘ └──────────┘ └──────────┘
# │ │
# ┌────┴────┐ ┌─────┴─────┐
# │ Failed? │ │ Generate │
# │ Rollback│ │ Report │
# └─────────┘ └───────────┘
Rollback
- name: Rollback failed patch
hosts: "{{ failed_hosts }}"
become: true
tasks:
- name: Rollback last dnf transaction
ansible.builtin.command: dnf history undo last -y
when: ansible_os_family == 'RedHat'
- name: Revert VM snapshot
community.vmware.vmware_guest_snapshot:
hostname: "{{ vcenter_host }}"
username: "{{ vcenter_user }}"
password: "{{ vcenter_pass }}"
validate_certs: false
datacenter: DC01
name: "{{ inventory_hostname }}"
state: revert
snapshot_name: "pre-patch-{{ patch_date }}"
delegate_to: localhost
when: rollback_method == 'snapshot'
Best Practices
Rolling updates withserial — Never patch 100% simultaneously; 25% batches with health checks
Pre-patch snapshots — Always have a rollback path
Health checks before and after — Automated service validation catches regressions
Load balancer integration — Drain connections before patching; re-add after validation
Separate kernel updates — Kernel patches require reboot; handle separately from app patches
Compliance reporting — Generate reports after each patch cycle for audit
Test patches in dev first — Staggered rollout: dev → staging → production
Exclude critical packages when needed — Skip Docker/K8s updates during OS patching
FAQ
How to handle patch failures?
max_fail_percentage: 10 stops the play if more than 10% of hosts fail. Use rollback playbook for affected hosts. VMware snapshots provide instant rollback.
Patching frequency?
Monthly for routine updates (align with Patch Tuesday). Immediate for critical CVEs (CVSS 9.0+). Ansible makes emergency patching as fast as routine.
Can I patch without rebooting?
For most package updates, yes. Kernel updates require reboot. Use kpatch (RHEL) or livepatch (Ubuntu) for kernel patching without reboot — Ansible can manage these too.
Conclusion
Ansible patch management transforms OS patching from a weekend maintenance window into an automated, rolling process with built-in safety nets. Pre-patch snapshots, health checks, load balancer integration, and compliance reporting ensure patches are applied safely and verifiably across your entire fleet.
Related Articles
• Ansible Compliance Automation • Ansible Windows Server Management • Ansible VMware AutomationCategory: installation