Azure Security Audit Automation with AI-Powered Analysis

The Problem

Security audits across multi-subscription Azure environments are time-consuming and inconsistent when done manually. An analyst pulling NSG rules, firewall configurations, public IP exposures, and Entra ID access reviews across a dozen subscriptions is going to miss things — not from negligence, but from volume and fatigue.

The specific ask: automate the collection, make it queryable, and surface anomalies without requiring an analyst to know exactly what to look for ahead of time.

The Architecture

The solution is a multi-layer pipeline: automated collection, structured storage, and AI-assisted analysis.

Layer 1: Data Collection Runbooks

A set of PowerShell and Python runbooks execute on a schedule, sweeping configured subscriptions and pulling:

›Network Security Group rules — all inbound/outbound rules across all NSGs, flagged by subscription and resource group
›Azure Firewall rules — application rules, network rules, DNAT rules, with policy inheritance resolved
›Public IP inventory — all public IPs with associated resource type, DNS label, and allocation method
›Entra ID audit data — groups, users, role assignments, access review results
›Tenable scan results — pulled directly from the Tenable API as both CSV (for indexing) and PDF (for human review)

# NSG collection — simplified example
$nsgs = Get-AzNetworkSecurityGroup
foreach ($nsg in $nsgs) {
    $rules = $nsg.SecurityRules | Select-Object Name, Priority, Direction,
        Access, Protocol, SourceAddressPrefix, DestinationPortRange
    $output += [PSCustomObject]@{
        Subscription = $sub.Name
        NSG          = $nsg.Name
        ResourceGroup= $nsg.ResourceGroupName
        Rules        = $rules
    }
}
$output | Export-Csv -Path $outputPath -NoTypeInformation

Layer 2: Structured Storage

All output lands in Azure Storage accounts, partitioned by date and audit category. The folder structure is intentional — it allows the data to be consumed by multiple downstream tools without transformation.

audits/
  2025-03-01/
    nsg-rules.csv
    firewall-rules.csv
    public-ips.csv
    entra-groups.csv
    entra-roles.csv
    access-reviews.csv
    tenable-scan.csv
    tenable-scan.pdf

Layer 3: AI Foundry Index and Query Interface

A Python runbook feeds the CSV data into an Azure AI Foundry vector index on each audit cycle. The index is queried through a web portal using GPT-4o (Azure deployment), enabling natural-language questions over the audit data.

# Feed CSVs into AI Foundry index
from azure.ai.projects import AIProjectClient
 
client = AIProjectClient(endpoint=FOUNDRY_ENDPOINT, credential=credential)
index_client = client.indexes.get(index_name=INDEX_NAME)
 
for csv_path in audit_csvs:
    with open(csv_path, "rb") as f:
        index_client.upload_file(file=f, filename=Path(csv_path).name)

ℹ Note

GPT handles natural-language queries well — "show me all NSG rules that allow inbound access from the internet on port 22" works exactly as expected. But firewall rule comparison between two audit cycles requires deterministic logic, not inference.

The Firewall Comparison Problem

AI models don't naturally do accurate diff-style comparisons on structured rule sets — they'll hallucinate additions or miss deletions in large rule tables. This required custom logic in the portal.

The comparison engine reads two audit cycle CSVs directly, computes the rule delta using a hash-based approach, and passes the structured diff to GPT as grounded context. GPT's job then is interpretation and explanation, not computation.

def compare_firewall_rules(baseline: pd.DataFrame, current: pd.DataFrame) -> dict:
    baseline_hashes = set(baseline.apply(lambda r: hash(tuple(r)), axis=1))
    current_hashes  = set(current.apply(lambda r: hash(tuple(r)), axis=1))
 
    added   = current[~current.apply(lambda r: hash(tuple(r)), axis=1).isin(baseline_hashes)]
    removed = baseline[~baseline.apply(lambda r: hash(tuple(r)), axis=1).isin(current_hashes)]
 
    return {"added": added.to_dict("records"), "removed": removed.to_dict("records")}

GPT receives the diff and returns a human-readable analysis with citations back to specific rules — including security implications of the changes.

Status

The runbook collection layer and Azure Storage architecture are in production. The AI Foundry indexing pipeline is operational and query logic is working in the proof-of-concept portal. The full analyst-facing UI is in active development.

Design Principles

Deterministic where it matters. AI is used for language, synthesis, and surfacing patterns — not for computation or comparison. Rule diffs, IP lists, and access reviews are resolved by code before GPT sees them.

Auditable storage. Every audit cycle is a dated partition. Historical comparison is available without any special tooling — just CSV files in blob storage.

Tenable integration. Vulnerability scan results feed the same index, enabling correlation between firewall posture and known CVEs in the same query interface.

OWASP and PCI framing. Network exposure queries are written to surface PCI-relevant conditions first — public IPs on cardholder data environment resources, NSG rules allowing broad internet access, privileged Entra roles without MFA enforcement.