top of page
Search

Your Hub and Spoke Network Is Costing You $11,000 a Year for Traffic That Doesn't Need a Firewall

  • Writer: Logan Hemphill
    Logan Hemphill
  • 5 days ago
  • 8 min read

You're running Azure Firewall Standard in your hub VNet right now, and there's a solid chance that 60% or more of the traffic flowing through it is simple east west communication that NSGs could handle for free. That's roughly $11,000 a year in deployment costs alone, before data processing charges, for a service that's mostly acting as an expensive traffic passthrough.


I see this pattern constantly. A team reads the Cloud Adoption Framework, follows the landing zone reference architecture, deploys hub and spoke with Azure Firewall Standard sitting in the middle, and calls it done. Six months later they're wondering why their networking line item keeps climbing and their spoke to spoke latency is worse than it should be.


Here's the deal. Hub and spoke is a good pattern. It's not the right pattern for every organization, and it's definitely not the right pattern at the scale most teams deploy it.


The "best practice" trap


Microsoft's Cloud Adoption Framework recommends hub and spoke as the default networking topology for Azure landing zones. The reference architecture puts Azure Firewall in the hub for centralized traffic inspection. For large enterprises with regulatory requirements around deep packet inspection, threat intelligence filtering, and centralized egress control, that architecture makes sense.


But most of the environments I walk into aren't large enterprises with those requirements. They're organizations running 5 to 20 spoke VNets, hosting internal line of business applications, with maybe one or two public facing workloads behind Application Gateway. Their security needs are real, but they don't require Layer 7 inspection on every packet moving between spokes.


The problem is that "best practice" gets treated as "only practice." Teams deploy Azure Firewall Standard because the reference architecture shows it, not because they've evaluated whether their traffic patterns actually need it. And once it's deployed, nobody questions the $912/month fixed cost because it's "infrastructure."


What Azure Firewall Standard actually costs you


Let's break the numbers down so this isn't abstract.

Azure Firewall Standard charges $1.25/hour for deployment. That's a fixed cost whether you're pushing 1 GB or 1 TB through it. That comes out to roughly $912/month or $10,944/year just to have it running.


On top of that, you pay $0.016/GB for data processing. If you're pushing 500 GB/month of spoke to spoke traffic through the firewall, that's another $8/month. The data processing cost is almost negligible compared to the deployment cost, which makes the fixed cost the real issue.


Now compare that to NSGs, which are free. No deployment cost. No per GB charge for the NSG itself. You'll still pay VNet peering costs if traffic crosses VNets (about $0.01/GB per direction for same region peering), but that's pennies compared to the firewall's fixed overhead. They operate at Layer 3 and 4, handling IP and port based allow/deny rules. For east west traffic between spokes where you just need to control which subnets can talk to which services on which ports, NSGs do the job.


If your environment genuinely needs the Layer 7 capabilities, IDPS, threat intelligence based filtering, and TLS inspection that Azure Firewall Standard provides, then the $912/month is justified. But I'd estimate that 70% of the organizations I've worked with are paying for those capabilities and not using them. Their firewall rules are almost entirely network rules and NAT rules. Things that NSGs handle natively.


The latency tax nobody talks about


Cost isn't the only issue. Hub and spoke with centralized firewall inspection adds latency to every spoke to spoke communication. Traffic from Spoke A to Spoke B has to route through the hub, hit the firewall, get inspected, and then route to the destination. That's an extra 2 to 3 milliseconds on every request compared to direct VNet peering.


For most workloads, 2 to 3 milliseconds doesn't matter. For latency sensitive workloads like real time APIs, database replication, or microservices with high call volumes, it adds up fast. I had a client running a microservices architecture across three spoke VNets. Their service to service calls were averaging 15ms when they should have been under 5ms. The extra hops through the hub firewall accounted for the majority of that overhead.


The fix wasn't complicated. We identified which spoke to spoke traffic actually needed firewall inspection (traffic touching their PCI scoped workloads) and which didn't (internal APIs communicating between their application tier and their data tier). For the traffic that didn't need deep inspection, we set up direct VNet peering with NSG rules. Their latency dropped to where it should have been, and the firewall's processing load decreased noticeably.


How to audit what your firewall is actually doing


Before you make any changes, you need to understand your traffic patterns. Here's a KQL query you can run against your Azure Firewall's diagnostic logs in Log Analytics to see what percentage of your traffic is using network rules versus application rules:


If your environment uses the newer structured logs (resource specific tables), run this:

// Azure Firewall rule hit analysis (structured logs)
// Uses resource-specific tables AZFWNetworkRule and AZFWApplicationRule
let networkHits = AZFWNetworkRule | where TimeGenerated > ago(30d) | count | project NetworkRuleHits = Count;
let appHits = AZFWApplicationRule | where TimeGenerated > ago(30d) | count | project AppRuleHits = Count;
networkHits
| join appHits on 1==1
| extend TotalHits = NetworkRuleHits + AppRuleHits
| extend NetworkRulePercent = round(100.0 * NetworkRuleHits / TotalHits, 1)
| extend AppRulePercent = round(100.0 * AppRuleHits / TotalHits, 1)
| project TotalHits, NetworkRuleHits, NetworkRulePercent, AppRuleHits, AppRulePercent

If you're still on the legacy AzureDiagnostics mode, use this version instead:

// Azure Firewall rule hit analysis (legacy diagnostics)
AzureDiagnostics
| where Category == "AzureFirewallNetworkRule" or Category == "AzureFirewallApplicationRule"
| where TimeGenerated > ago(30d)
| summarize
    NetworkRuleHits = countif(Category == "AzureFirewallNetworkRule"),
    AppRuleHits = countif(Category == "AzureFirewallApplicationRule")
| extend TotalHits = NetworkRuleHits + AppRuleHits
| extend NetworkRulePercent = round(100.0 * NetworkRuleHits / TotalHits, 1)
| extend AppRulePercent = round(100.0 * AppRuleHits / TotalHits, 1)
| project TotalHits, NetworkRuleHits, NetworkRulePercent, AppRuleHits, AppRulePercent

Not sure which mode you're on? Check your firewall's Diagnostic Settings in the portal. If you see tables like AZFWNetworkRule, you're on structured logs. If everything goes to AzureDiagnostics, you're on legacy.


If your NetworkRulePercent is above 80%, that's a strong signal that most of your firewall traffic is doing work that NSGs could handle. Network rules operate at Layer 3/4, the same layer as NSGs. You're paying for Layer 7 capabilities you aren't using.


Here's another query to identify your top spoke to spoke traffic flows. This one uses the structured logs table:

// Top spoke-to-spoke flows through Azure Firewall (structured logs, last 30 days)
AZFWNetworkRule
| where TimeGenerated > ago(30d)
| summarize FlowCount = count() by SourceIp, DestinationIp, DestinationPort, Action
| order by FlowCount desc
| take 25

If you're on legacy diagnostics, the equivalent uses the msg_s field parsing:

// Top spoke-to-spoke flows (legacy diagnostics, last 30 days)
AzureDiagnostics
| where Category == "AzureFirewallNetworkRule"
| where TimeGenerated > ago(30d)
| parse msg_s with Protocol " request from " SourceIP ":" SourcePort " to " DestIP ":" DestPort ". Action: " Action
| where isnotempty(SourceIP) and isnotempty(DestIP)
| summarize FlowCount = count() by SourceIP, DestIP, DestPort, Action
| order by FlowCount desc
| take 25

This tells you exactly which traffic is crossing through the firewall and how often. Look at the results and ask yourself: does this traffic actually need deep packet inspection, or would a simple allow rule on an NSG accomplish the same thing?


The right sizing approach


I'm not saying rip out your hub and spoke architecture. I'm saying right size it to what you actually need.


For organizations with fewer than 10 spoke VNets and no regulatory requirement for deep packet inspection: Consider whether you need Azure Firewall at all. Direct VNet peering with well designed NSGs gives you network segmentation, traffic control, and zero additional deployment cost. If you need centralized egress filtering for outbound internet traffic, Azure Firewall Basic at $288/month is less than a third of Standard's cost and handles the basics.


For organizations with regulatory requirements on specific workloads: Don't put everything through the firewall. Use Azure Firewall Premium for the traffic that needs inspection (PCI scoped workloads, traffic touching sensitive data stores, outbound internet traffic) and direct peering with NSGs for everything else. This is a hybrid approach, and it's how I'd do it if I was in your shoes.


For large enterprises with 20+ spoke VNets: Evaluate whether Virtual WAN makes more sense than self managed hub and spoke. At scale, Virtual WAN's managed routing and lower operational overhead can offset its costs.


Here's an Azure Resource Graph query to see your current VNet peering topology and identify how many peerings flow through your hub:

// List all VNet peerings and their states
resources
| where type =~ "microsoft.network/virtualnetworks"
| mv-expand peering = properties.virtualNetworkPeerings
| project
    VNetName = name,
    VNetResourceGroup = resourceGroup,
    PeeringName = tostring(peering.name),
    PeeringState = tostring(peering.properties.peeringState),
    RemoteVNet = tostring(split(tostring(peering.properties.remoteVirtualNetwork.id), "/")[8]),
    AllowForwardedTraffic = tostring(peering.properties.allowForwardedTraffic),
    UseRemoteGateways = tostring(peering.properties.useRemoteGateways)
| order by VNetName asc

If AllowForwardedTraffic is true and UseRemoteGateways is true on most of your spokes, all that traffic is routing through the hub. That's the traffic you should audit.


What this looks like in practice


I worked with a client last year who was running a classic hub and spoke with Azure Firewall Standard. They had 12 spoke VNets, mostly internal applications, one public facing web app, and a VPN gateway for their on premises connectivity. Their monthly Azure Firewall cost was $912 for deployment plus about $40 in data processing. Call it $950/month.


We looked at their firewall logs and found that 85% of the rule hits were network rules. Of those, the vast majority were spoke to spoke traffic between their application and database tiers. The application rules they actually used were limited to their outbound internet filtering for the public facing web app.


Here's what we changed. We kept Azure Firewall Standard for the public facing workload's outbound traffic and the VPN gateway traffic coming from on premises. We set up direct VNet peering between the application and database spoke VNets with NSG rules controlling the allowed ports and protocols. We removed the UDR (user defined route) that was forcing that internal traffic through the firewall.


The result: their Azure Firewall's data processing dropped by about 70% because most of the traffic was no longer flowing through it. The deployment cost stayed the same because you can't scale that down, but the overall architecture was simpler, faster, and positioned them to eventually evaluate whether Azure Firewall Basic would meet their reduced needs. If they downgrade to Basic, they'd save roughly $624/month ($7,488/year) on deployment alone.


The NSG design that actually works


If you're going to shift traffic off the firewall and onto NSGs, the NSG design matters. Poorly designed NSGs create the same kind of management headaches that drive teams to centralized firewalls in the first place.


Two things I always recommend. First, use Application Security Groups (ASGs) instead of hardcoding IP addresses in your NSG rules. ASGs let you group VMs by function (web servers, app servers, database servers) and write rules against those groups. When you add a new VM, you assign it to the right ASG and it automatically inherits the correct network rules. No IP address management.

Second, keep your NSG rules focused. Don't try to replicate a firewall's complexity in NSGs. NSGs work best with clear, simple rules: allow traffic from the app tier ASG to the database tier ASG on port 1433, deny everything else. If you find yourself writing 50 NSG rules on a single subnet, that's a sign you might need Azure Firewall for that specific traffic flow after all.


What to do next


Run the KQL queries above against your environment. If you find that 80% or more of your firewall traffic is network rules, you have room to simplify. Start with the highest volume spoke to spoke flows. Set up direct peering with NSGs for the traffic that doesn't need deep inspection. Keep the firewall for what it's actually good at: Layer 7 inspection, threat intelligence, and centralized outbound filtering.

Don't try to do this all at once. Pick one spoke to spoke flow, implement direct peering with NSGs, validate that security controls are equivalent, and measure the latency and cost impact. Then move to the next one.


The goal isn't to eliminate your firewall. The goal is to stop paying $11,000 a year for it to do work that a free service handles just fine.


References

If this sounds like your environment, I do free 30 minute Azure cost reviews. It's a conversation to see if there's a fit, not a sales pitch.

 
 
 
bottom of page