Receive Backlog Queue

Linux Kernel Tuning for High Performance Networking Series

John H Patton
Level Up Coding

--

Photo by Thomas Jensen on Unsplash

The receive backlog queue is the first of the queues where packets are stored during the 3-way handshake. To review the 3-way handshake, here’s the link to the primer:

This article will cover information related to the receive backlog queue in linux kernels v2.4.20 to current, although kernel settings may appear in earlier versions.

TCP Receive Queue and netdev_max_backlog

Each CPU core can hold a number of packets in a ring buffer before the network stack is able to process them. If the buffer is filled faster than TCP stack can process them, a dropped packet counter is incremented and they will be dropped. The net.core.netdev_max_backlog setting should be increased to maximize the number of packets queued for processing on servers with high burst traffic.

This is a per CPU core setting, so it needs to be set with that in mind.

To determine if this setting needs to increase, the softnet_stat file can be monitored for dropped packets. Only the first three columns are interesting at this time:

# cat /proc/net/softnet_stat
000dfbfa 000000df 00000022 ...
000f2d7a 000000b1 0000003e ...
...

Each row in this file corresponds to a CPU core, the 1st column corresponds to the number of packets the core has processed, the second column holds the number of dropped packets, and the third column is time_squeeze which is the number of times the packet processing budget was exceeded.

Receive Backlog

If column 2 increases under load, this indicates that the kernel setting net.core.netdev_max_backlog should be increased. Keep increasing until this column stabilizes.

A recommended approach is to increase the backlog value by the change in column 2 over a 10 second period + some buffer. Example:

T0: 000000df
T10: 000001b2
T20: 00000291
Increase: DF (223)

In the above example, and increase of the backlog by 256 might give enough capacity to the receive queue. Increase this setting and monitor for a while under load to ensure it’s set correctly.

Time Squeeze

If column 3 increases under load, this is an indication that the TCP processor was unable to process all packets available before the CPU budget was exhausted before it completed. The network processor has two interrupts for processing transmits and receives. The receive interrupt that polls for new packets is called the receive softirq, and the polling is called NAPI (New API) polling. The only time receive packets can be removed from the network interface card (NIC) for processing is during the NAPI poll.

These kernel settings effect time squeeze:

  1. net.core.dev_weight
    Maximum number of packets the driver can receive during a NAPI interrupt, per CPU.
  2. net.core.netdev_budget
    Maximum number of packets received in one NAPI polling cycle, total for all interfaces/CPUs. Cannot exceed
  3. net.core.netdev_budget_usecs
    Time in microseconds of one NAPI polling cycle.

NAPI Polling ends after netdev_budget_usecs have elapsed or netdev_budget has been reached.

When the 3rd column in softnet_stat increases, it’s typically caused by a high bandwidth of 10Gbps+ adding more packets to the interface receive buffer than can be received during NAPI polling. To adjust the processor to remove a higher number of packets without exceeding the budget, slowly increase the net.core.netdev_budget until the 3rd column stabilizes. The number of packets processed by NAPI is also limited by the netdev_budget_usecs time, so some adjustments to both may be needed.

Conclusion

Remember that the netdev values take CPU to process and can adversely affect performance if set too high, so keep the changes small.

If any of the information in this article is inaccurate, please post a comment and I’ll update the article to correct the information.

--

--