Merge tag 'net-next-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Paolo Abeni:
 "Core:

   - Introduce a config option to tweak MAX_SKB_FRAGS. Increasing the
     default value allows for better BIG TCP performances

   - Reduce compound page head access for zero-copy data transfers

   - RPS/RFS improvements, avoiding unneeded NET_RX_SOFTIRQ when
     possible

   - Threaded NAPI improvements, adding defer skb free support and
     unneeded softirq avoidance

   - Address dst_entry reference count scalability issues, via false
     sharing avoidance and optimize refcount tracking

   - Add lockless accesses annotation to sk_err[_soft]

   - Optimize again the skb struct layout

   - Extends the skb drop reasons to make it usable by multiple
     subsystems

   - Better const qualifier awareness for socket casts

  BPF:

   - Add skb and XDP typed dynptrs which allow BPF programs for more
     ergonomic and less brittle iteration through data and
     variable-sized accesses

   - Add a new BPF netfilter program type and minimal support to hook
     BPF programs to netfilter hooks such as prerouting or forward

   - Add more precise memory usage reporting for all BPF map types

   - Adds support for using {FOU,GUE} encap with an ipip device
     operating in collect_md mode and add a set of BPF kfuncs for
     controlling encap params

   - Allow BPF programs to detect at load time whether a particular
     kfunc exists or not, and also add support for this in light
     skeleton

   - Bigger batch of BPF verifier improvements to prepare for upcoming
     BPF open-coded iterators allowing for less restrictive looping
     capabilities

   - Rework RCU enforcement in the verifier, add kptr_rcu and enforce
     BPF programs to NULL-check before passing such pointers into kfunc

   - Add support for kptrs in percpu hashmaps, percpu LRU hashmaps and
     in local storage maps

   - Enable RCU semantics for task BPF kptrs and allow referenced kptr
     tasks to be stored in BPF maps

   - Add support for refcounted local kptrs to the verifier for allowing
     shared ownership, useful for adding a node to both the BPF list and
     rbtree

   - Add BPF verifier support for ST instructions in
     convert_ctx_access() which will help new -mcpu=v4 clang flag to
     start emitting them

   - Add ARM32 USDT support to libbpf

   - Improve bpftool's visual program dump which produces the control
     flow graph in a DOT format by adding C source inline annotations

  Protocols:

   - IPv4: Allow adding to IPv4 address a 'protocol' tag. Such value
     indicates the provenance of the IP address

   - IPv6: optimize route lookup, dropping unneeded R/W lock acquisition

   - Add the handshake upcall mechanism, allowing the user-space to
     implement generic TLS handshake on kernel's behalf

   - Bridge: support per-{Port, VLAN} neighbor suppression, increasing
     resilience to nodes failures

   - SCTP: add support for Fair Capacity and Weighted Fair Queueing
     schedulers

   - MPTCP: delay first subflow allocation up to its first usage. This
     will allow for later better LSM interaction

   - xfrm: Remove inner/outer modes from input/output path. These are
     not needed anymore

   - WiFi:
      - reduced neighbor report (RNR) handling for AP mode
      - HW timestamping support
      - support for randomized auth/deauth TA for PASN privacy
      - per-link debugfs for multi-link
      - TC offload support for mac80211 drivers
      - mac80211 mesh fast-xmit and fast-rx support
      - enable Wi-Fi 7 (EHT) mesh support

  Netfilter:

   - Add nf_tables 'brouting' support, to force a packet to be routed
     instead of being bridged

   - Update bridge netfilter and ovs conntrack helpers to handle IPv6
     Jumbo packets properly, i.e. fetch the packet length from
     hop-by-hop extension header. This is needed for BIT TCP support

   - The iptables 32bit compat interface isn't compiled in by default
     anymore

   - Move ip(6)tables builtin icmp matches to the udptcp one. This has
     the advantage that icmp/icmpv6 match doesn't load the
     iptables/ip6tables modules anymore when iptables-nft is used

   - Extended netlink error report for netdevice in flowtables and
     netdev/chains. Allow for incrementally add/delete devices to netdev
     basechain. Allow to create netdev chain without device

  Driver API:

   - Remove redundant Device Control Error Reporting Enable, as PCI core
     has already error reporting enabled at enumeration time

   - Move Multicast DB netlink handlers to core, allowing devices other
     then bridge to use them

   - Allow the page_pool to directly recycle the pages from safely
     localized NAPI

   - Implement lockless TX queue stop/wake combo macros, allowing for
     further code de-duplication and sanitization

   - Add YNL support for user headers and struct attrs

   - Add partial YNL specification for devlink

   - Add partial YNL specification for ethtool

   - Add tc-mqprio and tc-taprio support for preemptible traffic classes

   - Add tx push buf len param to ethtool, specifies the maximum number
     of bytes of a transmitted packet a driver can push directly to the
     underlying device

   - Add basic LED support for switch/phy

   - Add NAPI documentation, stop relaying on external links

   - Convert dsa_master_ioctl() to netdev notifier. This is a
     preparatory work to make the hardware timestamping layer selectable
     by user space

   - Add transceiver support and improve the error messages for CAN-FD
     controllers

  New hardware / drivers:

   - Ethernet:
      - AMD/Pensando core device support
      - MediaTek MT7981 SoC
      - MediaTek MT7988 SoC
      - Broadcom BCM53134 embedded switch
      - Texas Instruments CPSW9G ethernet switch
      - Qualcomm EMAC3 DWMAC ethernet
      - StarFive JH7110 SoC
      - NXP CBTX ethernet PHY

   - WiFi:
      - Apple M1 Pro/Max devices
      - RealTek rtl8710bu/rtl8188gu
      - RealTek rtl8822bs, rtl8822cs and rtl8821cs SDIO chipset

   - Bluetooth:
      - Realtek RTL8821CS, RTL8851B, RTL8852BS
      - Mediatek MT7663, MT7922
      - NXP w8997
      - Actions Semi ATS2851
      - QTI WCN6855
      - Marvell 88W8997

   - Can:
      - STMicroelectronics bxcan stm32f429

  Drivers:

   - Ethernet NICs:
      - Intel (1G, icg):
         - add tracking and reporting of QBV config errors
         - add support for configuring max SDU for each Tx queue
      - Intel (100G, ice):
         - refactor mailbox overflow detection to support Scalable IOV
         - GNSS interface optimization
      - Intel (i40e):
         - support XDP multi-buffer
      - nVidia/Mellanox:
         - add the support for linux bridge multicast offload
         - enable TC offload for egress and engress MACVLAN over bond
         - add support for VxLAN GBP encap/decap flows offload
         - extend packet offload to fully support libreswan
         - support tunnel mode in mlx5 IPsec packet offload
         - extend XDP multi-buffer support
         - support MACsec VLAN offload
         - add support for dynamic msix vectors allocation
         - drop RX page_cache and fully use page_pool
         - implement thermal zone to report NIC temperature
      - Netronome/Corigine:
         - add support for multi-zone conntrack offload
      - Solarflare/Xilinx:
         - support offloading TC VLAN push/pop actions to the MAE
         - support TC decap rules
         - support unicast PTP

   - Other NICs:
      - Broadcom (bnxt): enforce software based freq adjustments only on
        shared PHC NIC
      - RealTek (r8169): refactor to addess ASPM issues during NAPI poll
      - Micrel (lan8841): add support for PTP_PF_PEROUT
      - Cadence (macb): enable PTP unicast
      - Engleder (tsnep): add XDP socket zero-copy support
      - virtio-net: implement exact header length guest feature
      - veth: add page_pool support for page recycling
      - vxlan: add MDB data path support
      - gve: add XDP support for GQI-QPL format
      - geneve: accept every ethertype
      - macvlan: allow some packets to bypass broadcast queue
      - mana: add support for jumbo frame

   - Ethernet high-speed switches:
      - Microchip (sparx5): Add support for TC flower templates

   - Ethernet embedded switches:
      - Broadcom (b54):
         - configure 6318 and 63268 RGMII ports
      - Marvell (mv88e6xxx):
         - faster C45 bus scan
      - Microchip:
         - lan966x:
            - add support for IS1 VCAP
            - better TX/RX from/to CPU performances
         - ksz9477: add ETS Qdisc support
         - ksz8: enhance static MAC table operations and error handling
         - sama7g5: add PTP capability
      - NXP (ocelot):
         - add support for external ports
         - add support for preemptible traffic classes
      - Texas Instruments:
         - add CPSWxG SGMII support for J7200 and J721E

   - Intel WiFi (iwlwifi):
      - preparation for Wi-Fi 7 EHT and multi-link support
      - EHT (Wi-Fi 7) sniffer support
      - hardware timestamping support for some devices/firwmares
      - TX beacon protection on newer hardware

   - Qualcomm 802.11ax WiFi (ath11k):
      - MU-MIMO parameters support
      - ack signal support for management packets

   - RealTek WiFi (rtw88):
      - SDIO bus support
      - better support for some SDIO devices (e.g. MAC address from
        efuse)

   - RealTek WiFi (rtw89):
      - HW scan support for 8852b
      - better support for 6 GHz scanning
      - support for various newer firmware APIs
      - framework firmware backwards compatibility

   - MediaTek WiFi (mt76):
      - P2P support
      - mesh A-MSDU support
      - EHT (Wi-Fi 7) support
      - coredump support"

* tag 'net-next-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2078 commits)
  net: phy: hide the PHYLIB_LEDS knob
  net: phy: marvell-88x2222: remove unnecessary (void*) conversions
  tcp/udp: Fix memleaks of sk and zerocopy skbs with TX timestamp.
  net: amd: Fix link leak when verifying config failed
  net: phy: marvell: Fix inconsistent indenting in led_blink_set
  lan966x: Don't use xdp_frame when action is XDP_TX
  tsnep: Add XDP socket zero-copy TX support
  tsnep: Add XDP socket zero-copy RX support
  tsnep: Move skb receive action to separate function
  tsnep: Add functions for queue enable/disable
  tsnep: Rework TX/RX queue initialization
  tsnep: Replace modulo operation with mask
  net: phy: dp83867: Add led_brightness_set support
  net: phy: Fix reading LED reg property
  drivers: nfc: nfcsim: remove return value check of `dev_dir`
  net: phy: dp83867: Remove unnecessary (void*) conversions
  net: ethtool: coalesce: try to make user settings stick twice
  net: mana: Check if netdev/napi_alloc_frag returns single page
  net: mana: Rename mana_refill_rxoob and remove some empty lines
  net: veth: add page_pool stats
  ...
This commit is contained in:
Linus Torvalds
2023-04-26 16:07:23 -07:00
1925 changed files with 138826 additions and 47261 deletions

View File

@@ -76,8 +76,8 @@ struct {
/*
* The trace events for cpu_idle and cpu_frequency are taken from:
* /sys/kernel/debug/tracing/events/power/cpu_idle/format
* /sys/kernel/debug/tracing/events/power/cpu_frequency/format
* /sys/kernel/tracing/events/power/cpu_idle/format
* /sys/kernel/tracing/events/power/cpu_frequency/format
*
* These two events have same format, so define one common structure.
*/

View File

@@ -65,7 +65,7 @@ static void Usage(void);
static void read_trace_pipe2(void);
static void do_error(char *msg, bool errno_flag);
#define DEBUGFS "/sys/kernel/debug/tracing/"
#define TRACEFS "/sys/kernel/tracing/"
static struct bpf_program *bpf_prog;
static struct bpf_object *obj;
@@ -77,7 +77,7 @@ static void read_trace_pipe2(void)
FILE *outf;
char *outFname = "hbm_out.log";
trace_fd = open(DEBUGFS "trace_pipe", O_RDONLY, 0);
trace_fd = open(TRACEFS "trace_pipe", O_RDONLY, 0);
if (trace_fd < 0) {
printf("Error opening trace_pipe\n");
return;
@@ -315,6 +315,7 @@ static int run_bpf_prog(char *prog, int cg_id)
fout = fopen(fname, "w");
fprintf(fout, "id:%d\n", cg_id);
fprintf(fout, "ERROR: Could not lookup queue_stats\n");
fclose(fout);
} else if (stats_flag && qstats.lastPacketTime >
qstats.firstPacketTime) {
long long delta_us = (qstats.lastPacketTime -

View File

@@ -39,8 +39,8 @@ struct {
/* Taken from the current format defined in
* include/trace/events/ib_umad.h
* and
* /sys/kernel/debug/tracing/events/ib_umad/ib_umad_read/format
* /sys/kernel/debug/tracing/events/ib_umad/ib_umad_write/format
* /sys/kernel/tracing/events/ib_umad/ib_umad_read/format
* /sys/kernel/tracing/events/ib_umad/ib_umad_write/format
*/
struct ib_umad_rw_args {
u64 pad;

View File

@@ -5,7 +5,7 @@ NS1=lwt_ns1
VETH0=tst_lwt1a
VETH1=tst_lwt1b
BPF_PROG=lwt_len_hist.bpf.o
TRACE_ROOT=/sys/kernel/debug/tracing
TRACE_ROOT=/sys/kernel/tracing
function cleanup {
# To reset saved histogram, remove pinned map

View File

@@ -110,7 +110,7 @@ static inline int update_counts(void *ctx, u32 pid, u64 delta)
}
#if 1
/* taken from /sys/kernel/debug/tracing/events/sched/sched_switch/format */
/* taken from /sys/kernel/tracing/events/sched/sched_switch/format */
struct sched_switch_args {
unsigned long long pad;
char prev_comm[TASK_COMM_LEN];

View File

@@ -21,10 +21,10 @@
#define DEFAULT_FREQ 99
#define DEFAULT_SECS 5
#define MAX_IPS 8192
#define PAGE_OFFSET 0xffff880000000000
static int map_fd;
static int nr_cpus;
static long _text_addr;
static void usage(void)
{
@@ -108,7 +108,7 @@ static void print_ip_map(int fd)
/* sort and print */
qsort(counts, max, sizeof(struct ipcount), count_cmp);
for (i = 0; i < max; i++) {
if (counts[i].ip > PAGE_OFFSET) {
if (counts[i].ip > _text_addr) {
sym = ksym_search(counts[i].ip);
if (!sym) {
printf("ksym not found. Is kallsyms loaded?\n");
@@ -169,6 +169,13 @@ int main(int argc, char **argv)
return 2;
}
/* used to determine whether the address is kernel space */
_text_addr = ksym_get_addr("_text");
if (!_text_addr) {
fprintf(stderr, "ERROR: no '_text' in /proc/kallsyms\n");
return 3;
}
/* create perf FDs for each CPU */
nr_cpus = sysconf(_SC_NPROCESSORS_ONLN);
links = calloc(nr_cpus, sizeof(struct bpf_link *));

View File

@@ -235,7 +235,7 @@ static int test_debug_fs_uprobe(char *binary_path, long offset, bool is_return)
struct bpf_link *link;
ssize_t bytes;
snprintf(buf, sizeof(buf), "/sys/kernel/debug/tracing/%s_events",
snprintf(buf, sizeof(buf), "/sys/kernel/tracing/%s_events",
event_type);
kfd = open(buf, O_WRONLY | O_TRUNC, 0);
CHECK_PERROR_RET(kfd < 0);
@@ -252,7 +252,7 @@ static int test_debug_fs_uprobe(char *binary_path, long offset, bool is_return)
close(kfd);
kfd = -1;
snprintf(buf, sizeof(buf), "/sys/kernel/debug/tracing/events/%ss/%s/id",
snprintf(buf, sizeof(buf), "/sys/kernel/tracing/events/%ss/%s/id",
event_type, event_alias);
efd = open(buf, O_RDONLY, 0);
CHECK_PERROR_RET(efd < 0);

View File

@@ -21,7 +21,7 @@ IP_LOCAL="192.168.99.1"
PROG_SRC="test_lwt_bpf.c"
BPF_PROG="test_lwt_bpf.o"
TRACE_ROOT=/sys/kernel/debug/tracing
TRACE_ROOT=/sys/kernel/tracing
CONTEXT_INFO=$(cat ${TRACE_ROOT}/trace_options | grep context)
function lookup_mac()

View File

@@ -7,7 +7,7 @@
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>
/* from /sys/kernel/debug/tracing/events/task/task_rename/format */
/* from /sys/kernel/tracing/events/task/task_rename/format */
struct task_rename {
__u64 pad;
__u32 pid;
@@ -21,7 +21,7 @@ int prog(struct task_rename *ctx)
return 0;
}
/* from /sys/kernel/debug/tracing/events/fib/fib_table_lookup/format */
/* from /sys/kernel/tracing/events/fib/fib_table_lookup/format */
struct fib_table_lookup {
__u64 pad;
__u32 tb_id;