Merge tag 'rcu.2023.02.10a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu

Pull RCU updates from Paul McKenney: - Documentation updates - Miscellaneous fixes, perhaps most notably: - Throttling callback invocation based on the number of callbacks that are now ready to invoke instead of on the total number of callbacks - Several patches that suppress false-positive boot-time diagnostics, for example, due to lockdep not yet being initialized - Make expedited RCU CPU stall warnings dump stacks of any tasks that are blocking the stalled grace period. (Normal RCU CPU stall warnings have done this for many years) - Lazy-callback fixes to avoid delays during boot, suspend, and resume. (Note that lazy callbacks must be explicitly enabled, so this should not (yet) affect production use cases) - Make kfree_rcu() and friends take advantage of polled grace periods, thus reducing memory footprint by almost two orders of magnitude, admittedly on a microbenchmark This also begins the transition from kfree_rcu(p) to kfree_rcu_mightsleep(p). This transition was motivated by bugs where kfree_rcu(p), which can block, was typed instead of the intended kfree_rcu(p, rh) - SRCU updates, perhaps most notably fixing a bug that causes SRCU to fail when booted on a system with a non-zero boot CPU. This surprising situation actually happens for kdump kernels on the powerpc architecture This also adds an srcu_down_read() and srcu_up_read(), which act like srcu_read_lock() and srcu_read_unlock(), but allow an SRCU read-side critical section to be handed off from one task to another - Clean up the now-useless SRCU Kconfig option There are a few more commits that are not yet acked or pulled into maintainer trees, and these will be in a pull request for a later merge window - RCU-tasks updates, perhaps most notably these fixes: - A strange interaction between PID-namespace unshare and the RCU-tasks grace period that results in a low-probability but very real hang - A race between an RCU tasks rude grace period on a single-CPU system and CPU-hotplug addition of the second CPU that can result in a too-short grace period - A race between shrinking RCU tasks down to a single callback list and queuing a new callback to some other CPU, but where that queuing is delayed for more than an RCU grace period. This can result in that callback being stranded on the non-boot CPU - Torture-test updates and fixes - Torture-test scripting updates and fixes - Provide additional RCU CPU stall-warning information in kernels built with CONFIG_RCU_CPU_STALL_CPUTIME=y, and restore the full five-minute timeout limit for expedited RCU CPU stall warnings * tag 'rcu.2023.02.10a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (80 commits) rcu/kvfree: Add kvfree_rcu_mightsleep() and kfree_rcu_mightsleep() kernel/notifier: Remove CONFIG_SRCU init: Remove "select SRCU" fs/quota: Remove "select SRCU" fs/notify: Remove "select SRCU" fs/btrfs: Remove "select SRCU" fs: Remove CONFIG_SRCU drivers/pci/controller: Remove "select SRCU" drivers/net: Remove "select SRCU" drivers/md: Remove "select SRCU" drivers/hwtracing/stm: Remove "select SRCU" drivers/dax: Remove "select SRCU" drivers/base: Remove CONFIG_SRCU rcu: Disable laziness if lazy-tracking says so rcu: Track laziness during boot and suspend rcu: Remove redundant call to rcu_boost_kthread_setaffinity() rcu: Allow up to five minutes expedited RCU CPU stall-warning timeouts rcu: Align the output of RCU CPU stall warning messages rcu: Add RCU stall diagnosis information sched: Add helper nr_context_switches_cpu() ...
2025-07-24 05:01:03 +02:00 · 2023-02-21 10:45:51 -08:00
parent 8ca8d89b43 bba8d3d17d
commit 8cc01d43f8
54 changed files with 1739 additions and 844 deletions
--- a/Documentation/RCU/NMI-RCU.rst
+++ b/Documentation/RCU/NMI-RCU.rst
@@ -8,7 +8,7 @@ Although RCU is usually used to protect read-mostly data structures,
 it is possible to use RCU to provide dynamic non-maskable interrupt
 handlers, as well as dynamic irq handlers.  This document describes
 how to do this, drawing loosely from Zwane Mwaikambo's NMI-timer
-work in "arch/x86/kernel/traps.c".
+work in an old version of "arch/x86/kernel/traps.c".
 The relevant pieces of code are listed below, each followed by a
 brief explanation::
@@ -116,7 +116,7 @@ Answer to Quick Quiz:
 	This same sad story can happen on other CPUs when using
 	a compiler with aggressive pointer-value speculation
-	optimizations.
+	optimizations.  (But please don't!)
 	More important, the rcu_dereference_sched() makes it
 	clear to someone reading the code that the pointer is
--- a/Documentation/RCU/UP.rst
+++ b/Documentation/RCU/UP.rst
@@ -38,7 +38,7 @@ by having call_rcu() directly invoke its arguments only if it was called
 from process context.  However, this can fail in a similar manner.
 Suppose that an RCU-based algorithm again scans a linked list containing
-elements A, B, and C in process contexts, but that it invokes a function
+elements A, B, and C in process context, but that it invokes a function
 on each element as it is scanned.  Suppose further that this function
 deletes element B from the list, then passes it to call_rcu() for deferred
 freeing.  This may be a bit unconventional, but it is perfectly legal
@@ -59,7 +59,8 @@ Example 3: Death by Deadlock
 Suppose that call_rcu() is invoked while holding a lock, and that the
 callback function must acquire this same lock.  In this case, if
 call_rcu() were to directly invoke the callback, the result would
-be self-deadlock.
+be self-deadlock *even if* this invocation occurred from a later
 call_rcu() invocation a full grace period later.
 In some cases, it would possible to restructure to code so that
 the call_rcu() is delayed until after the lock is released.  However,
@@ -85,6 +86,14 @@ Quick Quiz #2:
 :ref:`Answers to Quick Quiz <answer_quick_quiz_up>`
 It is important to note that userspace RCU implementations *do*
 permit call_rcu() to directly invoke callbacks, but only if a full
 grace period has elapsed since those callbacks were queued.  This is
 the case because some userspace environments are extremely constrained.
 Nevertheless, people writing userspace RCU implementations are strongly
 encouraged to avoid invoking callbacks from call_rcu(), thus obtaining
 the deadlock-avoidance benefits called out above.
 Summary
 -------
--- a/Documentation/RCU/lockdep.rst
+++ b/Documentation/RCU/lockdep.rst
@@ -69,9 +69,8 @@ checking of rcu_dereference() primitives:
 		value of the pointer itself, for example, against NULL.
 The rcu_dereference_check() check expression can be any boolean
-expression, but would normally include a lockdep expression.  However,
+expression, but would normally include a lockdep expression.  For a
-any boolean expression can be used.  For a moderately ornate example,
+moderately ornate example, consider the following::
 consider the following::
 	file = rcu_dereference_check(fdt->fd[fd],
 				     lockdep_is_held(&files->file_lock) ||
@@ -97,10 +96,10 @@ code, it could instead be written as follows::
 					 atomic_read(&files->count) == 1);
 This would verify cases #2 and #3 above, and furthermore lockdep would
-complain if this was used in an RCU read-side critical section unless one
+complain even if this was used in an RCU read-side critical section unless
-of these two cases held.  Because rcu_dereference_protected() omits all
+one of these two cases held.  Because rcu_dereference_protected() omits
-barriers and compiler constraints, it generates better code than do the
+all barriers and compiler constraints, it generates better code than do
-other flavors of rcu_dereference().  On the other hand, it is illegal
+the other flavors of rcu_dereference().  On the other hand, it is illegal
 to use rcu_dereference_protected() if either the RCU-protected pointer
 or the RCU-protected data that it points to can change concurrently.
--- a/Documentation/RCU/rcu.rst
+++ b/Documentation/RCU/rcu.rst
@@ -77,15 +77,17 @@ Frequently Asked Questions
  search for the string "Patent" in Documentation/RCU/RTFP.txt to find them.
  Of these, one was allowed to lapse by the assignee, and the
  others have been contributed to the Linux kernel under GPL.
  Many (but not all) have long since expired.
  There are now also LGPL implementations of user-level RCU
  available (https://liburcu.org/).
 - I hear that RCU needs work in order to support realtime kernels?
-  Realtime-friendly RCU can be enabled via the CONFIG_PREEMPT_RCU
+  Realtime-friendly RCU are enabled via the CONFIG_PREEMPTION
  kernel configuration parameter.
 - Where can I find more information on RCU?
  See the Documentation/RCU/RTFP.txt file.
-  Or point your browser at (http://www.rdrop.com/users/paulmck/RCU/).
+  Or point your browser at (https://docs.google.com/document/d/1X0lThx8OK0ZgLMqVoXiR4ZrGURHrXK6NyLRbeXe3Xac/edit)
  or (https://docs.google.com/document/d/1GCdQC8SDbb54W1shjEXqGZ0Rq8a6kIeYutdSIajfpLA/edit?usp=sharing).
--- a/Documentation/RCU/rcu_dereference.rst
+++ b/Documentation/RCU/rcu_dereference.rst
@@ -19,8 +19,9 @@ Follow these rules to keep your RCU code working properly:
 	can reload the value, and won't your code have fun with two
 	different values for a single pointer!  Without rcu_dereference(),
 	DEC Alpha can load a pointer, dereference that pointer, and
-	return data preceding initialization that preceded the store of
+	return data preceding initialization that preceded the store
-	the pointer.
+	of the pointer.  (As noted later, in recent kernels READ_ONCE()
 	also prevents DEC Alpha from playing these tricks.)
 	In addition, the volatile cast in rcu_dereference() prevents the
 	compiler from deducing the resulting pointer value.  Please see
@@ -34,7 +35,7 @@ Follow these rules to keep your RCU code working properly:
 	takes on the role of the lockless_dereference() primitive that
 	was removed in v4.15.
-	You are only permitted to use rcu_dereference on pointer values.
+-	You are only permitted to use rcu_dereference() on pointer values.
 	The compiler simply knows too much about integral values to
 	trust it to carry dependencies through integer operations.
 	There are a very few exceptions, namely that you can temporarily
@@ -240,6 +241,7 @@ precautions.  To see this, consider the following code fragment::
 		struct foo *q;
 		int r1, r2;
 		rcu_read_lock();
 		p = rcu_dereference(gp2);
 		if (p == NULL)
 			return;
@@ -248,7 +250,10 @@ precautions.  To see this, consider the following code fragment::
 		if (p == q) {
 			/* The compiler decides that q->c is same as p->c. */
 			r2 = p->c; /* Could get 44 on weakly order system. */
 		} else {
 			r2 = p->c - r1; /* Unconditional access to p->c. */
 		}
 		rcu_read_unlock();
 		do_something_with(r1, r2);
 	}
@@ -297,6 +302,7 @@ Then one approach is to use locking, for example, as follows::
 		struct foo *q;
 		int r1, r2;
 		rcu_read_lock();
 		p = rcu_dereference(gp2);
 		if (p == NULL)
 			return;
@@ -306,7 +312,12 @@ Then one approach is to use locking, for example, as follows::
 		if (p == q) {
 			/* The compiler decides that q->c is same as p->c. */
 			r2 = p->c; /* Locking guarantees r2 == 144. */
 		} else {
 			spin_lock(&q->lock);
 			r2 = q->c - r1;
 			spin_unlock(&q->lock);
 		}
 		rcu_read_unlock();
 		spin_unlock(&p->lock);
 		do_something_with(r1, r2);
 	}
@@ -364,7 +375,7 @@ the exact value of "p" even in the not-equals case.  This allows the
 compiler to make the return values independent of the load from "gp",
 in turn destroying the ordering between this load and the loads of the
 return values.  This can result in "p->b" returning pre-initialization
-garbage values.
+garbage values on weakly ordered systems.
 In short, rcu_dereference() is *not* optional when you are going to
 dereference the resulting pointer.
@@ -430,7 +441,7 @@ member of the rcu_dereference() to use in various situations:
 SPARSE CHECKING OF RCU-PROTECTED POINTERS
 -----------------------------------------
-The sparse static-analysis tool checks for direct access to RCU-protected
+The sparse static-analysis tool checks for non-RCU access to RCU-protected
 pointers, which can result in "interesting" bugs due to compiler
 optimizations involving invented loads and perhaps also load tearing.
 For example, suppose someone mistakenly does something like this::
--- a/Documentation/RCU/rcubarrier.rst
+++ b/Documentation/RCU/rcubarrier.rst
@@ -5,37 +5,12 @@ RCU and Unloadable Modules
 [Originally published in LWN Jan. 14, 2007: http://lwn.net/Articles/217484/]
-RCU (read-copy update) is a synchronization mechanism that can be thought
+RCU updaters sometimes use call_rcu() to initiate an asynchronous wait for
-of as a replacement for read-writer locking (among other things), but with
+a grace period to elapse.  This primitive takes a pointer to an rcu_head
-very low-overhead readers that are immune to deadlock, priority inversion,
+struct placed within the RCU-protected data structure and another pointer
-and unbounded latency. RCU read-side critical sections are delimited
+to a function that may be invoked later to free that structure. Code to
-by rcu_read_lock() and rcu_read_unlock(), which, in non-CONFIG_PREEMPTION
+delete an element p from the linked list from IRQ context might then be
-kernels, generate no code whatsoever.
+as follows::
 This means that RCU writers are unaware of the presence of concurrent
 readers, so that RCU updates to shared data must be undertaken quite
 carefully, leaving an old version of the data structure in place until all
 pre-existing readers have finished. These old versions are needed because
 such readers might hold a reference to them. RCU updates can therefore be
 rather expensive, and RCU is thus best suited for read-mostly situations.
 How can an RCU writer possibly determine when all readers are finished,
 given that readers might well leave absolutely no trace of their
 presence? There is a synchronize_rcu() primitive that blocks until all
 pre-existing readers have completed. An updater wishing to delete an
 element p from a linked list might do the following, while holding an
 appropriate lock, of course::
 	list_del_rcu(p);
 	synchronize_rcu();
 	kfree(p);
 But the above code cannot be used in IRQ context -- the call_rcu()
 primitive must be used instead. This primitive takes a pointer to an
 rcu_head struct placed within the RCU-protected data structure and
 another pointer to a function that may be invoked later to free that
 structure. Code to delete an element p from the linked list from IRQ
 context might then be as follows::
 	list_del_rcu(p);
 	call_rcu(&p->rcu, p_callback);
@@ -54,7 +29,7 @@ IRQ context. The function p_callback() might be defined as follows::
 Unloading Modules That Use call_rcu()
 -------------------------------------
-But what if p_callback is defined in an unloadable module?
+But what if the p_callback() function is defined in an unloadable module?
 If we unload the module while some RCU callbacks are pending,
 the CPUs executing these callbacks are going to be severely
@@ -67,20 +42,21 @@ grace period to elapse, it does not wait for the callbacks to complete.
 One might be tempted to try several back-to-back synchronize_rcu()
 calls, but this is still not guaranteed to work. If there is a very
-heavy RCU-callback load, then some of the callbacks might be deferred
+heavy RCU-callback load, then some of the callbacks might be deferred in
-in order to allow other processing to proceed. Such deferral is required
+order to allow other processing to proceed. For but one example, such
-in realtime kernels in order to avoid excessive scheduling latencies.
+deferral is required in realtime kernels in order to avoid excessive
 scheduling latencies.
 rcu_barrier()
 -------------
-We instead need the rcu_barrier() primitive.  Rather than waiting for
+This situation can be handled by the rcu_barrier() primitive.  Rather
-a grace period to elapse, rcu_barrier() waits for all outstanding RCU
+than waiting for a grace period to elapse, rcu_barrier() waits for all
-callbacks to complete.  Please note that rcu_barrier() does **not** imply
+outstanding RCU callbacks to complete.  Please note that rcu_barrier()
-synchronize_rcu(), in particular, if there are no RCU callbacks queued
+does **not** imply synchronize_rcu(), in particular, if there are no RCU
-anywhere, rcu_barrier() is within its rights to return immediately,
+callbacks queued anywhere, rcu_barrier() is within its rights to return
-without waiting for a grace period to elapse.
+immediately, without waiting for anything, let alone a grace period.
 Pseudo-code using rcu_barrier() is as follows:
@@ -89,83 +65,86 @@ Pseudo-code using rcu_barrier() is as follows:
   3. Allow the module to be unloaded.
 There is also an srcu_barrier() function for SRCU, and you of course
-must match the flavor of rcu_barrier() with that of call_rcu().  If your
+must match the flavor of srcu_barrier() with that of call_srcu().
-module uses multiple flavors of call_rcu(), then it must also use multiple
+If your module uses multiple srcu_struct structures, then it must also
-flavors of rcu_barrier() when unloading that module.  For example, if
+use multiple invocations of srcu_barrier() when unloading that module.
-it uses call_rcu(), call_srcu() on srcu_struct_1, and call_srcu() on
+For example, if it uses call_rcu(), call_srcu() on srcu_struct_1, and
-srcu_struct_2, then the following three lines of code will be required
+call_srcu() on srcu_struct_2, then the following three lines of code
-when unloading::
+will be required when unloading::
- 1 rcu_barrier();
+  1  rcu_barrier();
- 2 srcu_barrier(&srcu_struct_1);
+  2  srcu_barrier(&srcu_struct_1);
- 3 srcu_barrier(&srcu_struct_2);
+  3  srcu_barrier(&srcu_struct_2);
-The rcutorture module makes use of rcu_barrier() in its exit function
+If latency is of the essence, workqueues could be used to run these
-as follows::
+three functions concurrently.
- 1  static void
+An ancient version of the rcutorture module makes use of rcu_barrier()
- 2  rcu_torture_cleanup(void)
+in its exit function as follows::
- 3  {
+
- 4    int i;
+  1  static void
- 5
+  2  rcu_torture_cleanup(void)
- 6    fullstop = 1;
+  3  {
- 7    if (shuffler_task != NULL) {
+  4    int i;
- 8     VERBOSE_PRINTK_STRING("Stopping rcu_torture_shuffle task");
+  5
- 9     kthread_stop(shuffler_task);
+  6    fullstop = 1;
- 10   }
+  7    if (shuffler_task != NULL) {
- 11   shuffler_task = NULL;
+  8      VERBOSE_PRINTK_STRING("Stopping rcu_torture_shuffle task");
  9      kthread_stop(shuffler_task);
 10    }
 11    shuffler_task = NULL;
 12
- 13   if (writer_task != NULL) {
+ 13    if (writer_task != NULL) {
- 14     VERBOSE_PRINTK_STRING("Stopping rcu_torture_writer task");
+ 14      VERBOSE_PRINTK_STRING("Stopping rcu_torture_writer task");
- 15     kthread_stop(writer_task);
+ 15      kthread_stop(writer_task);
- 16   }
+ 16    }
- 17   writer_task = NULL;
+ 17    writer_task = NULL;
 18
- 19   if (reader_tasks != NULL) {
+ 19    if (reader_tasks != NULL) {
- 20     for (i = 0; i < nrealreaders; i++) {
+ 20      for (i = 0; i < nrealreaders; i++) {
- 21       if (reader_tasks[i] != NULL) {
+ 21        if (reader_tasks[i] != NULL) {
- 22         VERBOSE_PRINTK_STRING(
+ 22          VERBOSE_PRINTK_STRING(
- 23           "Stopping rcu_torture_reader task");
+ 23            "Stopping rcu_torture_reader task");
- 24         kthread_stop(reader_tasks[i]);
+ 24          kthread_stop(reader_tasks[i]);
- 25       }
+ 25        }
- 26       reader_tasks[i] = NULL;
+ 26        reader_tasks[i] = NULL;
- 27     }
+ 27      }
- 28     kfree(reader_tasks);
+ 28      kfree(reader_tasks);
- 29     reader_tasks = NULL;
+ 29      reader_tasks = NULL;
- 30   }
+ 30    }
- 31   rcu_torture_current = NULL;
+ 31    rcu_torture_current = NULL;
 32
- 33   if (fakewriter_tasks != NULL) {
+ 33    if (fakewriter_tasks != NULL) {
- 34     for (i = 0; i < nfakewriters; i++) {
+ 34      for (i = 0; i < nfakewriters; i++) {
- 35       if (fakewriter_tasks[i] != NULL) {
+ 35        if (fakewriter_tasks[i] != NULL) {
- 36         VERBOSE_PRINTK_STRING(
+ 36          VERBOSE_PRINTK_STRING(
- 37           "Stopping rcu_torture_fakewriter task");
+ 37            "Stopping rcu_torture_fakewriter task");
- 38         kthread_stop(fakewriter_tasks[i]);
+ 38          kthread_stop(fakewriter_tasks[i]);
- 39       }
+ 39        }
- 40       fakewriter_tasks[i] = NULL;
+ 40        fakewriter_tasks[i] = NULL;
- 41     }
+ 41      }
- 42     kfree(fakewriter_tasks);
+ 42      kfree(fakewriter_tasks);
- 43     fakewriter_tasks = NULL;
+ 43      fakewriter_tasks = NULL;
- 44   }
+ 44    }
 45
- 46   if (stats_task != NULL) {
+ 46    if (stats_task != NULL) {
- 47     VERBOSE_PRINTK_STRING("Stopping rcu_torture_stats task");
+ 47      VERBOSE_PRINTK_STRING("Stopping rcu_torture_stats task");
- 48     kthread_stop(stats_task);
+ 48      kthread_stop(stats_task);
- 49   }
+ 49    }
- 50   stats_task = NULL;
+ 50    stats_task = NULL;
 51
- 52   /* Wait for all RCU callbacks to fire. */
+ 52    /* Wait for all RCU callbacks to fire. */
- 53   rcu_barrier();
+ 53    rcu_barrier();
 54
- 55   rcu_torture_stats_print(); /* -After- the stats thread is stopped! */
+ 55    rcu_torture_stats_print(); /* -After- the stats thread is stopped! */
 56
- 57   if (cur_ops->cleanup != NULL)
+ 57    if (cur_ops->cleanup != NULL)
- 58     cur_ops->cleanup();
+ 58      cur_ops->cleanup();
- 59   if (atomic_read(&n_rcu_torture_error))
+ 59    if (atomic_read(&n_rcu_torture_error))
- 60     rcu_torture_print_module_parms("End of test: FAILURE");
+ 60      rcu_torture_print_module_parms("End of test: FAILURE");
- 61   else
+ 61    else
- 62     rcu_torture_print_module_parms("End of test: SUCCESS");
+ 62      rcu_torture_print_module_parms("End of test: SUCCESS");
- 63 }
+ 63  }
 Line 6 sets a global variable that prevents any RCU callbacks from
 re-posting themselves. This will not be necessary in most cases, since
@@ -190,16 +169,17 @@ Quick Quiz #1:
 :ref:`Answer to Quick Quiz #1 <answer_rcubarrier_quiz_1>`
 Your module might have additional complications. For example, if your
-module invokes call_rcu() from timers, you will need to first cancel all
+module invokes call_rcu() from timers, you will need to first refrain
-the timers, and only then invoke rcu_barrier() to wait for any remaining
+from posting new timers, cancel (or wait for) all the already-posted
 timers, and only then invoke rcu_barrier() to wait for any remaining
 RCU callbacks to complete.
-Of course, if you module uses call_rcu(), you will need to invoke
+Of course, if your module uses call_rcu(), you will need to invoke
 rcu_barrier() before unloading.  Similarly, if your module uses
 call_srcu(), you will need to invoke srcu_barrier() before unloading,
 and on the same srcu_struct structure.  If your module uses call_rcu()
-**and** call_srcu(), then you will need to invoke rcu_barrier() **and**
+**and** call_srcu(), then (as noted above) you will need to invoke
-srcu_barrier().
+rcu_barrier() **and** srcu_barrier().
 Implementing rcu_barrier()
@@ -211,27 +191,40 @@ queues. His implementation queues an RCU callback on each of the per-CPU
 callback queues, and then waits until they have all started executing, at
 which point, all earlier RCU callbacks are guaranteed to have completed.
-The original code for rcu_barrier() was as follows::
+The original code for rcu_barrier() was roughly as follows::
- 1  void rcu_barrier(void)
+  1  void rcu_barrier(void)
- 2  {
+  2  {
- 3    BUG_ON(in_interrupt());
+  3    BUG_ON(in_interrupt());
- 4    /* Take cpucontrol mutex to protect against CPU hotplug */
+  4    /* Take cpucontrol mutex to protect against CPU hotplug */
- 5    mutex_lock(&rcu_barrier_mutex);
+  5    mutex_lock(&rcu_barrier_mutex);
- 6    init_completion(&rcu_barrier_completion);
+  6    init_completion(&rcu_barrier_completion);
- 7    atomic_set(&rcu_barrier_cpu_count, 0);
+  7    atomic_set(&rcu_barrier_cpu_count, 1);
- 8    on_each_cpu(rcu_barrier_func, NULL, 0, 1);
+  8    on_each_cpu(rcu_barrier_func, NULL, 0, 1);
- 9    wait_for_completion(&rcu_barrier_completion);
+  9    if (atomic_dec_and_test(&rcu_barrier_cpu_count))
- 10   mutex_unlock(&rcu_barrier_mutex);
+ 10      complete(&rcu_barrier_completion);
- 11 }
+ 11    wait_for_completion(&rcu_barrier_completion);
 12    mutex_unlock(&rcu_barrier_mutex);
 13  }
-Line 3 verifies that the caller is in process context, and lines 5 and 10
+Line 3 verifies that the caller is in process context, and lines 5 and 12
 use rcu_barrier_mutex to ensure that only one rcu_barrier() is using the
 global completion and counters at a time, which are initialized on lines
 6 and 7. Line 8 causes each CPU to invoke rcu_barrier_func(), which is
 shown below. Note that the final "1" in on_each_cpu()'s argument list
 ensures that all the calls to rcu_barrier_func() will have completed
-before on_each_cpu() returns. Line 9 then waits for the completion.
+before on_each_cpu() returns. Line 9 removes the initial count from
 rcu_barrier_cpu_count, and if this count is now zero, line 10 finalizes
 the completion, which prevents line 11 from blocking.  Either way,
 line 11 then waits (if needed) for the completion.
 .. _rcubarrier_quiz_2:
 Quick Quiz #2:
 	Why doesn't line 8 initialize rcu_barrier_cpu_count to zero,
 	thereby avoiding the need for lines 9 and 10?
 :ref:`Answer to Quick Quiz #2 <answer_rcubarrier_quiz_2>`
 This code was rewritten in 2008 and several times thereafter, but this
 still gives the general idea.
@@ -239,21 +232,21 @@ still gives the general idea.
 The rcu_barrier_func() runs on each CPU, where it invokes call_rcu()
 to post an RCU callback, as follows::
- 1  static void rcu_barrier_func(void *notused)
+  1  static void rcu_barrier_func(void *notused)
- 2  {
+  2  {
- 3    int cpu = smp_processor_id();
+  3    int cpu = smp_processor_id();
- 4    struct rcu_data *rdp = &per_cpu(rcu_data, cpu);
+  4    struct rcu_data *rdp = &per_cpu(rcu_data, cpu);
- 5    struct rcu_head *head;
+  5    struct rcu_head *head;
- 6
+  6
- 7    head = &rdp->barrier;
+  7    head = &rdp->barrier;
- 8    atomic_inc(&rcu_barrier_cpu_count);
+  8    atomic_inc(&rcu_barrier_cpu_count);
- 9    call_rcu(head, rcu_barrier_callback);
+  9    call_rcu(head, rcu_barrier_callback);
- 10 }
+ 10  }
 Lines 3 and 4 locate RCU's internal per-CPU rcu_data structure,
 which contains the struct rcu_head that needed for the later call to
 call_rcu(). Line 7 picks up a pointer to this struct rcu_head, and line
-8 increments a global counter. This counter will later be decremented
+8 increments the global counter. This counter will later be decremented
 by the callback. Line 9 then registers the rcu_barrier_callback() on
 the current CPU's queue.
@@ -261,33 +254,34 @@ The rcu_barrier_callback() function simply atomically decrements the
 rcu_barrier_cpu_count variable and finalizes the completion when it
 reaches zero, as follows::
- 1 static void rcu_barrier_callback(struct rcu_head *notused)
+  1  static void rcu_barrier_callback(struct rcu_head *notused)
- 2 {
+  2  {
- 3   if (atomic_dec_and_test(&rcu_barrier_cpu_count))
+  3    if (atomic_dec_and_test(&rcu_barrier_cpu_count))
- 4     complete(&rcu_barrier_completion);
+  4      complete(&rcu_barrier_completion);
- 5 }
+  5  }
-.. _rcubarrier_quiz_2:
+.. _rcubarrier_quiz_3:
-Quick Quiz #2:
+Quick Quiz #3:
 	What happens if CPU 0's rcu_barrier_func() executes
 	immediately (thus incrementing rcu_barrier_cpu_count to the
 	value one), but the other CPU's rcu_barrier_func() invocations
 	are delayed for a full grace period? Couldn't this result in
 	rcu_barrier() returning prematurely?
-:ref:`Answer to Quick Quiz #2 <answer_rcubarrier_quiz_2>`
+:ref:`Answer to Quick Quiz #3 <answer_rcubarrier_quiz_3>`
 The current rcu_barrier() implementation is more complex, due to the need
 to avoid disturbing idle CPUs (especially on battery-powered systems)
 and the need to minimally disturb non-idle CPUs in real-time systems.
-However, the code above illustrates the concepts.
+In addition, a great many optimizations have been applied.  However,
 the code above illustrates the concepts.
 rcu_barrier() Summary
 ---------------------
-The rcu_barrier() primitive has seen relatively little use, since most
+The rcu_barrier() primitive is used relatively infrequently, since most
 code using RCU is in the core kernel rather than in modules. However, if
 you are using RCU from an unloadable module, you need to use rcu_barrier()
 so that your module may be safely unloaded.
@@ -302,7 +296,8 @@ Quick Quiz #1:
 	Is there any other situation where rcu_barrier() might
 	be required?
-Answer: Interestingly enough, rcu_barrier() was not originally
+Answer:
 	Interestingly enough, rcu_barrier() was not originally
 	implemented for module unloading. Nikita Danilov was using
 	RCU in a filesystem, which resulted in a similar situation at
 	filesystem-unmount time. Dipankar Sarma coded up rcu_barrier()
@@ -318,13 +313,48 @@ Answer: Interestingly enough, rcu_barrier() was not originally
 .. _answer_rcubarrier_quiz_2:
 Quick Quiz #2:
 	Why doesn't line 8 initialize rcu_barrier_cpu_count to zero,
 	thereby avoiding the need for lines 9 and 10?
 Answer:
 	Suppose that the on_each_cpu() function shown on line 8 was
 	delayed, so that CPU 0's rcu_barrier_func() executed and
 	the corresponding grace period elapsed, all before CPU 1's
 	rcu_barrier_func() started executing.  This would result in
 	rcu_barrier_cpu_count being decremented to zero, so that line
 	11's wait_for_completion() would return immediately, failing to
 	wait for CPU 1's callbacks to be invoked.
 	Note that this was not a problem when the rcu_barrier() code
 	was first added back in 2005.  This is because on_each_cpu()
 	disables preemption, which acted as an RCU read-side critical
 	section, thus preventing CPU 0's grace period from completing
 	until on_each_cpu() had dealt with all of the CPUs.  However,
 	with the advent of preemptible RCU, rcu_barrier() no longer
 	waited on nonpreemptible regions of code in preemptible kernels,
 	that being the job of the new rcu_barrier_sched() function.
 	However, with the RCU flavor consolidation around v4.20, this
 	possibility was once again ruled out, because the consolidated
 	RCU once again waits on nonpreemptible regions of code.
 	Nevertheless, that extra count might still be a good idea.
 	Relying on these sort of accidents of implementation can result
 	in later surprise bugs when the implementation changes.
 :ref:`Back to Quick Quiz #2 <rcubarrier_quiz_2>`
 .. _answer_rcubarrier_quiz_3:
 Quick Quiz #3:
 	What happens if CPU 0's rcu_barrier_func() executes
 	immediately (thus incrementing rcu_barrier_cpu_count to the
 	value one), but the other CPU's rcu_barrier_func() invocations
 	are delayed for a full grace period? Couldn't this result in
 	rcu_barrier() returning prematurely?
-Answer: This cannot happen. The reason is that on_each_cpu() has its last
+Answer:
 	This cannot happen. The reason is that on_each_cpu() has its last
 	argument, the wait flag, set to "1". This flag is passed through
 	to smp_call_function() and further to smp_call_function_on_cpu(),
 	causing this latter to spin until the cross-CPU invocation of
@@ -336,18 +366,15 @@ Answer: This cannot happen. The reason is that on_each_cpu() has its last
 	Therefore, on_each_cpu() disables preemption across its call
 	to smp_call_function() and also across the local call to
-	rcu_barrier_func(). This prevents the local CPU from context
+	rcu_barrier_func(). Because recent RCU implementations treat
-	switching, again preventing grace periods from completing. This
+	preemption-disabled regions of code as RCU read-side critical
 	sections, this prevents grace periods from completing. This
 	means that all CPUs have executed rcu_barrier_func() before
 	the first rcu_barrier_callback() can possibly execute, in turn
 	preventing rcu_barrier_cpu_count from prematurely reaching zero.
-	Currently, -rt implementations of RCU keep but a single global
+	But if on_each_cpu() ever decides to forgo disabling preemption,
-	queue for RCU callbacks, and thus do not suffer from this
+	as might well happen due to real-time latency considerations,
-	problem. However, when the -rt RCU eventually does have per-CPU
+	initializing rcu_barrier_cpu_count to one will save the day.
 	callback queues, things will have to change. One simple change
 	is to add an rcu_read_lock() before line 8 of rcu_barrier()
 	and an rcu_read_unlock() after line 8 of this same function. If
 	you can think of a better change, please let me know!
-:ref:`Back to Quick Quiz #2 <rcubarrier_quiz_2>`
+:ref:`Back to Quick Quiz #3 <rcubarrier_quiz_3>`
--- a/Documentation/RCU/rculist_nulls.rst
+++ b/Documentation/RCU/rculist_nulls.rst
@@ -14,19 +14,19 @@ Using 'nulls'
 =============
 Using special makers (called 'nulls') is a convenient way
-to solve following problem :
+to solve following problem.
-A typical RCU linked list managing objects which are
+Without 'nulls', a typical RCU linked list managing objects which are
-allocated with SLAB_TYPESAFE_BY_RCU kmem_cache can
+allocated with SLAB_TYPESAFE_BY_RCU kmem_cache can use the following
-use following algos :
+algorithms:
-1) Lookup algo
+1) Lookup algorithm
--------------
+-------------------
 ::
  rcu_read_lock()
  begin:
  rcu_read_lock()
  obj = lockless_lookup(key);
  if (obj) {
    if (!try_get_ref(obj)) // might fail for free objects
@@ -38,6 +38,7 @@ use following algos :
    */
    if (obj->key != key) { // not the object we expected
      put_ref(obj);
      rcu_read_unlock();
      goto begin;
    }
  }
@@ -52,9 +53,9 @@ but a version with an additional memory barrier (smp_rmb())
  {
    struct hlist_node *node, *next;
    for (pos = rcu_dereference((head)->first);
-        pos && ({ next = pos->next; smp_rmb(); prefetch(next); 1; }) &&
+         pos && ({ next = pos->next; smp_rmb(); prefetch(next); 1; }) &&
-        ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; });
+         ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; });
-        pos = rcu_dereference(next))
+         pos = rcu_dereference(next))
      if (obj->key == key)
        return obj;
    return NULL;
@@ -64,9 +65,9 @@ And note the traditional hlist_for_each_entry_rcu() misses this smp_rmb()::
  struct hlist_node *node;
  for (pos = rcu_dereference((head)->first);
-        pos && ({ prefetch(pos->next); 1; }) &&
+       pos && ({ prefetch(pos->next); 1; }) &&
-        ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; });
+       ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; });
-        pos = rcu_dereference(pos->next))
+       pos = rcu_dereference(pos->next))
   if (obj->key == key)
     return obj;
  return NULL;
@@ -82,36 +83,32 @@ Quoting Corey Minyard::
  solved by pre-fetching the "next" field (with proper barriers) before
  checking the key."
-2) Insert algo
+2) Insertion algorithm
--------------
+----------------------
 We need to make sure a reader cannot read the new 'obj->obj_next' value
-and previous value of 'obj->key'. Or else, an item could be deleted
+and previous value of 'obj->key'. Otherwise, an item could be deleted
 from a chain, and inserted into another chain. If new chain was empty
-before the move, 'next' pointer is NULL, and lockless reader can
+before the move, 'next' pointer is NULL, and lockless reader can not
-not detect it missed following items in original chain.
+detect the fact that it missed following items in original chain.
 ::
  /*
-  * Please note that new inserts are done at the head of list,
+   * Please note that new inserts are done at the head of list,
-  * not in the middle or end.
+   * not in the middle or end.
-  */
+   */
  obj = kmem_cache_alloc(...);
  lock_chain(); // typically a spin_lock()
  obj->key = key;
-  /*
+  atomic_set_release(&obj->refcnt, 1); // key before refcnt
  * we need to make sure obj->key is updated before obj->next
  * or obj->refcnt
  */
  smp_wmb();
  atomic_set(&obj->refcnt, 1);
  hlist_add_head_rcu(&obj->obj_node, list);
  unlock_chain(); // typically a spin_unlock()
-3) Remove algo
+3) Removal algorithm
--------------
+--------------------
 Nothing special here, we can use a standard RCU hlist deletion.
 But thanks to SLAB_TYPESAFE_BY_RCU, beware a deleted object can be reused
 very very fast (before the end of RCU grace period)
@@ -133,7 +130,7 @@ Avoiding extra smp_rmb()
 ========================
 With hlist_nulls we can avoid extra smp_rmb() in lockless_lookup()
-and extra smp_wmb() in insert function.
+and extra _release() in insert function.
 For example, if we choose to store the slot number as the 'nulls'
 end-of-list marker for each slot of the hash table, we can detect
@@ -142,59 +139,61 @@ to another chain) checking the final 'nulls' value if
 the lookup met the end of chain. If final 'nulls' value
 is not the slot number, then we must restart the lookup at
 the beginning. If the object was moved to the same chain,
-then the reader doesn't care : It might eventually
+then the reader doesn't care: It might occasionally
 scan the list again without harm.
-1) lookup algo
+1) lookup algorithm
--------------
+-------------------
 ::
  head = &table[slot];
  rcu_read_lock();
  begin:
  rcu_read_lock();
  hlist_nulls_for_each_entry_rcu(obj, node, head, member) {
    if (obj->key == key) {
-      if (!try_get_ref(obj)) // might fail for free objects
+      if (!try_get_ref(obj)) { // might fail for free objects
-        goto begin;
+	rcu_read_unlock();
      if (obj->key != key) { // not the object we expected
        put_ref(obj);
        goto begin;
      }
-    goto out;
+      if (obj->key != key) { // not the object we expected
        put_ref(obj);
 	rcu_read_unlock();
        goto begin;
      }
      goto out;
    }
  }
  // If the nulls value we got at the end of this lookup is
  // not the expected one, we must restart lookup.
  // We probably met an item that was moved to another chain.
  if (get_nulls_value(node) != slot) {
    put_ref(obj);
    rcu_read_unlock();
    goto begin;
  }
  /*
  * if the nulls value we got at the end of this lookup is
  * not the expected one, we must restart lookup.
  * We probably met an item that was moved to another chain.
  */
  if (get_nulls_value(node) != slot)
  goto begin;
  obj = NULL;
  out:
  rcu_read_unlock();
-2) Insert function
+2) Insert algorithm
------------------
+-------------------
 ::
  /*
-  * Please note that new inserts are done at the head of list,
+   * Please note that new inserts are done at the head of list,
-  * not in the middle or end.
+   * not in the middle or end.
-  */
+   */
  obj = kmem_cache_alloc(cachep);
  lock_chain(); // typically a spin_lock()
  obj->key = key;
  atomic_set_release(&obj->refcnt, 1); // key before refcnt
  /*
-  * changes to obj->key must be visible before refcnt one
+   * insert obj in RCU way (readers might be traversing chain)
-  */
+   */
  smp_wmb();
  atomic_set(&obj->refcnt, 1);
  /*
  * insert obj in RCU way (readers might be traversing chain)
  */
  hlist_nulls_add_head_rcu(&obj->obj_node, list);
  unlock_chain(); // typically a spin_unlock()
--- a/Documentation/RCU/stallwarn.rst
+++ b/Documentation/RCU/stallwarn.rst
@@ -25,10 +25,10 @@ warnings:
 -	A CPU looping with bottom halves disabled.
-	For !CONFIG_PREEMPTION kernels, a CPU looping anywhere in the kernel
+-	For !CONFIG_PREEMPTION kernels, a CPU looping anywhere in the
-	without invoking schedule().  If the looping in the kernel is
+	kernel without potentially invoking schedule().  If the looping
-	really expected and desirable behavior, you might need to add
+	in the kernel is really expected and desirable behavior, you
-	some calls to cond_resched().
+	might need to add some calls to cond_resched().
 -	Booting Linux using a console connection that is too slow to
 	keep up with the boot-time console-message rate.  For example,
@@ -108,16 +108,17 @@ warnings:
 -	A bug in the RCU implementation.
-	A hardware failure.  This is quite unlikely, but has occurred
+-	A hardware failure.  This is quite unlikely, but is not at all
-	at least once in real life.  A CPU failed in a running system,
+	uncommon in large datacenter.  In one memorable case some decades
-	becoming unresponsive, but not causing an immediate crash.
+	back, a CPU failed in a running system, becoming unresponsive,
-	This resulted in a series of RCU CPU stall warnings, eventually
+	but not causing an immediate crash.  This resulted in a series
-	leading the realization that the CPU had failed.
+	of RCU CPU stall warnings, eventually leading the realization
 	that the CPU had failed.
-The RCU, RCU-sched, and RCU-tasks implementations have CPU stall warning.
+The RCU, RCU-sched, RCU-tasks, and RCU-tasks-trace implementations have
-Note that SRCU does *not* have CPU stall warnings.  Please note that
+CPU stall warning.  Note that SRCU does *not* have CPU stall warnings.
-RCU only detects CPU stalls when there is a grace period in progress.
+Please note that RCU only detects CPU stalls when there is a grace period
-No grace period, no CPU stall warnings.
+in progress.  No grace period, no CPU stall warnings.
 To diagnose the cause of the stall, inspect the stack traces.
 The offending function will usually be near the top of the stack.
@@ -205,16 +206,21 @@ RCU_STALL_RAT_DELAY
 rcupdate.rcu_task_stall_timeout
 -------------------------------
-	This boot/sysfs parameter controls the RCU-tasks stall warning
+	This boot/sysfs parameter controls the RCU-tasks and
-	interval.  A value of zero or less suppresses RCU-tasks stall
+	RCU-tasks-trace stall warning intervals.  A value of zero or less
-	warnings.  A positive value sets the stall-warning interval
+	suppresses RCU-tasks stall warnings.  A positive value sets the
-	in seconds.  An RCU-tasks stall warning starts with the line:
+	stall-warning interval in seconds.  An RCU-tasks stall warning
 	starts with the line:
 		INFO: rcu_tasks detected stalls on tasks:
 	And continues with the output of sched_show_task() for each
 	task stalling the current RCU-tasks grace period.
 	An RCU-tasks-trace stall warning starts (and continues) similarly:
 		INFO: rcu_tasks_trace detected stalls on tasks
 Interpreting RCU's CPU Stall-Detector "Splats"
 ==============================================
@@ -248,7 +254,8 @@ dynticks counter, which will have an even-numbered value if the CPU
 is in dyntick-idle mode and an odd-numbered value otherwise.  The hex
 number between the two "/"s is the value of the nesting, which will be
 a small non-negative number if in the idle loop (as shown above) and a
-very large positive number otherwise.
+very large positive number otherwise.  The number following the final
 "/" is the NMI nesting, which will be a small non-negative number.
 The "softirq=" portion of the message tracks the number of RCU softirq
 handlers that the stalled CPU has executed.  The number before the "/"
@@ -383,3 +390,95 @@ for example, "P3421".
 It is entirely possible to see stall warnings from normal and from
 expedited grace periods at about the same time during the same run.
 RCU_CPU_STALL_CPUTIME
 =====================
 In kernels built with CONFIG_RCU_CPU_STALL_CPUTIME=y or booted with
 rcupdate.rcu_cpu_stall_cputime=1, the following additional information
 is supplied with each RCU CPU stall warning::
  rcu:          hardirqs   softirqs   csw/system
  rcu:  number:      624         45            0
  rcu: cputime:       69          1         2425   ==> 2500(ms)
 These statistics are collected during the sampling period. The values
 in row "number:" are the number of hard interrupts, number of soft
 interrupts, and number of context switches on the stalled CPU. The
 first three values in row "cputime:" indicate the CPU time in
 milliseconds consumed by hard interrupts, soft interrupts, and tasks
 on the stalled CPU.  The last number is the measurement interval, again
 in milliseconds.  Because user-mode tasks normally do not cause RCU CPU
 stalls, these tasks are typically kernel tasks, which is why only the
 system CPU time are considered.
 The sampling period is shown as follows::
  |<------------first timeout---------->|<-----second timeout----->|
  |<--half timeout-->|<--half timeout-->|                          |
  |                  |<--first period-->|                          |
  |                  |<-----------second sampling period---------->|
  |                  |                  |                          |
             snapshot time point    1st-stall                  2nd-stall
 The following describes four typical scenarios:
 1. A CPU looping with interrupts disabled.
   ::
     rcu:          hardirqs   softirqs   csw/system
     rcu:  number:        0          0            0
     rcu: cputime:        0          0            0   ==> 2500(ms)
   Because interrupts have been disabled throughout the measurement
   interval, there are no interrupts and no context switches.
   Furthermore, because CPU time consumption was measured using interrupt
   handlers, the system CPU consumption is misleadingly measured as zero.
   This scenario will normally also have "(0 ticks this GP)" printed on
   this CPU's summary line.
 2. A CPU looping with bottom halves disabled.
   This is similar to the previous example, but with non-zero number of
   and CPU time consumed by hard interrupts, along with non-zero CPU
   time consumed by in-kernel execution::
     rcu:          hardirqs   softirqs   csw/system
     rcu:  number:      624          0            0
     rcu: cputime:       49          0         2446   ==> 2500(ms)
   The fact that there are zero softirqs gives a hint that these were
   disabled, perhaps via local_bh_disable().  It is of course possible
   that there were no softirqs, perhaps because all events that would
   result in softirq execution are confined to other CPUs.  In this case,
   the diagnosis should continue as shown in the next example.
 3. A CPU looping with preemption disabled.
   Here, only the number of context switches is zero::
     rcu:          hardirqs   softirqs   csw/system
     rcu:  number:      624         45            0
     rcu: cputime:       69          1         2425   ==> 2500(ms)
   This situation hints that the stalled CPU was looping with preemption
   disabled.
 4. No looping, but massive hard and soft interrupts.
   ::
     rcu:          hardirqs   softirqs   csw/system
     rcu:  number:       xx         xx            0
     rcu: cputime:       xx         xx            0   ==> 2500(ms)
   Here, the number and CPU time of hard interrupts are all non-zero,
   but the number of context switches and the in-kernel CPU time consumed
   are zero. The number and cputime of soft interrupts will usually be
   non-zero, but could be zero, for example, if the CPU was spinning
   within a single hard interrupt handler.
   If this type of RCU CPU stall warning can be reproduced, you can
   narrow it down by looking at /proc/interrupts or by writing code to
   trace each interrupt, for example, by referring to show_interrupts().
--- a/Documentation/RCU/torture.rst
+++ b/Documentation/RCU/torture.rst
@@ -206,7 +206,11 @@ values for memory may require disabling the callback-flooding tests
 using the --bootargs parameter discussed below.
 Sometimes additional debugging is useful, and in such cases the --kconfig
-parameter to kvm.sh may be used, for example, ``--kconfig 'CONFIG_KASAN=y'``.
+parameter to kvm.sh may be used, for example, ``--kconfig 'CONFIG_RCU_EQS_DEBUG=y'``.
 In addition, there are the --gdb, --kasan, and --kcsan parameters.
 Note that --gdb limits you to one scenario per kvm.sh run and requires
 that you have another window open from which to run ``gdb`` as instructed
 by the script.
 Kernel boot arguments can also be supplied, for example, to control
 rcutorture's module parameters.  For example, to test a change to RCU's
@@ -219,10 +223,17 @@ require disabling rcutorture's callback-flooding tests::
 		--bootargs 'rcutorture.fwd_progress=0'
 Sometimes all that is needed is a full set of kernel builds.  This is
-what the --buildonly argument does.
+what the --buildonly parameter does.
-Finally, the --trust-make argument allows each kernel build to reuse what
+The --duration parameter can override the default run time of 30 minutes.
-it can from the previous kernel build.
+For example, ``--duration 2d`` would run for two days, ``--duration 3h``
 would run for three hours, ``--duration 5m`` would run for five minutes,
 and ``--duration 45s`` would run for 45 seconds.  This last can be useful
 for tracking down rare boot-time failures.
 Finally, the --trust-make parameter allows each kernel build to reuse what
 it can from the previous kernel build.  Please note that without the
 --trust-make parameter, your tags files may be demolished.
 There are additional more arcane arguments that are documented in the
 source code of the kvm.sh script.
@@ -291,3 +302,73 @@ the following summary at the end of the run on a 12-CPU system::
    TREE07 ------- 167347 GPs (30.9902/s) [rcu: g1079021 f0x0 ] n_max_cbs: 478732
    CPU count limited from 16 to 12
    TREE09 ------- 752238 GPs (139.303/s) [rcu: g13075057 f0x0 ] n_max_cbs: 99011
 Repeated Runs
 =============
 Suppose that you are chasing down a rare boot-time failure.  Although you
 could use kvm.sh, doing so will rebuild the kernel on each run.  If you
 need (say) 1,000 runs to have confidence that you have fixed the bug,
 these pointless rebuilds can become extremely annoying.
 This is why kvm-again.sh exists.
 Suppose that a previous kvm.sh run left its output in this directory::
 	tools/testing/selftests/rcutorture/res/2022.11.03-11.26.28
 Then this run can be re-run without rebuilding as follow:
 	kvm-again.sh tools/testing/selftests/rcutorture/res/2022.11.03-11.26.28
 A few of the original run's kvm.sh parameters may be overridden, perhaps
 most notably --duration and --bootargs.  For example::
 	kvm-again.sh tools/testing/selftests/rcutorture/res/2022.11.03-11.26.28 \
 		--duration 45s
 would re-run the previous test, but for only 45 seconds, thus facilitating
 tracking down the aforementioned rare boot-time failure.
 Distributed Runs
 ================
 Although kvm.sh is quite useful, its testing is confined to a single
 system.  It is not all that hard to use your favorite framework to cause
 (say) 5 instances of kvm.sh to run on your 5 systems, but this will very
 likely unnecessarily rebuild kernels.  In addition, manually distributing
 the desired rcutorture scenarios across the available systems can be
 painstaking and error-prone.
 And this is why the kvm-remote.sh script exists.
 If you the following command works::
 	ssh system0 date
 and if it also works for system1, system2, system3, system4, and system5,
 and all of these systems have 64 CPUs, you can type::
 	kvm-remote.sh "system0 system1 system2 system3 system4 system5" \
 		--cpus 64 --duration 8h --configs "5*CFLIST"
 This will build each default scenario's kernel on the local system, then
 spread each of five instances of each scenario over the systems listed,
 running each scenario for eight hours.  At the end of the runs, the
 results will be gathered, recorded, and printed.  Most of the parameters
 that kvm.sh will accept can be passed to kvm-remote.sh, but the list of
 systems must come first.
 The kvm.sh ``--dryrun scenarios`` argument is useful for working out
 how many scenarios may be run in one batch across a group of systems.
 You can also re-run a previous remote run in a manner similar to kvm.sh:
 	kvm-remote.sh "system0 system1 system2 system3 system4 system5" \
 		tools/testing/selftests/rcutorture/res/2022.11.03-11.26.28-remote \
 		--duration 24h
 In this case, most of the kvm-again.sh parmeters may be supplied following
 the pathname of the old run-results directory.
--- a/Documentation/RCU/whatisRCU.rst
+++ b/Documentation/RCU/whatisRCU.rst
@@ -16,18 +16,23 @@ to start learning about RCU:
 | 6.	The RCU API, 2019 Edition    https://lwn.net/Articles/777036/
 |	2019 Big API Table           https://lwn.net/Articles/777165/
 For those preferring video:
 | 1.	Unraveling RCU Mysteries: Fundamentals          https://www.linuxfoundation.org/webinars/unraveling-rcu-usage-mysteries
 | 2.	Unraveling RCU Mysteries: Additional Use Cases  https://www.linuxfoundation.org/webinars/unraveling-rcu-usage-mysteries-additional-use-cases
 What is RCU?
 RCU is a synchronization mechanism that was added to the Linux kernel
 during the 2.5 development effort that is optimized for read-mostly
-situations.  Although RCU is actually quite simple once you understand it,
+situations.  Although RCU is actually quite simple, making effective use
-getting there can sometimes be a challenge.  Part of the problem is that
+of it requires you to think differently about your code.  Another part
-most of the past descriptions of RCU have been written with the mistaken
+of the problem is the mistaken assumption that there is "one true way" to
-assumption that there is "one true way" to describe RCU.  Instead,
+describe and to use RCU.  Instead, the experience has been that different
-the experience has been that different people must take different paths
+people must take different paths to arrive at an understanding of RCU,
-to arrive at an understanding of RCU.  This document provides several
+depending on their experiences and use cases.  This document provides
-different paths, as follows:
+several different paths, as follows:
 :ref:`1.	RCU OVERVIEW <1_whatisRCU>`
@@ -157,34 +162,36 @@ rcu_read_lock()
 ^^^^^^^^^^^^^^^
 	void rcu_read_lock(void);
-	Used by a reader to inform the reclaimer that the reader is
+	This temporal primitive is used by a reader to inform the
-	entering an RCU read-side critical section.  It is illegal
+	reclaimer that the reader is entering an RCU read-side critical
-	to block while in an RCU read-side critical section, though
+	section.  It is illegal to block while in an RCU read-side
-	kernels built with CONFIG_PREEMPT_RCU can preempt RCU
+	critical section, though kernels built with CONFIG_PREEMPT_RCU
-	read-side critical sections.  Any RCU-protected data structure
+	can preempt RCU read-side critical sections.  Any RCU-protected
-	accessed during an RCU read-side critical section is guaranteed to
+	data structure accessed during an RCU read-side critical section
-	remain unreclaimed for the full duration of that critical section.
+	is guaranteed to remain unreclaimed for the full duration of that
-	Reference counts may be used in conjunction with RCU to maintain
+	critical section.  Reference counts may be used in conjunction
-	longer-term references to data structures.
+	with RCU to maintain longer-term references to data structures.
 rcu_read_unlock()
 ^^^^^^^^^^^^^^^^^
 	void rcu_read_unlock(void);
-	Used by a reader to inform the reclaimer that the reader is
+	This temporal primitives is used by a reader to inform the
-	exiting an RCU read-side critical section.  Note that RCU
+	reclaimer that the reader is exiting an RCU read-side critical
-	read-side critical sections may be nested and/or overlapping.
+	section.  Note that RCU read-side critical sections may be nested
 	and/or overlapping.
 synchronize_rcu()
 ^^^^^^^^^^^^^^^^^
 	void synchronize_rcu(void);
-	Marks the end of updater code and the beginning of reclaimer
+	This temporal primitive marks the end of updater code and the
-	code.  It does this by blocking until all pre-existing RCU
+	beginning of reclaimer code.  It does this by blocking until
-	read-side critical sections on all CPUs have completed.
+	all pre-existing RCU read-side critical sections on all CPUs
-	Note that synchronize_rcu() will **not** necessarily wait for
+	have completed.  Note that synchronize_rcu() will **not**
-	any subsequent RCU read-side critical sections to complete.
+	necessarily wait for any subsequent RCU read-side critical
-	For example, consider the following sequence of events::
+	sections to complete.  For example, consider the following
 	sequence of events::
 	         CPU 0                  CPU 1                 CPU 2
 	     ----------------- ------------------------- ---------------
@@ -211,13 +218,13 @@ synchronize_rcu()
 	to be useful in all but the most read-intensive situations,
 	synchronize_rcu()'s overhead must also be quite small.
-	The call_rcu() API is a callback form of synchronize_rcu(),
+	The call_rcu() API is an asynchronous callback form of
-	and is described in more detail in a later section.  Instead of
+	synchronize_rcu(), and is described in more detail in a later
-	blocking, it registers a function and argument which are invoked
+	section.  Instead of blocking, it registers a function and
-	after all ongoing RCU read-side critical sections have completed.
+	argument which are invoked after all ongoing RCU read-side
-	This callback variant is particularly useful in situations where
+	critical sections have completed.  This callback variant is
-	it is illegal to block or where update-side performance is
+	particularly useful in situations where it is illegal to block
-	critically important.
+	or where update-side performance is critically important.
 	However, the call_rcu() API should not be used lightly, as use
 	of the synchronize_rcu() API generally results in simpler code.
@@ -236,11 +243,13 @@ rcu_assign_pointer()
 	would be cool to be able to declare a function in this manner.
 	(Compiler experts will no doubt disagree.)
-	The updater uses this function to assign a new value to an
+	The updater uses this spatial macro to assign a new value to an
 	RCU-protected pointer, in order to safely communicate the change
-	in value from the updater to the reader.  This macro does not
+	in value from the updater to the reader.  This is a spatial (as
-	evaluate to an rvalue, but it does execute any memory-barrier
+	opposed to temporal) macro.  It does not evaluate to an rvalue,
-	instructions required for a given CPU architecture.
+	but it does execute any memory-barrier instructions required
 	for a given CPU architecture.  Its ordering properties are that
 	of a store-release operation.
 	Perhaps just as important, it serves to document (1) which
 	pointers are protected by RCU and (2) the point at which a
@@ -255,14 +264,15 @@ rcu_dereference()
 	Like rcu_assign_pointer(), rcu_dereference() must be implemented
 	as a macro.
-	The reader uses rcu_dereference() to fetch an RCU-protected
+	The reader uses the spatial rcu_dereference() macro to fetch
-	pointer, which returns a value that may then be safely
+	an RCU-protected pointer, which returns a value that may
-	dereferenced.  Note that rcu_dereference() does not actually
+	then be safely dereferenced.  Note that rcu_dereference()
-	dereference the pointer, instead, it protects the pointer for
+	does not actually dereference the pointer, instead, it
-	later dereferencing.  It also executes any needed memory-barrier
+	protects the pointer for later dereferencing.  It also
-	instructions for a given CPU architecture.  Currently, only Alpha
+	executes any needed memory-barrier instructions for a given
-	needs memory barriers within rcu_dereference() -- on other CPUs,
+	CPU architecture.  Currently, only Alpha needs memory barriers
-	it compiles to nothing, not even a compiler directive.
+	within rcu_dereference() -- on other CPUs, it compiles to a
 	volatile load.
 	Common coding practice uses rcu_dereference() to copy an
 	RCU-protected pointer to a local variable, then dereferences
@@ -355,12 +365,15 @@ reader, updater, and reclaimer.
 	      synchronize_rcu() & call_rcu()
-The RCU infrastructure observes the time sequence of rcu_read_lock(),
+The RCU infrastructure observes the temporal sequence of rcu_read_lock(),
 rcu_read_unlock(), synchronize_rcu(), and call_rcu() invocations in
 order to determine when (1) synchronize_rcu() invocations may return
 to their callers and (2) call_rcu() callbacks may be invoked.  Efficient
 implementations of the RCU infrastructure make heavy use of batching in
 order to amortize their overhead over many uses of the corresponding APIs.
 The rcu_assign_pointer() and rcu_dereference() invocations communicate
 spatial changes via stores to and loads from the RCU-protected pointer in
 question.
 There are at least three flavors of RCU usage in the Linux kernel. The diagram
 above shows the most common one. On the updater side, the rcu_assign_pointer(),
@@ -392,7 +405,9 @@ b.	RCU applied to networking data structures that may be subjected
 c.	RCU applied to scheduler and interrupt/NMI-handler tasks.
 Again, most uses will be of (a).  The (b) and (c) cases are important
-for specialized uses, but are relatively uncommon.
+for specialized uses, but are relatively uncommon.  The SRCU, RCU-Tasks,
 RCU-Tasks-Rude, and RCU-Tasks-Trace have similar relationships among
 their assorted primitives.
 .. _3_whatisRCU:
@@ -468,7 +483,7 @@ So, to sum up:
 -	Within an RCU read-side critical section, use rcu_dereference()
 	to dereference RCU-protected pointers.
-	Use some solid scheme (such as locks or semaphores) to
+-	Use some solid design (such as locks or semaphores) to
 	keep concurrent updates from interfering with each other.
 -	Use rcu_assign_pointer() to update an RCU-protected pointer.
@@ -579,6 +594,14 @@ to avoid having to write your own callback::
 	kfree_rcu(old_fp, rcu);
 If the occasional sleep is permitted, the single-argument form may
 be used, omitting the rcu_head structure from struct foo.
 	kfree_rcu(old_fp);
 This variant of kfree_rcu() almost never blocks, but might do so by
 invoking synchronize_rcu() in response to memory-allocation failure.
 Again, see checklist.rst for additional rules governing the use of RCU.
 .. _5_whatisRCU:
@@ -596,7 +619,7 @@ lacking both functionality and performance.  However, they are useful
 in getting a feel for how RCU works.  See kernel/rcu/update.c for a
 production-quality implementation, and see:
-	http://www.rdrop.com/users/paulmck/RCU
+	https://docs.google.com/document/d/1X0lThx8OK0ZgLMqVoXiR4ZrGURHrXK6NyLRbeXe3Xac/edit
 for papers describing the Linux kernel RCU implementation.  The OLS'01
 and OLS'02 papers are a good introduction, and the dissertation provides
@@ -929,6 +952,8 @@ unfortunately any spinlock in a ``SLAB_TYPESAFE_BY_RCU`` object must be
 initialized after each and every call to kmem_cache_alloc(), which renders
 reference-free spinlock acquisition completely unsafe.  Therefore, when
 using ``SLAB_TYPESAFE_BY_RCU``, make proper use of a reference counter.
 (Those willing to use a kmem_cache constructor may also use locking,
 including cache-friendly sequence locking.)
 With traditional reference counting -- such as that implemented by the
 kref library in Linux -- there is typically code that runs when the last
@@ -1047,6 +1072,30 @@ sched::
 	rcu_read_lock_sched_held
 RCU-Tasks::
 	Critical sections	Grace period		Barrier
 	N/A			call_rcu_tasks		rcu_barrier_tasks
 				synchronize_rcu_tasks
 RCU-Tasks-Rude::
 	Critical sections	Grace period		Barrier
 	N/A			call_rcu_tasks_rude	rcu_barrier_tasks_rude
 				synchronize_rcu_tasks_rude
 RCU-Tasks-Trace::
 	Critical sections	Grace period		Barrier
 	rcu_read_lock_trace	call_rcu_tasks_trace	rcu_barrier_tasks_trace
 	rcu_read_unlock_trace	synchronize_rcu_tasks_trace
 SRCU::
 	Critical sections	Grace period		Barrier
@@ -1087,35 +1136,43 @@ list can be helpful:
 a.	Will readers need to block?  If so, you need SRCU.
-b.	What about the -rt patchset?  If readers would need to block
+b.	Will readers need to block and are you doing tracing, for
-	in an non-rt kernel, you need SRCU.  If readers would block
+	example, ftrace or BPF?  If so, you need RCU-tasks,
-	in a -rt kernel, but not in a non-rt kernel, SRCU is not
+	RCU-tasks-rude, and/or RCU-tasks-trace.
 	necessary.  (The -rt patchset turns spinlocks into sleeplocks,
 	hence this distinction.)
-c.	Do you need to treat NMI handlers, hardirq handlers,
+c.	What about the -rt patchset?  If readers would need to block in
 	an non-rt kernel, you need SRCU.  If readers would block when
 	acquiring spinlocks in a -rt kernel, but not in a non-rt kernel,
 	SRCU is not necessary.	(The -rt patchset turns spinlocks into
 	sleeplocks, hence this distinction.)
 d.	Do you need to treat NMI handlers, hardirq handlers,
 	and code segments with preemption disabled (whether
 	via preempt_disable(), local_irq_save(), local_bh_disable(),
 	or some other mechanism) as if they were explicit RCU readers?
-	If so, RCU-sched is the only choice that will work for you.
+	If so, RCU-sched readers are the only choice that will work
 	for you, but since about v4.20 you use can use the vanilla RCU
 	update primitives.
-d.	Do you need RCU grace periods to complete even in the face
+e.	Do you need RCU grace periods to complete even in the face of
-	of softirq monopolization of one or more of the CPUs?  For
+	softirq monopolization of one or more of the CPUs?  For example,
-	example, is your code subject to network-based denial-of-service
+	is your code subject to network-based denial-of-service attacks?
-	attacks?  If so, you should disable softirq across your readers,
+	If so, you should disable softirq across your readers, for
-	for example, by using rcu_read_lock_bh().
+	example, by using rcu_read_lock_bh().  Since about v4.20 you
 	use can use the vanilla RCU update primitives.
-e.	Is your workload too update-intensive for normal use of
+f.	Is your workload too update-intensive for normal use of
 	RCU, but inappropriate for other synchronization mechanisms?
 	If so, consider SLAB_TYPESAFE_BY_RCU (which was originally
 	named SLAB_DESTROY_BY_RCU).  But please be careful!
-f.	Do you need read-side critical sections that are respected
+g.	Do you need read-side critical sections that are respected even
-	even though they are in the middle of the idle loop, during
+	on CPUs that are deep in the idle loop, during entry to or exit
-	user-mode execution, or on an offlined CPU?  If so, SRCU is the
+	from user-mode execution, or on an offlined CPU?  If so, SRCU
-	only choice that will work for you.
+	and RCU Tasks Trace are the only choices that will work for you,
 	with SRCU being strongly preferred in almost all cases.
-g.	Otherwise, use RCU.
+h.	Otherwise, use RCU.
 Of course, this all assumes that you have determined that RCU is in fact
 the right tool for your job.
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5113,6 +5113,17 @@
 			rcupdate.rcu_cpu_stall_timeout to be used (after
 			conversion from seconds to milliseconds).
 	rcupdate.rcu_cpu_stall_cputime= [KNL]
 			Provide statistics on the cputime and count of
 			interrupts and tasks during the sampling period. For
 			multiple continuous RCU stalls, all sampling periods
 			begin at half of the first RCU stall timeout.
 	rcupdate.rcu_exp_stall_task_details= [KNL]
 			Print stack dumps of any tasks blocking the
 			current expedited RCU grace period during an
 			expedited RCU CPU stall warning.
 	rcupdate.rcu_expedited= [KNL]
 			Use expedited grace-period primitives, for
 			example, synchronize_rcu_expedited() instead
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -181,7 +181,6 @@ void fw_devlink_purge_absent_suppliers(struct fwnode_handle *fwnode)
 }
 EXPORT_SYMBOL_GPL(fw_devlink_purge_absent_suppliers);
 #ifdef CONFIG_SRCU
 static DEFINE_MUTEX(device_links_lock);
 DEFINE_STATIC_SRCU(device_links_srcu);
@@ -220,47 +219,6 @@ static void device_link_remove_from_lists(struct device_link *link)
 	list_del_rcu(&link->s_node);
 	list_del_rcu(&link->c_node);
 }
 #else /* !CONFIG_SRCU */
 static DECLARE_RWSEM(device_links_lock);
 static inline void device_links_write_lock(void)
 {
 	down_write(&device_links_lock);
 }
 static inline void device_links_write_unlock(void)
 {
 	up_write(&device_links_lock);
 }
 int device_links_read_lock(void)
 {
 	down_read(&device_links_lock);
 	return 0;
 }
 void device_links_read_unlock(int not_used)
 {
 	up_read(&device_links_lock);
 }
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 int device_links_read_lock_held(void)
 {
 	return lockdep_is_held(&device_links_lock);
 }
 #endif
 static inline void device_link_synchronize_removal(void)
 {
 }
 static void device_link_remove_from_lists(struct device_link *link)
 {
 	list_del(&link->s_node);
 	list_del(&link->c_node);
 }
 #endif /* !CONFIG_SRCU */
 static bool device_is_ancestor(struct device *dev, struct device *target)
 {
--- a/drivers/dax/Kconfig
+++ b/drivers/dax/Kconfig
@@ -1,7 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0-only
 menuconfig DAX
 	tristate "DAX: direct access to differentiated memory"
 	select SRCU
 	default m if NVDIMM_DAX
 if DAX
--- a/drivers/hwtracing/stm/Kconfig
+++ b/drivers/hwtracing/stm/Kconfig
@@ -2,7 +2,6 @@
 config STM
 	tristate "System Trace Module devices"
 	select CONFIGFS_FS
 	select SRCU
 	help
 	  A System Trace Module (STM) is a device exporting data in System
 	  Trace Protocol (STP) format as defined by MIPI STP standards.
--- a/drivers/md/Kconfig
+++ b/drivers/md/Kconfig
@@ -6,7 +6,6 @@
 menuconfig MD
 	bool "Multiple devices driver support (RAID and LVM)"
 	depends on BLOCK
 	select SRCU
 	help
 	  Support multiple physical spindles through a single logical device.
 	  Required for RAID and logical volume management.
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -334,7 +334,6 @@ config NETCONSOLE_DYNAMIC
 config NETPOLL
 	def_bool NETCONSOLE
 	select SRCU
 config NET_POLL_CONTROLLER
 	def_bool NETPOLL
--- a/drivers/pci/controller/Kconfig
+++ b/drivers/pci/controller/Kconfig
@@ -258,7 +258,7 @@ config PCIE_MEDIATEK_GEN3
 	  MediaTek SoCs.
 config VMD
-	depends on PCI_MSI && X86_64 && SRCU && !UML
+	depends on PCI_MSI && X86_64 && !UML
 	tristate "Intel Volume Management Device Driver"
 	help
 	  Adds support for the Intel Volume Management Device (VMD). VMD is a
--- a/fs/btrfs/Kconfig
+++ b/fs/btrfs/Kconfig
@@ -17,7 +17,6 @@ config BTRFS_FS
 	select FS_IOMAP
 	select RAID6_PQ
 	select XOR_BLOCKS
 	select SRCU
 	depends on PAGE_SIZE_LESS_THAN_256KB
 	help
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1890,7 +1890,6 @@ int generic_setlease(struct file *filp, long arg, struct file_lock **flp,
 }
 EXPORT_SYMBOL(generic_setlease);
 #if IS_ENABLED(CONFIG_SRCU)
 /*
 * Kernel subsystems can register to be notified on any attempt to set
 * a new lease with the lease_notifier_chain. This is used by (e.g.) nfsd
@@ -1924,30 +1923,6 @@ void lease_unregister_notifier(struct notifier_block *nb)
 }
 EXPORT_SYMBOL_GPL(lease_unregister_notifier);
 #else /* !IS_ENABLED(CONFIG_SRCU) */
 static inline void
 lease_notifier_chain_init(void)
 {
 }
 static inline void
 setlease_notifier(long arg, struct file_lock *lease)
 {
 }
 int lease_register_notifier(struct notifier_block *nb)
 {
 	return 0;
 }
 EXPORT_SYMBOL_GPL(lease_register_notifier);
 void lease_unregister_notifier(struct notifier_block *nb)
 {
 }
 EXPORT_SYMBOL_GPL(lease_unregister_notifier);
 #endif /* IS_ENABLED(CONFIG_SRCU) */
 /**
 * vfs_setlease        -       sets a lease on an open file
 * @filp:	file pointer
--- a/fs/notify/Kconfig
+++ b/fs/notify/Kconfig
@@ -1,7 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0-only
 config FSNOTIFY
 	def_bool n
 	select SRCU
 source "fs/notify/dnotify/Kconfig"
 source "fs/notify/inotify/Kconfig"
--- a/fs/quota/Kconfig
+++ b/fs/quota/Kconfig
@@ -6,7 +6,6 @@
 config QUOTA
 	bool "Quota support"
 	select QUOTACTL
 	select SRCU
 	help
 	  If you say Y here, you will be able to set per user limits for disk
 	  usage (also called disk quotas). Currently, it works for the
--- a/include/linux/kernel_stat.h
+++ b/include/linux/kernel_stat.h
@@ -52,6 +52,7 @@ DECLARE_PER_CPU(struct kernel_cpustat, kernel_cpustat);
 #define kstat_cpu(cpu) per_cpu(kstat, cpu)
 #define kcpustat_cpu(cpu) per_cpu(kernel_cpustat, cpu)
 extern unsigned long long nr_context_switches_cpu(int cpu);
 extern unsigned long long nr_context_switches(void);
 extern unsigned int kstat_irqs_cpu(unsigned int irq, int cpu);
@@ -67,6 +68,17 @@ static inline unsigned int kstat_softirqs_cpu(unsigned int irq, int cpu)
       return kstat_cpu(cpu).softirqs[irq];
 }
 static inline unsigned int kstat_cpu_softirqs_sum(int cpu)
 {
 	int i;
 	unsigned int sum = 0;
 	for (i = 0; i < NR_SOFTIRQS; i++)
 		sum += kstat_softirqs_cpu(i, cpu);
 	return sum;
 }
 /*
 * Number of interrupts per specific IRQ source, since bootup
 */
@@ -75,7 +87,7 @@ extern unsigned int kstat_irqs_usr(unsigned int irq);
 /*
 * Number of interrupts per cpu, since bootup
 */
-static inline unsigned int kstat_cpu_irqs_sum(unsigned int cpu)
+static inline unsigned long kstat_cpu_irqs_sum(unsigned int cpu)
 {
 	return kstat_cpu(cpu).irqs_sum;
 }
--- a/include/linux/rculist_nulls.h
+++ b/include/linux/rculist_nulls.h
@@ -139,7 +139,7 @@ static inline void hlist_nulls_add_tail_rcu(struct hlist_nulls_node *n,
 	if (last) {
 		n->next = last->next;
 		n->pprev = &last->next;
-		rcu_assign_pointer(hlist_next_rcu(last), n);
+		rcu_assign_pointer(hlist_nulls_next_rcu(last), n);
 	} else {
 		hlist_nulls_add_head_rcu(n, h);
 	}
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -238,6 +238,7 @@ void synchronize_rcu_tasks_rude(void);
 #define rcu_note_voluntary_context_switch(t) rcu_tasks_qs(t, false)
 void exit_tasks_rcu_start(void);
 void exit_tasks_rcu_stop(void);
 void exit_tasks_rcu_finish(void);
 #else /* #ifdef CONFIG_TASKS_RCU_GENERIC */
 #define rcu_tasks_classic_qs(t, preempt) do { } while (0)
@@ -246,6 +247,7 @@ void exit_tasks_rcu_finish(void);
 #define call_rcu_tasks call_rcu
 #define synchronize_rcu_tasks synchronize_rcu
 static inline void exit_tasks_rcu_start(void) { }
 static inline void exit_tasks_rcu_stop(void) { }
 static inline void exit_tasks_rcu_finish(void) { }
 #endif /* #else #ifdef CONFIG_TASKS_RCU_GENERIC */
@@ -374,11 +376,18 @@ static inline int debug_lockdep_rcu_enabled(void)
 * RCU_LOCKDEP_WARN - emit lockdep splat if specified condition is met
 * @c: condition to check
 * @s: informative message
 *
 * This checks debug_lockdep_rcu_enabled() before checking (c) to
 * prevent early boot splats due to lockdep not yet being initialized,
 * and rechecks it after checking (c) to prevent false-positive splats
 * due to races with lockdep being disabled.  See commit 3066820034b5dd
 * ("rcu: Reject RCU_LOCKDEP_WARN() false positives") for more detail.
 */
 #define RCU_LOCKDEP_WARN(c, s)						\
 	do {								\
 		static bool __section(".data.unlikely") __warned;	\
-		if ((c) && debug_lockdep_rcu_enabled() && !__warned) {	\
+		if (debug_lockdep_rcu_enabled() && (c) &&		\
 		    debug_lockdep_rcu_enabled() && !__warned) {		\
 			__warned = true;				\
 			lockdep_rcu_suspicious(__FILE__, __LINE__, s);	\
 		}							\
@@ -1004,6 +1013,9 @@ static inline notrace void rcu_read_unlock_sched_notrace(void)
 #define kvfree_rcu(...) KVFREE_GET_MACRO(__VA_ARGS__,		\
 	kvfree_rcu_arg_2, kvfree_rcu_arg_1)(__VA_ARGS__)
 #define kvfree_rcu_mightsleep(ptr) kvfree_rcu_arg_1(ptr)
 #define kfree_rcu_mightsleep(ptr) kvfree_rcu_mightsleep(ptr)
 #define KVFREE_GET_MACRO(_1, _2, NAME, ...) NAME
 #define kvfree_rcu_arg_2(ptr, rhf)					\
 do {									\
@@ -1011,8 +1023,7 @@ do {									\
 									\
 	if (___p) {									\
 		BUILD_BUG_ON(!__is_kvfree_rcu_offset(offsetof(typeof(*(ptr)), rhf)));	\
-		kvfree_call_rcu(&((___p)->rhf), (rcu_callback_t)(unsigned long)		\
+		kvfree_call_rcu(&((___p)->rhf), (void *) (___p));			\
 			(offsetof(typeof(*(ptr)), rhf)));				\
 	}										\
 } while (0)
@@ -1021,7 +1032,7 @@ do {								\
 	typeof(ptr) ___p = (ptr);				\
 								\
 	if (___p)						\
-		kvfree_call_rcu(NULL, (rcu_callback_t) (___p));	\
+		kvfree_call_rcu(NULL, (void *) (___p));		\
 } while (0)
 /*
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -98,25 +98,25 @@ static inline void synchronize_rcu_expedited(void)
 */
 extern void kvfree(const void *addr);
-static inline void __kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
+static inline void __kvfree_call_rcu(struct rcu_head *head, void *ptr)
 {
 	if (head) {
-		call_rcu(head, func);
+		call_rcu(head, (rcu_callback_t) ((void *) head - ptr));
 		return;
 	}
 	// kvfree_rcu(one_arg) call.
 	might_sleep();
 	synchronize_rcu();
-	kvfree((void *) func);
+	kvfree(ptr);
 }
 #ifdef CONFIG_KASAN_GENERIC
-void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func);
+void kvfree_call_rcu(struct rcu_head *head, void *ptr);
 #else
-static inline void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
+static inline void kvfree_call_rcu(struct rcu_head *head, void *ptr)
 {
-	__kvfree_call_rcu(head, func);
+	__kvfree_call_rcu(head, ptr);
 }
 #endif
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -33,7 +33,7 @@ static inline void rcu_virt_note_context_switch(void)
 }
 void synchronize_rcu_expedited(void);
-void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func);
+void kvfree_call_rcu(struct rcu_head *head, void *ptr);
 void rcu_barrier(void);
 bool rcu_eqs_special_set(int cpu);
--- a/include/linux/srcu.h
+++ b/include/linux/srcu.h
@@ -214,6 +214,34 @@ srcu_read_lock_notrace(struct srcu_struct *ssp) __acquires(ssp)
 	return retval;
 }
 /**
 * srcu_down_read - register a new reader for an SRCU-protected structure.
 * @ssp: srcu_struct in which to register the new reader.
 *
 * Enter a semaphore-like SRCU read-side critical section.  Note that
 * SRCU read-side critical sections may be nested.  However, it is
 * illegal to call anything that waits on an SRCU grace period for the
 * same srcu_struct, whether directly or indirectly.  Please note that
 * one way to indirectly wait on an SRCU grace period is to acquire
 * a mutex that is held elsewhere while calling synchronize_srcu() or
 * synchronize_srcu_expedited().  But if you want lockdep to help you
 * keep this stuff straight, you should instead use srcu_read_lock().
 *
 * The semaphore-like nature of srcu_down_read() means that the matching
 * srcu_up_read() can be invoked from some other context, for example,
 * from some other task or from an irq handler.  However, neither
 * srcu_down_read() nor srcu_up_read() may be invoked from an NMI handler.
 *
 * Calls to srcu_down_read() may be nested, similar to the manner in
 * which calls to down_read() may be nested.
 */
 static inline int srcu_down_read(struct srcu_struct *ssp) __acquires(ssp)
 {
 	WARN_ON_ONCE(in_nmi());
 	srcu_check_nmi_safety(ssp, false);
 	return __srcu_read_lock(ssp);
 }
 /**
 * srcu_read_unlock - unregister a old reader from an SRCU-protected structure.
 * @ssp: srcu_struct in which to unregister the old reader.
@@ -254,6 +282,23 @@ srcu_read_unlock_notrace(struct srcu_struct *ssp, int idx) __releases(ssp)
 	__srcu_read_unlock(ssp, idx);
 }
 /**
 * srcu_up_read - unregister a old reader from an SRCU-protected structure.
 * @ssp: srcu_struct in which to unregister the old reader.
 * @idx: return value from corresponding srcu_read_lock().
 *
 * Exit an SRCU read-side critical section, but not necessarily from
 * the same context as the maching srcu_down_read().
 */
 static inline void srcu_up_read(struct srcu_struct *ssp, int idx)
 	__releases(ssp)
 {
 	WARN_ON_ONCE(idx & ~0x1);
 	WARN_ON_ONCE(in_nmi());
 	srcu_check_nmi_safety(ssp, false);
 	__srcu_read_unlock(ssp, idx);
 }
 /**
 * smp_mb__after_srcu_read_unlock - ensure full ordering after srcu_read_unlock
 *
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -49,7 +49,7 @@ struct srcu_data {
 struct srcu_node {
 	spinlock_t __private lock;
 	unsigned long srcu_have_cbs[4];		/* GP seq for children having CBs, but only */
-						/*  if greater than ->srcu_gq_seq. */
+						/*  if greater than ->srcu_gp_seq. */
 	unsigned long srcu_data_have_cbs[4];	/* Which srcu_data structs have CBs for given GP? */
 	unsigned long srcu_gp_seq_needed_exp;	/* Furthest future exp GP. */
 	struct srcu_node *srcu_parent;		/* Next up in tree. */
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1873,7 +1873,6 @@ config PERF_EVENTS
 	default y if PROFILING
 	depends on HAVE_PERF_EVENTS
 	select IRQ_WORK
 	select SRCU
 	help
 	  Enable kernel support for various performance events provided
 	  by software and hardware.
--- a/kernel/locking/locktorture.c
+++ b/kernel/locking/locktorture.c
@@ -46,6 +46,9 @@ torture_param(int, shutdown_secs, 0, "Shutdown time (j), <= zero to disable.");
 torture_param(int, stat_interval, 60,
 	     "Number of seconds between stats printk()s");
 torture_param(int, stutter, 5, "Number of jiffies to run/halt test, 0=disable");
 torture_param(int, rt_boost, 2,
 		"Do periodic rt-boost. 0=Disable, 1=Only for rt_mutex, 2=For all lock types.");
 torture_param(int, rt_boost_factor, 50, "A factor determining how often rt-boost happens.");
 torture_param(int, verbose, 1,
 	     "Enable verbose debugging printk()s");
@@ -127,15 +130,50 @@ static void torture_lock_busted_write_unlock(int tid __maybe_unused)
 	  /* BUGGY, do not use in real life!!! */
 }
-static void torture_boost_dummy(struct torture_random_state *trsp)
+static void __torture_rt_boost(struct torture_random_state *trsp)
 {
-	/* Only rtmutexes care about priority */
+	const unsigned int factor = rt_boost_factor;
 	if (!rt_task(current)) {
 		/*
 		 * Boost priority once every rt_boost_factor operations. When
 		 * the task tries to take the lock, the rtmutex it will account
 		 * for the new priority, and do any corresponding pi-dance.
 		 */
 		if (trsp && !(torture_random(trsp) %
 			      (cxt.nrealwriters_stress * factor))) {
 			sched_set_fifo(current);
 		} else /* common case, do nothing */
 			return;
 	} else {
 		/*
 		 * The task will remain boosted for another 10 * rt_boost_factor
 		 * operations, then restored back to its original prio, and so
 		 * forth.
 		 *
 		 * When @trsp is nil, we want to force-reset the task for
 		 * stopping the kthread.
 		 */
 		if (!trsp || !(torture_random(trsp) %
 			       (cxt.nrealwriters_stress * factor * 2))) {
 			sched_set_normal(current, 0);
 		} else /* common case, do nothing */
 			return;
 	}
 }
 static void torture_rt_boost(struct torture_random_state *trsp)
 {
 	if (rt_boost != 2)
 		return;
 	__torture_rt_boost(trsp);
 }
 static struct lock_torture_ops lock_busted_ops = {
 	.writelock	= torture_lock_busted_write_lock,
 	.write_delay	= torture_lock_busted_write_delay,
-	.task_boost     = torture_boost_dummy,
+	.task_boost     = torture_rt_boost,
 	.writeunlock	= torture_lock_busted_write_unlock,
 	.readlock       = NULL,
 	.read_delay     = NULL,
@@ -179,7 +217,7 @@ __releases(torture_spinlock)
 static struct lock_torture_ops spin_lock_ops = {
 	.writelock	= torture_spin_lock_write_lock,
 	.write_delay	= torture_spin_lock_write_delay,
-	.task_boost     = torture_boost_dummy,
+	.task_boost     = torture_rt_boost,
 	.writeunlock	= torture_spin_lock_write_unlock,
 	.readlock       = NULL,
 	.read_delay     = NULL,
@@ -206,7 +244,7 @@ __releases(torture_spinlock)
 static struct lock_torture_ops spin_lock_irq_ops = {
 	.writelock	= torture_spin_lock_write_lock_irq,
 	.write_delay	= torture_spin_lock_write_delay,
-	.task_boost     = torture_boost_dummy,
+	.task_boost     = torture_rt_boost,
 	.writeunlock	= torture_lock_spin_write_unlock_irq,
 	.readlock       = NULL,
 	.read_delay     = NULL,
@@ -275,7 +313,7 @@ __releases(torture_rwlock)
 static struct lock_torture_ops rw_lock_ops = {
 	.writelock	= torture_rwlock_write_lock,
 	.write_delay	= torture_rwlock_write_delay,
-	.task_boost     = torture_boost_dummy,
+	.task_boost     = torture_rt_boost,
 	.writeunlock	= torture_rwlock_write_unlock,
 	.readlock       = torture_rwlock_read_lock,
 	.read_delay     = torture_rwlock_read_delay,
@@ -318,7 +356,7 @@ __releases(torture_rwlock)
 static struct lock_torture_ops rw_lock_irq_ops = {
 	.writelock	= torture_rwlock_write_lock_irq,
 	.write_delay	= torture_rwlock_write_delay,
-	.task_boost     = torture_boost_dummy,
+	.task_boost     = torture_rt_boost,
 	.writeunlock	= torture_rwlock_write_unlock_irq,
 	.readlock       = torture_rwlock_read_lock_irq,
 	.read_delay     = torture_rwlock_read_delay,
@@ -358,7 +396,7 @@ __releases(torture_mutex)
 static struct lock_torture_ops mutex_lock_ops = {
 	.writelock	= torture_mutex_lock,
 	.write_delay	= torture_mutex_delay,
-	.task_boost     = torture_boost_dummy,
+	.task_boost     = torture_rt_boost,
 	.writeunlock	= torture_mutex_unlock,
 	.readlock       = NULL,
 	.read_delay     = NULL,
@@ -456,7 +494,7 @@ static struct lock_torture_ops ww_mutex_lock_ops = {
 	.exit		= torture_ww_mutex_exit,
 	.writelock	= torture_ww_mutex_lock,
 	.write_delay	= torture_mutex_delay,
-	.task_boost     = torture_boost_dummy,
+	.task_boost     = torture_rt_boost,
 	.writeunlock	= torture_ww_mutex_unlock,
 	.readlock       = NULL,
 	.read_delay     = NULL,
@@ -474,37 +512,6 @@ __acquires(torture_rtmutex)
 	return 0;
 }
 static void torture_rtmutex_boost(struct torture_random_state *trsp)
 {
 	const unsigned int factor = 50000; /* yes, quite arbitrary */
 	if (!rt_task(current)) {
 		/*
 		 * Boost priority once every ~50k operations. When the
 		 * task tries to take the lock, the rtmutex it will account
 		 * for the new priority, and do any corresponding pi-dance.
 		 */
 		if (trsp && !(torture_random(trsp) %
 			      (cxt.nrealwriters_stress * factor))) {
 			sched_set_fifo(current);
 		} else /* common case, do nothing */
 			return;
 	} else {
 		/*
 		 * The task will remain boosted for another ~500k operations,
 		 * then restored back to its original prio, and so forth.
 		 *
 		 * When @trsp is nil, we want to force-reset the task for
 		 * stopping the kthread.
 		 */
 		if (!trsp || !(torture_random(trsp) %
 			       (cxt.nrealwriters_stress * factor * 2))) {
 			sched_set_normal(current, 0);
 		} else /* common case, do nothing */
 			return;
 	}
 }
 static void torture_rtmutex_delay(struct torture_random_state *trsp)
 {
 	const unsigned long shortdelay_us = 2;
@@ -530,10 +537,18 @@ __releases(torture_rtmutex)
 	rt_mutex_unlock(&torture_rtmutex);
 }
 static void torture_rt_boost_rtmutex(struct torture_random_state *trsp)
 {
 	if (!rt_boost)
 		return;
 	__torture_rt_boost(trsp);
 }
 static struct lock_torture_ops rtmutex_lock_ops = {
 	.writelock	= torture_rtmutex_lock,
 	.write_delay	= torture_rtmutex_delay,
-	.task_boost     = torture_rtmutex_boost,
+	.task_boost     = torture_rt_boost_rtmutex,
 	.writeunlock	= torture_rtmutex_unlock,
 	.readlock       = NULL,
 	.read_delay     = NULL,
@@ -600,7 +615,7 @@ __releases(torture_rwsem)
 static struct lock_torture_ops rwsem_lock_ops = {
 	.writelock	= torture_rwsem_down_write,
 	.write_delay	= torture_rwsem_write_delay,
-	.task_boost     = torture_boost_dummy,
+	.task_boost     = torture_rt_boost,
 	.writeunlock	= torture_rwsem_up_write,
 	.readlock       = torture_rwsem_down_read,
 	.read_delay     = torture_rwsem_read_delay,
@@ -652,7 +667,7 @@ static struct lock_torture_ops percpu_rwsem_lock_ops = {
 	.exit		= torture_percpu_rwsem_exit,
 	.writelock	= torture_percpu_rwsem_down_write,
 	.write_delay	= torture_rwsem_write_delay,
-	.task_boost     = torture_boost_dummy,
+	.task_boost     = torture_rt_boost,
 	.writeunlock	= torture_percpu_rwsem_up_write,
 	.readlock       = torture_percpu_rwsem_down_read,
 	.read_delay     = torture_rwsem_read_delay,
--- a/kernel/notifier.c
+++ b/kernel/notifier.c
@@ -456,7 +456,6 @@ int raw_notifier_call_chain(struct raw_notifier_head *nh,
 }
 EXPORT_SYMBOL_GPL(raw_notifier_call_chain);
 #ifdef CONFIG_SRCU
 /*
 *	SRCU notifier chain routines.    Registration and unregistration
 *	use a mutex, and call_chain is synchronized by SRCU (no locks).
@@ -573,8 +572,6 @@ void srcu_init_notifier_head(struct srcu_notifier_head *nh)
 }
 EXPORT_SYMBOL_GPL(srcu_init_notifier_head);
 #endif /* CONFIG_SRCU */
 static ATOMIC_NOTIFIER_HEAD(die_chain);
 int notrace notify_die(enum die_val val, const char *str,
--- a/kernel/pid_namespace.c
+++ b/kernel/pid_namespace.c
@@ -244,7 +244,24 @@ void zap_pid_ns_processes(struct pid_namespace *pid_ns)
 		set_current_state(TASK_INTERRUPTIBLE);
 		if (pid_ns->pid_allocated == init_pids)
 			break;
 		/*
 		 * Release tasks_rcu_exit_srcu to avoid following deadlock:
 		 *
 		 * 1) TASK A unshare(CLONE_NEWPID)
 		 * 2) TASK A fork() twice -> TASK B (child reaper for new ns)
 		 *    and TASK C
 		 * 3) TASK B exits, kills TASK C, waits for TASK A to reap it
 		 * 4) TASK A calls synchronize_rcu_tasks()
 		 *                   -> synchronize_srcu(tasks_rcu_exit_srcu)
 		 * 5) *DEADLOCK*
 		 *
 		 * It is considered safe to release tasks_rcu_exit_srcu here
 		 * because we assume the current task can not be concurrently
 		 * reaped at this point.
 		 */
 		exit_tasks_rcu_stop();
 		schedule();
 		exit_tasks_rcu_start();
 	}
 	__set_current_state(TASK_RUNNING);
--- a/kernel/rcu/Kconfig.debug
+++ b/kernel/rcu/Kconfig.debug
@@ -82,7 +82,7 @@ config RCU_CPU_STALL_TIMEOUT
 config RCU_EXP_CPU_STALL_TIMEOUT
 	int "Expedited RCU CPU stall timeout in milliseconds"
 	depends on RCU_STALL_COMMON
-	range 0 21000
+	range 0 300000
 	default 0
 	help
 	  If a given expedited RCU grace period extends more than the
@@ -92,6 +92,19 @@ config RCU_EXP_CPU_STALL_TIMEOUT
 	  says to use the RCU_CPU_STALL_TIMEOUT value converted from
 	  seconds to milliseconds.
 config RCU_CPU_STALL_CPUTIME
 	bool "Provide additional RCU stall debug information"
 	depends on RCU_STALL_COMMON
 	default n
 	help
 	  Collect statistics during the sampling period, such as the number of
 	  (hard interrupts, soft interrupts, task switches) and the cputime of
 	  (hard interrupts, soft interrupts, kernel tasks) are added to the
 	  RCU stall report. For multiple continuous RCU stalls, all sampling
 	  periods begin at half of the first RCU stall timeout.
 	  The boot option rcupdate.rcu_cpu_stall_cputime has the same function
 	  as this one, but will override this if it exists.
 config RCU_TRACE
 	bool "Enable tracing for RCU"
 	depends on DEBUG_KERNEL
--- a/kernel/rcu/rcu.h
+++ b/kernel/rcu/rcu.h
@@ -224,6 +224,8 @@ extern int rcu_cpu_stall_ftrace_dump;
 extern int rcu_cpu_stall_suppress;
 extern int rcu_cpu_stall_timeout;
 extern int rcu_exp_cpu_stall_timeout;
 extern int rcu_cpu_stall_cputime;
 extern bool rcu_exp_stall_task_details __read_mostly;
 int rcu_jiffies_till_stall_check(void);
 int rcu_exp_jiffies_till_stall_check(void);
@@ -447,14 +449,20 @@ do {									\
 /* Tiny RCU doesn't expedite, as its purpose in life is instead to be tiny. */
 static inline bool rcu_gp_is_normal(void) { return true; }
 static inline bool rcu_gp_is_expedited(void) { return false; }
 static inline bool rcu_async_should_hurry(void) { return false; }
 static inline void rcu_expedite_gp(void) { }
 static inline void rcu_unexpedite_gp(void) { }
 static inline void rcu_async_hurry(void) { }
 static inline void rcu_async_relax(void) { }
 static inline void rcu_request_urgent_qs_task(struct task_struct *t) { }
 #else /* #ifdef CONFIG_TINY_RCU */
 bool rcu_gp_is_normal(void);     /* Internal RCU use. */
 bool rcu_gp_is_expedited(void);  /* Internal RCU use. */
 bool rcu_async_should_hurry(void);  /* Internal RCU use. */
 void rcu_expedite_gp(void);
 void rcu_unexpedite_gp(void);
 void rcu_async_hurry(void);
 void rcu_async_relax(void);
 void rcupdate_announce_bootup_oddness(void);
 #ifdef CONFIG_TASKS_RCU_GENERIC
 void show_rcu_tasks_gp_kthreads(void);
--- a/kernel/rcu/rcu_segcblist.c
+++ b/kernel/rcu/rcu_segcblist.c
@@ -89,7 +89,7 @@ static void rcu_segcblist_set_len(struct rcu_segcblist *rsclp, long v)
 }
 /* Get the length of a segment of the rcu_segcblist structure. */
-static long rcu_segcblist_get_seglen(struct rcu_segcblist *rsclp, int seg)
+long rcu_segcblist_get_seglen(struct rcu_segcblist *rsclp, int seg)
 {
 	return READ_ONCE(rsclp->seglen[seg]);
 }
--- a/kernel/rcu/rcu_segcblist.h
+++ b/kernel/rcu/rcu_segcblist.h
@@ -15,6 +15,8 @@ static inline long rcu_cblist_n_cbs(struct rcu_cblist *rclp)
 	return READ_ONCE(rclp->len);
 }
 long rcu_segcblist_get_seglen(struct rcu_segcblist *rsclp, int seg);
 /* Return number of callbacks in segmented callback list by summing seglen. */
 long rcu_segcblist_n_segment_cbs(struct rcu_segcblist *rsclp);
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -399,7 +399,7 @@ static int torture_readlock_not_held(void)
 	return rcu_read_lock_bh_held() || rcu_read_lock_sched_held();
 }
-static int rcu_torture_read_lock(void) __acquires(RCU)
+static int rcu_torture_read_lock(void)
 {
 	rcu_read_lock();
 	return 0;
@@ -441,7 +441,7 @@ rcu_read_delay(struct torture_random_state *rrsp, struct rt_read_seg *rtrsp)
 	}
 }
-static void rcu_torture_read_unlock(int idx) __releases(RCU)
+static void rcu_torture_read_unlock(int idx)
 {
 	rcu_read_unlock();
 }
@@ -625,7 +625,7 @@ static struct srcu_struct srcu_ctld;
 static struct srcu_struct *srcu_ctlp = &srcu_ctl;
 static struct rcu_torture_ops srcud_ops;
-static int srcu_torture_read_lock(void) __acquires(srcu_ctlp)
+static int srcu_torture_read_lock(void)
 {
 	if (cur_ops == &srcud_ops)
 		return srcu_read_lock_nmisafe(srcu_ctlp);
@@ -652,7 +652,7 @@ srcu_read_delay(struct torture_random_state *rrsp, struct rt_read_seg *rtrsp)
 	}
 }
-static void srcu_torture_read_unlock(int idx) __releases(srcu_ctlp)
+static void srcu_torture_read_unlock(int idx)
 {
 	if (cur_ops == &srcud_ops)
 		srcu_read_unlock_nmisafe(srcu_ctlp, idx);
@@ -814,13 +814,13 @@ static void synchronize_rcu_trivial(void)
 	}
 }
-static int rcu_torture_read_lock_trivial(void) __acquires(RCU)
+static int rcu_torture_read_lock_trivial(void)
 {
 	preempt_disable();
 	return 0;
 }
-static void rcu_torture_read_unlock_trivial(int idx) __releases(RCU)
+static void rcu_torture_read_unlock_trivial(int idx)
 {
 	preempt_enable();
 }
--- a/kernel/rcu/refscale.c
+++ b/kernel/rcu/refscale.c
@@ -76,6 +76,8 @@ torture_param(int, verbose_batched, 0, "Batch verbose debugging printk()s");
 // Wait until there are multiple CPUs before starting test.
 torture_param(int, holdoff, IS_BUILTIN(CONFIG_RCU_REF_SCALE_TEST) ? 10 : 0,
 	      "Holdoff time before test start (s)");
 // Number of typesafe_lookup structures, that is, the degree of concurrency.
 torture_param(long, lookup_instances, 0, "Number of typesafe_lookup structures.");
 // Number of loops per experiment, all readers execute operations concurrently.
 torture_param(long, loops, 10000, "Number of loops per experiment.");
 // Number of readers, with -1 defaulting to about 75% of the CPUs.
@@ -124,7 +126,7 @@ static int exp_idx;
 // Operations vector for selecting different types of tests.
 struct ref_scale_ops {
-	void (*init)(void);
+	bool (*init)(void);
 	void (*cleanup)(void);
 	void (*readsection)(const int nloops);
 	void (*delaysection)(const int nloops, const int udl, const int ndl);
@@ -162,8 +164,9 @@ static void ref_rcu_delay_section(const int nloops, const int udl, const int ndl
 	}
 }
-static void rcu_sync_scale_init(void)
+static bool rcu_sync_scale_init(void)
 {
 	return true;
 }
 static struct ref_scale_ops rcu_ops = {
@@ -315,9 +318,10 @@ static struct ref_scale_ops refcnt_ops = {
 // Definitions for rwlock
 static rwlock_t test_rwlock;
-static void ref_rwlock_init(void)
+static bool ref_rwlock_init(void)
 {
 	rwlock_init(&test_rwlock);
 	return true;
 }
 static void ref_rwlock_section(const int nloops)
@@ -351,9 +355,10 @@ static struct ref_scale_ops rwlock_ops = {
 // Definitions for rwsem
 static struct rw_semaphore test_rwsem;
-static void ref_rwsem_init(void)
+static bool ref_rwsem_init(void)
 {
 	init_rwsem(&test_rwsem);
 	return true;
 }
 static void ref_rwsem_section(const int nloops)
@@ -523,6 +528,237 @@ static struct ref_scale_ops clock_ops = {
 	.name		= "clock"
 };
 ////////////////////////////////////////////////////////////////////////
 //
 // Methods leveraging SLAB_TYPESAFE_BY_RCU.
 //
 // Item to look up in a typesafe manner.  Array of pointers to these.
 struct refscale_typesafe {
 	atomic_t rts_refctr;  // Used by all flavors
 	spinlock_t rts_lock;
 	seqlock_t rts_seqlock;
 	unsigned int a;
 	unsigned int b;
 };
 static struct kmem_cache *typesafe_kmem_cachep;
 static struct refscale_typesafe **rtsarray;
 static long rtsarray_size;
 static DEFINE_TORTURE_RANDOM_PERCPU(refscale_rand);
 static bool (*rts_acquire)(struct refscale_typesafe *rtsp, unsigned int *start);
 static bool (*rts_release)(struct refscale_typesafe *rtsp, unsigned int start);
 // Conditionally acquire an explicit in-structure reference count.
 static bool typesafe_ref_acquire(struct refscale_typesafe *rtsp, unsigned int *start)
 {
 	return atomic_inc_not_zero(&rtsp->rts_refctr);
 }
 // Unconditionally release an explicit in-structure reference count.
 static bool typesafe_ref_release(struct refscale_typesafe *rtsp, unsigned int start)
 {
 	if (!atomic_dec_return(&rtsp->rts_refctr)) {
 		WRITE_ONCE(rtsp->a, rtsp->a + 1);
 		kmem_cache_free(typesafe_kmem_cachep, rtsp);
 	}
 	return true;
 }
 // Unconditionally acquire an explicit in-structure spinlock.
 static bool typesafe_lock_acquire(struct refscale_typesafe *rtsp, unsigned int *start)
 {
 	spin_lock(&rtsp->rts_lock);
 	return true;
 }
 // Unconditionally release an explicit in-structure spinlock.
 static bool typesafe_lock_release(struct refscale_typesafe *rtsp, unsigned int start)
 {
 	spin_unlock(&rtsp->rts_lock);
 	return true;
 }
 // Unconditionally acquire an explicit in-structure sequence lock.
 static bool typesafe_seqlock_acquire(struct refscale_typesafe *rtsp, unsigned int *start)
 {
 	*start = read_seqbegin(&rtsp->rts_seqlock);
 	return true;
 }
 // Conditionally release an explicit in-structure sequence lock.  Return
 // true if this release was successful, that is, if no retry is required.
 static bool typesafe_seqlock_release(struct refscale_typesafe *rtsp, unsigned int start)
 {
 	return !read_seqretry(&rtsp->rts_seqlock, start);
 }
 // Do a read-side critical section with the specified delay in
 // microseconds and nanoseconds inserted so as to increase probability
 // of failure.
 static void typesafe_delay_section(const int nloops, const int udl, const int ndl)
 {
 	unsigned int a;
 	unsigned int b;
 	int i;
 	long idx;
 	struct refscale_typesafe *rtsp;
 	unsigned int start;
 	for (i = nloops; i >= 0; i--) {
 		preempt_disable();
 		idx = torture_random(this_cpu_ptr(&refscale_rand)) % rtsarray_size;
 		preempt_enable();
 retry:
 		rcu_read_lock();
 		rtsp = rcu_dereference(rtsarray[idx]);
 		a = READ_ONCE(rtsp->a);
 		if (!rts_acquire(rtsp, &start)) {
 			rcu_read_unlock();
 			goto retry;
 		}
 		if (a != READ_ONCE(rtsp->a)) {
 			(void)rts_release(rtsp, start);
 			rcu_read_unlock();
 			goto retry;
 		}
 		un_delay(udl, ndl);
 		// Remember, seqlock read-side release can fail.
 		if (!rts_release(rtsp, start)) {
 			rcu_read_unlock();
 			goto retry;
 		}
 		b = READ_ONCE(rtsp->a);
 		WARN_ONCE(a != b, "Re-read of ->a changed from %u to %u.\n", a, b);
 		b = rtsp->b;
 		rcu_read_unlock();
 		WARN_ON_ONCE(a * a != b);
 	}
 }
 // Because the acquisition and release methods are expensive, there
 // is no point in optimizing away the un_delay() function's two checks.
 // Thus simply define typesafe_read_section() as a simple wrapper around
 // typesafe_delay_section().
 static void typesafe_read_section(const int nloops)
 {
 	typesafe_delay_section(nloops, 0, 0);
 }
 // Allocate and initialize one refscale_typesafe structure.
 static struct refscale_typesafe *typesafe_alloc_one(void)
 {
 	struct refscale_typesafe *rtsp;
 	rtsp = kmem_cache_alloc(typesafe_kmem_cachep, GFP_KERNEL);
 	if (!rtsp)
 		return NULL;
 	atomic_set(&rtsp->rts_refctr, 1);
 	WRITE_ONCE(rtsp->a, rtsp->a + 1);
 	WRITE_ONCE(rtsp->b, rtsp->a * rtsp->a);
 	return rtsp;
 }
 // Slab-allocator constructor for refscale_typesafe structures created
 // out of a new slab of system memory.
 static void refscale_typesafe_ctor(void *rtsp_in)
 {
 	struct refscale_typesafe *rtsp = rtsp_in;
 	spin_lock_init(&rtsp->rts_lock);
 	seqlock_init(&rtsp->rts_seqlock);
 	preempt_disable();
 	rtsp->a = torture_random(this_cpu_ptr(&refscale_rand));
 	preempt_enable();
 }
 static struct ref_scale_ops typesafe_ref_ops;
 static struct ref_scale_ops typesafe_lock_ops;
 static struct ref_scale_ops typesafe_seqlock_ops;
 // Initialize for a typesafe test.
 static bool typesafe_init(void)
 {
 	long idx;
 	long si = lookup_instances;
 	typesafe_kmem_cachep = kmem_cache_create("refscale_typesafe",
 						 sizeof(struct refscale_typesafe), sizeof(void *),
 						 SLAB_TYPESAFE_BY_RCU, refscale_typesafe_ctor);
 	if (!typesafe_kmem_cachep)
 		return false;
 	if (si < 0)
 		si = -si * nr_cpu_ids;
 	else if (si == 0)
 		si = nr_cpu_ids;
 	rtsarray_size = si;
 	rtsarray = kcalloc(si, sizeof(*rtsarray), GFP_KERNEL);
 	if (!rtsarray)
 		return false;
 	for (idx = 0; idx < rtsarray_size; idx++) {
 		rtsarray[idx] = typesafe_alloc_one();
 		if (!rtsarray[idx])
 			return false;
 	}
 	if (cur_ops == &typesafe_ref_ops) {
 		rts_acquire = typesafe_ref_acquire;
 		rts_release = typesafe_ref_release;
 	} else if (cur_ops == &typesafe_lock_ops) {
 		rts_acquire = typesafe_lock_acquire;
 		rts_release = typesafe_lock_release;
 	} else if (cur_ops == &typesafe_seqlock_ops) {
 		rts_acquire = typesafe_seqlock_acquire;
 		rts_release = typesafe_seqlock_release;
 	} else {
 		WARN_ON_ONCE(1);
 		return false;
 	}
 	return true;
 }
 // Clean up after a typesafe test.
 static void typesafe_cleanup(void)
 {
 	long idx;
 	if (rtsarray) {
 		for (idx = 0; idx < rtsarray_size; idx++)
 			kmem_cache_free(typesafe_kmem_cachep, rtsarray[idx]);
 		kfree(rtsarray);
 		rtsarray = NULL;
 		rtsarray_size = 0;
 	}
 	kmem_cache_destroy(typesafe_kmem_cachep);
 	typesafe_kmem_cachep = NULL;
 	rts_acquire = NULL;
 	rts_release = NULL;
 }
 // The typesafe_init() function distinguishes these structures by address.
 static struct ref_scale_ops typesafe_ref_ops = {
 	.init		= typesafe_init,
 	.cleanup	= typesafe_cleanup,
 	.readsection	= typesafe_read_section,
 	.delaysection	= typesafe_delay_section,
 	.name		= "typesafe_ref"
 };
 static struct ref_scale_ops typesafe_lock_ops = {
 	.init		= typesafe_init,
 	.cleanup	= typesafe_cleanup,
 	.readsection	= typesafe_read_section,
 	.delaysection	= typesafe_delay_section,
 	.name		= "typesafe_lock"
 };
 static struct ref_scale_ops typesafe_seqlock_ops = {
 	.init		= typesafe_init,
 	.cleanup	= typesafe_cleanup,
 	.readsection	= typesafe_read_section,
 	.delaysection	= typesafe_delay_section,
 	.name		= "typesafe_seqlock"
 };
 static void rcu_scale_one_reader(void)
 {
 	if (readdelay <= 0)
@@ -812,6 +1048,7 @@ ref_scale_init(void)
 	static struct ref_scale_ops *scale_ops[] = {
 		&rcu_ops, &srcu_ops, RCU_TRACE_OPS RCU_TASKS_OPS &refcnt_ops, &rwlock_ops,
 		&rwsem_ops, &lock_ops, &lock_irq_ops, &acqrel_ops, &clock_ops,
 		&typesafe_ref_ops, &typesafe_lock_ops, &typesafe_seqlock_ops,
 	};
 	if (!torture_init_begin(scale_type, verbose))
@@ -833,7 +1070,10 @@ ref_scale_init(void)
 		goto unwind;
 	}
 	if (cur_ops->init)
-		cur_ops->init();
+		if (!cur_ops->init()) {
 			firsterr = -EUCLEAN;
 			goto unwind;
 		}
 	ref_scale_print_module_parms(cur_ops, "Start of test");
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -154,7 +154,7 @@ static void init_srcu_struct_data(struct srcu_struct *ssp)
 */
 static inline bool srcu_invl_snp_seq(unsigned long s)
 {
-	return rcu_seq_state(s) == SRCU_SNP_INIT_SEQ;
+	return s == SRCU_SNP_INIT_SEQ;
 }
 /*
@@ -469,24 +469,59 @@ static bool srcu_readers_active_idx_check(struct srcu_struct *ssp, int idx)
 	/*
 	 * If the locks are the same as the unlocks, then there must have
-	 * been no readers on this index at some time in between. This does
+	 * been no readers on this index at some point in this function.
-	 * not mean that there are no more readers, as one could have read
+	 * But there might be more readers, as a task might have read
-	 * the current index but not have incremented the lock counter yet.
+	 * the current ->srcu_idx but not yet have incremented its CPU's
 	 * ->srcu_lock_count[idx] counter.  In fact, it is possible
 	 * that most of the tasks have been preempted between fetching
 	 * ->srcu_idx and incrementing ->srcu_lock_count[idx].  And there
 	 * could be almost (ULONG_MAX / sizeof(struct task_struct)) tasks
 	 * in a system whose address space was fully populated with memory.
 	 * Call this quantity Nt.
 	 *
-	 * So suppose that the updater is preempted here for so long
+	 * So suppose that the updater is preempted at this point in the
-	 * that more than ULONG_MAX non-nested readers come and go in
+	 * code for a long time.  That now-preempted updater has already
-	 * the meantime.  It turns out that this cannot result in overflow
+	 * flipped ->srcu_idx (possibly during the preceding grace period),
-	 * because if a reader modifies its unlock count after we read it
+	 * done an smp_mb() (again, possibly during the preceding grace
-	 * above, then that reader's next load of ->srcu_idx is guaranteed
+	 * period), and summed up the ->srcu_unlock_count[idx] counters.
-	 * to get the new value, which will cause it to operate on the
+	 * How many times can a given one of the aforementioned Nt tasks
-	 * other bank of counters, where it cannot contribute to the
+	 * increment the old ->srcu_idx value's ->srcu_lock_count[idx]
-	 * overflow of these counters.  This means that there is a maximum
+	 * counter, in the absence of nesting?
 	 * of 2*NR_CPUS increments, which cannot overflow given current
 	 * systems, especially not on 64-bit systems.
 	 *
-	 * OK, how about nesting?  This does impose a limit on nesting
+	 * It can clearly do so once, given that it has already fetched
-	 * of floor(ULONG_MAX/NR_CPUS/2), which should be sufficient,
+	 * the old value of ->srcu_idx and is just about to use that value
-	 * especially on 64-bit systems.
+	 * to index its increment of ->srcu_lock_count[idx].  But as soon as
 	 * it leaves that SRCU read-side critical section, it will increment
 	 * ->srcu_unlock_count[idx], which must follow the updater's above
 	 * read from that same value.  Thus, as soon the reading task does
 	 * an smp_mb() and a later fetch from ->srcu_idx, that task will be
 	 * guaranteed to get the new index.  Except that the increment of
 	 * ->srcu_unlock_count[idx] in __srcu_read_unlock() is after the
 	 * smp_mb(), and the fetch from ->srcu_idx in __srcu_read_lock()
 	 * is before the smp_mb().  Thus, that task might not see the new
 	 * value of ->srcu_idx until the -second- __srcu_read_lock(),
 	 * which in turn means that this task might well increment
 	 * ->srcu_lock_count[idx] for the old value of ->srcu_idx twice,
 	 * not just once.
 	 *
 	 * However, it is important to note that a given smp_mb() takes
 	 * effect not just for the task executing it, but also for any
 	 * later task running on that same CPU.
 	 *
 	 * That is, there can be almost Nt + Nc further increments of
 	 * ->srcu_lock_count[idx] for the old index, where Nc is the number
 	 * of CPUs.  But this is OK because the size of the task_struct
 	 * structure limits the value of Nt and current systems limit Nc
 	 * to a few thousand.
 	 *
 	 * OK, but what about nesting?  This does impose a limit on
 	 * nesting of half of the size of the task_struct structure
 	 * (measured in bytes), which should be sufficient.  A late 2022
 	 * TREE01 rcutorture run reported this size to be no less than
 	 * 9408 bytes, allowing up to 4704 levels of nesting, which is
 	 * comfortably beyond excessive.  Especially on 64-bit systems,
 	 * which are unlikely to be configured with an address space fully
 	 * populated with memory, at least not anytime soon.
 	 */
 	return srcu_readers_lock_idx(ssp, idx) == unlocks;
 }
@@ -726,7 +761,7 @@ static void srcu_gp_start(struct srcu_struct *ssp)
 	int state;
 	if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
-		sdp = per_cpu_ptr(ssp->sda, 0);
+		sdp = per_cpu_ptr(ssp->sda, get_boot_cpu_id());
 	else
 		sdp = this_cpu_ptr(ssp->sda);
 	lockdep_assert_held(&ACCESS_PRIVATE(ssp, lock));
@@ -837,7 +872,8 @@ static void srcu_gp_end(struct srcu_struct *ssp)
 	/* Initiate callback invocation as needed. */
 	ss_state = smp_load_acquire(&ssp->srcu_size_state);
 	if (ss_state < SRCU_SIZE_WAIT_BARRIER) {
-		srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, 0), cbdelay);
+		srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, get_boot_cpu_id()),
 					cbdelay);
 	} else {
 		idx = rcu_seq_ctr(gpseq) % ARRAY_SIZE(snp->srcu_have_cbs);
 		srcu_for_each_node_breadth_first(ssp, snp) {
@@ -914,7 +950,7 @@ static void srcu_funnel_exp_start(struct srcu_struct *ssp, struct srcu_node *snp
 	if (snp)
 		for (; snp != NULL; snp = snp->srcu_parent) {
 			sgsne = READ_ONCE(snp->srcu_gp_seq_needed_exp);
-			if (rcu_seq_done(&ssp->srcu_gp_seq, s) ||
+			if (WARN_ON_ONCE(rcu_seq_done(&ssp->srcu_gp_seq, s)) ||
 			    (!srcu_invl_snp_seq(sgsne) && ULONG_CMP_GE(sgsne, s)))
 				return;
 			spin_lock_irqsave_rcu_node(snp, flags);
@@ -941,6 +977,9 @@ static void srcu_funnel_exp_start(struct srcu_struct *ssp, struct srcu_node *snp
 *
 * Note that this function also does the work of srcu_funnel_exp_start(),
 * in some cases by directly invoking it.
 *
 * The srcu read lock should be hold around this function. And s is a seq snap
 * after holding that lock.
 */
 static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
 				 unsigned long s, bool do_norm)
@@ -961,7 +1000,7 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
 	if (snp_leaf)
 		/* Each pass through the loop does one level of the srcu_node tree. */
 		for (snp = snp_leaf; snp != NULL; snp = snp->srcu_parent) {
-			if (rcu_seq_done(&ssp->srcu_gp_seq, s) && snp != snp_leaf)
+			if (WARN_ON_ONCE(rcu_seq_done(&ssp->srcu_gp_seq, s)) && snp != snp_leaf)
 				return; /* GP already done and CBs recorded. */
 			spin_lock_irqsave_rcu_node(snp, flags);
 			snp_seq = snp->srcu_have_cbs[idx];
@@ -998,8 +1037,8 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
 	if (!do_norm && ULONG_CMP_LT(ssp->srcu_gp_seq_needed_exp, s))
 		WRITE_ONCE(ssp->srcu_gp_seq_needed_exp, s);
-	/* If grace period not already done and none in progress, start it. */
+	/* If grace period not already in progress, start it. */
-	if (!rcu_seq_done(&ssp->srcu_gp_seq, s) &&
+	if (!WARN_ON_ONCE(rcu_seq_done(&ssp->srcu_gp_seq, s)) &&
 	    rcu_seq_state(ssp->srcu_gp_seq) == SRCU_STATE_IDLE) {
 		WARN_ON_ONCE(ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed));
 		srcu_gp_start(ssp);
@@ -1059,10 +1098,11 @@ static void srcu_flip(struct srcu_struct *ssp)
 	/*
 	 * Ensure that if the updater misses an __srcu_read_unlock()
-	 * increment, that task's next __srcu_read_lock() will see the
+	 * increment, that task's __srcu_read_lock() following its next
-	 * above counter update.  Note that both this memory barrier
+	 * __srcu_read_lock() or __srcu_read_unlock() will see the above
-	 * and the one in srcu_readers_active_idx_check() provide the
+	 * counter update.  Note that both this memory barrier and the
-	 * guarantee for __srcu_read_lock().
+	 * one in srcu_readers_active_idx_check() provide the guarantee
 	 * for __srcu_read_lock().
 	 */
 	smp_mb(); /* D */  /* Pairs with C. */
 }
@@ -1161,7 +1201,7 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
 	idx = __srcu_read_lock_nmisafe(ssp);
 	ss_state = smp_load_acquire(&ssp->srcu_size_state);
 	if (ss_state < SRCU_SIZE_WAIT_CALL)
-		sdp = per_cpu_ptr(ssp->sda, 0);
+		sdp = per_cpu_ptr(ssp->sda, get_boot_cpu_id());
 	else
 		sdp = raw_cpu_ptr(ssp->sda);
 	spin_lock_irqsave_sdp_contention(sdp, &flags);
@@ -1497,7 +1537,7 @@ void srcu_barrier(struct srcu_struct *ssp)
 	idx = __srcu_read_lock_nmisafe(ssp);
 	if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
-		srcu_barrier_one_cpu(ssp, per_cpu_ptr(ssp->sda, 0));
+		srcu_barrier_one_cpu(ssp, per_cpu_ptr(ssp->sda,	get_boot_cpu_id()));
 	else
 		for_each_possible_cpu(cpu)
 			srcu_barrier_one_cpu(ssp, per_cpu_ptr(ssp->sda, cpu));
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -384,6 +384,7 @@ static int rcu_tasks_need_gpcb(struct rcu_tasks *rtp)
 {
 	int cpu;
 	unsigned long flags;
 	bool gpdone = poll_state_synchronize_rcu(rtp->percpu_dequeue_gpseq);
 	long n;
 	long ncbs = 0;
 	long ncbsnz = 0;
@@ -425,21 +426,23 @@ static int rcu_tasks_need_gpcb(struct rcu_tasks *rtp)
 			WRITE_ONCE(rtp->percpu_enqueue_shift, order_base_2(nr_cpu_ids));
 			smp_store_release(&rtp->percpu_enqueue_lim, 1);
 			rtp->percpu_dequeue_gpseq = get_state_synchronize_rcu();
 			gpdone = false;
 			pr_info("Starting switch %s to CPU-0 callback queuing.\n", rtp->name);
 		}
 		raw_spin_unlock_irqrestore(&rtp->cbs_gbl_lock, flags);
 	}
-	if (rcu_task_cb_adjust && !ncbsnz &&
+	if (rcu_task_cb_adjust && !ncbsnz && gpdone) {
 	    poll_state_synchronize_rcu(rtp->percpu_dequeue_gpseq)) {
 		raw_spin_lock_irqsave(&rtp->cbs_gbl_lock, flags);
 		if (rtp->percpu_enqueue_lim < rtp->percpu_dequeue_lim) {
 			WRITE_ONCE(rtp->percpu_dequeue_lim, 1);
 			pr_info("Completing switch %s to CPU-0 callback queuing.\n", rtp->name);
 		}
-		for (cpu = rtp->percpu_dequeue_lim; cpu < nr_cpu_ids; cpu++) {
+		if (rtp->percpu_dequeue_lim == 1) {
-			struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rtp->rtpcpu, cpu);
+			for (cpu = rtp->percpu_dequeue_lim; cpu < nr_cpu_ids; cpu++) {
 				struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rtp->rtpcpu, cpu);
-			WARN_ON_ONCE(rcu_segcblist_n_cbs(&rtpcp->cblist));
+				WARN_ON_ONCE(rcu_segcblist_n_cbs(&rtpcp->cblist));
 			}
 		}
 		raw_spin_unlock_irqrestore(&rtp->cbs_gbl_lock, flags);
 	}
@@ -560,8 +563,9 @@ static int __noreturn rcu_tasks_kthread(void *arg)
 static void synchronize_rcu_tasks_generic(struct rcu_tasks *rtp)
 {
 	/* Complain if the scheduler has not started.  */
-	WARN_ONCE(rcu_scheduler_active == RCU_SCHEDULER_INACTIVE,
+	if (WARN_ONCE(rcu_scheduler_active == RCU_SCHEDULER_INACTIVE,
-			 "synchronize_rcu_tasks called too soon");
+			 "synchronize_%s() called too soon", rtp->name))
 		return;
 	// If the grace-period kthread is running, use it.
 	if (READ_ONCE(rtp->kthread_ptr)) {
@@ -827,11 +831,21 @@ static void rcu_tasks_pertask(struct task_struct *t, struct list_head *hop)
 static void rcu_tasks_postscan(struct list_head *hop)
 {
 	/*
-	 * Wait for tasks that are in the process of exiting.  This
+	 * Exiting tasks may escape the tasklist scan. Those are vulnerable
-	 * does only part of the job, ensuring that all tasks that were
+	 * until their final schedule() with TASK_DEAD state. To cope with
-	 * previously exiting reach the point where they have disabled
+	 * this, divide the fragile exit path part in two intersecting
-	 * preemption, allowing the later synchronize_rcu() to finish
+	 * read side critical sections:
-	 * the job.
+	 *
 	 * 1) An _SRCU_ read side starting before calling exit_notify(),
 	 *    which may remove the task from the tasklist, and ending after
 	 *    the final preempt_disable() call in do_exit().
 	 *
 	 * 2) An _RCU_ read side starting with the final preempt_disable()
 	 *    call in do_exit() and ending with the final call to schedule()
 	 *    with TASK_DEAD state.
 	 *
 	 * This handles the part 1). And postgp will handle part 2) with a
 	 * call to synchronize_rcu().
 	 */
 	synchronize_srcu(&tasks_rcu_exit_srcu);
 }
@@ -898,7 +912,10 @@ static void rcu_tasks_postgp(struct rcu_tasks *rtp)
 	 *
 	 * In addition, this synchronize_rcu() waits for exiting tasks
 	 * to complete their final preempt_disable() region of execution,
-	 * cleaning up after the synchronize_srcu() above.
+	 * cleaning up after synchronize_srcu(&tasks_rcu_exit_srcu),
 	 * enforcing the whole region before tasklist removal until
 	 * the final schedule() with TASK_DEAD state to be an RCU TASKS
 	 * read side critical section.
 	 */
 	synchronize_rcu();
 }
@@ -988,27 +1005,42 @@ void show_rcu_tasks_classic_gp_kthread(void)
 EXPORT_SYMBOL_GPL(show_rcu_tasks_classic_gp_kthread);
 #endif // !defined(CONFIG_TINY_RCU)
-/* Do the srcu_read_lock() for the above synchronize_srcu().  */
+/*
 * Contribute to protect against tasklist scan blind spot while the
 * task is exiting and may be removed from the tasklist. See
 * corresponding synchronize_srcu() for further details.
 */
 void exit_tasks_rcu_start(void) __acquires(&tasks_rcu_exit_srcu)
 {
 	preempt_disable();
 	current->rcu_tasks_idx = __srcu_read_lock(&tasks_rcu_exit_srcu);
 	preempt_enable();
 }
-/* Do the srcu_read_unlock() for the above synchronize_srcu().  */
+/*
-void exit_tasks_rcu_finish(void) __releases(&tasks_rcu_exit_srcu)
+ * Contribute to protect against tasklist scan blind spot while the
 * task is exiting and may be removed from the tasklist. See
 * corresponding synchronize_srcu() for further details.
 */
 void exit_tasks_rcu_stop(void) __releases(&tasks_rcu_exit_srcu)
 {
 	struct task_struct *t = current;
 	preempt_disable();
 	__srcu_read_unlock(&tasks_rcu_exit_srcu, t->rcu_tasks_idx);
-	preempt_enable();
+}
-	exit_tasks_rcu_finish_trace(t);
+
 /*
 * Contribute to protect against tasklist scan blind spot while the
 * task is exiting and may be removed from the tasklist. See
 * corresponding synchronize_srcu() for further details.
 */
 void exit_tasks_rcu_finish(void)
 {
 	exit_tasks_rcu_stop();
 	exit_tasks_rcu_finish_trace(current);
 }
 #else /* #ifdef CONFIG_TASKS_RCU */
 void exit_tasks_rcu_start(void) { }
 void exit_tasks_rcu_stop(void) { }
 void exit_tasks_rcu_finish(void) { exit_tasks_rcu_finish_trace(current); }
 #endif /* #else #ifdef CONFIG_TASKS_RCU */
@@ -1036,9 +1068,6 @@ static void rcu_tasks_be_rude(struct work_struct *work)
 // Wait for one rude RCU-tasks grace period.
 static void rcu_tasks_rude_wait_gp(struct rcu_tasks *rtp)
 {
 	if (num_online_cpus() <= 1)
 		return;	// Fastpath for only one CPU.
 	rtp->n_ipis += cpumask_weight(cpu_online_mask);
 	schedule_on_each_cpu(rcu_tasks_be_rude);
 }
@@ -1815,23 +1844,21 @@ static void test_rcu_tasks_callback(struct rcu_head *rhp)
 static void rcu_tasks_initiate_self_tests(void)
 {
 	unsigned long j = jiffies;
 	pr_info("Running RCU-tasks wait API self tests\n");
 #ifdef CONFIG_TASKS_RCU
-	tests[0].runstart = j;
+	tests[0].runstart = jiffies;
 	synchronize_rcu_tasks();
 	call_rcu_tasks(&tests[0].rh, test_rcu_tasks_callback);
 #endif
 #ifdef CONFIG_TASKS_RUDE_RCU
-	tests[1].runstart = j;
+	tests[1].runstart = jiffies;
 	synchronize_rcu_tasks_rude();
 	call_rcu_tasks_rude(&tests[1].rh, test_rcu_tasks_callback);
 #endif
 #ifdef CONFIG_TASKS_TRACE_RCU
-	tests[2].runstart = j;
+	tests[2].runstart = jiffies;
 	synchronize_rcu_tasks_trace();
 	call_rcu_tasks_trace(&tests[2].rh, test_rcu_tasks_callback);
 #endif
--- a/kernel/rcu/tiny.c
+++ b/kernel/rcu/tiny.c
@@ -246,15 +246,12 @@ bool poll_state_synchronize_rcu(unsigned long oldstate)
 EXPORT_SYMBOL_GPL(poll_state_synchronize_rcu);
 #ifdef CONFIG_KASAN_GENERIC
-void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
+void kvfree_call_rcu(struct rcu_head *head, void *ptr)
 {
-	if (head) {
+	if (head)
 		void *ptr = (void *) head - (unsigned long) func;
 		kasan_record_aux_stack_noalloc(ptr);
 	}
-	__kvfree_call_rcu(head, func);
+	__kvfree_call_rcu(head, ptr);
 }
 EXPORT_SYMBOL_GPL(kvfree_call_rcu);
 #endif
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -144,14 +144,16 @@ static int rcu_scheduler_fully_active __read_mostly;
 static void rcu_report_qs_rnp(unsigned long mask, struct rcu_node *rnp,
 			      unsigned long gps, unsigned long flags);
 static void rcu_init_new_rnp(struct rcu_node *rnp_leaf);
 static void rcu_cleanup_dead_rnp(struct rcu_node *rnp_leaf);
 static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu);
 static void invoke_rcu_core(void);
 static void rcu_report_exp_rdp(struct rcu_data *rdp);
 static void sync_sched_exp_online_cleanup(int cpu);
 static void check_cb_ovld_locked(struct rcu_data *rdp, struct rcu_node *rnp);
 static bool rcu_rdp_is_offloaded(struct rcu_data *rdp);
 static bool rcu_rdp_cpu_online(struct rcu_data *rdp);
 static bool rcu_init_invoked(void);
 static void rcu_cleanup_dead_rnp(struct rcu_node *rnp_leaf);
 static void rcu_init_new_rnp(struct rcu_node *rnp_leaf);
 /*
 * rcuc/rcub/rcuop kthread realtime priority. The "rcuop"
@@ -214,27 +216,6 @@ EXPORT_SYMBOL_GPL(rcu_get_gp_kthreads_prio);
 */
 #define PER_RCU_NODE_PERIOD 3	/* Number of grace periods between delays for debugging. */
 /*
 * Compute the mask of online CPUs for the specified rcu_node structure.
 * This will not be stable unless the rcu_node structure's ->lock is
 * held, but the bit corresponding to the current CPU will be stable
 * in most contexts.
 */
 static unsigned long rcu_rnp_online_cpus(struct rcu_node *rnp)
 {
 	return READ_ONCE(rnp->qsmaskinitnext);
 }
 /*
 * Is the CPU corresponding to the specified rcu_data structure online
 * from RCU's perspective?  This perspective is given by that structure's
 * ->qsmaskinitnext field rather than by the global cpu_online_mask.
 */
 static bool rcu_rdp_cpu_online(struct rcu_data *rdp)
 {
 	return !!(rdp->grpmask & rcu_rnp_online_cpus(rdp->mynode));
 }
 /*
 * Return true if an RCU grace period is in progress.  The READ_ONCE()s
 * permit this function to be invoked without holding the root rcu_node
@@ -734,46 +715,6 @@ void rcu_request_urgent_qs_task(struct task_struct *t)
 	smp_store_release(per_cpu_ptr(&rcu_data.rcu_urgent_qs, cpu), true);
 }
 #if defined(CONFIG_PROVE_RCU) && defined(CONFIG_HOTPLUG_CPU)
 /*
 * Is the current CPU online as far as RCU is concerned?
 *
 * Disable preemption to avoid false positives that could otherwise
 * happen due to the current CPU number being sampled, this task being
 * preempted, its old CPU being taken offline, resuming on some other CPU,
 * then determining that its old CPU is now offline.
 *
 * Disable checking if in an NMI handler because we cannot safely
 * report errors from NMI handlers anyway.  In addition, it is OK to use
 * RCU on an offline processor during initial boot, hence the check for
 * rcu_scheduler_fully_active.
 */
 bool rcu_lockdep_current_cpu_online(void)
 {
 	struct rcu_data *rdp;
 	bool ret = false;
 	if (in_nmi() || !rcu_scheduler_fully_active)
 		return true;
 	preempt_disable_notrace();
 	rdp = this_cpu_ptr(&rcu_data);
 	/*
 	 * Strictly, we care here about the case where the current CPU is
 	 * in rcu_cpu_starting() and thus has an excuse for rdp->grpmask
 	 * not being up to date. So arch_spin_is_locked() might have a
 	 * false positive if it's held by some *other* CPU, but that's
 	 * OK because that just means a false *negative* on the warning.
 	 */
 	if (rcu_rdp_cpu_online(rdp) || arch_spin_is_locked(&rcu_state.ofl_lock))
 		ret = true;
 	preempt_enable_notrace();
 	return ret;
 }
 EXPORT_SYMBOL_GPL(rcu_lockdep_current_cpu_online);
 #endif /* #if defined(CONFIG_PROVE_RCU) && defined(CONFIG_HOTPLUG_CPU) */
 /*
 * When trying to report a quiescent state on behalf of some other CPU,
 * it is our responsibility to check for and handle potential overflow
@@ -925,6 +866,24 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp)
 			rdp->rcu_iw_gp_seq = rnp->gp_seq;
 			irq_work_queue_on(&rdp->rcu_iw, rdp->cpu);
 		}
 		if (rcu_cpu_stall_cputime && rdp->snap_record.gp_seq != rdp->gp_seq) {
 			int cpu = rdp->cpu;
 			struct rcu_snap_record *rsrp;
 			struct kernel_cpustat *kcsp;
 			kcsp = &kcpustat_cpu(cpu);
 			rsrp = &rdp->snap_record;
 			rsrp->cputime_irq     = kcpustat_field(kcsp, CPUTIME_IRQ, cpu);
 			rsrp->cputime_softirq = kcpustat_field(kcsp, CPUTIME_SOFTIRQ, cpu);
 			rsrp->cputime_system  = kcpustat_field(kcsp, CPUTIME_SYSTEM, cpu);
 			rsrp->nr_hardirqs = kstat_cpu_irqs_sum(rdp->cpu);
 			rsrp->nr_softirqs = kstat_cpu_softirqs_sum(rdp->cpu);
 			rsrp->nr_csw = nr_context_switches_cpu(rdp->cpu);
 			rsrp->jiffies = jiffies;
 			rsrp->gp_seq = rdp->gp_seq;
 		}
 	}
 	return 0;
@@ -1350,13 +1309,6 @@ static void rcu_strict_gp_boundary(void *unused)
 	invoke_rcu_core();
 }
 // Has rcu_init() been invoked?  This is used (for example) to determine
 // whether spinlocks may be acquired safely.
 static bool rcu_init_invoked(void)
 {
 	return !!rcu_state.n_online_cpus;
 }
 // Make the polled API aware of the beginning of a grace period.
 static void rcu_poll_gp_seq_start(unsigned long *snap)
 {
@@ -2091,92 +2043,6 @@ rcu_check_quiescent_state(struct rcu_data *rdp)
 	rcu_report_qs_rdp(rdp);
 }
 /*
 * Near the end of the offline process.  Trace the fact that this CPU
 * is going offline.
 */
 int rcutree_dying_cpu(unsigned int cpu)
 {
 	bool blkd;
 	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
 	struct rcu_node *rnp = rdp->mynode;
 	if (!IS_ENABLED(CONFIG_HOTPLUG_CPU))
 		return 0;
 	blkd = !!(READ_ONCE(rnp->qsmask) & rdp->grpmask);
 	trace_rcu_grace_period(rcu_state.name, READ_ONCE(rnp->gp_seq),
 			       blkd ? TPS("cpuofl-bgp") : TPS("cpuofl"));
 	return 0;
 }
 /*
 * All CPUs for the specified rcu_node structure have gone offline,
 * and all tasks that were preempted within an RCU read-side critical
 * section while running on one of those CPUs have since exited their RCU
 * read-side critical section.  Some other CPU is reporting this fact with
 * the specified rcu_node structure's ->lock held and interrupts disabled.
 * This function therefore goes up the tree of rcu_node structures,
 * clearing the corresponding bits in the ->qsmaskinit fields.  Note that
 * the leaf rcu_node structure's ->qsmaskinit field has already been
 * updated.
 *
 * This function does check that the specified rcu_node structure has
 * all CPUs offline and no blocked tasks, so it is OK to invoke it
 * prematurely.  That said, invoking it after the fact will cost you
 * a needless lock acquisition.  So once it has done its work, don't
 * invoke it again.
 */
 static void rcu_cleanup_dead_rnp(struct rcu_node *rnp_leaf)
 {
 	long mask;
 	struct rcu_node *rnp = rnp_leaf;
 	raw_lockdep_assert_held_rcu_node(rnp_leaf);
 	if (!IS_ENABLED(CONFIG_HOTPLUG_CPU) ||
 	    WARN_ON_ONCE(rnp_leaf->qsmaskinit) ||
 	    WARN_ON_ONCE(rcu_preempt_has_tasks(rnp_leaf)))
 		return;
 	for (;;) {
 		mask = rnp->grpmask;
 		rnp = rnp->parent;
 		if (!rnp)
 			break;
 		raw_spin_lock_rcu_node(rnp); /* irqs already disabled. */
 		rnp->qsmaskinit &= ~mask;
 		/* Between grace periods, so better already be zero! */
 		WARN_ON_ONCE(rnp->qsmask);
 		if (rnp->qsmaskinit) {
 			raw_spin_unlock_rcu_node(rnp);
 			/* irqs remain disabled. */
 			return;
 		}
 		raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */
 	}
 }
 /*
 * The CPU has been completely removed, and some other CPU is reporting
 * this fact from process context.  Do the remainder of the cleanup.
 * There can only be one CPU hotplug operation at a time, so no need for
 * explicit locking.
 */
 int rcutree_dead_cpu(unsigned int cpu)
 {
 	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
 	struct rcu_node *rnp = rdp->mynode;  /* Outgoing CPU's rdp & rnp. */
 	if (!IS_ENABLED(CONFIG_HOTPLUG_CPU))
 		return 0;
 	WRITE_ONCE(rcu_state.n_online_cpus, rcu_state.n_online_cpus - 1);
 	/* Adjust any no-longer-needed kthreads. */
 	rcu_boost_kthread_setaffinity(rnp, -1);
 	// Stop-machine done, so allow nohz_full to disable tick.
 	tick_dep_clear(TICK_DEP_BIT_RCU);
 	return 0;
 }
 /*
 * Invoke any RCU callbacks that have made it to the end of their grace
 * period.  Throttle as specified by rdp->blimit.
@@ -2209,7 +2075,7 @@ static void rcu_do_batch(struct rcu_data *rdp)
 	 */
 	rcu_nocb_lock_irqsave(rdp, flags);
 	WARN_ON_ONCE(cpu_is_offline(smp_processor_id()));
-	pending = rcu_segcblist_n_cbs(&rdp->cblist);
+	pending = rcu_segcblist_get_seglen(&rdp->cblist, RCU_DONE_TAIL);
 	div = READ_ONCE(rcu_divisor);
 	div = div < 0 ? 7 : div > sizeof(long) * 8 - 2 ? sizeof(long) * 8 - 2 : div;
 	bl = max(rdp->blimit, pending >> div);
@@ -2727,10 +2593,11 @@ static void check_cb_ovld(struct rcu_data *rdp)
 }
 static void
-__call_rcu_common(struct rcu_head *head, rcu_callback_t func, bool lazy)
+__call_rcu_common(struct rcu_head *head, rcu_callback_t func, bool lazy_in)
 {
 	static atomic_t doublefrees;
 	unsigned long flags;
 	bool lazy;
 	struct rcu_data *rdp;
 	bool was_alldone;
@@ -2755,6 +2622,7 @@ __call_rcu_common(struct rcu_head *head, rcu_callback_t func, bool lazy)
 	kasan_record_aux_stack_noalloc(head);
 	local_irq_save(flags);
 	rdp = this_cpu_ptr(&rcu_data);
 	lazy = lazy_in && !rcu_async_should_hurry();
 	/* Add the callback to our list. */
 	if (unlikely(!rcu_segcblist_is_enabled(&rdp->cblist))) {
@@ -2876,13 +2744,15 @@ EXPORT_SYMBOL_GPL(call_rcu);
 /**
 * struct kvfree_rcu_bulk_data - single block to store kvfree_rcu() pointers
 * @list: List node. All blocks are linked between each other
 * @gp_snap: Snapshot of RCU state for objects placed to this bulk
 * @nr_records: Number of active pointers in the array
 * @next: Next bulk object in the block chain
 * @records: Array of the kvfree_rcu() pointers
 */
 struct kvfree_rcu_bulk_data {
 	struct list_head list;
 	unsigned long gp_snap;
 	unsigned long nr_records;
 	struct kvfree_rcu_bulk_data *next;
 	void *records[];
 };
@@ -2898,26 +2768,28 @@ struct kvfree_rcu_bulk_data {
 * struct kfree_rcu_cpu_work - single batch of kfree_rcu() requests
 * @rcu_work: Let queue_rcu_work() invoke workqueue handler after grace period
 * @head_free: List of kfree_rcu() objects waiting for a grace period
- * @bkvhead_free: Bulk-List of kvfree_rcu() objects waiting for a grace period
+ * @bulk_head_free: Bulk-List of kvfree_rcu() objects waiting for a grace period
 * @krcp: Pointer to @kfree_rcu_cpu structure
 */
 struct kfree_rcu_cpu_work {
 	struct rcu_work rcu_work;
 	struct rcu_head *head_free;
-	struct kvfree_rcu_bulk_data *bkvhead_free[FREE_N_CHANNELS];
+	struct list_head bulk_head_free[FREE_N_CHANNELS];
 	struct kfree_rcu_cpu *krcp;
 };
 /**
 * struct kfree_rcu_cpu - batch up kfree_rcu() requests for RCU grace period
 * @head: List of kfree_rcu() objects not yet waiting for a grace period
- * @bkvhead: Bulk-List of kvfree_rcu() objects not yet waiting for a grace period
+ * @head_gp_snap: Snapshot of RCU state for objects placed to "@head"
 * @bulk_head: Bulk-List of kvfree_rcu() objects not yet waiting for a grace period
 * @krw_arr: Array of batches of kfree_rcu() objects waiting for a grace period
 * @lock: Synchronize access to this structure
 * @monitor_work: Promote @head to @head_free after KFREE_DRAIN_JIFFIES
 * @initialized: The @rcu_work fields have been initialized
- * @count: Number of objects for which GP not started
+ * @head_count: Number of objects in rcu_head singular list
 * @bulk_count: Number of objects in bulk-list
 * @bkvcache:
 *	A simple cache list that contains objects for reuse purpose.
 *	In order to save some per-cpu space the list is singular.
@@ -2935,13 +2807,20 @@ struct kfree_rcu_cpu_work {
 * the interactions with the slab allocators.
 */
 struct kfree_rcu_cpu {
 	// Objects queued on a linked list
 	// through their rcu_head structures.
 	struct rcu_head *head;
-	struct kvfree_rcu_bulk_data *bkvhead[FREE_N_CHANNELS];
+	unsigned long head_gp_snap;
 	atomic_t head_count;
 	// Objects queued on a bulk-list.
 	struct list_head bulk_head[FREE_N_CHANNELS];
 	atomic_t bulk_count[FREE_N_CHANNELS];
 	struct kfree_rcu_cpu_work krw_arr[KFREE_N_BATCHES];
 	raw_spinlock_t lock;
 	struct delayed_work monitor_work;
 	bool initialized;
 	int count;
 	struct delayed_work page_cache_work;
 	atomic_t backoff_page_cache_fill;
@@ -3029,82 +2908,51 @@ drain_page_cache(struct kfree_rcu_cpu *krcp)
 	return freed;
 }
-/*
+static void
- * This function is invoked in workqueue context after a grace period.
+kvfree_rcu_bulk(struct kfree_rcu_cpu *krcp,
- * It frees all the objects queued on ->bkvhead_free or ->head_free.
+	struct kvfree_rcu_bulk_data *bnode, int idx)
 */
 static void kfree_rcu_work(struct work_struct *work)
 {
 	unsigned long flags;
-	struct kvfree_rcu_bulk_data *bkvhead[FREE_N_CHANNELS], *bnext;
+	int i;
 	struct rcu_head *head, *next;
 	struct kfree_rcu_cpu *krcp;
 	struct kfree_rcu_cpu_work *krwp;
 	int i, j;
-	krwp = container_of(to_rcu_work(work),
+	debug_rcu_bhead_unqueue(bnode);
 			    struct kfree_rcu_cpu_work, rcu_work);
 	krcp = krwp->krcp;
-	raw_spin_lock_irqsave(&krcp->lock, flags);
+	rcu_lock_acquire(&rcu_callback_map);
-	// Channels 1 and 2.
+	if (idx == 0) { // kmalloc() / kfree().
-	for (i = 0; i < FREE_N_CHANNELS; i++) {
+		trace_rcu_invoke_kfree_bulk_callback(
-		bkvhead[i] = krwp->bkvhead_free[i];
+			rcu_state.name, bnode->nr_records,
-		krwp->bkvhead_free[i] = NULL;
+			bnode->records);
 	}
-	// Channel 3.
+		kfree_bulk(bnode->nr_records, bnode->records);
-	head = krwp->head_free;
+	} else { // vmalloc() / vfree().
-	krwp->head_free = NULL;
+		for (i = 0; i < bnode->nr_records; i++) {
-	raw_spin_unlock_irqrestore(&krcp->lock, flags);
+			trace_rcu_invoke_kvfree_callback(
 				rcu_state.name, bnode->records[i], 0);
-	// Handle the first two channels.
+			vfree(bnode->records[i]);
 	for (i = 0; i < FREE_N_CHANNELS; i++) {
 		for (; bkvhead[i]; bkvhead[i] = bnext) {
 			bnext = bkvhead[i]->next;
 			debug_rcu_bhead_unqueue(bkvhead[i]);
 			rcu_lock_acquire(&rcu_callback_map);
 			if (i == 0) { // kmalloc() / kfree().
 				trace_rcu_invoke_kfree_bulk_callback(
 					rcu_state.name, bkvhead[i]->nr_records,
 					bkvhead[i]->records);
 				kfree_bulk(bkvhead[i]->nr_records,
 					bkvhead[i]->records);
 			} else { // vmalloc() / vfree().
 				for (j = 0; j < bkvhead[i]->nr_records; j++) {
 					trace_rcu_invoke_kvfree_callback(
 						rcu_state.name,
 						bkvhead[i]->records[j], 0);
 					vfree(bkvhead[i]->records[j]);
 				}
 			}
 			rcu_lock_release(&rcu_callback_map);
 			raw_spin_lock_irqsave(&krcp->lock, flags);
 			if (put_cached_bnode(krcp, bkvhead[i]))
 				bkvhead[i] = NULL;
 			raw_spin_unlock_irqrestore(&krcp->lock, flags);
 			if (bkvhead[i])
 				free_page((unsigned long) bkvhead[i]);
 			cond_resched_tasks_rcu_qs();
 		}
 	}
 	rcu_lock_release(&rcu_callback_map);
 	raw_spin_lock_irqsave(&krcp->lock, flags);
 	if (put_cached_bnode(krcp, bnode))
 		bnode = NULL;
 	raw_spin_unlock_irqrestore(&krcp->lock, flags);
 	if (bnode)
 		free_page((unsigned long) bnode);
 	cond_resched_tasks_rcu_qs();
 }
 static void
 kvfree_rcu_list(struct rcu_head *head)
 {
 	struct rcu_head *next;
 	/*
 	 * This is used when the "bulk" path can not be used for the
 	 * double-argument of kvfree_rcu().  This happens when the
 	 * page-cache is empty, which means that objects are instead
 	 * queued on a linked list through their rcu_head structures.
 	 * This list is named "Channel 3".
 	 */
 	for (; head; head = next) {
-		unsigned long offset = (unsigned long)head->func;
+		void *ptr = (void *) head->func;
-		void *ptr = (void *)head - offset;
+		unsigned long offset = (void *) head - ptr;
 		next = head->next;
 		debug_rcu_head_unqueue((struct rcu_head *)ptr);
@@ -3119,16 +2967,72 @@ static void kfree_rcu_work(struct work_struct *work)
 	}
 }
 /*
 * This function is invoked in workqueue context after a grace period.
 * It frees all the objects queued on ->bulk_head_free or ->head_free.
 */
 static void kfree_rcu_work(struct work_struct *work)
 {
 	unsigned long flags;
 	struct kvfree_rcu_bulk_data *bnode, *n;
 	struct list_head bulk_head[FREE_N_CHANNELS];
 	struct rcu_head *head;
 	struct kfree_rcu_cpu *krcp;
 	struct kfree_rcu_cpu_work *krwp;
 	int i;
 	krwp = container_of(to_rcu_work(work),
 		struct kfree_rcu_cpu_work, rcu_work);
 	krcp = krwp->krcp;
 	raw_spin_lock_irqsave(&krcp->lock, flags);
 	// Channels 1 and 2.
 	for (i = 0; i < FREE_N_CHANNELS; i++)
 		list_replace_init(&krwp->bulk_head_free[i], &bulk_head[i]);
 	// Channel 3.
 	head = krwp->head_free;
 	krwp->head_free = NULL;
 	raw_spin_unlock_irqrestore(&krcp->lock, flags);
 	// Handle the first two channels.
 	for (i = 0; i < FREE_N_CHANNELS; i++) {
 		// Start from the tail page, so a GP is likely passed for it.
 		list_for_each_entry_safe(bnode, n, &bulk_head[i], list)
 			kvfree_rcu_bulk(krcp, bnode, i);
 	}
 	/*
 	 * This is used when the "bulk" path can not be used for the
 	 * double-argument of kvfree_rcu().  This happens when the
 	 * page-cache is empty, which means that objects are instead
 	 * queued on a linked list through their rcu_head structures.
 	 * This list is named "Channel 3".
 	 */
 	kvfree_rcu_list(head);
 }
 static bool
 need_offload_krc(struct kfree_rcu_cpu *krcp)
 {
 	int i;
 	for (i = 0; i < FREE_N_CHANNELS; i++)
-		if (krcp->bkvhead[i])
+		if (!list_empty(&krcp->bulk_head[i]))
 			return true;
-	return !!krcp->head;
+	return !!READ_ONCE(krcp->head);
 }
 static int krc_count(struct kfree_rcu_cpu *krcp)
 {
 	int sum = atomic_read(&krcp->head_count);
 	int i;
 	for (i = 0; i < FREE_N_CHANNELS; i++)
 		sum += atomic_read(&krcp->bulk_count[i]);
 	return sum;
 }
 static void
@@ -3136,7 +3040,7 @@ schedule_delayed_monitor_work(struct kfree_rcu_cpu *krcp)
 {
 	long delay, delay_left;
-	delay = READ_ONCE(krcp->count) >= KVFREE_BULK_MAX_ENTR ? 1:KFREE_DRAIN_JIFFIES;
+	delay = krc_count(krcp) >= KVFREE_BULK_MAX_ENTR ? 1:KFREE_DRAIN_JIFFIES;
 	if (delayed_work_pending(&krcp->monitor_work)) {
 		delay_left = krcp->monitor_work.timer.expires - jiffies;
 		if (delay < delay_left)
@@ -3146,6 +3050,44 @@ schedule_delayed_monitor_work(struct kfree_rcu_cpu *krcp)
 	queue_delayed_work(system_wq, &krcp->monitor_work, delay);
 }
 static void
 kvfree_rcu_drain_ready(struct kfree_rcu_cpu *krcp)
 {
 	struct list_head bulk_ready[FREE_N_CHANNELS];
 	struct kvfree_rcu_bulk_data *bnode, *n;
 	struct rcu_head *head_ready = NULL;
 	unsigned long flags;
 	int i;
 	raw_spin_lock_irqsave(&krcp->lock, flags);
 	for (i = 0; i < FREE_N_CHANNELS; i++) {
 		INIT_LIST_HEAD(&bulk_ready[i]);
 		list_for_each_entry_safe_reverse(bnode, n, &krcp->bulk_head[i], list) {
 			if (!poll_state_synchronize_rcu(bnode->gp_snap))
 				break;
 			atomic_sub(bnode->nr_records, &krcp->bulk_count[i]);
 			list_move(&bnode->list, &bulk_ready[i]);
 		}
 	}
 	if (krcp->head && poll_state_synchronize_rcu(krcp->head_gp_snap)) {
 		head_ready = krcp->head;
 		atomic_set(&krcp->head_count, 0);
 		WRITE_ONCE(krcp->head, NULL);
 	}
 	raw_spin_unlock_irqrestore(&krcp->lock, flags);
 	for (i = 0; i < FREE_N_CHANNELS; i++) {
 		list_for_each_entry_safe(bnode, n, &bulk_ready[i], list)
 			kvfree_rcu_bulk(krcp, bnode, i);
 	}
 	if (head_ready)
 		kvfree_rcu_list(head_ready);
 }
 /*
 * This function is invoked after the KFREE_DRAIN_JIFFIES timeout.
 */
@@ -3156,26 +3098,31 @@ static void kfree_rcu_monitor(struct work_struct *work)
 	unsigned long flags;
 	int i, j;
 	// Drain ready for reclaim.
 	kvfree_rcu_drain_ready(krcp);
 	raw_spin_lock_irqsave(&krcp->lock, flags);
 	// Attempt to start a new batch.
 	for (i = 0; i < KFREE_N_BATCHES; i++) {
 		struct kfree_rcu_cpu_work *krwp = &(krcp->krw_arr[i]);
-		// Try to detach bkvhead or head and attach it over any
+		// Try to detach bulk_head or head and attach it over any
 		// available corresponding free channel. It can be that
 		// a previous RCU batch is in progress, it means that
 		// immediately to queue another one is not possible so
 		// in that case the monitor work is rearmed.
-		if ((krcp->bkvhead[0] && !krwp->bkvhead_free[0]) ||
+		if ((!list_empty(&krcp->bulk_head[0]) && list_empty(&krwp->bulk_head_free[0])) ||
-			(krcp->bkvhead[1] && !krwp->bkvhead_free[1]) ||
+			(!list_empty(&krcp->bulk_head[1]) && list_empty(&krwp->bulk_head_free[1])) ||
-				(krcp->head && !krwp->head_free)) {
+				(READ_ONCE(krcp->head) && !krwp->head_free)) {
 			// Channel 1 corresponds to the SLAB-pointer bulk path.
 			// Channel 2 corresponds to vmalloc-pointer bulk path.
 			for (j = 0; j < FREE_N_CHANNELS; j++) {
-				if (!krwp->bkvhead_free[j]) {
+				if (list_empty(&krwp->bulk_head_free[j])) {
-					krwp->bkvhead_free[j] = krcp->bkvhead[j];
+					atomic_set(&krcp->bulk_count[j], 0);
-					krcp->bkvhead[j] = NULL;
+					list_replace_init(&krcp->bulk_head[j],
 						&krwp->bulk_head_free[j]);
 				}
 			}
@@ -3183,11 +3130,10 @@ static void kfree_rcu_monitor(struct work_struct *work)
 			// objects queued on the linked list.
 			if (!krwp->head_free) {
 				krwp->head_free = krcp->head;
-				krcp->head = NULL;
+				atomic_set(&krcp->head_count, 0);
 				WRITE_ONCE(krcp->head, NULL);
 			}
 			WRITE_ONCE(krcp->count, 0);
 			// One work is per one batch, so there are three
 			// "free channels", the batch can handle. It can
 			// be that the work is in the pending state when
@@ -3197,6 +3143,8 @@ static void kfree_rcu_monitor(struct work_struct *work)
 		}
 	}
 	raw_spin_unlock_irqrestore(&krcp->lock, flags);
 	// If there is nothing to detach, it means that our job is
 	// successfully done here. In case of having at least one
 	// of the channels that is still busy we should rearm the
@@ -3204,8 +3152,6 @@ static void kfree_rcu_monitor(struct work_struct *work)
 	// still in progress.
 	if (need_offload_krc(krcp))
 		schedule_delayed_monitor_work(krcp);
 	raw_spin_unlock_irqrestore(&krcp->lock, flags);
 }
 static enum hrtimer_restart
@@ -3288,10 +3234,11 @@ add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp,
 		return false;
 	idx = !!is_vmalloc_addr(ptr);
 	bnode = list_first_entry_or_null(&(*krcp)->bulk_head[idx],
 		struct kvfree_rcu_bulk_data, list);
 	/* Check if a new block is required. */
-	if (!(*krcp)->bkvhead[idx] ||
+	if (!bnode || bnode->nr_records == KVFREE_BULK_MAX_ENTR) {
 			(*krcp)->bkvhead[idx]->nr_records == KVFREE_BULK_MAX_ENTR) {
 		bnode = get_cached_bnode(*krcp);
 		if (!bnode && can_alloc) {
 			krc_this_cpu_unlock(*krcp, *flags);
@@ -3315,17 +3262,15 @@ add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp,
 		if (!bnode)
 			return false;
-		/* Initialize the new block. */
+		// Initialize the new block and attach it.
 		bnode->nr_records = 0;
-		bnode->next = (*krcp)->bkvhead[idx];
+		list_add(&bnode->list, &(*krcp)->bulk_head[idx]);
 		/* Attach it to the head. */
 		(*krcp)->bkvhead[idx] = bnode;
 	}
-	/* Finally insert. */
+	// Finally insert and update the GP for this page.
-	(*krcp)->bkvhead[idx]->records
+	bnode->records[bnode->nr_records++] = ptr;
-		[(*krcp)->bkvhead[idx]->nr_records++] = ptr;
+	bnode->gp_snap = get_state_synchronize_rcu();
 	atomic_inc(&(*krcp)->bulk_count[idx]);
 	return true;
 }
@@ -3342,26 +3287,21 @@ add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp,
 * be free'd in workqueue context. This allows us to: batch requests together to
 * reduce the number of grace periods during heavy kfree_rcu()/kvfree_rcu() load.
 */
-void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
+void kvfree_call_rcu(struct rcu_head *head, void *ptr)
 {
 	unsigned long flags;
 	struct kfree_rcu_cpu *krcp;
 	bool success;
 	void *ptr;
-	if (head) {
+	/*
-		ptr = (void *) head - (unsigned long) func;
+	 * Please note there is a limitation for the head-less
-	} else {
+	 * variant, that is why there is a clear rule for such
-		/*
+	 * objects: it can be used from might_sleep() context
-		 * Please note there is a limitation for the head-less
+	 * only. For other places please embed an rcu_head to
-		 * variant, that is why there is a clear rule for such
+	 * your data.
-		 * objects: it can be used from might_sleep() context
+	 */
-		 * only. For other places please embed an rcu_head to
+	if (!head)
 		 * your data.
 		 */
 		might_sleep();
 		ptr = (unsigned long *) func;
 	}
 	// Queue the object but don't yet schedule the batch.
 	if (debug_rcu_head_queue(ptr)) {
@@ -3382,14 +3322,16 @@ void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
 			// Inline if kvfree_rcu(one_arg) call.
 			goto unlock_return;
-		head->func = func;
+		head->func = ptr;
 		head->next = krcp->head;
-		krcp->head = head;
+		WRITE_ONCE(krcp->head, head);
 		atomic_inc(&krcp->head_count);
 		// Take a snapshot for this krcp.
 		krcp->head_gp_snap = get_state_synchronize_rcu();
 		success = true;
 	}
 	WRITE_ONCE(krcp->count, krcp->count + 1);
 	// Set timer to drain after KFREE_DRAIN_JIFFIES.
 	if (rcu_scheduler_active == RCU_SCHEDULER_RUNNING)
 		schedule_delayed_monitor_work(krcp);
@@ -3420,7 +3362,7 @@ kfree_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
 	for_each_possible_cpu(cpu) {
 		struct kfree_rcu_cpu *krcp = per_cpu_ptr(&krc, cpu);
-		count += READ_ONCE(krcp->count);
+		count += krc_count(krcp);
 		count += READ_ONCE(krcp->nr_bkv_objs);
 		atomic_set(&krcp->backoff_page_cache_fill, 1);
 	}
@@ -3437,7 +3379,7 @@ kfree_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
 		int count;
 		struct kfree_rcu_cpu *krcp = per_cpu_ptr(&krc, cpu);
-		count = krcp->count;
+		count = krc_count(krcp);
 		count += drain_page_cache(krcp);
 		kfree_rcu_monitor(&krcp->monitor_work.work);
@@ -3461,15 +3403,12 @@ static struct shrinker kfree_rcu_shrinker = {
 void __init kfree_rcu_scheduler_running(void)
 {
 	int cpu;
 	unsigned long flags;
 	for_each_possible_cpu(cpu) {
 		struct kfree_rcu_cpu *krcp = per_cpu_ptr(&krc, cpu);
 		raw_spin_lock_irqsave(&krcp->lock, flags);
 		if (need_offload_krc(krcp))
 			schedule_delayed_monitor_work(krcp);
 		raw_spin_unlock_irqrestore(&krcp->lock, flags);
 	}
 }
@@ -3485,9 +3424,10 @@ void __init kfree_rcu_scheduler_running(void)
 */
 static int rcu_blocking_is_gp(void)
 {
-	if (rcu_scheduler_active != RCU_SCHEDULER_INACTIVE)
+	if (rcu_scheduler_active != RCU_SCHEDULER_INACTIVE) {
 		might_sleep();
 		return false;
-	might_sleep();  /* Check for RCU read-side critical section. */
+	}
 	return true;
 }
@@ -3711,7 +3651,9 @@ EXPORT_SYMBOL_GPL(start_poll_synchronize_rcu_full);
 * If @false is returned, it is the caller's responsibility to invoke this
 * function later on until it does return @true.  Alternatively, the caller
 * can explicitly wait for a grace period, for example, by passing @oldstate
- * to cond_synchronize_rcu() or by directly invoking synchronize_rcu().
+ * to either cond_synchronize_rcu() or cond_synchronize_rcu_expedited()
 * on the one hand or by directly invoking either synchronize_rcu() or
 * synchronize_rcu_expedited() on the other.
 *
 * Yes, this function does not take counter wrap into account.
 * But counter wrap is harmless.  If the counter wraps, we have waited for
@@ -3722,6 +3664,12 @@ EXPORT_SYMBOL_GPL(start_poll_synchronize_rcu_full);
 * completed.  Alternatively, they can use get_completed_synchronize_rcu()
 * to get a guaranteed-completed grace-period state.
 *
 * In addition, because oldstate compresses the grace-period state for
 * both normal and expedited grace periods into a single unsigned long,
 * it can miss a grace period when synchronize_rcu() runs concurrently
 * with synchronize_rcu_expedited().  If this is unacceptable, please
 * instead use the _full() variant of these polling APIs.
 *
 * This function provides the same memory-ordering guarantees that
 * would be provided by a synchronize_rcu() that was invoked at the call
 * to the function that provided @oldstate, and that returned at the end
@@ -4079,6 +4027,155 @@ retry:
 }
 EXPORT_SYMBOL_GPL(rcu_barrier);
 /*
 * Compute the mask of online CPUs for the specified rcu_node structure.
 * This will not be stable unless the rcu_node structure's ->lock is
 * held, but the bit corresponding to the current CPU will be stable
 * in most contexts.
 */
 static unsigned long rcu_rnp_online_cpus(struct rcu_node *rnp)
 {
 	return READ_ONCE(rnp->qsmaskinitnext);
 }
 /*
 * Is the CPU corresponding to the specified rcu_data structure online
 * from RCU's perspective?  This perspective is given by that structure's
 * ->qsmaskinitnext field rather than by the global cpu_online_mask.
 */
 static bool rcu_rdp_cpu_online(struct rcu_data *rdp)
 {
 	return !!(rdp->grpmask & rcu_rnp_online_cpus(rdp->mynode));
 }
 #if defined(CONFIG_PROVE_RCU) && defined(CONFIG_HOTPLUG_CPU)
 /*
 * Is the current CPU online as far as RCU is concerned?
 *
 * Disable preemption to avoid false positives that could otherwise
 * happen due to the current CPU number being sampled, this task being
 * preempted, its old CPU being taken offline, resuming on some other CPU,
 * then determining that its old CPU is now offline.
 *
 * Disable checking if in an NMI handler because we cannot safely
 * report errors from NMI handlers anyway.  In addition, it is OK to use
 * RCU on an offline processor during initial boot, hence the check for
 * rcu_scheduler_fully_active.
 */
 bool rcu_lockdep_current_cpu_online(void)
 {
 	struct rcu_data *rdp;
 	bool ret = false;
 	if (in_nmi() || !rcu_scheduler_fully_active)
 		return true;
 	preempt_disable_notrace();
 	rdp = this_cpu_ptr(&rcu_data);
 	/*
 	 * Strictly, we care here about the case where the current CPU is
 	 * in rcu_cpu_starting() and thus has an excuse for rdp->grpmask
 	 * not being up to date. So arch_spin_is_locked() might have a
 	 * false positive if it's held by some *other* CPU, but that's
 	 * OK because that just means a false *negative* on the warning.
 	 */
 	if (rcu_rdp_cpu_online(rdp) || arch_spin_is_locked(&rcu_state.ofl_lock))
 		ret = true;
 	preempt_enable_notrace();
 	return ret;
 }
 EXPORT_SYMBOL_GPL(rcu_lockdep_current_cpu_online);
 #endif /* #if defined(CONFIG_PROVE_RCU) && defined(CONFIG_HOTPLUG_CPU) */
 // Has rcu_init() been invoked?  This is used (for example) to determine
 // whether spinlocks may be acquired safely.
 static bool rcu_init_invoked(void)
 {
 	return !!rcu_state.n_online_cpus;
 }
 /*
 * Near the end of the offline process.  Trace the fact that this CPU
 * is going offline.
 */
 int rcutree_dying_cpu(unsigned int cpu)
 {
 	bool blkd;
 	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
 	struct rcu_node *rnp = rdp->mynode;
 	if (!IS_ENABLED(CONFIG_HOTPLUG_CPU))
 		return 0;
 	blkd = !!(READ_ONCE(rnp->qsmask) & rdp->grpmask);
 	trace_rcu_grace_period(rcu_state.name, READ_ONCE(rnp->gp_seq),
 			       blkd ? TPS("cpuofl-bgp") : TPS("cpuofl"));
 	return 0;
 }
 /*
 * All CPUs for the specified rcu_node structure have gone offline,
 * and all tasks that were preempted within an RCU read-side critical
 * section while running on one of those CPUs have since exited their RCU
 * read-side critical section.  Some other CPU is reporting this fact with
 * the specified rcu_node structure's ->lock held and interrupts disabled.
 * This function therefore goes up the tree of rcu_node structures,
 * clearing the corresponding bits in the ->qsmaskinit fields.  Note that
 * the leaf rcu_node structure's ->qsmaskinit field has already been
 * updated.
 *
 * This function does check that the specified rcu_node structure has
 * all CPUs offline and no blocked tasks, so it is OK to invoke it
 * prematurely.  That said, invoking it after the fact will cost you
 * a needless lock acquisition.  So once it has done its work, don't
 * invoke it again.
 */
 static void rcu_cleanup_dead_rnp(struct rcu_node *rnp_leaf)
 {
 	long mask;
 	struct rcu_node *rnp = rnp_leaf;
 	raw_lockdep_assert_held_rcu_node(rnp_leaf);
 	if (!IS_ENABLED(CONFIG_HOTPLUG_CPU) ||
 	    WARN_ON_ONCE(rnp_leaf->qsmaskinit) ||
 	    WARN_ON_ONCE(rcu_preempt_has_tasks(rnp_leaf)))
 		return;
 	for (;;) {
 		mask = rnp->grpmask;
 		rnp = rnp->parent;
 		if (!rnp)
 			break;
 		raw_spin_lock_rcu_node(rnp); /* irqs already disabled. */
 		rnp->qsmaskinit &= ~mask;
 		/* Between grace periods, so better already be zero! */
 		WARN_ON_ONCE(rnp->qsmask);
 		if (rnp->qsmaskinit) {
 			raw_spin_unlock_rcu_node(rnp);
 			/* irqs remain disabled. */
 			return;
 		}
 		raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */
 	}
 }
 /*
 * The CPU has been completely removed, and some other CPU is reporting
 * this fact from process context.  Do the remainder of the cleanup.
 * There can only be one CPU hotplug operation at a time, so no need for
 * explicit locking.
 */
 int rcutree_dead_cpu(unsigned int cpu)
 {
 	if (!IS_ENABLED(CONFIG_HOTPLUG_CPU))
 		return 0;
 	WRITE_ONCE(rcu_state.n_online_cpus, rcu_state.n_online_cpus - 1);
 	// Stop-machine done, so allow nohz_full to disable tick.
 	tick_dep_clear(TICK_DEP_BIT_RCU);
 	return 0;
 }
 /*
 * Propagate ->qsinitmask bits up the rcu_node tree to account for the
 * first CPU in a given leaf rcu_node structure coming online.  The caller
@@ -4408,11 +4505,13 @@ static int rcu_pm_notify(struct notifier_block *self,
 	switch (action) {
 	case PM_HIBERNATION_PREPARE:
 	case PM_SUSPEND_PREPARE:
 		rcu_async_hurry();
 		rcu_expedite_gp();
 		break;
 	case PM_POST_HIBERNATION:
 	case PM_POST_SUSPEND:
 		rcu_unexpedite_gp();
 		rcu_async_relax();
 		break;
 	default:
 		break;
@@ -4766,7 +4865,7 @@ struct workqueue_struct *rcu_gp_wq;
 static void __init kfree_rcu_batch_init(void)
 {
 	int cpu;
-	int i;
+	int i, j;
 	/* Clamp it to [0:100] seconds interval. */
 	if (rcu_delay_page_cache_fill_msec < 0 ||
@@ -4786,8 +4885,14 @@ static void __init kfree_rcu_batch_init(void)
 		for (i = 0; i < KFREE_N_BATCHES; i++) {
 			INIT_RCU_WORK(&krcp->krw_arr[i].rcu_work, kfree_rcu_work);
 			krcp->krw_arr[i].krcp = krcp;
 			for (j = 0; j < FREE_N_CHANNELS; j++)
 				INIT_LIST_HEAD(&krcp->krw_arr[i].bulk_head_free[j]);
 		}
 		for (i = 0; i < FREE_N_CHANNELS; i++)
 			INIT_LIST_HEAD(&krcp->bulk_head[i]);
 		INIT_DELAYED_WORK(&krcp->monitor_work, kfree_rcu_monitor);
 		INIT_DELAYED_WORK(&krcp->page_cache_work, fill_page_cache_func);
 		krcp->initialized = true;
@@ -4838,6 +4943,8 @@ void __init rcu_init(void)
 	// Kick-start any polled grace periods that started early.
 	if (!(per_cpu_ptr(&rcu_data, cpu)->mynode->exp_seq_poll_rq & 0x1))
 		(void)start_poll_synchronize_rcu_expedited();
 	rcu_test_sync_prims();
 }
 #include "tree_stall.h"
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -158,6 +158,23 @@ union rcu_noqs {
 	u16 s; /* Set of bits, aggregate OR here. */
 };
 /*
 * Record the snapshot of the core stats at half of the first RCU stall timeout.
 * The member gp_seq is used to ensure that all members are updated only once
 * during the sampling period. The snapshot is taken only if this gp_seq is not
 * equal to rdp->gp_seq.
 */
 struct rcu_snap_record {
 	unsigned long	gp_seq;		/* Track rdp->gp_seq counter */
 	u64		cputime_irq;	/* Accumulated cputime of hard irqs */
 	u64		cputime_softirq;/* Accumulated cputime of soft irqs */
 	u64		cputime_system; /* Accumulated cputime of kernel tasks */
 	unsigned long	nr_hardirqs;	/* Accumulated number of hard irqs */
 	unsigned int	nr_softirqs;	/* Accumulated number of soft irqs */
 	unsigned long long nr_csw;	/* Accumulated number of task switches */
 	unsigned long   jiffies;	/* Track jiffies value */
 };
 /* Per-CPU data for read-copy update. */
 struct rcu_data {
 	/* 1) quiescent-state and grace-period handling : */
@@ -262,6 +279,8 @@ struct rcu_data {
 	short rcu_onl_gp_flags;		/* ->gp_flags at last online. */
 	unsigned long last_fqs_resched;	/* Time of last rcu_resched(). */
 	unsigned long last_sched_clock;	/* Jiffies of last rcu_sched_clock_irq(). */
 	struct rcu_snap_record snap_record; /* Snapshot of core stats at half of */
 					    /* the first RCU stall timeout */
 	long lazy_len;			/* Length of buffered lazy callbacks. */
 	int cpu;
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -11,6 +11,7 @@
 static void rcu_exp_handler(void *unused);
 static int rcu_print_task_exp_stall(struct rcu_node *rnp);
 static void rcu_exp_print_detail_task_stall_rnp(struct rcu_node *rnp);
 /*
 * Record the start of an expedited grace period.
@@ -667,8 +668,11 @@ static void synchronize_rcu_expedited_wait(void)
 				mask = leaf_node_cpu_bit(rnp, cpu);
 				if (!(READ_ONCE(rnp->expmask) & mask))
 					continue;
 				preempt_disable(); // For smp_processor_id() in dump_cpu_task().
 				dump_cpu_task(cpu);
 				preempt_enable();
 			}
 			rcu_exp_print_detail_task_stall_rnp(rnp);
 		}
 		jiffies_stall = 3 * rcu_exp_jiffies_till_stall_check() + 3;
 		panic_on_rcu_stall();
@@ -811,6 +815,36 @@ static int rcu_print_task_exp_stall(struct rcu_node *rnp)
 	return ndetected;
 }
 /*
 * Scan the current list of tasks blocked within RCU read-side critical
 * sections, dumping the stack of each that is blocking the current
 * expedited grace period.
 */
 static void rcu_exp_print_detail_task_stall_rnp(struct rcu_node *rnp)
 {
 	unsigned long flags;
 	struct task_struct *t;
 	if (!rcu_exp_stall_task_details)
 		return;
 	raw_spin_lock_irqsave_rcu_node(rnp, flags);
 	if (!READ_ONCE(rnp->exp_tasks)) {
 		raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
 		return;
 	}
 	t = list_entry(rnp->exp_tasks->prev,
 		       struct task_struct, rcu_node_entry);
 	list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) {
 		/*
 		 * We could be printing a lot while holding a spinlock.
 		 * Avoid triggering hard lockup.
 		 */
 		touch_nmi_watchdog();
 		sched_show_task(t);
 	}
 	raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
 }
 #else /* #ifdef CONFIG_PREEMPT_RCU */
 /* Request an expedited quiescent state. */
@@ -883,6 +917,15 @@ static int rcu_print_task_exp_stall(struct rcu_node *rnp)
 	return 0;
 }
 /*
 * Because preemptible RCU does not exist, we never have to print out
 * tasks blocked within RCU read-side critical sections that are blocking
 * the current expedited grace period.
 */
 static void rcu_exp_print_detail_task_stall_rnp(struct rcu_node *rnp)
 {
 }
 #endif /* #else #ifdef CONFIG_PREEMPT_RCU */
 /**
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -39,7 +39,7 @@ int rcu_exp_jiffies_till_stall_check(void)
 	// CONFIG_RCU_EXP_CPU_STALL_TIMEOUT, so check the allowed range.
 	// The minimum clamped value is "2UL", because at least one full
 	// tick has to be guaranteed.
-	till_stall_check = clamp(msecs_to_jiffies(cpu_stall_timeout), 2UL, 21UL * HZ);
+	till_stall_check = clamp(msecs_to_jiffies(cpu_stall_timeout), 2UL, 300UL * HZ);
 	if (cpu_stall_timeout && jiffies_to_msecs(till_stall_check) != cpu_stall_timeout)
 		WRITE_ONCE(rcu_exp_cpu_stall_timeout, jiffies_to_msecs(till_stall_check));
@@ -428,6 +428,35 @@ static bool rcu_is_rcuc_kthread_starving(struct rcu_data *rdp, unsigned long *jp
 	return j > 2 * HZ;
 }
 static void print_cpu_stat_info(int cpu)
 {
 	struct rcu_snap_record rsr, *rsrp;
 	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
 	struct kernel_cpustat *kcsp = &kcpustat_cpu(cpu);
 	if (!rcu_cpu_stall_cputime)
 		return;
 	rsrp = &rdp->snap_record;
 	if (rsrp->gp_seq != rdp->gp_seq)
 		return;
 	rsr.cputime_irq     = kcpustat_field(kcsp, CPUTIME_IRQ, cpu);
 	rsr.cputime_softirq = kcpustat_field(kcsp, CPUTIME_SOFTIRQ, cpu);
 	rsr.cputime_system  = kcpustat_field(kcsp, CPUTIME_SYSTEM, cpu);
 	pr_err("\t         hardirqs   softirqs   csw/system\n");
 	pr_err("\t number: %8ld %10d %12lld\n",
 		kstat_cpu_irqs_sum(cpu) - rsrp->nr_hardirqs,
 		kstat_cpu_softirqs_sum(cpu) - rsrp->nr_softirqs,
 		nr_context_switches_cpu(cpu) - rsrp->nr_csw);
 	pr_err("\tcputime: %8lld %10lld %12lld   ==> %d(ms)\n",
 		div_u64(rsr.cputime_irq - rsrp->cputime_irq, NSEC_PER_MSEC),
 		div_u64(rsr.cputime_softirq - rsrp->cputime_softirq, NSEC_PER_MSEC),
 		div_u64(rsr.cputime_system - rsrp->cputime_system, NSEC_PER_MSEC),
 		jiffies_to_msecs(jiffies - rsrp->jiffies));
 }
 /*
 * Print out diagnostic information for the specified stalled CPU.
 *
@@ -484,6 +513,8 @@ static void print_cpu_stall_info(int cpu)
 	       data_race(rcu_state.n_force_qs) - rcu_state.n_force_qs_gpstart,
 	       rcuc_starved ? buf : "",
 	       falsepositive ? " (false positive?)" : "");
 	print_cpu_stat_info(cpu);
 }
 /* Complain about starvation of grace-period kthread.  */
@@ -588,7 +619,7 @@ static void print_other_cpu_stall(unsigned long gp_seq, unsigned long gps)
 	for_each_possible_cpu(cpu)
 		totqlen += rcu_get_n_cbs_cpu(cpu);
-	pr_cont("\t(detected by %d, t=%ld jiffies, g=%ld, q=%lu ncpus=%d)\n",
+	pr_err("\t(detected by %d, t=%ld jiffies, g=%ld, q=%lu ncpus=%d)\n",
 	       smp_processor_id(), (long)(jiffies - gps),
 	       (long)rcu_seq_current(&rcu_state.gp_seq), totqlen, rcu_state.n_online_cpus);
 	if (ndetected) {
@@ -649,7 +680,7 @@ static void print_cpu_stall(unsigned long gps)
 	raw_spin_unlock_irqrestore_rcu_node(rdp->mynode, flags);
 	for_each_possible_cpu(cpu)
 		totqlen += rcu_get_n_cbs_cpu(cpu);
-	pr_cont("\t(t=%lu jiffies g=%ld q=%lu ncpus=%d)\n",
+	pr_err("\t(t=%lu jiffies g=%ld q=%lu ncpus=%d)\n",
 		jiffies - gps,
 		(long)rcu_seq_current(&rcu_state.gp_seq), totqlen, rcu_state.n_online_cpus);
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -144,8 +144,45 @@ bool rcu_gp_is_normal(void)
 }
 EXPORT_SYMBOL_GPL(rcu_gp_is_normal);
-static atomic_t rcu_expedited_nesting = ATOMIC_INIT(1);
+static atomic_t rcu_async_hurry_nesting = ATOMIC_INIT(1);
 /*
 * Should call_rcu() callbacks be processed with urgency or are
 * they OK being executed with arbitrary delays?
 */
 bool rcu_async_should_hurry(void)
 {
 	return !IS_ENABLED(CONFIG_RCU_LAZY) ||
 	       atomic_read(&rcu_async_hurry_nesting);
 }
 EXPORT_SYMBOL_GPL(rcu_async_should_hurry);
 /**
 * rcu_async_hurry - Make future async RCU callbacks not lazy.
 *
 * After a call to this function, future calls to call_rcu()
 * will be processed in a timely fashion.
 */
 void rcu_async_hurry(void)
 {
 	if (IS_ENABLED(CONFIG_RCU_LAZY))
 		atomic_inc(&rcu_async_hurry_nesting);
 }
 EXPORT_SYMBOL_GPL(rcu_async_hurry);
 /**
 * rcu_async_relax - Make future async RCU callbacks lazy.
 *
 * After a call to this function, future calls to call_rcu()
 * will be processed in a lazy fashion.
 */
 void rcu_async_relax(void)
 {
 	if (IS_ENABLED(CONFIG_RCU_LAZY))
 		atomic_dec(&rcu_async_hurry_nesting);
 }
 EXPORT_SYMBOL_GPL(rcu_async_relax);
 static atomic_t rcu_expedited_nesting = ATOMIC_INIT(1);
 /*
 * Should normal grace-period primitives be expedited?  Intended for
 * use within RCU.  Note that this function takes the rcu_expedited
@@ -195,6 +232,7 @@ static bool rcu_boot_ended __read_mostly;
 void rcu_end_inkernel_boot(void)
 {
 	rcu_unexpedite_gp();
 	rcu_async_relax();
 	if (rcu_normal_after_boot)
 		WRITE_ONCE(rcu_normal, 1);
 	rcu_boot_ended = true;
@@ -220,6 +258,7 @@ void rcu_test_sync_prims(void)
 {
 	if (!IS_ENABLED(CONFIG_PROVE_RCU))
 		return;
 	pr_info("Running RCU synchronous self tests\n");
 	synchronize_rcu();
 	synchronize_rcu_expedited();
 }
@@ -508,6 +547,10 @@ int rcu_cpu_stall_timeout __read_mostly = CONFIG_RCU_CPU_STALL_TIMEOUT;
 module_param(rcu_cpu_stall_timeout, int, 0644);
 int rcu_exp_cpu_stall_timeout __read_mostly = CONFIG_RCU_EXP_CPU_STALL_TIMEOUT;
 module_param(rcu_exp_cpu_stall_timeout, int, 0644);
 int rcu_cpu_stall_cputime __read_mostly = IS_ENABLED(CONFIG_RCU_CPU_STALL_CPUTIME);
 module_param(rcu_cpu_stall_cputime, int, 0644);
 bool rcu_exp_stall_task_details __read_mostly;
 module_param(rcu_exp_stall_task_details, bool, 0644);
 #endif /* #ifdef CONFIG_RCU_STALL_COMMON */
 // Suppress boot-time RCU CPU stall warnings and rcutorture writer stall
@@ -555,9 +598,12 @@ struct early_boot_kfree_rcu {
 static void early_boot_test_call_rcu(void)
 {
 	static struct rcu_head head;
 	int idx;
 	static struct rcu_head shead;
 	struct early_boot_kfree_rcu *rhp;
 	idx = srcu_down_read(&early_srcu);
 	srcu_up_read(&early_srcu, idx);
 	call_rcu(&head, test_callback);
 	early_srcu_cookie = start_poll_synchronize_srcu(&early_srcu);
 	call_srcu(&early_srcu, &shead, test_callback);
@@ -586,6 +632,7 @@ static int rcu_verify_early_boot_tests(void)
 		early_boot_test_counter++;
 		srcu_barrier(&early_srcu);
 		WARN_ON_ONCE(!poll_state_synchronize_srcu(&early_srcu, early_srcu_cookie));
 		cleanup_srcu_struct(&early_srcu);
 	}
 	if (rcu_self_test_counter != early_boot_test_counter) {
 		WARN_ON(1);
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5342,6 +5342,11 @@ bool single_task_running(void)
 }
 EXPORT_SYMBOL(single_task_running);
 unsigned long long nr_context_switches_cpu(int cpu)
 {
 	return cpu_rq(cpu)->nr_switches;
 }
 unsigned long long nr_context_switches(void)
 {
 	int i;
--- a/kernel/torture.c
+++ b/kernel/torture.c
@@ -450,7 +450,7 @@ unsigned long
 torture_random(struct torture_random_state *trsp)
 {
 	if (--trsp->trs_count < 0) {
-		trsp->trs_state += (unsigned long)local_clock();
+		trsp->trs_state += (unsigned long)local_clock() + raw_smp_processor_id();
 		trsp->trs_count = TORTURE_RANDOM_REFRESH;
 	}
 	trsp->trs_state = trsp->trs_state * TORTURE_RANDOM_MULT +
@@ -915,7 +915,7 @@ void torture_kthread_stopping(char *title)
 	VERBOSE_TOROUT_STRING(buf);
 	while (!kthread_should_stop()) {
 		torture_shutdown_absorb(title);
-		schedule_timeout_uninterruptible(1);
+		schedule_timeout_uninterruptible(HZ / 20);
 	}
 }
 EXPORT_SYMBOL_GPL(torture_kthread_stopping);
--- a/tools/testing/selftests/rcutorture/bin/configcheck.sh
+++ b/tools/testing/selftests/rcutorture/bin/configcheck.sh
@@ -10,10 +10,9 @@
 T="`mktemp -d ${TMPDIR-/tmp}/configcheck.sh.XXXXXX`"
 trap 'rm -rf $T' 0
-cat $1 > $T/.config
+sed -e 's/"//g' < $1 > $T/.config
-cat $2 | sed -e 's/\(.*\)=n/# \1 is not set/' -e 's/^#CHECK#//' |
+sed -e 's/"//g' -e 's/\(.*\)=n/# \1 is not set/' -e 's/^#CHECK#//' < $2 |
 grep -v '^CONFIG_INITRAMFS_SOURCE' |
 awk	'
 {
 		print "if grep -q \"" $0 "\" < '"$T/.config"'";
--- a/tools/testing/selftests/rcutorture/bin/console-badness.sh
+++ b/tools/testing/selftests/rcutorture/bin/console-badness.sh
@@ -10,7 +10,7 @@
 #
 # Authors: Paul E. McKenney <paulmck@kernel.org>
-egrep 'Badness|WARNING:|Warn|BUG|===========|BUG: KCSAN:|Call Trace:|Oops:|detected stalls on CPUs/tasks:|self-detected stall on CPU|Stall ended before state dump start|\?\?\? Writer stall state|rcu_.*kthread starved for|!!!' |
+grep -E 'Badness|WARNING:|Warn|BUG|===========|BUG: KCSAN:|Call Trace:|Oops:|detected stalls on CPUs/tasks:|self-detected stall on CPU|Stall ended before state dump start|\?\?\? Writer stall state|rcu_.*kthread starved for|!!!' |
 grep -v 'ODEBUG: ' |
 grep -v 'This means that this is a DEBUG kernel and it is' |
 grep -v 'Warning: unable to open an initial console' |
--- a/tools/testing/selftests/rcutorture/bin/kvm-build.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm-build.sh
@@ -44,10 +44,10 @@ fi
 ncpus="`getconf _NPROCESSORS_ONLN`"
 make -j$((2 * ncpus)) $TORTURE_KMAKE_ARG > $resdir/Make.out 2>&1
 retval=$?
-if test $retval -ne 0 || grep "rcu[^/]*": < $resdir/Make.out | egrep -q "Stop|Error|error:|warning:" || egrep -q "Stop|Error|error:" < $resdir/Make.out
+if test $retval -ne 0 || grep "rcu[^/]*": < $resdir/Make.out | grep -E -q "Stop|Error|error:|warning:" || grep -E -q "Stop|Error|error:" < $resdir/Make.out
 then
 	echo Kernel build error
-	egrep "Stop|Error|error:|warning:" < $resdir/Make.out
+	grep -E "Stop|Error|error:|warning:" < $resdir/Make.out
 	echo Run aborted.
 	exit 3
 fi
--- a/tools/testing/selftests/rcutorture/bin/kvm-find-errors.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm-find-errors.sh
@@ -32,11 +32,11 @@ for i in ${rundir}/*/Make.out
 do
 	scenariodir="`dirname $i`"
 	scenariobasedir="`echo ${scenariodir} | sed -e 's/\.[0-9]*$//'`"
-	if egrep -q "error:|warning:|^ld: .*undefined reference to" < $i
+	if grep -E -q "error:|warning:|^ld: .*undefined reference to" < $i
 	then
-		egrep "error:|warning:|^ld: .*undefined reference to" < $i > $i.diags
+		grep -E "error:|warning:|^ld: .*undefined reference to" < $i > $i.diags
 		files="$files $i.diags $i"
-	elif ! test -f ${scenariobasedir}/vmlinux && ! test -f "${rundir}/re-run"
+	elif ! test -f ${scenariobasedir}/vmlinux && ! test -f ${scenariobasedir}/vmlinux.xz && ! test -f "${rundir}/re-run"
 	then
 		echo No ${scenariobasedir}/vmlinux file > $i.diags
 		files="$files $i.diags $i"
--- a/tools/testing/selftests/rcutorture/bin/kvm.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm.sh
@@ -186,7 +186,7 @@ do
 		fi
 		;;
 	--kconfig|--kconfigs)
-		checkarg --kconfig "(Kconfig options)" $# "$2" '^CONFIG_[A-Z0-9_]\+=\([ynm]\|[0-9]\+\)\( CONFIG_[A-Z0-9_]\+=\([ynm]\|[0-9]\+\)\)*$' '^error$'
+		checkarg --kconfig "(Kconfig options)" $# "$2" '^CONFIG_[A-Z0-9_]\+=\([ynm]\|[0-9]\+\|"[^"]*"\)\( CONFIG_[A-Z0-9_]\+=\([ynm]\|[0-9]\+\|"[^"]*"\)\)*$' '^error$'
 		TORTURE_KCONFIG_ARG="`echo "$TORTURE_KCONFIG_ARG $2" | sed -e 's/^ *//' -e 's/ *$//'`"
 		shift
 		;;
@@ -585,7 +585,7 @@ awk < $T/cfgcpu.pack \
 echo kvm-end-run-stats.sh "$resdir/$ds" "$starttime" >> $T/script
 # Extract the tests and their batches from the script.
-egrep 'Start batch|Starting build\.' $T/script | grep -v ">>" |
+grep -E 'Start batch|Starting build\.' $T/script | grep -v ">>" |
 	sed -e 's/:.*$//' -e 's/^echo //' -e 's/-ovf//' |
 	awk '
 	/^----Start/ {
@@ -622,7 +622,7 @@ then
 elif test "$dryrun" = sched
 then
 	# Extract the test run schedule from the script.
-	egrep 'Start batch|Starting build\.' $T/script | grep -v ">>" |
+	grep -E 'Start batch|Starting build\.' $T/script | grep -v ">>" |
 		sed -e 's/:.*$//' -e 's/^echo //'
 	nbuilds="`grep 'Starting build\.' $T/script |
 		  grep -v ">>" | sed -e 's/:.*$//' -e 's/^echo //' |
--- a/tools/testing/selftests/rcutorture/bin/parse-console.sh
+++ b/tools/testing/selftests/rcutorture/bin/parse-console.sh
@@ -65,7 +65,7 @@ then
 	fi
 	grep --binary-files=text 'torture:.*ver:' $file |
-	egrep --binary-files=text -v '\(null\)|rtc: 000000000* ' |
+	grep -E --binary-files=text -v '\(null\)|rtc: 000000000* ' |
 	sed -e 's/^(initramfs)[^]]*] //' -e 's/^\[[^]]*] //' |
 	sed -e 's/^.*ver: //' |
 	awk '
@@ -128,17 +128,17 @@ then
 	then
 		summary="$summary  Badness: $n_badness"
 	fi
-	n_warn=`grep -v 'Warning: unable to open an initial console' $file | grep -v 'Warning: Failed to add ttynull console. No stdin, stdout, and stderr for the init process' | egrep -c 'WARNING:|Warn'`
+	n_warn=`grep -v 'Warning: unable to open an initial console' $file | grep -v 'Warning: Failed to add ttynull console. No stdin, stdout, and stderr for the init process' | grep -E -c 'WARNING:|Warn'`
 	if test "$n_warn" -ne 0
 	then
 		summary="$summary  Warnings: $n_warn"
 	fi
-	n_bugs=`egrep -c '\bBUG|Oops:' $file`
+	n_bugs=`grep -E -c '\bBUG|Oops:' $file`
 	if test "$n_bugs" -ne 0
 	then
 		summary="$summary  Bugs: $n_bugs"
 	fi
-	n_kcsan=`egrep -c 'BUG: KCSAN: ' $file`
+	n_kcsan=`grep -E -c 'BUG: KCSAN: ' $file`
 	if test "$n_kcsan" -ne 0
 	then
 		if test "$n_bugs" = "$n_kcsan"
@@ -158,7 +158,7 @@ then
 	then
 		summary="$summary  lockdep: $n_badness"
 	fi
-	n_stalls=`egrep -c 'detected stalls on CPUs/tasks:|self-detected stall on CPU|Stall ended before state dump start|\?\?\? Writer stall state' $file`
+	n_stalls=`grep -E -c 'detected stalls on CPUs/tasks:|self-detected stall on CPU|Stall ended before state dump start|\?\?\? Writer stall state' $file`
 	if test "$n_stalls" -ne 0
 	then
 		summary="$summary  Stalls: $n_stalls"