livepatch: Simplify API by removing registration step

The possibility to re-enable a registered patch was useful for immediate
patches where the livepatch module had to stay until the system reboot.
The improved consistency model allows to achieve the same result by
unloading and loading the livepatch module again.

Also we are going to add a feature called atomic replace. It will allow
to create a patch that would replace all already registered patches.
The aim is to handle dependent patches more securely. It will obsolete
the stack of patches that helped to handle the dependencies so far.
Then it might be unclear when a cumulative patch re-enabling is safe.

It would be complicated to support the many modes. Instead we could
actually make the API and code easier to understand.

Therefore, remove the two step public API. All the checks and init calls
are moved from klp_register_patch() to klp_enabled_patch(). Also the patch
is automatically freed, including the sysfs interface when the transition
to the disabled state is completed.

As a result, there is never a disabled patch on the top of the stack.
Therefore we do not need to check the stack in __klp_enable_patch().
And we could simplify the check in __klp_disable_patch().

Also the API and logic is much easier. It is enough to call
klp_enable_patch() in module_init() call. The patch can be disabled
by writing '0' into /sys/kernel/livepatch/<patch>/enabled. Then the module
can be removed once the transition finishes and sysfs interface is freed.

The only problem is how to free the structures and kobjects safely.
The operation is triggered from the sysfs interface. We could not put
the related kobject from there because it would cause lock inversion
between klp_mutex and kernfs locks, see kn->count lockdep map.

Therefore, offload the free task to a workqueue. It is perfectly fine:

  + The patch can no longer be used in the livepatch operations.

  + The module could not be removed until the free operation finishes
    and module_put() is called.

  + The operation is asynchronous already when the first
    klp_try_complete_transition() fails and another call
    is queued with a delay.

Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
This commit is contained in:
Petr Mladek
2019-01-09 13:43:23 +01:00
committed by Jiri Kosina
parent 68007289bf
commit 958ef1e39d
9 changed files with 171 additions and 332 deletions

View File

@@ -12,12 +12,11 @@ Table of Contents:
4. Livepatch module
4.1. New functions
4.2. Metadata
4.3. Livepatch module handling
5. Livepatch life-cycle
5.1. Registration
5.1. Loading
5.2. Enabling
5.3. Disabling
5.4. Unregistration
5.4. Removing
6. Sysfs
7. Limitations
@@ -298,117 +297,89 @@ into three levels:
see the "Consistency model" section.
4.3. Livepatch module handling
------------------------------
The usual behavior is that the new functions will get used when
the livepatch module is loaded. For this, the module init() function
has to register the patch (struct klp_patch) and enable it. See the
section "Livepatch life-cycle" below for more details about these
two operations.
Module removal is only safe when there are no users of the underlying
functions. This is the reason why the force feature permanently disables
the removal. The forced tasks entered the functions but we cannot say
that they returned back. Therefore it cannot be decided when the
livepatch module can be safely removed. When the system is successfully
transitioned to a new patch state (patched/unpatched) without being
forced it is guaranteed that no task sleeps or runs in the old code.
5. Livepatch life-cycle
=======================
Livepatching defines four basic operations that define the life cycle of each
live patch: registration, enabling, disabling and unregistration. There are
several reasons why it is done this way.
First, the patch is applied only when all patched symbols for already
loaded objects are found. The error handling is much easier if this
check is done before particular functions get redirected.
Second, it might take some time until the entire system is migrated with
the hybrid consistency model being used. The patch revert might block
the livepatch module removal for too long. Therefore it is useful to
revert the patch using a separate operation that might be called
explicitly. But it does not make sense to remove all information until
the livepatch module is really removed.
Livepatching can be described by four basic operations:
loading, enabling, disabling, removing.
5.1. Registration
-----------------
5.1. Loading
------------
Each patch first has to be registered using klp_register_patch(). This makes
the patch known to the livepatch framework. Also it does some preliminary
computing and checks.
The only reasonable way is to enable the patch when the livepatch kernel
module is being loaded. For this, klp_enable_patch() has to be called
in the module_init() callback. There are two main reasons:
In particular, the patch is added into the list of known patches. The
addresses of the patched functions are found according to their names.
The special relocations, mentioned in the section "New functions", are
applied. The relevant entries are created under
/sys/kernel/livepatch/<name>. The patch is rejected when any operation
fails.
First, only the module has an easy access to the related struct klp_patch.
Second, the error code might be used to refuse loading the module when
the patch cannot get enabled.
5.2. Enabling
-------------
Registered patches might be enabled either by calling klp_enable_patch() or
by writing '1' to /sys/kernel/livepatch/<name>/enabled. The system will
start using the new implementation of the patched functions at this stage.
The livepatch gets enabled by calling klp_enable_patch() from
the module_init() callback. The system will start using the new
implementation of the patched functions at this stage.
When a patch is enabled, livepatch enters into a transition state where
tasks are converging to the patched state. This is indicated by a value
of '1' in /sys/kernel/livepatch/<name>/transition. Once all tasks have
been patched, the 'transition' value changes to '0'. For more
information about this process, see the "Consistency model" section.
First, the addresses of the patched functions are found according to their
names. The special relocations, mentioned in the section "New functions",
are applied. The relevant entries are created under
/sys/kernel/livepatch/<name>. The patch is rejected when any above
operation fails.
If an original function is patched for the first time, a function
specific struct klp_ops is created and an universal ftrace handler is
registered.
Second, livepatch enters into a transition state where tasks are converging
to the patched state. If an original function is patched for the first
time, a function specific struct klp_ops is created and an universal
ftrace handler is registered[*]. This stage is indicated by a value of '1'
in /sys/kernel/livepatch/<name>/transition. For more information about
this process, see the "Consistency model" section.
Functions might be patched multiple times. The ftrace handler is registered
only once for the given function. Further patches just add an entry to the
list (see field `func_stack`) of the struct klp_ops. The last added
entry is chosen by the ftrace handler and becomes the active function
replacement.
Finally, once all tasks have been patched, the 'transition' value changes
to '0'.
Note that the patches might be enabled in a different order than they were
registered.
[*] Note that functions might be patched multiple times. The ftrace handler
is registered only once for a given function. Further patches just add
an entry to the list (see field `func_stack`) of the struct klp_ops.
The right implementation is selected by the ftrace handler, see
the "Consistency model" section.
5.3. Disabling
--------------
Enabled patches might get disabled either by calling klp_disable_patch() or
by writing '0' to /sys/kernel/livepatch/<name>/enabled. At this stage
either the code from the previously enabled patch or even the original
code gets used.
Enabled patches might get disabled by writing '0' to
/sys/kernel/livepatch/<name>/enabled.
When a patch is disabled, livepatch enters into a transition state where
tasks are converging to the unpatched state. This is indicated by a
value of '1' in /sys/kernel/livepatch/<name>/transition. Once all tasks
have been unpatched, the 'transition' value changes to '0'. For more
information about this process, see the "Consistency model" section.
First, livepatch enters into a transition state where tasks are converging
to the unpatched state. The system starts using either the code from
the previously enabled patch or even the original one. This stage is
indicated by a value of '1' in /sys/kernel/livepatch/<name>/transition.
For more information about this process, see the "Consistency model"
section.
Here all the functions (struct klp_func) associated with the to-be-disabled
Second, once all tasks have been unpatched, the 'transition' value changes
to '0'. All the functions (struct klp_func) associated with the to-be-disabled
patch are removed from the corresponding struct klp_ops. The ftrace handler
is unregistered and the struct klp_ops is freed when the func_stack list
becomes empty.
Patches must be disabled in exactly the reverse order in which they were
enabled. It makes the problem and the implementation much easier.
Third, the sysfs interface is destroyed.
Note that patches must be disabled in exactly the reverse order in which
they were enabled. It makes the problem and the implementation much easier.
5.4. Unregistration
-------------------
5.4. Removing
-------------
Disabled patches might be unregistered by calling klp_unregister_patch().
This can be done only when the patch is disabled and the code is no longer
used. It must be called before the livepatch module gets unloaded.
At this stage, all the relevant sys-fs entries are removed and the patch
is removed from the list of known patches.
Module removal is only safe when there are no users of functions provided
by the module. This is the reason why the force feature permanently
disables the removal. Only when the system is successfully transitioned
to a new patch state (patched/unpatched) without being forced it is
guaranteed that no task sleeps or runs in the old code.
6. Sysfs