bug-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/3 gnumach] vm_map: Fix deadlock in vm code


From: Samuel Thibault
Subject: Re: [PATCH 1/3 gnumach] vm_map: Fix deadlock in vm code
Date: Wed, 21 Feb 2024 19:44:08 +0100
User-agent: NeoMutt/20170609 (1.8.3)

Damien Zammit, le mer. 21 févr. 2024 13:45:59 +0000, a ecrit:
> Authored-by: Sergey Bugaev <bugaevc@gmail.com>
> ---
>  vm/vm_map.c | 32 ++++++++++++++++++++++----------
>  1 file changed, 22 insertions(+), 10 deletions(-)
> 
> diff --git a/vm/vm_map.c b/vm/vm_map.c
> index f221c532..e4672260 100644
> --- a/vm/vm_map.c
> +++ b/vm/vm_map.c
> @@ -1424,8 +1424,9 @@ vm_map_pageable_scan(struct vm_map *map,
>                    struct vm_map_entry *start,
>                    struct vm_map_entry *end)
>  {
> -     struct vm_map_entry *entry;
> -     boolean_t do_wire_faults;
> +     struct vm_map_entry     *entry;
> +     struct vm_map_entry     entry_copy;
> +     boolean_t               do_wire_faults;
>  
>       /*
>        * Pass 1. Update counters and prepare wiring faults.
> @@ -1549,15 +1550,30 @@ vm_map_pageable_scan(struct vm_map *map,
>        * If we are wiring in the kernel map or a submap of it,
>        * unlock the map to avoid deadlocks. [...]

IIRC you got to see which kind of deadlock we are protecting from here?
I'd be very useful to document it here, so next people reading this get
to know.

>        [...] We trust that the
>        * kernel threads are well-behaved, and therefore will
>        * not do anything destructive to this region of the map
>        * while we have it unlocked.

Are we not here precisely in the case that they are not well-behaved?
The comment probably deserves an update, to make the situation cler.

> We cannot trust user threads
>        * to do the same.
>        *
> +      * Once we unlock the map, even well-intentioned operations
> +      * on adjacent VM regions can end up affecting our entry,
> +      * due to clipping and coalescing entries.  So, make a
> +      * temporary copy of the entry, and pass that to vm_fault_wire()
> +      * instead of the original.

We need to be very careful with such a thing. If it's a copy that is
given to further function calls, we need to make sure that they won't
try to modify it, i.e. add const qualifiers on the entry parameter all
along to vm_fault_wire and its callees being passed entry.

That being said, this doesn't seem completely safe to me: in the end
vm_fault_wire_fast uses the object field, are we sure that the object
won't disappear? Or are we sure it won't because "kernel threads are
well-behaved"? (but I don't really know how we are sure of this, so this
probably also deserves documenting).

> +      *
>        * HACK HACK HACK HACK
>        */
>       if (vm_map_pmap(map) == kernel_pmap) {
> +             /*
> +              *      TODO: Support wiring more than one entry
> +              *      in the kernel map at a time.
> +              */
> +             assert(start->vme_next == end);

Are we sure that this currently is never to happen?

> +             entry_copy = *start;
>               vm_map_unlock(map); /* trust me ... */
> -     } else {
> -             vm_map_lock_set_recursive(map);
> -             vm_map_lock_write_to_read(map);
> +             vm_fault_wire(map, &entry_copy);

This is also assuming that for non-kernel_pmap there is just one entry
for which to call vm_fault_wire, while userland can very well ask for
wiring a wide range of memory, spanning various objects and whatnot.

AIUI you'd rather want the converse: inline the code for the kernel_pmap
case, which is much more trivial since start->vme_next == end and thus
no for loop to perform, just call vm_fault_wire and the locking stuff
around, and be done.

> +             vm_map_lock(map);
> +             return;
>       }
>  
> +     vm_map_lock_set_recursive(map);
> +     vm_map_lock_write_to_read(map);
> +
>       for (entry = start; entry != end; entry = entry->vme_next) {
>               /*
>                * The wiring count can only be 1 if it was
> @@ -1572,11 +1588,7 @@ vm_map_pageable_scan(struct vm_map *map,
>               }
>       }
>  
> -     if (vm_map_pmap(map) == kernel_pmap) {
> -             vm_map_lock(map);
> -     } else {
> -             vm_map_lock_clear_recursive(map);
> -     }
> +     vm_map_lock_clear_recursive(map);
>  }



reply via email to

[Prev in Thread] Current Thread [Next in Thread]