bug-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Future use of OSKit facilities in gnumach


From: John Tobey
Subject: Re: Future use of OSKit facilities in gnumach
Date: Sun, 2 Mar 2003 21:37:25 -0500
User-agent: Mutt/1.3.28i

> This stuff is really all using the "minimal" code.  This is just stuff for
> booting and such, and things like _exit to reboot and printf for the
> boot-time output on the minimal console.

> The use now is pretty minimal and if anything there will be less use rather
> than more.

Thanks Roland, that's what I was hoping.

> The existing oskit-on-unix code ought to do you just fine for most of the
> "hardware" support, i.e. replacing the boot path, the minimal console,
> device driver, timer, etc. interfaces.  Really the only hard work you have
> is the hard work of your plan, i.e. context switching and address spaces.

No, that is the FUN part.  I've got that working rather well, if I may
say so (rewritten since my first try).  The hard part is putting it
all together.

> I am certainly receptive to making more parts of GNU Mach replaceable with
> OSKit modules.

Thanks.

> For your hack, you probably can get a lot done without going the most
> modular oskity route.  Most of what you need to replace is in locore and
> cswitch.

Don't forget about Makefile.in and configure.in.  My locore
"replacement" follows, just as a teaser for anyone else who may be
interested.  It is the only bit of assembly code, but not the only
x86/Linux-specific code.  The rest is a lot of ptrace(), shared
memory, and syscall numbers.  I am aiming for under 10 context
switches per emulated syscall... too slow for an IRC server, but fast
enough to get my brother to try the Hurd.  Oh, and NCPUS will be > 1
by default thanks to linuxthreads.

-- 
John Tobey <jtobey@john-edwin-tobey.org>
\____^-^
/\  /\


## void tramp(int const* pseudo_code)

# Wait for and service a kernel request to execute a syscall, jump
# to another location, or perform housekeeping.  Repeat ad infinitum.

# These comments use "kernel" to mean the OTOP Mach kernel process
# running on the "host" Linux kernel.

# This is assembly because it must not use the stack.  This code
# executes host system calls, but Hurd user code is prevented from
# doing so by PTRACE_SYSCALL.  Thus, Hurd user code is less
# privileged and yet runs in our address space.  If it can write our
# stack, it may be able to gain privilege.  Therefore, we trust only
# read-only memory controlled by the kernel.  This is a concern only
# in the presence of untrusted, running threads.

# This code assumes that:
#
#     * PSEUDO_CODE points to an array of pseudo-instructions, each
#           beginning with a pointer to an "op" defined in this code,
#           followed by any arguments required by the op.

## Define constants here instead of looking for them in headers.
## I consider it more likely that the headers will cause problems on
## a given system than that the values will change.

## Syscall numbers from <asm/unistd.h>

__NR_rt_sigreturn  = 173

## The ubiquitous page size.

PAGE_SIZE     = 4096

## Offsets in ucontext_t structure.
UC_SS_SP      = 8               # uc_stack.ss_sp
US_SS_SIZE    = 16              # uc_stack.ss_size

        ## Type "@function" fools "objdump -d" into disassembling it.
        .type   tramp,@function
tramp:
        ## C entry point.
        xorl    %ebp, %ebp      # tell debuggers there's no stack
        popl    %eax            # discard return address
        xorl    %eax, %eax      # start in a non-error state
op_goto:
        popl    %esp            # get pseudo-code address
        ret                     # run pseudo-code

        ## op_break executes a software breakpoint that lets the
        ## controlling process examine our syscall return value.
op_break:
        int3                    # breakpoint
op_resume:
        ret                     # resumed, jump to next op

        ## op_syscall is for syscalls.
        ## Arg0 is the number of syscall args.  Arg1 is the syscall
        ## number.  Args 2-7, if present, are the syscall args.
op_syscall:
        popl    %ebp            # get number of args
        popl    %eax            # get syscall number
        testl   %ebp, %ebp      # is arg count 0?
        je      syscall_trap    # yes, done
        popl    %ebx            # 1st syscall arg
        decl    %ebp            # is arg count 1?
        je      syscall_trap    # yes, done
        popl    %ecx            # 2nd syscall arg
        decl    %ebp            # is arg count 2?
        je      syscall_trap    # yes, done
        popl    %edx            # 3rd syscall arg
        decl    %ebp            # is arg count 3?
        je      syscall_trap    # yes, done
        popl    %esi            # 4th syscall arg
        decl    %ebp            # is arg count 4?
        je      syscall_trap    # yes, done
        popl    %edi            # 5th syscall arg
        decl    %ebp            # is arg count 5?
        je      syscall_trap    # yes, done
        popl    %ebp            # 6th syscall arg
syscall_trap:
        int     $0x80           # syscall
        ret                     # jump to next op

        ## op_check_ret normally happens after op_syscall when there
        ## is only one expected "successful" return value.
        ## op_check_ret does nothing if the return value indicates
        ## success.  Otherwise, it raises SIGSEGV.
op_check_ret:
        cmpl    $-PAGE_SIZE, %eax       # error result?
        ja      raise_SIGSEGV   # yes, error, yield control
        ret                     # success, jump to next op

raise_SIGSEGV:
        hlt                     # issue a privileged instruction so we "crash"
                                # but maintain register state
        jmp     very_wrong      # the hlt instruction isn't supposed to return

        .align  16
tramp_handler:
        ## Figure out where the base of the signal stack is and store
        ## ESP there.
        movl    12(%esp), %ecx  # third arg is ucontext_t pointer
        movl    UC_SS_SP(%ecx), %eax    # get low address of signal stack
        addl    US_SS_SIZE(%ecx), %eax  # add signal stack size
        movl    %esp, (%eax)    # store ESP there

        ## We need to jump back into Hurd code.  We are granted one
        ## syscall.  (We need it, because user-mode ix86 seems to lack
        ## any way to set the instruction and stack pointers
        ## simultaneously.)  The stack is set up so that a RET
        ## instruction will cause execution of a little piece of code
        ## that calls rt_sigreturn, a special syscall tailored to this
        ## purpose.  But that assumes that no other thread has snuck
        ## in and modified either the return address (jumped to by
        ## RET) or the little trampoline that it points to.  If either
        ## of these things were modified, Hurd code could trick us
        ## into executing any syscall, which could be unsafe.  So we
        ## call rt_sigreturn directly.  rt_sigreturn and sigreturn are
        ## unique among Linux syscalls in that they take their one
        ## argument in ESP.
        popl    %eax            # simulate a "ret" instruction
        movl    $__NR_rt_sigreturn, %eax        # rt_sigreturn
        int     $0x80           # syscall

        ## Something is very wrong if we get here.
very_wrong:
        hlt                     # sigreturn is not supposed to return
        jmp very_wrong          # refuse to continue

        ## Here begins the op array, which requires alignment.
        .align  4
tramp_end:
        .size   otop_tramp, . - otop_tramp

## tramp_info structure

.section        .rodata
.align  4

.globl otop_tramp_info
        .type   otop_tramp_info, @object
otop_tramp_info:
        .long   tramp
        .long   tramp_handler - tramp
        .long   tramp_end - tramp
        .long   op_goto - tramp
        .long   op_break - tramp
        .long   op_resume - tramp
        .long   op_syscall - tramp
        .long   op_check_ret - tramp
        .long   raise_SIGSEGV - tramp
        .size   otop_tramp_info, . - otop_tramp_info




reply via email to

[Prev in Thread] Current Thread [Next in Thread]