[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[gomd-devel] Re: Chpox support in gomd

From: Gian Paolo Ghilardi
Subject: [gomd-devel] Re: Chpox support in gomd
Date: Fri, 31 Oct 2003 19:51:24 +0100

Hi Olexander.

First of all, thanks for the quick reply.

> Hello Gian,
> Thank you for your interest to chpox project and for wish
> to integrate support of chpox into gomd. Good userlevel
> support should extend the capabilities of
> checkpoint/restart and usage convenience. I like your
> development plan and wish to make some suggestions.

Your comments are always welcome. Thanks!

> I think that the first feature should be implemented is support of
> operations (registering, checkpointing, rotation of checkpoint files). So
> Phase#1 should be the most rational. Probably one thread is enough
> because registering and sending checkpoint signals are rather fast.

Yes, you're right. Moreover I think it's safer to manage a single thread
than several ones.
As checkpointing is important, gomd must support it in the safest way.

> I also think that it is necessary to provide some kind of checkpoint files
> versions for the posibility that user can choose not the last checkpoint
> some earlier.

The infos of every registered process are included in a chpoxProc object.
The idea is to provide a counter to trace the number of dumps (and related
dumps filename) and autoclean the oldest dumps.

>This also should provide aditional backup copies in the
> case your last checkpoint becomes broken. At first we thought to
> integrate it with chpox, but then decided that this is not the kernel
> business and it is better to implement it in userlevel.
> Another thing I can suggest
> is automatic restart of checkpointed process after system
> reboot. Many users should like this feature.

Wow! This is a really nice idea (and maybe I know how to implement that).

> There are also may
> be some problems. The first is concerned with registering. If you need to
register a
> single process - it is quite easy. Old chpox versions supported only
single processes.
> Now chpox was extended
> to support of communicating processes. So it is necessary to specify
> several process that need to be checkpointed togather. Now this can
> be done by registering the parent pid and setting the flag to
> checkpoint chlidren.

Yet supported. The user can define the dump mechanism via a macro in
constants.h file (CHPOX_REGISTER_MECHANISM). At the moment this macro is
hard-coded but in future the user will be able to define the mechanism at
Please refer to

 >It is enough for communication via pipes (in
> most cases). But now we are working on support of sockets on single
> machine. It should be necessary to checkpoint togather  processes that
> has no common parent.
>So registering mechanism may be changed.
> Another problem, that checkpointing/restart may be a large security hole
> and it is necessary to remember about it.
Yes. I'll discuss with other developers about this critical issue.
Security is important.

> If we have new ideas we shall inform you. If you have any questions
> fell free to contact me.
> Regards
> Olexander Sudakov

Thanks a lot for your help.

Gian Paolo Ghilardi
Gomd Team

PS: yesterday the first code to support chpox was committed to CVS (a few
hours of work => code could be ugly to see). If you have a few minutes to
spend on, can you check and comment the registerProcess() function in the
file below? I'd like to know if the steps are correct. Thanks.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]