bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Parallelism a la make -j <n> / GNU parallel


From: Linda Walsh
Subject: Re: Parallelism a la make -j <n> / GNU parallel
Date: Sat, 12 May 2012 08:35:35 -0700
User-agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.24) Gecko/20100228 Lightning/0.9 Thunderbird/2.0.0.24 Mnenhy/0.7.6.666

Ole Tange wrote:

On Sat, May 12, 2012 at 9:34 AM, Linda Walsh <bash@tlinx.org> wrote:
Ole Tange wrote:

Can you explain how that idea would differ from sem (Part of GNU
Parallel)?
� � � �Because gnu parallel is written in perl? �And well, writing it in
perl.... that's near easy... did that about ... 8 years ago? in perl...
to encode albums in FLAC or LAME -- about 35-45 seconds/album...on my old
machine. �But perl broke the script, multiple times .. (upgrades in perl)...

I have been the maintainer of GNU Parallel for the past 11 years. It
has cost years of work to get it to work on all platforms for all
versions for all corner cases. It has never broken because of a perl
upgrade. So I am quite baffled when you say it is near easy.

Maybe you really mean that it is easy to get it to work for some
platforms for some versions for some corner cases? I will agree to
that, but I would never characterize that as production quality code.

So am rewriting it...

� � � �Doing it in shell... that would be a 'new' challenge... ;-)

I fully understand the thrill in doing something again that has

----
        I think you missed the 'wink'... and the following comment:

    "And people called me masochistic for trying to write complex
   progs in shell ... "
        
Notice the original adjective before the word 'easy'. It's vital in understanding, it's a classic example of the relative definition of "near".

Example: Travel to the nearest solar system outside ours is near easy compared to travel to the next galaxy... *ahem*...*cough* *cough*.


Sorry, but I'm sure it WASN'T easy to get it to work natively on all platforms, especially if Win32 or strawberry perl is included in that "all".

        Even getting it to work correctly on 1 platform is less than trivial --
as mine runs jobs based on #cpus, a simple check of /proc/cpuinfo won't work so
well on non-*nix compat's...  so there was no attempt at comparison, and mine
isn't even working right now....(didn't start in perl...mostly due to usage
of 'use Exporter/EXPORTS' and multiple packages in same file -- and semantics in those areas changing quite a bit.

If you did it all in 1 package OR did it in straight OO (no 'Export/Imports), you would like have avoided most of the problems I ran into.

        As was suggested by others -- I also found a need to use HASH's to
map exit status values from each job, back to their original caller to point to
which job failed (if one failed).

        Also one would want to trap Control-C, so one can terminate outstanding
jobs.

I had no need for a semaphore -- and was able to use sleep without fear of waking up multiple jobs, as I only scheduled as many jobs as desired at one time -- i.e. say running 20 convert jobs, but my cpu count=2 & extra loading factor was 2, allowing for 4 jobs to run at once. The first foure would run immediately, but the next 16 wouldn't get scheduled for execution until one of the first ones exited, then they are scheduled 1-by-1 as done processes finish.

So for me, my 'semaphore' count was 'k1+k2', where k1 was based on # cpu's in system and k2 was specific to how much overlap the particular machine I was running on needed to keep the cpu's @ 100% ...Once I decide on those I just have to simple accounting for number of launched procs & # of child deaths to maintain a running score. So my limiting semaphore was based on outstanding children -- if I wanted something not based on outstanding children, I might have to use a semaphore... But my algorithm didn't require an arbitrary number,
so no need for that added complexity.



already been done - especially if it can be done better (Case in
point: GNU Parallel vs. xargs). I also understand the concept of doing
something that has already been done - just to see if you can do it
yourself (e.g. I wrote a quine just to see if I could).

What I do not understand is wanting help to do something that has
already been done better.

----
        Huh?

        I think people were discussing ways it might be done in shell, not
asking for help for any specific implementation -- or some were asking that
something like parallel be a built-in (my suggestion of a ".dll" (or .so)).

        They challenge would be writing something as complex as 'parallel' in
shell... which is being discussed, in more primitive forms -- beyond which, I think is left as an exercise for the reader...

And of course, seeing yours was written in perl, I couldn't help but muse a bit on having a "perl builtin" keyword as, most installations having perl already have it built as a dynamic lib. However, not knowing how or why it would be beneficial/useful to do that of the top of my head, I left that idea with "..."...

Does that make my idiosyncratic and culturally-"contexted"-obfuscating statement(s) more clear? ;-)


Linda




reply via email to

[Prev in Thread] Current Thread [Next in Thread]