pspp-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Allocating many workspaces.


From: Ben Pfaff
Subject: Re: Allocating many workspaces.
Date: Sat, 17 Mar 2012 15:27:12 -0700
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (gnu/linux)

John Darrington <address@hidden> writes:

> On Sat, Mar 17, 2012 at 12:15:17PM -0700, Ben Pfaff wrote:
>      John Darrington <address@hidden> writes:
>      
>      Ah, yes.  I've been aware of related problems for a long time,
>      but I haven't come up with a good solution.  One must limit the
>      total memory allocated, not the memory allocated per-instance, of
>      course, but the proper way to distribute the available memory
>      among the competing users is not obvious.  I guess that the
>      easiest way is first-come-first-served.  That might be just fine
>      in the common case, so perhaps we should implement it that way as
>      a first cut.
>
> Unless the number of cases per instances is known a priori
> (which in general it isn't) I don't see any better alternative
> to first-come-first-served. -- perhaps decadically decreasing
> might be one way, in the assumption that if there are many
> instances, then hopefully they are small ones.
>
> Is it feasible to have workspaces which dynamically change
> their allocation or is that not possible?

For casereaders, it's easy enough to dynamically change, since
casereaders are able to dump all of their in-memory data to disk.

>      For categoricals, though, what's the fallback if the memory usage
>      becomes too high?  Can we fall back to some kind of on-disk
>      storage, or do we just fail?  "Just fail" is probably not a good
>      way to go, if first-come-first-served is the strategy we use,
>      because it means that unrelated memory use (e.g. for cases) can
>      cause even small number of categories to break.
>
> Maybe we should do the "just fail" option in the first instance and see
> if we can improve it later.

OK.

>      Here's another idea that comes to mind: is there a maximum number
>      of categories that makes sense?  Would a "max categories" setting
>      defaulting to, say, 1000, still allow most users to get real work
>      done in realistic cases?
>
> 1000 would be much too high.  How many machines can allocate 64GB of heap?
> "Realistic cases" is somewhat subjective.  But I cannot envisage that in 
> most instances more than 20 categories would be involved - but who knows?

I mean, 1000 categories per instance, not 1000 instances.
Presumably, 1000 categories do not need much memory (a few
kilobytes?) unless the space for categories is, say, O(n**2) in
the number of categories (I haven't looked).
-- 
Ben Pfaff 
http://benpfaff.org



reply via email to

[Prev in Thread] Current Thread [Next in Thread]