gnuastro-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #59904] Large aperture can easily fill memory in sort-based match


From: Mohammad Akhlaghi
Subject: [bug #59904] Large aperture can easily fill memory in sort-based match
Date: Mon, 18 Jan 2021 08:08:13 -0500 (EST)
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:84.0) Gecko/20100101 Firefox/84.0

URL:
  <https://savannah.gnu.org/bugs/?59904>

                 Summary: Large aperture can easily fill memory in sort-based
match
                 Project: GNU Astronomy Utilities
            Submitted by: makhlaghi
            Submitted on: Mon 18 Jan 2021 01:08:10 PM UTC
                Category: Match
                Severity: 3 - Normal
              Item Group: Crash
                  Status: Confirmed
                 Privacy: Public
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Any

    _______________________________________________________

Details:

The sort-based match algorithm which is currently used in the Match program
makes a linked list of nearby points (within the given aperture) between the
two catalogs to find the best match between them. This is necessary to make
sure that the order of the input catalogs doesn't affect the final result (see
the comments in the 'match_coordinates_second_in_first' function for more).

However, this has a bad side-effect: when there are many points in both
catalogs (for example x100000) and the aperture is large (by mistake
usually!), the created lists for each point in each catalog can easily fill
the whole system RAM causing Match to crash! 

For example, with these two commands we can download the ID, RA and Dec of the
same region (randomly selected) in Gaia DR2 and eDR3 (each is 72Mb, with more
than 2 million rows):


astquery gaia --dataset=dr2 -csource_id,ra,dec --center=281.6553922,11.4038964
--radius=2 -odr2.fits
astquery gaia --dataset=edr3 -csource_id,ra,dec
--center=281.6553922,11.4038964 --radius=2 -oedr3.fits


When we later try to match these two by RA and Dec with the command below, the
RAM consumption will exceed 10GB and cause a crash on many systems!


astmatch dr2.fits edr3.fits --ccol1=ra,dec --ccol2=ra,dec --aperture=1 


To fix this problem, we should avoid keeping the more distant elements in the
list and only keep the top N nearest elements (for example N=10). We just have
to use a structure like Gnuastro's "Ordered list of size_t" structure:
'gal_list_osizet_t'.

Until this problem is fixed, to avoid the problem, you should decrease the
aperture size to physically meaningful values (in the case of Gaia, something
like 0.1 arcsec ('--aperture=0.5/3600').




    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?59904>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]