bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#24450: [PATCHv2] Re: pypi importer outputs strange character series


From: Maxim Cournoyer
Subject: bug#24450: [PATCHv2] Re: pypi importer outputs strange character series in optional dependency case.
Date: Mon, 10 Jun 2019 22:30:45 +0900
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.2 (gnu/linux)

Hello again!

Ricardo Wurmus <address@hidden> writes:

> On to the next:
>
>> From 73e27235cac1275ba7671fd2364325cf5788cb3c Mon Sep 17 00:00:00 2001
>> From: Maxim Cournoyer <address@hidden>
>> Date: Thu, 28 Mar 2019 00:26:02 -0400
>> Subject: [PATCH 5/9] import: pypi: Support more types of archives.
>>
>> This change enables the PyPI importer to look for requirements in a source
>> archive of a different type than "tar.gz" or "tar.bz2".
>
> Okay.
>
>> * guix/import/pypi.scm: (guess-requirements)[tarball-directory]: Rename to...
>> [archive-root-directory]: this. Use COMPRESSED-FILED? to determine if an
>> archive is supported or not.
>
> Nitpick: please use “...this.” and leave two spaces between sentences.

Done.

> Typo: it should be COMPRESSED-FILE?

Fixed.

>> [guess-requirements-from-source]: Adapt to use the new method, and use unzip
>> to extract ZIP archives.
>
> s/method/procedure/

Done.

> Please also mention that “compute-inputs” has been adjusted.

Done.

>> -  (define (tarball-directory url)
>> -    ;; Given the URL of the package's tarball, return the name of the 
>> directory
>> +  (define (archive-root-directory url)
>> +    ;; Given the URL of the package's archive, return the name of the 
>> directory
>>      ;; that will be created upon decompressing it. If the filetype is not
>>      ;; supported, return #f.
>> -    ;; TODO: Support more archive formats.
>> -    (let ((basename (substring url (+ 1 (string-rindex url #\/)))))
>> -      (cond
>> -       ((string-suffix? ".tar.gz" basename)
>> -        (string-drop-right basename 7))
>> -       ((string-suffix? ".tar.bz2" basename)
>> -        (string-drop-right basename 8))
>> -       (else
>> +    (if (compressed-file? url)
>> +        (let ((root-directory (file-sans-extension (basename url))))
>> +          (if (string=? "tar" (file-extension root-directory))
>> +              (file-sans-extension root-directory)
>> +              root-directory))
>>          (begin
>> -          (warning (G_ "Unsupported archive format: \
>> -cannot determine package dependencies"))
>> -          #f)))))
>> +          (warning (G_ "Unsupported archive format (~a): \
>> +cannot determine package dependencies") (file-extension url))
>> +          #f)))
>
> I think the double application of file-sans-extension and the
> intermediate variable name “root-directory” for something that is a file
> is a little confusing, but I don’t have a better proposal (other than to
> replace file-extension and file-sans-extension with a match expression).

Done, w.r.t. using "match":

--8<---------------cut here---------------start------------->8---
@@ -198,10 +198,12 @@ be extracted in a temporary directory."
     ;; that will be created upon decompressing it. If the filetype is not
     ;; supported, return #f.
     (if (compressed-file? url)
-        (let ((root-directory (file-sans-extension (basename url))))
-          (if (string=? "tar" (file-extension root-directory))
-              (file-sans-extension root-directory)
-              root-directory))
+        (match (file-sans-extension (basename url))
+          (root-directory
+           (match (file-extension root-directory)
+             ("tar"
+              (file-sans-extension root-directory))
+             (_ root-directory))))
         (begin
           (warning (G_ "Unsupported archive format (~a): \
 cannot determine package dependencies") (file-extension url))
--8<---------------cut here---------------end--------------->8---


>>    (define (read-wheel-metadata wheel-archive)
>>      ;; Given WHEEL-ARCHIVE, a ZIP Python wheel archive, return the package's
>> @@ -246,16 +243,20 @@ cannot determine package dependencies"))
>
>>    (define (guess-requirements-from-source)
>>      ;; Return the package's requirements by guessing them from the source.
>> -    (let ((dirname (tarball-directory source-url)))
>> +    (let ((dirname (archive-root-directory source-url))
>> +          (extension (file-extension source-url)))
>>        (if (string? dirname)
>>            (call-with-temporary-directory
>>             (lambda (dir)
>>               (let* ((pypi-name (string-take dirname (string-rindex dirname 
>> #\-)))
>>                      (requires.txt (string-append dirname "/" pypi-name
>>                                                   ".egg-info" 
>> "/requires.txt"))
>> -                    (exit-code (parameterize ((current-error-port 
>> (%make-void-port "rw+"))
>> -                                              (current-output-port 
>> (%make-void-port "rw+")))
>> -                                 (system* "tar" "xf" tarball "-C" dir 
>> requires.txt))))
>> +                    (exit-code
>> +                     (parameterize ((current-error-port (%make-void-port 
>> "rw+"))
>> +                                    (current-output-port (%make-void-port 
>> "rw+")))
>> +                       (if (string=? "zip" extension)
>> +                           (system* "unzip" archive "-d" dir requires.txt)
>> +                           (system* "tar" "xf" archive "-C" dir 
>> requires.txt)))))
>
> I guess this is why I’m not too happy with this: we’re checking in
> multiple places if the format is supported but then forget about this
> again until the next time we need to do something to the file.

I don't see much of a problem with the current design since there are
two questions being answered:

1) What should be the directory name of the extracted package (retrieved
   from the base name of the archive).
2) What extractor should be used (zip vs tar).

These two questions are orthogonal, and that the same primitive get used
to answer both is an implementation, or rather, an optimization detail.

> I wonder if we could do better and answer the question just once.

The questions are different :-). We could optimize, but that would be at
the price of expressiveness (squash the two questions into one solving
space).

What do you think?

Maxim





reply via email to

[Prev in Thread] Current Thread [Next in Thread]