bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #62516] "--convert-links" converts frags unnecessarily


From: anonymous
Subject: [bug #62516] "--convert-links" converts frags unnecessarily
Date: Tue, 24 May 2022 08:31:15 -0400 (EDT)

URL:
  <https://savannah.gnu.org/bugs/?62516>

                 Summary: "--convert-links" converts frags unnecessarily
                 Project: GNU Wget
            Submitted by: None
            Submitted on: Tue 24 May 2022 12:31:13 PM UTC
                Category: Code Architecture
                Severity: 3 - Normal
                Priority: 5 - Normal
                  Status: None
                 Privacy: Public
             Assigned to: None
         Originator Name: IsSkyfalls_
        Originator Email: gnu-bugs-submit@skyfalls.xyz
             Open/Closed: Open
                 Release: trunk
         Discussion Lock: Any
        Operating System: GNU/Linux
         Reproducibility: Every Time
           Fixed Release: None
         Planned Release: None
              Regression: None
           Work Required: None
          Patch Included: No


    _______________________________________________________

Follow-up Comments:


-------------------------------------------------------
Date: Tue 24 May 2022 12:31:13 PM UTC By: Anonymous
Hello!

When using wget with the --convert-links flag, every <code>&lt;a
href=&quot;#xxx&quot;/&gt;</code> gets turned into <code>&lt;a
href=&quot;page.html#xxx&quot;&gt;</code>. This occurs when href only contains
the frag(#xxx). In this situation, the href doesn't need to be changed since
they already point to the same resource, and this also breaks some really
horrible code (which is how I discovered this problem).

I hosted a demo page for easy testing:
> wget --convert-links https://skyfalls-mage-test.b-cdn.net/wget-hash.html


----
And also the debug output:

DEBUG output created by Wget 1.21.3 on linux-gnu.

Reading HSTS entries from /home/skyfalls/.wget-hsts
URI encoding = ‘UTF-8’
Converted file name 'wget-hash.html' (UTF-8) -> 'wget-hash.html' (UTF-8)
--2022-05-24 19:58:22--  https://skyfalls-mage-test.b-cdn.net/wget-hash.html
Loaded CA certificate '/etc/ssl/certs/ca-certificates.crt'
Certificates loaded: 283
Resolving skyfalls-mage-test.b-cdn.net (skyfalls-mage-test.b-cdn.net)...
138.199.24.218
Caching skyfalls-mage-test.b-cdn.net => 138.199.24.218
Connecting to skyfalls-mage-test.b-cdn.net
(skyfalls-mage-test.b-cdn.net)|138.199.24.218|:443... connected.
Created socket 3.
Releasing 0x0000557931b66e90 (new refcount 1).

---request begin---
GET /wget-hash.html HTTP/1.1
Host: skyfalls-mage-test.b-cdn.net
User-Agent: Wget/1.21.3
Accept: */*
Accept-Encoding: identity
Connection: Keep-Alive

---request end---
HTTP request sent, awaiting response... 
---response begin---
HTTP/1.1 200 OK
Date: Tue, 24 May 2022 11:58:23 GMT
Content-Type: text/html
Content-Length: 591
Connection: keep-alive
Vary: Accept-Encoding
Server: BunnyCDN-SG1-782
CDN-PullZone: 621948
CDN-Uid: 1de2e70a-3826-41df-bbaa-7c302575448b
CDN-RequestCountryCode: CN
Cache-Control: public, max-age=2592000
Last-Modified: Tue, 24 May 2022 11:55:08 GMT
CDN-StorageServer: DE-200
CDN-FileServer: 313
CDN-ProxyVer: 1.02
CDN-RequestPullSuccess: True
CDN-RequestPullCode: 206
CDN-CachedAt: 05/24/2022 11:57:58
CDN-EdgeStorageId: 641
CDN-Status: 200
CDN-RequestId: e03468d12293a75b8500da2a1ac491c5
CDN-Cache: HIT
Accept-Ranges: bytes

---response end---
200 OK
Registered socket 3 for persistent reuse.
Length: 591 [text/html]
Saving to: ‘wget-hash.html’

wget-hash.html                       
100%[=========================================================================>]
    591  --.-KB/s    in 0s      

2022-05-24 19:58:23 (96.7 MB/s) - ‘wget-hash.html’ saved [591/591]

Scanning wget-hash.html (from
https://skyfalls-mage-test.b-cdn.net/wget-hash.html)
Loaded wget-hash.html (size 591).

URI encoding = ‘UTF-8’
wget-hash.html:
merge(‘https://skyfalls-mage-test.b-cdn.net/wget-hash.html’, ‘#p1’) ->
https://skyfalls-mage-test.b-cdn.net/wget-hash.html#p1
appending ‘https://skyfalls-mage-test.b-cdn.net/wget-hash.html’ to
urlpos.
URI encoding = ‘UTF-8’
wget-hash.html:
merge(‘https://skyfalls-mage-test.b-cdn.net/wget-hash.html’, ‘#p2’) ->
https://skyfalls-mage-test.b-cdn.net/wget-hash.html#p2
appending ‘https://skyfalls-mage-test.b-cdn.net/wget-hash.html’ to
urlpos.
URI encoding = ‘UTF-8’
wget-hash.html:
merge(‘https://skyfalls-mage-test.b-cdn.net/wget-hash.html’, ‘#p3’) ->
https://skyfalls-mage-test.b-cdn.net/wget-hash.html#p3
appending ‘https://skyfalls-mage-test.b-cdn.net/wget-hash.html’ to
urlpos.
nofollow in wget-hash.html: 0
URI encoding = ‘UTF-8’
will convert url https://skyfalls-mage-test.b-cdn.net/wget-hash.html to local
wget-hash.html
URI encoding = ‘UTF-8’
will convert url https://skyfalls-mage-test.b-cdn.net/wget-hash.html to local
wget-hash.html
URI encoding = ‘UTF-8’
will convert url https://skyfalls-mage-test.b-cdn.net/wget-hash.html to local
wget-hash.html
Converting links in wget-hash.html... 3.
TO_RELATIVE: https://skyfalls-mage-test.b-cdn.net/wget-hash.html to
wget-hash.html at position 271 in wget-hash.html.
TO_RELATIVE: https://skyfalls-mage-test.b-cdn.net/wget-hash.html to
wget-hash.html at position 291 in wget-hash.html.
TO_RELATIVE: https://skyfalls-mage-test.b-cdn.net/wget-hash.html to
wget-hash.html at position 311 in wget-hash.html.
3-0
Converted links in 1 files in 0 seconds.








    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?62516>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]