[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Octave-bug-tracker] [bug #59706] Avoid "canonicalize_file_name" on Wind
From: |
Rik |
Subject: |
[Octave-bug-tracker] [bug #59706] Avoid "canonicalize_file_name" on Windows |
Date: |
Mon, 21 Dec 2020 12:22:17 -0500 (EST) |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36 |
Follow-up Comment #15, bug #59706 (project octave):
In terms of performance, and possibly clarity, would it be better to use
push_back() rather than += operator for strings in this context:
if ((full_file_name.length () == 2 && full_file_name[1] == ':')
|| pat.is_match (full_file_name))
full_file_name += '\\';
It is true that the '+' operator has an overload for the prototype operator +
(string, char) so maybe there is no performance difference.
Also, performing a regular expression search is always going to be slow, and
most of the time it will be unnecessary. Maybe add a quick test to check
whether the first two characters are `\` which is a requirement for a UNC name
before looking for the full pattern. For example
if (full_file_name.length () > 4 && full_file_name[0] == '``' &&
full_file_name[1] == '``' && pat.is_match (full_file_name))
full_file_name += '\\';
There is also a FIXME note in the code
// FIXME: Does this pattern match all possible UNC roots?
I don't think it does capture everything. For one thing, it appears that the
text between backslashes is not optional. There must be at least one
character which means that '*' is not appropriate and '+' should be used.
Also, the servername is allowed to be specified as an IP address which means a
form like "10.1.1.1" is allowed. The '\w' character lass does not include the
period '.' character so that should be added. I am using documentation for
the UNC format from
https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-dtyp/62e862f4-2a51-452e-8eeb-dc4ff5ee33cc.
The definition of the UNC format is
UNC = "\\" host-name "\" share-name [ "\" object-name ]
host-name = "[" IPv6address "]" / IPv4address / reg-name
; IPv6address, IPv4address, and reg-name as specified in [RFC3986]
share-name = 1*80pchar
pchar = %x20-21 / %x23-29 / %x2D-2E / %x30-39 / %x40-5A /
%x5E-7B / %x7D-FF
object-name = *path-name [ "\" file-name ]
path-name = 1*255pchar
file-name = 1*255fchar [ ":" stream-name [ ":" stream-type ] ]
fchar = %x20-21 / %x23-29 / %x2B-2E / %x30-39 / %x3B / %x3D /
%x40-5B / %x5D-7B / %x7D-FF
stream-name = *schar
schar = %x01-2E / %x30-39 / %x3B-5B /%x5D-FF
stream-type = 1*schar
So, a modest improvement to the regexp would be
regexp pat (R"(^\\\\[\w.-]+\\[\w\$-]+$)")
This would catch IP4 addresses, but still not IP6 addresses. It would also
catch common share names such as "admin$".
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?59706>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
- [Octave-bug-tracker] [bug #59706] Add function that normalizes a file path, (continued)
- [Octave-bug-tracker] [bug #59706] Add function that normalizes a file path, anonymous, 2020/12/18
- [Octave-bug-tracker] [bug #59706] Add function that normalizes a file path, Rik, 2020/12/20
- [Octave-bug-tracker] [bug #59706] Avoid "canonicalize_file_name" on Windows, Markus Mützel, 2020/12/21
- [Octave-bug-tracker] [bug #59706] Avoid "canonicalize_file_name" on Windows, Markus Mützel, 2020/12/21
- [Octave-bug-tracker] [bug #59706] Avoid "canonicalize_file_name" on Windows, Markus Mützel, 2020/12/21
- [Octave-bug-tracker] [bug #59706] Avoid "canonicalize_file_name" on Windows, anonymous, 2020/12/21
- [Octave-bug-tracker] [bug #59706] Avoid "canonicalize_file_name" on Windows, Markus Mützel, 2020/12/21
- [Octave-bug-tracker] [bug #59706] Avoid "canonicalize_file_name" on Windows, anonymous, 2020/12/21
- [Octave-bug-tracker] [bug #59706] Avoid "canonicalize_file_name" on Windows, anonymous, 2020/12/21
- [Octave-bug-tracker] [bug #59706] Avoid "canonicalize_file_name" on Windows, Rik, 2020/12/21
- [Octave-bug-tracker] [bug #59706] Avoid "canonicalize_file_name" on Windows,
Rik <=
- [Octave-bug-tracker] [bug #59706] Avoid "canonicalize_file_name" on Windows, Markus Mützel, 2020/12/21
- [Octave-bug-tracker] [bug #59706] Avoid "canonicalize_file_name" on Windows, Markus Mützel, 2020/12/21
- [Octave-bug-tracker] [bug #59706] Avoid "canonicalize_file_name" on Windows, Markus Mützel, 2020/12/21
- [Octave-bug-tracker] [bug #59706] Avoid "canonicalize_file_name" on Windows, Rik, 2020/12/21
- [Octave-bug-tracker] [bug #59706] Avoid "canonicalize_file_name" on Windows, John W. Eaton, 2020/12/21
- [Octave-bug-tracker] [bug #59706] Avoid "canonicalize_file_name" on Windows, Rik, 2020/12/21
- [Octave-bug-tracker] [bug #59706] Avoid "canonicalize_file_name" on Windows, Markus Mützel, 2020/12/22
- [Octave-bug-tracker] [bug #59706] Avoid "canonicalize_file_name" on Windows, Markus Mützel, 2020/12/22
- [Octave-bug-tracker] [bug #59706] Avoid "canonicalize_file_name" on Windows, Markus Mützel, 2020/12/22
- [Octave-bug-tracker] [bug #59706] Avoid "canonicalize_file_name" on Windows, Rik, 2020/12/22