Re: [PATCH] Accept more :tangle-mode specification forms

emacs-orgmode
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] Accept more :tangle-mode specification forms

From:	Tim Cross
Subject:	Re: [PATCH] Accept more :tangle-mode specification forms
Date:	Mon, 22 Nov 2021 00:51:19 +1100
User-agent:	mu4e 1.7.5; emacs 28.0.60
Timothy <tecosaur@gmail.com> writes:

> Hi Tim,
>
>>> The parsing of “555” to the integer 555 is done by
>>> org-babel-parse-header-arguments, and so can’t really be changed.
>>
>> I don't understand this. Why can't it be changed?
>
> Well, it can't be changed without changing
> org-babel-parse-header-arguments, which is quite a major change and I
> suspect may be a breaking change.
>
> Maybe it's fine? I just don't feel confident about this.
>
>> When I said disable base 10 values, what I meant was generate an error
>> if the value looks like it is base 10 i.e. does not have a leading o or
>> #o. This makes it clear it has to be specified as an octal value and
>> will alert users to this fact.
>
> This is all well and good, but the point I'm making is that since due to
> the current behaviour of org-babel-parse-header-arguments we can't
> distinguish between :tangle-mode (identity #o555) and :tangle-mode 365
> putting this idea /into practice/ doesn't look like it will be easy.
>

I don't see the complication here. When you call
org-babel-parse-header-arguments, it returns an alist of argument name
and value. If you had a header like :tangle-mode 555, you get the value
(:tangle-mode . 555), if you have :tangle-mode o555, you get
(:tangle-mode . "o555"). My suggestion is simply to look at the value
and if it is not a string, reject it outright. If it is a string, then
match that string to the allowed formats (i.e. an octal value if it
starts with "#0o" or "o", a symbolic value if it matches a regexp for
symbolic representation and even possiblly your ls -l format if it
matches that. This is not a change in org-babel-parse-arguments, but
rather a change in how we interpret those arguments. 


> It's also worth noting that with regard to the recent changes that have
> been made, this is not a new issue.

Which I don't think is a valid argument for the status quo. Allowing
base 10 mode specifications is just a bad idea. The fact it has been
permitted in the past is likely just an oversight rather than
intentional behaviour. Given the potential security issues involved
here, I think we can legitimately obsolete such features (using
acceptable change management approaches with suitable notification for
users etc).


>
>>> I think the ls-style is quite valuable for two reasons. It’s
>>> well-established thanks to ls, and is particularly good for
>>> understanding permissions at a glance.
>>
>> I would agree it is an established way to display the permissions
>> associated with a filesystem object, but disagree it is a well
>> established way to set such values - I know of no other tool which uses
>> this format.
>
> The driving motivation here is that while the tools which you mention do
> not use this for setting permissions (e.g. chmod), they are only used
> for /setting/ permissions. Since Org files aren't write-only, and
> occasionally we go back and look at what we've written :P I think
> allowing a format that is optimised for ease-of-reading instead of
> ease-of-writing makes some sense. It is in this respect that I think
> ls -l style is a good idea.
>

I understand your motivation. I dispute your claim it is an established
and common convention. 

>> It is also not the same as the ls -l display (no object
>> type indicator). The ls -l also displays sticky bit, uid and gid. Does
>> your format also support setting these values (something which can be
>> done with the octal or symbolic formats) i.e. support s, S, t and T for
>> the 'executable' bit for user/group?
>
> Ah, I'm afraid that I'm not that up-together on sticky bits. I suspect
> that it's not just the ls -l style that will need tweaking. I'll read up
> on this in the next few days and update the ML.
>

This is very difficult Timothy. I really do appreciate all the work you
have done here and your patience, which makes what I'm going to say all
the more difficult. My big concern here is that you don't understand
this area with sufficient depth of understanding to really appreciate
some of the subtleties involved here. This makes me concerned that any
implementation of a new way to specify mode settings may have some
unexpected side effects or hidden issues which could be exploited in
unexpected ways (worse case) or simply have unexpected bugs which result
in mode settings which are not what the user intended. As someone who
has used Unix since 1980 and Linux since 1995, I've experienced my share
of issues relating to file modes and their interaction with different
filesystems, NFS, different platforms, character sets, locales etc.
Implementing a new approach as you are proposing is something I would be
extremely wary about.

If you do plan to go forward with this approach, please ensure you are
very confident in your understanding of the sticky bit, setuid/setgid
and associated interaction with execute etc. 

I suspect you do have the skills to implement correctly given sufficient
time, testing and motivation. However, I am concerned your initial stab
at this may be somewhat naive and requires more consideration. My main
concern is that once you take all the subtleties into consideration, the
complexity of your ls -l approach will exceed its utility and
maintainability. I could also be completely wrong.  

>> Personally, I prefer the symbolic form as it is shorter and clear. I
>> find the ls -l form too easy to get wrong (especially with getting the
>> number of '-' correct).
>
> Isn't choice a great thing? :D In seriousness, this is exactly why I
> think it's worth providing these options. At least thinking back to when
> I started using Linux a few years ago I would have said the same thing
> about the octal form, and was completely unfamiliar with chmod. I expect
> different people to have different preferences with these three options,
> but for one of them to be something most people are happy with.
>

Choice is good. However, as I mentioned ina previous post, the problem
with choice is complexity and the biggest source of security problems is
complexity. Keep in mind that what you do now is something which will
need to be maintained by others in the future. It needs to be clear,
concise and as simple as possible (but of course, no simpler!). 


>>> Tim suggested that invalid forms should cause tangling to fail, but I feel 
>>> this
>>> may be a bit much. Personally I’m inclined to either
>>> • warn, and don’t touch the file permissions (this is what currently occurs)
>>> • use a very conservative file permission (e.g. rw——-).
>>
>> I'm unsure on this. My concern is people may not notice the warning and
>> then be surprised later. Given the potential security issues, a later
>> surprise is something to be avoided even if it is inconvenient. With
>> respect to the default action to take, I would suggest we also need to
>> look at the default umask setting for the user and if that is more
>> restrictive, use that rather than some value like rw------- For all we
>> know, the user has set a specific umask for a valid reason and will be
>> surprised if emacs just ignores that to do what it wants (another
>> surprise to be avoided).
>
>> The user is not required to specify a mode. However, if they do and if
>> we cannot interpret what they have specified without ambiguity, we
>> should throw an error and cease processing. Making a guess as to what
>> they intended in this situation is IMO a big mistake.
>
> I don't see how using the default permissions is a potential security
> risk? Could it surprise the user, yes, but that seems unavoidable when
> the user is doing something invalid.

Consider this simple scenario. The tangled output is being written to a
directory which requires specific permissions for security reasons -
perhaps it is some type of ftp server where permissions will determine
what can be seen by whom. The data might contain sensitive information.
The user specifies what they thing is the correct mode specification,
but the7y get it wrong. Org comes along and looks at what they specified
and decides it cannot interpret what the user has specified, so applies
a default (and potentially completely wrong) mode, exposing sensitive
data to others who should not have had access.

Use a default if the user does not specify a mode, but if they specify a
mode and org cannot interpret what they specified, then do nothing -
don't generate the tangled output at all. This is the only safe
approach. 

>
>> My only preference for "#o" over just "o" prefix is to clearly signal to
>> the user that it is an octal value and avoid the situation where you
>> might glance a value and see the leading 'o' as a '0', missing the point
>> it is an octal value not a decimal one. However, this seems like a low
>> risk, so I'm happy either way.
>
> Well, I've got "o" for now but that's not set in stone. With a lower
> case "o" I feel the risk for confusion with "0" is quite low (besides
> which a "0" prefix seems to be the C-style for octal anyway).
>

I am not overly concerned about this one. I merely mention the use of
"#0" as a possible approach and why. If the preference is just to have
'o', that is fine with me. My main point is that when it comes to
parsing the value, having "#o" as a prefix for octal values is possibly
slightly easier wrt regexp specification than just 'o' as 'o' could also
be the first character for a symbolic mode specification as in o+w.
[Prev in Thread]
Current Thread
[Next in Thread]
Re: [PATCH] Accept more :tangle-mode specification forms, (continued)
Prev by Date: Re: A mobile clocking solution?
Next by Date: Re: [PATCH] Accept more :tangle-mode specification forms
Previous by thread: Re: [PATCH] Accept more :tangle-mode specification forms
Next by thread: Re: [PATCH] Accept more :tangle-mode specification forms
Index(es):
- Date
- Thread