emacs-orgmode
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] ob-shell-test, test-ob-shell and introduction


From: Matt
Subject: Re: [PATCH] ob-shell-test, test-ob-shell and introduction
Date: Sat, 01 Jan 2022 23:32:00 -0500
User-agent: Zoho Mail

Apologies for the book.  I've been sitting on this stuff for two months and am 
wondering how to proceed.

IANAL but AFAIK/CT, my contract contains an exception for making contributions 
to projects like Org. I've gotten confirmation from my manager and by HR.  
However, until the CEO signs the FSF disclaimer, I can't officially contribute. 
 I'm confident that I can publish changes (e.g. to a personal website); the FSF 
just can't accept my changes (yet).

I could start working on ob-shell.el separately now and publish that myself. 
It's not clear how I would then contribute those changes back once I'm cleared 
by the FSF.  I'm inclined towards some refactoring and I'm not sure how I could 
break that down in such a way that if it took two more months to get the 
copyright stuff worked out that anyone could make sense of the changes.  I 
would much prefer to gradually submit patches, discuss them, and then 
eventually have them merged in turn.  If I have a heap of changes elsewhere, I 
feel like it would be harder to have a conversion about them.

Regardless, I think I should define test cases first.  If those are considered 
valid, then any refactoring would be moot if they pass, right?  With (agreed 
upon) test cases, it shouldn't matter if I refactor as long as functionality 
remains the same?

Overall, what advice do you have?

It looks to me like ob-shell.el could use some love in other respects besides 
async evaluation.  I've detailed where below, partly for my own organization 
and partly for posterity, but mainly because this isn't my house, so to speak, 
and I don't want to barge in and start rearranging the furniture and eating 
whatever's in the fridge.  Or, is it like Worg in that once I have the keys I 
can drive where I like, so long as there're no crashes?

I'm interested in people's thoughts on these notes on ob-shell.el:

* Tests
There are several code paths, like shebang, cmdline, and basic execution, which 
don't always have something to do with one another in the code.  Having tests 
would be really helpful to make sure everything jives.  Before doing anything 
with the code base, I intend to create tests for all the functionality.

* 2D-array
I documented two options for the =:var= header[fn:1]. The ob-shell.el code 
actually defines a third option for 2D-arrays.  I couldn't get it to work.  It 
was one of several things not documented anywhere, at least as far as I could 
find, and which I had to figure out straight from the code.  Between not being 
great at shell scripting and having a hard time deciphering that ob-shell.el 
code, I'm not sure 2D-arrays are actually fully or correctly implemented.

* M-up behavior <<M-up>>
The =org-babel-load-session:shell= code path only works when M-up is used on a 
code block[fn:2]. Is M-up actually documented anywhere?  Furthermore, the 
=org-babel-load-session:shell= only works for the "shell" language, which is 
not actually a "proper" shell (i.e. it's not listed in 
=org-babel-shell-names=). The M-up defaults to using $ESHELL or shell-file-name 
through the =shell= command.

For example, try calling M-up on these:

#+comment: (opaquely) calls the system shell
#+begin_src shell :session my-shell
echo "hello, world!"  #+end_src

#+comment: fails
#+begin_src sh :session my-sh
echo "hello, world!"  #+end_src

#+comment: fails
#+begin_src bash :session my-bash
echo "hello, world!"  #+end_src

To fix this, there needs to be an =org-babel-load-session:<lang>= for each 
language in =org-babel-shell-names=.  This would probably make the most sense 
in =org-babel-shell-initialize=.  However, that function 
[[org-babel-shell-initialize][needs attention]].

* Refactoring <<refactoring>>
The ob-shell.el code appears inconsistent to me.  Clearly, this is somewhat 
subjective.  I've tried to give a rationale for each point to make it less so.  
My goal is to be maintainer of ob-shell.el, but that's not forever.  These 
things were an obstacle for me and my aim is to remove them for the next person.

** =org-babel-shell-initialize= <<org-babel-shell-initialize>>
As alluded to elsewhere, =org-babel-shell-initialize= doesn't appear to adhere 
to the (elisp) Coding Conventions,

#+begin_quote
   • Constructs that define a function or variable should be macros, not 
functions, and their names should start with ‘define-’.  The macro should 
receive the name to be defined as the first argument.  That will help various 
tools find the definition automatically.  Avoid constructing the names in the 
macro itself, since that would confuse these tools.  #+end_quote

The =org-babel-shell-initialize= function defines =org-babel-execute:<lang>=, 
=org-babel-variable-assignments:<lang>=, and 
=org-babel-default-header-args:<lang>= for the "languages" in 
=org-babel-shell-names=.  As it stands, that =org-babel-shell-initialize= is a 
function does no harm (aside from being confusing by way of straying from 
convention).  However, if the [[M-up][M-up]] issue is to be resolved, it seems 
to me that =org-babel-shell-initialize= should be updated to match the 
convention (i.e. be a macro).

** Organization
Definitions are introduced in different orders.  For example, the 
=org-babel-shell-initialize= function whose sole purpose is to work with 
=org-babel-shell-names= is defined before the reader knows what 
=org-babel-shell-names= is.  Later, this pattern of defining the component 
pieces after they're used is reversed. For example, 
=org-babel-load-session:shell= relies on =org-babel-prep-session:shell= which 
is defined first.  I find defining terms before they're used makes a document 
more easy to comprehend than otherwise.  I want to reorder the definitions.

Similarly, some functions could use breaking out.  The most important is 
probably =org-babel-sh-evaluate= which handles the various header arguments.  
There are various paths of execution each requiring separate logic, yet all 
live in the one large function.  Breaking these out would allow them to have 
separate docstrings and, I expect, be easier to understand since the logic of 
one would be (lexically) separated from the rest.

Other functionality might be better served by consolidating functions. There's 
a lot of fiddly code dedicated to variable assignment. Actually, much of the 
ob-shell.el file is related to variable assignment. The assignments are done 
using separate functions, yet all apply to the same task. They'll never be used 
for anything else.  Do they need to be split out?  Is there a technical reason? 
I don't see one.  Does it help comprehension?  When split out as they are, I 
found it hard to make sense of the context they're used in since they're 
defined apart from the logic that uses them (i.e. what header uses them, what 
form does the header take, etc.).  I think it's worth seeing if there's a 
better way to present that part of the code base.

** Naming
The following apply to all shells, not just "sh" and should be updated to be 
"shell". This looks like cruft from when ob-shell.el was called ob-sh.el AFAICT.

- =org-babel-sh-evaluate=
- =org-babel-sh-eoe-indicator=
- =org-babel-sh-eoe-output=
- =org-babel--variable-assignments:sh-generic=
- =org-babel-sh-var-to-sh=
- =org-babel-sh-var-to-string=
- =org-babel-sh-initiate-session=
- =org-babel-sh-evaluate=
- =org-babel-sh-strip-weird-long-prompt=

Generally speaking, I find the Org Babel code base tricky to read (especially 
at first).  I spent a good deal of time untangling what lived where and who did 
what. I can play along fine now that I'm familiar. However, since understanding 
took longer than I think was necessary, I want to detail the pain points as 
they have made contributing to Babel harder.

Overall, Babel somewhat breaks the (elisp) Coding Conventions for naming,

#+begin_quote
You should choose a short word to distinguish your program from other Lisp 
programs.  #+end_quote

I understand the variable/function name prefix to be the file name, typically.  
The file name is often the package name, or more precisely the feature provided 
by the file[fn:3]. For Org Babel, there's not a solid file-to-prefix relation.  
We say "Org Babel", but the main functionality is in ob-core and the various 
"ob-" files either extend or implement implied behavior (e.g. 
=org-babel-<lang>-execute=).  Is the "program" ob-core, ob-lang, or the whole 
suite of files?  This is a subjective question which the Org Babel "program" 
answers with, "the whole suite of files".  All components, across all "ob-" 
files, bear the name "org-babel-". This is still something that trips me up: is 
the current symbol core or not?  Who is responsible for what?

I would expect the core API to have its own prefix. The extensions would then 
define their code and have a different prefix, "ob-<lang>-". This way, 
readers/contributors could open the pertinent ob-* file, see the expected 
symbol prefix (e.g. "ob-shell-") and another prefix (e.g. "org-babel-") and be 
able to distinguish which is which.  As it stands, ob-core.el could be renamed 
to org-babel.el or the "org-babel-" prefix could be changed to "ob-core-".

Another possible solution, or a stopgap, would be to have a document detailing 
the Org Babel API[fn:4].

* Process interaction
Emacs has several different ways of interacting with processes. The ob-shell.el 
code uses a few of them. Since async is another way to interact with a process, 
a single process pattern could be used. The goal would be to make each of the 
different functionalities provided by ob-shell.el have a similar 
implementation.  The expectation is that this would benefit maintenance.

* Footnotes

[fn:1] 
https://orgmode.org/worg/org-contrib/babel/languages/ob-doc-shell.html#orgfa6b7c5

[fn:2] M-up is bound to =org-metaup-hook= and
=ob-core:org-babel-load-in-session= by default.

[fn:3] It's not clear to me if there's a technical definition for an
Emacs package.

[fn:4] I may extend my personal notes into a document detailing the
Org API. http://excalamus.com/2021-11-03-org_babel.html




reply via email to

[Prev in Thread] Current Thread [Next in Thread]