Re: CVS diff and unknown files.

info-cvs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CVS diff and unknown files.

From:	Paul Sander
Subject:	Re: CVS diff and unknown files.
Date:	Wed, 2 Feb 2005 01:58:31 -0800


On Feb 1, 2005, at 8:16 AM, address@hidden wrote:

Paul Sander <address@hidden> writes:

On Jan 31, 2005, at 9:30 AM, address@hidden wrote:

That's not to say that we will *always* know at add time that the
commit will fail; failures can occur due to problems in theircontent
which are clearly not ready to check at add time.
Well, if I understand correctly, your intentionally want to haveweaker
checks at add-time than at commit-time. Instead, you can do it in the
commit-time trigger by skipping some of the tests if the file in
question is new one.


Oh, so what you're saying is that rather than making CVS differentiate
between the two sets of triggers, you want the commitinfo script to do
it instead. I suppose that's doable.


Exactly! I took too much time to explain this. Should be either my or
your fault (or both).

But I'll counter with this: Why not combine commitinfo, loginfo, and
taginfo the same way? My argument is that add-time triggers differ
from commitinfo triggers in the same way that commitinfo triggers
differ from those others: They run at different times for different
purposes.


All the triggers you've mentioned have no common tasks to do. On the
other hand, add-time checks you have in mind are subset of commit-time

checks you have in mind. The subset is easily achieved by bypassingsome

of checks.

Okay so you understand why it's a bad idea to overload too muchfunctionality. I argue that combining add-time and commit-timetriggers is also overloading things too much. There's the establishedcommit-time feature that is being modified so that it now runs twicerather than once. If this were indeed done, then thousands of existingCVS admins are inconvenienced because they must rewrite their existingcommitinfo triggers to bypass their current operation at add-time, andplus both the CVS authors and the administrators must work through achange of interface in an existing feature.

If you examine the implementation, you'd realize that CVS implementsits trigger support pretty much in a generic fashion, basicallyinvoking a function call with different parameters depending on thetrigger. So there's really no invention going on when adding a newtrigger. And it turns out that overloading an existing one isdemonstrably harmful.

I would have no objection to add-time-script if there were a separate
server operation called "add-file-to-the-working-copy", or
"check-if-file-name-is-ok-for-repository", but AFAIK there is currently
no such operations. I consider inventing new server operation just to
implement some policy that you have in mind and that could be
implemented using existing functionality anyway to be an overkill,
sorry.

If you go back and read Mark's postings, he mentions that my proposalrequires no changes to the client/server protocol, so it's really notinventing a new server operation in a very real sense. It changes thebehavior of the server, but if at add-time you invoke the option toavoid contacting the server, then why do you care?

You need to get out of the server. The server's not the one we're
trying to make happy here. It's the user that matters, and the tools
must bend to the users, not the other way around.

From an implementation standpoint, there's no difference between the
triggers, other than the conditions in which they fire and the files
in which they're configured. Thus the argument you make in the above
paragraph simply makes no sense.


It's you who needs to get out of the server, not me. You propose server

changes, not me. If the functionality has nothing to do with theserver,

then get out of the server indeed.

Yes, you do propose server changes: You propose invoking an existingtrigger in a second place. I argue that implementing a second oneperforms the same function that you propose, but in a cleaner way thatpreserves existing practice.

If I do in fact understand you to be recommending that commitinfo be
used at add time, then I must disagree. Triggers used at commit time
to check the content of files (like Greg's RCS Id trigger, for
example) are not appropriate to use at add time. Things likechecking
that the user has the right to add a file, or checking that the name
of the file complies with policy, are legitimate to check at addtime.They can also be checked at commit time, and the current proposaldoes
that because the add-time triggers are optional.


So the only thing you need to know in the commitinfo is what user is
trying to do and decide what checks are required based on this
information. If commitinfo currently doesn't have all the required
information, maybe it's better to fix it instead of inventing yet
another hack in the form of add-time triggers?

What you're really proposing is an alternative implementation foradd-time

triggers. Think about it.


It's how it looks from your point of view. From my point of view I

propose alternative implementation of the functionality you needwithout

inventing anything new, being add-time triggers or something else.

What you're recommending is to overload something in a way that is notappropriate.

What you suggest is yet another reincarnation of commit-time triggers
"run at different time" where "time" is defined in terms of operations
on the working copy. Think about it.

I have thought about. Apparently so has Greg. Something you shouldunderstand is that he and I have some history of adversity in thisforum. You may have noticed that he and I both agree that overloadingcommit-time triggers in the way you describe is a bad idea. Thatreally means something.

If users think the wrappers are in their way, they'll drop down to
CVS to add the file, causing more significant breakage later.

Either trust your users or arrange things so that it won't bepossible

for them to invoke cvs commands directly. But even if you choose the
latter way, those that do think the wrappers are in their way will
probably find a way to bypass them anyway.


First, I've already related a horror in which users were trusted and
failed. This is why we need safeguards on the circumstances in which
they can perform certain actions.


They will fail no matter what you do then.

Second, if a wrapper invokes CVS, then there's no way to effectively
hide CVS from the users.


You like to bring wrong statements into discussion very much. If you
don't know a way, it doesn't necessarily mean there is none.

Alright, I know of two ways on Unix to effectively isolate users fromparts of the filesystem: Closed directory permissions and chrootjails. Giving users limited access to applications involves hiding theentire application behind these walls and opening holes via setidwrappers or writing a new client/server application to wrap theexisting application. These solutions are expensive and easy to messup. Adding a new trigger to CVS is far simpler, it's generallyapplicable, shops that don't need it need not use it, and in this caseclients can opt out.

This is why policy must be enforced below the level of the CVS command
line.


CVS command line tools are no more than wrappers on top of CVS
client/server protocol. They can't enforce anything that the protocol
itself can't enforce. In turn, the protocol can only enforce what
clients can or can't do *to the server*.


That is all true.

You seem to be thinking that in the client/server model the server is a

boss. The reality is closer to the opposite. In fact it is client whois

boss. It commands what server should do, and then servers does (or
doesn't) what the boss have told him to do. There is no way for server
to force client to do (or not to do) something.

Nope, it's a negotiated effort. The server can refuse to cooperate aslong as the client makes unreasonable requests. That's the basis ofpolicy enforcement: In the end, the server really is in controlbecause it can always say "no".

In the case of the add-time trigger, it's really a request for a sanitycheck in which the server can inform the client that it notices anapproaching problem. It's built into the semantics of my proposedsolution that the CVS client can either heed the warning (by refusingto record an addition in the Entries file) or ignore the warning (byopting out of making the request). Commit time is when the serverdecides whether or not to cooperate, based in part on whether or notthe client chose to ignore an earlier warning.

Beyond that, there is absolutely no implication that the server hasmore control over the user's workspace than what I have described inthe past two paragraphs.

It's at that time that they finally learn the value of the process.


If somebody prefers to learn on his own mistakes, -- let him do it
unless it doesn't break others work.


Do you really mean, "let him do it unless it breaks others work"? How
do you prevent them from making mistakes that harm others?


By not allowing him to break the repository, obviously. That's the goal

of server-side policies, -- prevent users from breaking others work.Youjust can't prevent them from doing their mistakes in their workingcopy,

for example, one can run 'rm -rf' in his working copy after two days of
active development.

Okay, consider my horror story in which the user essentially did a "cvsrm *" recursively from the top of the project, followed by a commit.Technically, the repository wasn't broken, and the action was(eventually) reversible. On the other hand, hiding the entire sourcebase from the rest of the team wasn't the right thing to do because ithad a real and serious negative effect on the whole project. In ourcase, it took hundreds of man-hours to recover. TWICE!!! It's thiskind of stupid, foreseeable, and expensive mistake that simpleautomated policies are good at avoiding.

They won't know which mistakes are harmful until they try them.
Yes, they will learn anyway as soon as they try to commit theirchanges,
i.e., as soon as they try to break others work.

That's fine, but you've forgotten my argument that there are classes ofmistakes that can be caught early, while they're still limited to theuser's workspace. And if they are in fact caught at that time, theygive the user the opportunity to save the user many hours of work.This is the kind of thing that I'm advocating so strongly in thisthread.

That's exactly the point of enforcing policy: It allows the entire
project to learn from the mistakes of others so that when
inexperienced users repeat past mistakes, everybody is protected.


You can't force nobody to learn unless she wishes to learn. You can't

protect user from himself. Either the user wishes to know if sheadheres

to policies, in which case she will run whatever commands you tell him
early, or the user doesn't wish to know, in which case you has very
little to do about it.

Actually, you can protect the user from himself in a limited way. (Itmay be that you can't stop someone from shooting himself in the foot,but you can give him bulletproof shoes, for example. You can also givehim an empty gun, but that doesn't help anyone if he's the tribe'shunter.) And you can protect the project from people who don't adhereto policy for whatever reasons.

Tell me something. Do you consider policies to be frivolous exercisesby power mongers designed to get in your way for arbitrary reasons, ordo you consider policies to be well-considered tools designed to aidthe completion of the project more efficiently? What is yourexperience to back up you belief?

BTW, the definition of "client/server" implies the presence of a

network for communication between the two parts,
Wrong. Client and server can work on the same computer without any
network. Moreover, client/server idiom is heavily used inside
"monolithic" programs as well. Basically, client/server only meansthat
there is some "server" that makes some operations on behalf of
"clients", how clients communicate with the server is a secondaryissue.


While your statement is true, the fact is that client and server
programs typically operate on different machines. The fact that they
do is what gives the client/server paradigm its power. Sure, I can

write client/server programs that use loopback interfaces, Unixdomain

sockets, or even named pipes, but they offer no benefit because
function calls are more efficient RPC calls.


Don't you aware that client/server paradigm is widely used inside
"monolithic" programs as well? There are always things left to be
learned in this world.


Frankly, I don't care.  And it's not relevant to the topic at hand.


No, it's very relevant to the topic at hand. It's your ignorance of the
true meaning of the client/server model that makes you believe

everything is fine with your proposed design. The client/server modelis

mostly about splitting of responsibilities, not about media through
which client and server communicate.

Indeed it is about splitting responsibilities. But in the context ofthese few paragraphs, the client/server paradigm is but one means toaccomplish a goal of dividing and conquering a problem. The same canbe done with well-designed APIs in a monolithic application, and infact there have been times when such APIs were reimplemented, convertedfrom simple function calls to RPC calls. These facts are stillirrelevant to the discussion of add-time triggers, because the argumentis still relevant to CVS' local mode, which is not a client/serverimplementation.

Well, you've asked "why would you possibly want to deliberately
proceed toward a dead end" and I tried to provide an example when
the end is not actually dead. The problem is that unless I add the
files, 'cvs diff' doesn't work on them, and if your proposal is
implemented, and I bypass the add-time checks, I have no way to
repeat them later without committing the files.


Okay, so when you say "I need the files to be added to the working
copy anyway" what you really mean is that you an entry in the CVS
metadata so that you can do other stuff that won't work without it.
Fine, my proposed add-time trigger implementation allows for this.


*How do I repeat skipped checks later with your proposal?* I already
tired asking this same question again and again just to get no answer.


Have you not been reading my replies?

Answer 1: If you don't ignore the error condition (i.e. you care aboutwhy "cvs add" fails) then correct it and repeat the attempt to thefile.

Answer 2: If you do ignore the error condition (by telling "cvs add"to act locally), then attempt to commit the file. The add-time checkswill be redone, followed by a more comprehensive set of checks.


Now I ask you:  Why is this unreasonable?

Well, this time I've found an answer below, the answer is "run 'cvs -n
commit'", but it doesn't fit into your model. That's what I'm
advocating, -- you don't in fact need anything else but "cvs -n commit"
to let user know ASAP about possible problems.

I've stated on at least two occasions related to this thread that afterthe work is done, it doesn't really matter how you get there. But theadd-time checks influence the journey. If you happen to be in a shopwhere policies are all voluntary to the point that you must ask ifyou're in compliance, fine. Some of us, and not just myself, thinkthat policies are compulsory, and therefore the tests should runautomatically. Therein lies the reason why "cvs -n commit" alone isnot sufficient.

Here's another potential benefit: Most projects have some criteria
that submitted patches must have before they will be accepted. These
criteria might include naming conventions (e.g. all files named in
lower-case). If you intend to send a patch to the project, there's a
greater probability that it will be accepted the first time if you
subject yourself to some of their policies.


Then I need full commit-time checks, not your limited add-time checks.
Better spend your brain forces on improvements of the former than on
invention of the latter.

Full commit-time checks are certainly needed. But as I keep stating,more limited add-time checks are also strongly desirable. The factthat *you* don't need, want, or understand them does not mean that Ishould not have them. You can always leave them turned off.

This is a complementary view of my proposal: Have a set of triggers
that runs at add and commit times, and another set that runs only at
commit time. It so happens that the second set is already implemented.


There is actually no sense to make them different if you indeed insist
on informing users about problems ASAP.

Invoking commitinfo and passing an argument to distinguish between add
time and commit time, which seems to be what you have in mind, is an
alternative implementation.


Yes that's what I have in mind.

I think mine is better, and it certainly has fewer backward
compatibility issues.


Backward-compatible wrong design is not necessarily better than a sane
one even if it's not backward-compatible, though I think there is still
a way to keep backward compatibility in my case either.

Most of us in this forum seem to think that the existing design is asane one. I happen to think that the implementation is poor, butthat's a different argument. Either way, neither can be changedwithout significant inconvenience to the existing community. This iswhy I don't recommend changing commitinfo in any way.

Ah, so the sole purpose of invention of add-time triggers is in factan
attempt to don't break compatibility with the current commit-time
scripts, right? If so, it's better to explicitly state it to avoid
misunderstanding similar to mine.
I thought that was exposed in an earlier message. Sorry for notmaking it
clear.
No, there was no even such a word, "compatibility" in your earlier
messages.

If I propose a way to implement my variant and make it
backward-compatible, will you agree that mine is better? My
understanding is that no, you won't. So compatibility is not in factthe
main point of our disagreement.


Well, we won't know until you suggest it, will we?  :-)

On the other hand you already know my opinion of overloading functions,which is that doing so is usually bad unless there are *very*compelling reasons to do so. If your recommendation involvesoverloading commit-time triggers then you may be right. Still, itwon't hurt to try.

Seriously, think about what you're saying. You want to do thefollowing,
dramatically oversimplied:

edit foo.h
cvs add foo.h
edit foo.h
cc FOO.C
cvs commit foo.h

It's at this point you want to fail, after you've done all the
work. In my opinion it's better all around you fail after two
steps, not five.
You've missed my point. I have nothing against step2 warns meabout theproblems, maybe even by default, if I still have ability tosuppress the
warnings. I'm strongly against step2 to refuse to add the file as I
believe it would just disturb instead of help: if step2 refuses toaddthe file, it doesn't prevent me from doing steps 3 and 4. In factiteven doesn't prevent me from invoking 'cvs commit foo.h', butinstead ofgetting comprehensive explanation from the server why my changeisn't
accepted, I'll get less useful message "run cvs add first".
Are you saying that if step 2 fails, you would proceed with steps 3
and 4 and maybe attempt 5 without correcting the failure condition?
I said what I said, your suggested _failure_ to add the file to the
working copy doesn't prevent me from doing the rest of steps. Itmeans
that it's not any better than _warning_. And I even showed why it is
slightly worse.
Well, what do you expect if you ignore the warning in step 2?
With your proposal? I expect your "enforced policy" to somehow preventme
from continuing my way to the "dead end". You claimed you will be able
to enforce your favorite policy on my working copy. You failed.

Under the condition that you ignore the warning in step 2, I never,ever suggested that I could prevent you from wandering down the deadend. Remember, the whole point behind being able to opt out ofperforming the add-time check is to empower the user to do things thathe may need to redo later.

Sure, you can change your behavior and move the add down to right
before the commit, and that is consistent with the working styles
of many people, but then they get what they ask for.


So even with you proposals implemented you fail to actually impose
policies you are trying to impose with the proposals, -- too bad.

That's another thing I'm trying to explain, -- you have no way to

actually impose policies on the client side. Thinking otherwise isno

more than self-delusion.


Well, that's what I get for pandering to the crowd that thinks that
they simply must be able to add a file while working offline. If an

add-time connection to the server were a requirement (i.e. therewere

no feature to opt-out of add-time triggers) then this wouldn't be a
problem. Don't blame me for trying to meet your requirements.

I don't. I blame you for the failure to meet your own requirements.Even

if you make invocation of triggers by "cvs add" an absolute must, the

users will simply don't run "cvs add" for days until they need tocommit

their changes due to your own expectations of your users behavior.


If they're breaching policy in their file additions, then because the
changes will ultimately be rejected, it's not in their best interest
to delay the discovery of the violations.


If it's not in their interest then warning is more appropriate than
failure to complete operation, I believe. My whole point here is that
failure to add the file doesn't prevent your users from misbehavior any
more than a warning, so why failure?

Because in my experience, users don't heed warnings. Whenever I needto get their attention, I need to hit them over head with a failure.Case in point: When's the last time you shipped a large project thatcompiled completely without warnings? How long did it take you to tireof reading them and start to ignore them? At least with the opt-outmethod, you're forced to at least review the condition and perform somediscrete action to ignore it.

Also, from a standpoint of process automation, I want the failure asearly as possible because I can either stop early and let a humancorrect the problem and re-run a smaller amount of lost work, orbecause I can make decisions based on the failure and adjust thecontrol path of my process.

and running commit-time triggers at add-time is not appropriate.
Why? Because it doesn't match the policy you have in mind? Thenchangeyou commit-time triggers so that they don't do some tests atadd-time.
No, it's not because the triggers don't match the policy that I have
in mind. It's because there are certain tests that simply are not
appropriate to perform at add time. Scanning the contents of the new
files, for example, are not appropriate.
No, it is appropriate. You said yourself many times that you need to
check everything ASAP. If the file has contents, why don't you checkfor
its validity?

No, I never said "you need to check everything ASAP". I said "you needto perform appropriate checks ASAP". There's a big difference.

At add-time, chances are very high that the file's contents simplyaren't ready for review. That's why I don't check its validity. Toassume otherwise would be to require the user to defer running "cvsadd" until right before running "cvs commit". While that is the workhabit of many users, it's not the general case, and I won't intrudeinto their style in quite that way.

On the other hand, checking certain attributes of the file, like itsname, is reasonable to do at add-time. This is why the checksperformed at add time are necessarily a subset of those performed atcommit time.

Like I said in another post, after the commit has completed, the
effects of the two implementations are identical.

What we're arguing about is what's the right way to make the journey
to that point. My way has merit.

Sorry, but the only real merit I guessed is backwards compatibilitywith

current commit scripts. Anything else?


Backward compatibility is a good one.


My solution could be made backward-compatible as well.

The scripts will be easier to write if the two sets are kept separate
and invoked separately, also.


No, they will be easier to write only when an administrator will try to
implement a policy similar to those you have in your mind where the
checks are different. As I believe the checks must be the same, I in
fact like it very much that implementing inherently broken policies is
slightly more difficult, -- it will force admins to think twice before
actually doing that.

Hmmmm... Let's see. Suppose for a moment that we implement add-timechecks according to my proposal. We now have a set of checks that runboth at add-time and at commit-time. Administrators working in yourstyle have a method to do that, too.

Mark also mentioned some good ones.


Sorry, I've somehow missed that. Care to repeat or give a reference?

Look in the info-cvs archives for the following message IDs:address@hidden and address@hidden

You've admitted that you're not familiar with the CVS design.
Please don't argue issues of complexity in a vacuum.


Yes, I'm not familiar with the CVS design, hopefully though it's not
that broken that introducing additional feature is less complex than
tuning an existing one to meet new requirements.


The two implementations are equally complex.


Resulting application complexity is what actually matters in the long
term, not the complexity of turning current implementation to the new
one.

Check the code yourself and consider how you might implement bothfeatures. I believe both implementations are equally complex to buildand that neither implementation significantly increases the overallcomplexity of the application. However, I believe that your methodincreases the complexity of the administration tasks slightly, for thereason that conditional actions within scripts invoked by commitinfowill be more difficult to write. My method increases administrativetasks even more slightly because the administrator must decide whichone of two files is the right one in which to register the triggerscripts, but the scripts themselves are simpler.

And that's one of problems with your proposed add-time triggers
solution. The user needs a way to invoke later the checks "cvs add"
didn't do in this case. That's the scenario where user needsseparate
"add file to working copy" and "check file for validity against
repository" actions. Every time I ask you how do I repeat theadd-time
check, you don't give satisfactory answer.
[...]
I want to know how do I invoke the checks that have been skippedduring
off-line work when I go on-line?
cvs [-n] commit
Cool! So that's what "cvs add" in fact didn't do when I workedoff-line?
This fits perfectly into my model! That's exactly my point, -- if any,
"cvs add" should perform the same checks "cvs -n commit" does!

Whoa! You're missing something important. The moment a new file hasbeen successfully registered with "cvs add", conditions change.Repeating the same set of add-time checks alone is no longerappropriate. Assuming you don't abandon your work, the next logicalstep is to commit the work. Therefore, the work becomes subject to themore stringent commit-time checks. The commit-time checks are asuperset of the add-time checks because the user is given the option tosuccessfully register a new file while defeating the add-time checks.

Well, suppose you are designer and I'm a user then. I, the user,ask
you, the designer, to explain me why do you think I, the user, will
never need plain and simple "add new file to the working copy" user
operation.
The answer is: You can add anything you want to your workspace, butif
you intend to commit it then you must comply with the design. And I
will tell you at the moment you declare your intent if I thinkyou're
not in compliance.
Well, what is the command to just add the file to my working copy
without intent to commit it? Please don't tell me there is one asthen
you agree that "add new file to the working copy" is a useful user
operation. Please don't tell me there is none as I need it indeed.
I have never disputed that "add new file to the working copy" is a
useful operation.
Really?! What then did you mean here when you've answered YESSSSS here:

/begin quote
me> Ah, now I see. I suggest "add new file to the working copy" to be a
me> useful user operation, and you believe it is not? So the minimum
me> semantics of "cvs add" you agree with is something like "add newfile tome> the working copy but only after you make sure the file path is OKwith
me> respect to the repository".

you> YESSSSS!!!!
/end quote

Don't you even care to understand questions before answering, or you've
already changed your opinion?

In my usage model, I am in 100% agreement that the following is auseful operation: "add new file to the working copy but only after theadd-time triggers completed successfully"

In the statements above, my "make sure the file path is OK with respectto the repository" is a special case of your "add-time triggerscompleted successfully".

I have also been convinced that some users, other than myself, mighthave legitimate reason turn off add-time triggers at the moment that"cvs add" is invoked. That means I can accept that "add new file tothe working copy even under failure conditions detected by add-timetriggers" is a valid requirement. I can get over it.

--

Paul Sander | "Lets stick to the new mistakes and get rid of theold

address@hidden | ones" -- William Brown

[Prev in Thread]

Current Thread

[Next in Thread]

Re: CVS diff and unknown files., Sergei Organov, 2005/02/01
- Re: CVS diff and unknown files., Paul Sander, 2005/02/01
  - Re: CVS diff and unknown files., Sergei Organov, 2005/02/01
  - Re: CVS diff and unknown files., Greg A. Woods, 2005/02/01
- Re: CVS diff and unknown files., Sergei Organov, 2005/02/01
  - Re: CVS diff and unknown files., Paul Sander <=
    - Re: CVS diff and unknown files., Greg A. Woods, 2005/02/02
- Re: CVS diff and unknown files., Greg A. Woods, 2005/02/01
- Re: CVS diff and unknown files., Greg A. Woods, 2005/02/01
  - Re: CVS diff and unknown files., Sergei Organov, 2005/02/02
    - Re: CVS diff and unknown files., Paul Sander, 2005/02/02
    - Re: CVS diff and unknown files., Greg A. Woods, 2005/02/05
    - Re: CVS diff and unknown files., Greg A. Woods, 2005/02/02
    - Re: CVS diff and unknown files., Mark D. Baushke, 2005/02/02
    - Re: CVS diff and unknown files., Greg A. Woods, 2005/02/05
    - Re: CVS diff and unknown files., Sergei Organov, 2005/02/07

Prev by Date: Re: Problem with cvs_acls script?
Next by Date: Benefit of policies (was Re: Triggers)
Previous by thread: Re: CVS diff and unknown files.
Next by thread: Re: CVS diff and unknown files.
Index(es):
- Date
- Thread