[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: findin sloc changes between two tags

From: Paul Sander
Subject: Re: findin sloc changes between two tags
Date: Tue, 19 Feb 2008 00:06:21 -0800

On Feb 18, 2008, at 8:40 PM, yeti wrote:

On Feb 19, 4:38 am, Paul Sander <address@hidden> wrote:
For this particular metric, I usually run the two versions through a
beautifier with standard settings, then diff the output of that.

On Feb 18, 2008, at 10:17 AM, Rick Genter wrote:

From: address@hidden
On Behalf Of Ted Stern

But that regexp handles only C++ comments. I don't know of a way to recognize /* ... [text containing newlines] ... */. Possibly another
diff utility has that options (xxdiff, tkdiff?).

You could write an awk or perl script to filter the multiline comments
out, save the output to a file, then diff those files. I, however,
consider comments to be equally (or even more) important to non-
in source code, and don't understand the use case.- Hide quoted text -

- Show quoted text -

Hi guys,

Thanks for all those answers. I however thought that this would be a
fairly common problem and there might be a standard solution. Keeping
your suggestions in mind I did

cvs diff -wlcbBC20   -r rev1 -r  rev2 my_file.c  | perl -0777 -pe 's{/
\*.*?\*/}{}gs' | diffstat >> FileToHoldInfo.txt

idea is to get enough context lines and then eliminate the comments
from the diff output and finally use diffstat to gather stats. Do you
think this is the correct way ??

I think that this method will work only if the comments are completely enclosed within the context displayed by the diff program. It will fail (i.e., produce incorrect output), for example, if a short sentence is added to the end of a 50-line comment. Or to the beginning of one. Or to the middle of a 100-line comment. It also fails if someone arbitrarily inserts or removes newlines in the code itself.

This is where beautifiers such as the "indent" program come in. It normalizes the format of the source code based on the syntax of the programming language and policies specified on its command line. It leaves comments in place, so additional filtering (like your Perl one- liner above) might be necessary.

After the two versions have been reduced to standard formats, you can apply the diff program with minimal arguments. Its output can be used to count the number of lines inserted, deleted, and changed.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]