bug-datamash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-datamash] Feature request: percentiles


From: Assaf Gordon
Subject: Re: [Bug-datamash] Feature request: percentiles
Date: Sun, 12 Mar 2017 21:56:55 -0400

Hello Barry,

Sorry for the delayed response.

> On Mar 6, 2017, at 02:57, Barry Nisly <address@hidden> wrote:
> 
> I just found out about datamash and I want to thank you for creating such a 
> useful tool.

Thank you for your kind words.

> My request is to add percentile in addition to the quartile calculations. 
> 
> I typically deal with latencies and am interested in 90, 95, or 99 
> percentiles. Arbitrary percentiles would be great but, in looking at the 
> code, it doesn’t seem easy to implement. Creating hardcoded percentile 
> calculations (e.g., 90, 95, 99) would be simple (adding the opcodes and 
> connecting them to percentile_value() in src/utils.c.
> 
> Ideally, I could specify an arbitrary percentile, e.g., ‘percentile_93’ and 
> have the parser parse out the percentile and pass it along with the 
> ‘percentile’ opcode.
> 
> I may take a crack at implementing this as time permits and if there is any 
> interest in the feature.

I like this idea very much.

If I may suggest:
There are already two operations that accept a parameter: 'bin' and 'strbin'.
In their case the optional parameter determines the bucket size.
e.g. default bucket size of 100:
   seq 1 500 | datamash --full bin 1
vs bucket size of 10:
   seq 1 500 | datamash --full bin:10 1

The parser (in op-parser.c) already takes the value after a ':' and uses it as 
a parameter.
The function op-parser.c:set_op_params() checks if the parameter can be used 
with the requested operation.

I would try to implement a 'percentile' operation exactly in that way (in terms 
of parsing).

In terms of processing, it should probably be a case very similar to 
OP_QUARTILE_1/3/IQR/MEDIAN
in 'fields-ops.c'.

Please do try your hand at it and i'm happy to help making it work. Also feel 
free to send partial patches and we'll discuss and improve them.
I apologize in  advance if my replies are a bit delayed - a bit hectic at work 
at the moment.

regards,
 - assaf








reply via email to

[Prev in Thread] Current Thread [Next in Thread]