[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-datamash] datamash: selecting columns that include dashes
From: |
Assaf Gordon |
Subject: |
Re: [Bug-datamash] datamash: selecting columns that include dashes |
Date: |
Tue, 6 Aug 2019 16:51:59 -0600 |
User-agent: |
Mutt/1.11.4 (2019-03-13) |
Hello,
On Wed, Jul 24, 2019 at 09:00:48AM +0200, Renato Alves wrote:
> I'm trying to use datamash to aggregate and subset TSV data.
> If I use integers for column identifiers everything works as expected.
>
> $ datamash -H sum 2
> sum(A_Chlor_T1h_r1-metaG)
> 22960
>
> However if I try to specify column names datamash gets confused with the dash
> in the column name
>
> $ datamash -H sum A_Chlor_T1h_r1-metaG
> datamash: field range for 'sum' must be numeric
>
> I tried different variations of single ('' or "") and double quoting ("''" or
> '""') as well as escaping ("\-" or '\-') but couldn't find a way to subset
> using the column name.
> Since the order of columns is not maintained on all the files I'm using,
> using position-based indexing is unreliable.
>
> Is there some syntax that allows selecting column names that contain dashes?
> Additionally, is there any kind of support for wildcards when using column
> names? (i.e. something like "A_Chlor_T*")
>
Thank you for this nice idea (and sorry for the delayed reply).
I've added such feature to the git repository, here:
https://git.savannah.gnu.org/cgit/datamash.git/commit/?id=0f01d130b5d2f0157464e20e2e0f4a85b2e8899b
You can now use:
datamash -H sum A_Chlor_T1h_r1\\-metaG < input.txt
datamash -H sum 'A_Chlor_T1h_r1\-metaG' < input.txt
datamash -H sum "A_Chlor_T1h_r1\\-metaG" < input.txt
regards,
- assaf