bug-datamash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

quoted fields and decimal separators


From: code
Subject: quoted fields and decimal separators
Date: Mon, 03 Jan 2022 08:49:07 +0100
User-agent: Roundcube Webmail/1.2.3

Hello,

Thanks for datamash, it totally made my day. I am trying to proof to my wife that I can do on the commandline what she can do on a spradsheet and datamash has been of great help.

I have run into difficulties when analyzing some bank statements unfortunately.

1. They are automatically generated by the bank and therefore all fields are quoted
   (even fields that contain pure numbers).
2. Also (since it is a german bank) it uses "," for a decimal separator instead of ".". This is not a german problem though, many countries use commas instead of dots (see
   https://en.wikipedia.org/wiki/Decimal_separator#Usage_worldwide )

At the end of this mail you will find some example csvs to illustrate the point.

I was able to solve my problems with some sed/tr magic (I just stripped away the quoting and replaced all ","-characters.) But it was neither easy, fast, elegant nor stable against more complicated input. It would probably have failed completely if the text fields had contained any of the following characters: ,;"

I therefore would like to see the following features in datamash:

1. allow to specify a quoting character for fields (default none)
2. allow to specify a character that is then used as a decimal separator (ideally in both input and output)

I would like to hear your thoughts on this.

Thanks a lot,
Johannes


Original Format of the bank statement (heavily censored so that I am comfortable sharing it, but still formatted like a MT940 CSV). What I want to do is this to sum up the "Betrag" column, gouped bt the "Buchungstext": datamash -H groupby 4 sum 9

"Auftragskonto";"Buchungstag";"Valutadatum";"Buchungstext";"Verwendungszweck";"Beguenstigter/Zahlungspflichtiger";"Kontonummer";"BLZ";"Betrag";"Waehrung";"Info"
"DE123456";"07.12.21";"07.12.21";"BARGELDAUSZAHLUNG";"07.12/12.05UHR Somewhere";"GA NR00000000 BLZ123456 0";"";"7654321";"-50,00";"EUR";"Umsatz gebucht" "DE123456";"06.12.21";"06.12.21";"GUTSCHR. UEBERWEISUNG";"SVWZ+Taschengeld";"Johannes";"DE654321";"G123456";"10,00";"EUR";"Umsatz gebucht" "DE123456";"03.12.21";"03.12.21";"KARTENZAHLUNG";"SVWZ+2021-12-01T08.37 Something";"VERKEHRS AG SOMEWHERE";"DE123456";"ZZZYYYXXX";"-21,98";"EUR";"Umsatz gebucht" "DE123456";"01.12.21";"01.12.21";"ENTGELTABSCHLUSS";"Entgeltabrechnungsiehe Anlage";"";"";"7654321";"-4,90";"EUR";"Umsatz gebucht" "DE123456";"04.11.21";"04.11.21";"GUTSCHR. UEBERWEISUNG";"SVWZ+Taschengeld";"Johannes";"DE654321";"G123456";"10,00";"EUR";"Umsatz gebucht" "DE123456";"02.11.21";"01.11.21";"DAUERAUFTRAG";"SVWZ+Spende Lage der Nation";"Lage der Nation Media GmbH& Co. KG";"NL123456";"BUNABCDEF";"-5,00";"EUR";"Umsatz gebucht" "DE123456";"02.11.21";"01.11.21";"DAUERAUFTRAG";"SVWZ+Spende Qutebrowser";"Qutebrowser";"CH123456";"POABCDEF";"-5,00";"EUR";"Umsatz gebucht" "DE123456";"02.11.21";"01.11.21";"DAUERAUFTRAG";"SVWZ+Spende Netzpolitik.org";"netzpolitik.org e. V.";"DE123456";"GENABCDEF";"-7,50";"EUR";"Umsatz gebucht" "DE123456";"02.11.21";"30.10.21";"ENTGELTABSCHLUSS";"Entgeltabrechnungsiehe Anlage";"";"";"123456";"-4,90";"EUR";"Umsatz gebucht"

A more minimal example (I want datamash -H groupby 1 sum 2):

"method";"money_difference"
"transfer";"10,00"
"credit card";"-4,90"
"transfer";"10,00"

An evil example that makes scripting around more difficult (I want datamash -H groupby 2 sum 3):

"evil_string";"method";"money_difference"
"say "hello"";"transfer, immediately";"10,00"
";";"credit card, at some point";"-4,90"
",";"transfer, immediately";"10,00"



reply via email to

[Prev in Thread] Current Thread [Next in Thread]