[bug#57151] [PATCH 1/2] gnu: Add tesseract-ocr-tessdata-fast.

guix-patches

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug#57151] [PATCH 1/2] gnu: Add tesseract-ocr-tessdata-fast.

From:	Maxim Cournoyer
Subject:	[bug#57151] [PATCH 1/2] gnu: Add tesseract-ocr-tessdata-fast.
Date:	Fri, 12 Aug 2022 08:52:25 -0400
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/28.1 (gnu/linux)

Hi Simon,

Simon South <simon@simonsouth.net> writes:

> Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:
>> * gnu/packages/ocr.scm (tesseract-ocr-tessdata-fast): New variable.
>
> Maxim,
>
> Would it not be better to generate a separate package for each of the
> languages and scripts this data covers, as is done by Debian for
> instance?  The entire dataset is about a gigabyte in size and supports
> more than a hundred languages yet I imagine most people would be using
> only one or two.
>
> This would mean tesseract-ocr could simply propagate the
> "tesseract-ocr-tessdata-fast-eng" package rather than cherry-picking a
> specific file, and would establish a convention that would be necessary
> for packaging the "best" dataset as well, if that's desired.

That's a good idea!  I think we could have both, like Debian also has a
'tesseract-ocr-all' package for all the languages/scripts.  Which means
the individual variants could be added in at a later time by those
interested, eh :-).

A procedure returning a language-specific package variant would make
sense for that.

Thanks,

Maxim

[Prev in Thread]

Current Thread

[Next in Thread]

[bug#57151] [PATCH 0/2] *** Add trained data models for Tesseract OCR ***, Maxim Cournoyer, 2022/08/12
- [bug#57151] [PATCH 1/2] gnu: Add tesseract-ocr-tessdata-fast., Maxim Cournoyer, 2022/08/12
  - [bug#57151] [PATCH 2/2] gnu: tesseract-ocr: Make the default install minimally useful., Maxim Cournoyer, 2022/08/12
  - [bug#57151] [PATCH 1/2] gnu: Add tesseract-ocr-tessdata-fast., Simon South, 2022/08/12
    - [bug#57151] [PATCH 1/2] gnu: Add tesseract-ocr-tessdata-fast., Maxim Cournoyer <=
    - Message not available
    - bug#57151: [PATCH 1/2] gnu: Add tesseract-ocr-tessdata-fast., Maxim Cournoyer, 2022/08/12

Prev by Date: [bug#57149] [PATCH] gnu: Add font-nerd-fonts-firacode
Next by Date: [bug#57154] [PATCH] gnu: xfce4-screenshooter: Update to 1.9.11.
Previous by thread: [bug#57151] [PATCH 1/2] gnu: Add tesseract-ocr-tessdata-fast.
Next by thread: bug#57151: [PATCH 1/2] gnu: Add tesseract-ocr-tessdata-fast.
Index(es):
- Date
- Thread