emacs-orgmode
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Help with a (query) replacement


From: Juan Manuel Macías
Subject: Re: Help with a (query) replacement
Date: Sat, 12 Nov 2022 16:29:52 +0000

Juan Manuel Macías writes:

> In the case of PDFs, I would use pdftotext. It converts the PDF to plain
> text and (in theory) removes hyphens from the PDF after conversion. The
> resulting plain text is somewhat ugly (page numbers and other elements
> are preserved), but if you just want to copy/paste text, I think it's
> enough.

And if you don't want to mess with the command line, you can also use
calibre here to convert from PDF to plain text or even Epub (the latter
is better because Epub is a tagged format and then you can have more
control over how to process that, for example by converting it to Org or
Markdown with pandoc). Calibre will do its best to preserve the
structure of the PDF, removing hyphens and other unnecessary elements.
But keep in mind that this process is largely heuristic, and the
conversion is not 100% perfect. However, it works acceptably well.

https://calibre-ebook.com/about



reply via email to

[Prev in Thread] Current Thread [Next in Thread]