[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
gettext returns UTF-8 characters as question marks
From: |
Babak Razmjoo |
Subject: |
gettext returns UTF-8 characters as question marks |
Date: |
Fri, 27 Nov 2020 08:17:19 +0000 |
I certainly know this is not a bug in gettext, only my ignorance. I have a sample C++ progrem here:
```
#include <iostream>
#include <locale.h>
#include <libintl.h>
void printMessages()
{
const char* msg1 = gettext("Spiders are the only web developers who enjoy finding bugs");
const char* msg2 = gettext("A man, A horse, and a Gun");
std::cout << "Joke : "<< msg1 << "\nMovie : " << msg2 << "\n";
}
int main(int argc, const char *argv[])
{
std::cout << "فارسی با استفاده از استریم\n";
if (setlocale(LC_MESSAGES, "fa_IR.UTF8")) {
bindtextdomain("testgettext", ".");
textdomain("testgettext");
printMessages();
return 0;
}
else
return 1;
}
```
and have xgettext to extract a messages.po file. I have translated this file in Persian like this:
```
# Laziness sucks.
# Copyright (C) 2020 Babak
# This file is distributed under the same license as the PACKAGE package.
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: testgettext 0.1\n"
"Report-Msgid-Bugs-To: nowhere\n"
"POT-Creation-Date: 2020-11-25 21:31+0330\n"
"PO-Revision-Date: 2020-11-26 23:17+3:30\n"
"Last-Translator: Babak <EMAIL@ADDRESS>\n"
"Language: fa_IR\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
#: main.cc:7
msgid "Spiders are the only web developers who enjoy finding bugs"
msgstr "عنکبوتها تنها توسعه دهندگان وبی هستند که از یافتن باگ لذت می برند"
#: main.cc:8
msgid "A man, A horse, and a Gun"
msgstr "یک مرد، یک اسب، یک اسلحه"
```
Also I have msgfmt to compile a `testgettext.mo`. This is the directory layout of all these files:
```
.
├── fa_IR
│ └── LC_MESSAGES
│ └── testgettext.mo
├── fa.po
├── main.cc
├── messages.po
└── testgettext
```
Now whenever I run the compiled program `testgettext`, I see this output:
فارسی با استفاده از استریم
Joke : ???????? ???? ????? ??????? ??? ????? ?? ?? ????? ??? ??? ?? ????
Movie : ?? ???? ?? ???? ?? ?????
As you can see, all characters of translated messages are displayed as '?'. But I can see correct results using gettext program:
$ export TEXTDOMAINDIR=.
$ export LANG=fa_IR.UTF-8
$ gettext testgettext "Spiders are the only web developers who enjoy finding bugs"
عنکبوتها تنها توسعه دهندگان وبی هستند که از یافتن باگ لذت می برند
What have i done wrong, or what should i do to get right output in my sample program?
- gettext returns UTF-8 characters as question marks,
Babak Razmjoo <=