Trolltech Home | Qt4-preview-feedback Home | Recent Threads | All Threads | Author | Date
All threads index page 5

Qt4-preview-feedback Archive, April 2005
Qt-4 Beta 2 And Locale Problems in Turkish


Message 1 in thread

Hi;

First of all let me describe the problem(s) in Turkish locale;

Turkish has 4 letter "I"s. English has only two, a lowercase dotted i
and an uppercase dotless I. But Turkish has lowercase and uppercase
forms of both dotted and dotless I. 

So changing case of these letter result differ in Turkish locale.

Turkish also have modified version of latin letters "s", "o", "c", "u",
"g". 

Much more informative text can be found on
http://www.i18nguy.com/unicode/turkish-i18n.html web site.



And Qt has issues with these situations.

Qt-4 Beta 2 cannot compiled under Turkish locale, I'm not sure what
cause this but this can because of i/I case conversion problems
described above. This behaviour reported before to this list by Äsmail
DÃnmez( http://lists.trolltech.com/qt4-preview-feedback/2005-04/msg00169.html )

Currently Qt doesn't have any locale-dependent case conversion function
in QString. This is also discussed before here
( http://lists.trolltech.com/qt4-preview-feedback/2005-04/msg00051.html ) and also reported to Trolltech ( as Issue N68627 ) 

According to thread in this list, Qt developers doesn't want any locale
dependent function in QString. QString and also QCString is working
locale independent way and will be. The only locale-dependent function
is localeAwareCompare().

First problem is occured here, lots of developers thinks QString is
working locale-aware. So applications are not working in Turkish locale
or working buggy. For example KDE has problems with Turkish because of
this ( like #93770, #101211, #91089 ). Also QStringList is
locare-independent so sort() is not working unless overriding.

Second problem is locale-aware function localeAwareCompare(), it is not
working for Turkish locale.

Third, although QByteArray works locale dependent way it cannot convert
Turkish characters. Because it uses Glibc's tolower() function which
also can't handle Unicode characters.

As a result, Qt can't handle Turkish characters if i'm not missed a
point. For test the problems dummy code can be found on
http://cekirdek.uludag.org.tr/~caglar/main.cpp

I fully understand you don't want to mix QLocale's responsibilities with
QString and will not add any locale dependent function to QString
family. One of the solutions can implement a locale-aware
toLower()/toUpper()/sort() functions in QLocale just like toDouble().
This also solves "the applications keyword must not locale-dependent"
problem because QString remains work in Latin1. 

What do you think, is Qt-4 final will be Turkish friendly? And what
about Qt-3?

Yours
-- 
 [ signature omitted ] 

Attachment: signature.asc
Description: Bu dijital olarak =?iso-8859-9?Q?imzalanm=FD=FE?= ileti =?iso-8859-9?Q?par=E7as=FDd=FDr?=


Message 2 in thread

S.ÃaÄlar Onur wrote:
> Hi;
> 
> First of all let me describe the problem(s) in Turkish locale;
> 
> Turkish has 4 letter "I"s. English has only two, a lowercase dotted i
> and an uppercase dotless I. But Turkish has lowercase and uppercase
> forms of both dotted and dotless I. 
> 
> So changing case of these letter result differ in Turkish locale.
> 
> Turkish also have modified version of latin letters "s", "o", "c", "u",
> "g". 
> 
> Much more informative text can be found on
> http://www.i18nguy.com/unicode/turkish-i18n.html web site.

I don't have any solution to your problems but I was wondering what 
TrollTech is planning to do about this.  From what I've heard, the 
QString class is not intended to be locale aware but how can that work? 
  If it has comparison, to upper/lower, and such operations then if its 
not locale aware, how do they work?  Only for ASCII?  Binary bitwise 
comparisons???

I'm having trouble understanding how you have a Unicode based string 
class with these operations that doesn't use at least some locale for 
the work and why it shouldn't be locale aware...

-- 
 [ signature omitted ] 

Message 3 in thread

Hello,

ÃarÅamba 20 Nisan 2005 01:58 tarihinde, S.ÃaÄlar Onur ÅunlarÄ yazmÄÅtÄ: 
... snip ...
> What do you think, is Qt-4 final will be Turkish friendly? And what
> about Qt-3?

The problem Caglar mentions isn't special to Turkish locale. AFAIK there are 
some locales also suffer from the same type of problem. Azerbaijani is one 
locale that I know has exactly same problem with Turkish.

Will there be a sane solution for locale-dependent case-conversions? Or what 
are we supposed to do for fixing the programs that assume QString::lower() is 
locale-dependent?

regards,
-- 
 [ signature omitted ] 

Attachment: pgpGb6n9xIDd9.pgp
Description: PGP signature


Message 4 in thread

ÃarÅamba 20 Nisan 2005 16:06 tarihinde, BarÄÅ Metin ÅunlarÄ yazmÄÅtÄ: 
> ÃarÅamba 20 Nisan 2005 01:58 tarihinde, S.ÃaÄlar Onur ÅunlarÄ yazmÄÅtÄ:
> ... snip ...
>
> > What do you think, is Qt-4 final will be Turkish friendly? And what
> > about Qt-3?
>
> The problem Caglar mentions isn't special to Turkish locale. AFAIK there
> are some locales also suffer from the same type of problem. Azerbaijani is
> one locale that I know has exactly same problem with Turkish.
>
> Will there be a sane solution for locale-dependent case-conversions? Or
> what are we supposed to do for fixing the programs that assume
> QString::lower() is locale-dependent?

It's sad that this thread didn't take much intrest, but I suppose the current 
situation cause an ambiguity for developers. And I think we need a 
clarification about the topic.

regards,
-- 
 [ signature omitted ] 

Message 5 in thread


> I don't have any solution to your problems but I was 
> wondering what TrollTech is planning to do about this.  From 
> what I've heard, the QString class is not intended to be 
> locale aware but how can that work? 

The trolls would have to confirm (well, I suppose I could look at the code) but it would probably use the case folding rules from http://www.unicode.org/Public/UNIDATA/CaseFolding.txt 

I don't rally know whether it would follow the Simple or Full profile (the difference being whether case folding can cause length changes). Regardless, the 4 is of the Turkic languages (tr_TR and tr_AZ) are the only exceptions to locale-independent case folding in that listing, though there are some additional exceptions in http://www.unicode.org/Public/UNIDATA/SpecialCasing.txt, mostly dealing with ligatures.

>   If it has comparison, to upper/lower, and such operations 
> then if its not locale aware, how do they work?  Only for 
> ASCII?  Binary bitwise comparisons???
> 
> I'm having trouble understanding how you have a Unicode based 
> string class with these operations that doesn't use at least 
> some locale for the work and why it shouldn't be locale aware...
>
> --
> Brad Pepers
> brad@xxxxxxxxxxxxxxx

Message 6 in thread

On Thursday 21 April 2005 17:03, Puetz Kevin A wrote:
> > I don't have any solution to your problems but I was
> > wondering what TrollTech is planning to do about this.  From
> > what I've heard, the QString class is not intended to be
> > locale aware but how can that work?
>
> The trolls would have to confirm (well, I suppose I could look at the code)
> but it would probably use the case folding rules from
> http://www.unicode.org/Public/UNIDATA/CaseFolding.txt
>
> I don't rally know whether it would follow the Simple or Full profile (the
> difference being whether case folding can cause length changes).

We're following the simple profile. 

> Regardless, the 4 is of the Turkic languages (tr_TR and tr_AZ) are the only
> exceptions to locale-independent case folding in that listing, though there
> are some additional exceptions in
> http://www.unicode.org/Public/UNIDATA/SpecialCasing.txt, mostly dealing
> with ligatures.

Yes. We intend to add methods of dealing with locale dependent case folding to 
QLocale in a future release. As Kevin noted, there are only a few locales 
that have exceptions to the general case folding rules.

Best regards,
Lars

> >   If it has comparison, to upper/lower, and such operations
> > then if its not locale aware, how do they work?  Only for
> > ASCII?  Binary bitwise comparisons???
> >
> > I'm having trouble understanding how you have a Unicode based
> > string class with these operations that doesn't use at least
> > some locale for the work and why it shouldn't be locale aware...
> >
> > --
> > Brad Pepers
> > brad@xxxxxxxxxxxxxxx