[Cuis] About adding a Unicode handling porting layer

Germán Arduino garduino at gmail.com
Fri Feb 1 09:57:39 CST 2013


Thanks Hannes, this is very useful to me.

My next step in porting stuff is polish WebClient and, between other
things, Unicode is an issue.

Germán.

2013/2/1 H. Hirzel <hannes.hirzel at gmail.com>:
> Thank you Juan,
> for adding the Unicode fix so that pasting text through the clipboard
> does not silently loose characters. More things like this (including
> comments) later.
>
> I have realized that what I wrote earlier is wrong. Cuis reads and
> saves files in ISO8859-15 by default and not with Unicode. However it
> is not too difficult to read and write a Unicode file.
>
> I have started some notes on this here
> https://github.com/hhzl/Cuis-Add-Ons/blob/master/UnicodeNotes.md
>
> Regards
> Hannes
>
> On 1/23/13, Juan Vuletich <juan at jvuletich.org> wrote:
>> Thanks Hannes, just integrated this.
>>
>> Cheers,
>> Juan Vuletich
>>
>> H. Hirzel wrote:
>>> The attached change set prevents Cuis from silently ignoring
>>> characters which are not in ISO 8859-15.
>>>
>>> For example if you paste a text snippet which contains the letter
>>> Omega (Ω) into a TextWindow it is displayed as Ω
>>>
>>> The part which does it the other way round is not included.
>>>
>>> --Hannes
>>>
>>>
>>>
>>> On 1/22/13, H. Hirzel <hannes.hirzel at gmail.com> wrote:
>>>
>>>> Hello Germán
>>>>
>>>> On 1/22/13, Germán Arduino <garduino at gmail.com> wrote:
>>>>
>>>>> Nice if you will develop the needed code!
>>>>>
>>>>> The first need I have is on the methods of Swazoo that I commented in
>>>>> other mail, but I think that is more simple, only that I don't was
>>>>> aware of the already inplace support in Cuis itself.
>>>>>
>>>> Yes, that took me as well some time to find out that Cuis indeed has
>>>> some limited Unicode support.
>>>>
>>>> Juan originally wrote that Cuis had dropped Unicode support.
>>>>
>>>> When I have a look at Cuis from outside I cannot say that it is the
>>>> case as Cuis consumes and writes UFT8 text files. Unicode text
>>>> snippets pasted through the clipboard into a Cuis TextEditor also pass
>>>> in well. The only limitation is that internally it only handles the
>>>> code points which are in https://de.wikipedia.org/wiki/ISO_8859-15.
>>>> And if I work in a Cuis workspace  with
>>>>
>>>>     nn asCharacter
>>>>
>>>> where nn is an Integer
>>>>
>>>>    nn must belong to ISO_8859-15
>>>>
>>>>
>>>> ISO_8859-15 is good for most European languages. If we would have an
>>>> Add-On to cater for occasional other characters of Unicode which do
>>>> not fall into the set covered by ISO_8859-15 that would make UTF8 text
>>>> file processing with Cuis safe.
>>>>
>>>>
>>>> --Hannes
>>>>
>>>>
>>>>
>>>>> Germàn.
>>>>>
>>>>> 2013/1/22 H. Hirzel <hannes.hirzel at gmail.com>:
>>>>>
>>>>>> Hello Germán and Juan
>>>>>>
>>>>>> As we have seen we can say that Cuis handles Unicode to a certain
>>>>>> limited extent.
>>>>>>
>>>>>> I will post summary a writeup of what I know about it later. I am
>>>>>> interested in working/contributing to an add-on which loads Unicode
>>>>>> support into Cuis.
>>>>>>
>>>>>> For general work I need
>>>>>>
>>>>>> a)
>>>>>> an add-on so that Cuis can process arbitrary UFT8 text files. However
>>>>>> the majority of the content characters will fall into the
>>>>>>   https://de.wikipedia.org/wiki/ISO_8859-15
>>>>>> range. So it is fine if the other characters are rendered as \unnn or
>>>>>> &#nnn;
>>>>>>
>>>>>> b)
>>>>>> Another more rewarding put maybe more difficult way  would be to
>>>>>> replace the String class with a class which handles 16bit characters
>>>>>> instead of 8 bit characters. In terms of structure all would remain
>>>>>> the same. Characters would be 16bit like in Java.
>>>>>>
>>>>>>
>>>>>> This will come later. At the moment I am working on ContentPack
>>>>>> version 2 which will run on Cuis, Squeak and Pharo.
>>>>>>
>>>>>> Kind regards
>>>>>>
>>>>>> --Hannes
>>>>>>
>>>>>>
>>>>>>> 2013/1/22 Germán Arduino <garduino at gmail.com>:
>>>>>>>
>>>>>>>> Thanks for the comments Hannes / Juan:
>>>>>>>>
>>>>>>>> I will look into it when have time, or if you prefer Hannes and want
>>>>>>>> to help I will integrate it when finish with Aida.
>>>>>>>>
>>>>>>>> Germán.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2013/1/21 Juan Vuletich <juan at jvuletich.org>:
>>>>>>>>
>>>>>>>>> Hi Germán,
>>>>>>>>>
>>>>>>>>> Cool! Just a remark: Cuis does include conversion to/from utf-8 for
>>>>>>>>> the
>>>>>>>>> charset it supports (ISO-8859-15, covering nearly all the latin
>>>>>>>>> alphabets).
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Juan Vuletich
>>>>>>>>>
>>>>>>>>> Germán Arduino wrote:
>>>>>>>>>
>>>>>>>>>> Hi:
>>>>>>>>>>
>>>>>>>>>> The first versions of Sport and Swazoo working in Cuis 4.1 with all
>>>>>>>>>> tests green are ready to install.
>>>>>>>>>>
>>>>>>>>>> The changes I did in Swazoo are:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> - Avoid Unicode support that don't exist in Cuis
>>>>>>>>>>
>>>>>>>>>>
>>>>>> ......
>>>>>>
>>>>>> _______________________________________________
>>>>>> Cuis mailing list
>>>>>> Cuis at jvuletich.org
>>>>>> http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org
>>>>>>
>>>>>
>>>>> --
>>>>> Sincerely,
>>>>> Germán Arduino
>>>>> about.me/garduino
>>>>>
>>>>> _______________________________________________
>>>>> Cuis mailing list
>>>>> Cuis at jvuletich.org
>>>>> http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org
>>>>>
>>>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>> _______________________________________________
>>>> Cuis mailing list
>>>> Cuis at jvuletich.org
>>>> http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org
>>>>
>>
>>
>> _______________________________________________
>> Cuis mailing list
>> Cuis at jvuletich.org
>> http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org
>>
>
> _______________________________________________
> Cuis mailing list
> Cuis at jvuletich.org
> http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org




More information about the Cuis mailing list