[Cuis] About adding a Unicode handling porting layer

H. Hirzel hannes.hirzel at gmail.com
Fri Feb 1 08:42:35 CST 2013


Thank you Juan,
for adding the Unicode fix so that pasting text through the clipboard
does not silently loose characters. More things like this (including
comments) later.

I have realized that what I wrote earlier is wrong. Cuis reads and
saves files in ISO8859-15 by default and not with Unicode. However it
is not too difficult to read and write a Unicode file.

I have started some notes on this here
https://github.com/hhzl/Cuis-Add-Ons/blob/master/UnicodeNotes.md

Regards
Hannes

On 1/23/13, Juan Vuletich <juan at jvuletich.org> wrote:
> Thanks Hannes, just integrated this.
>
> Cheers,
> Juan Vuletich
>
> H. Hirzel wrote:
>> The attached change set prevents Cuis from silently ignoring
>> characters which are not in ISO 8859-15.
>>
>> For example if you paste a text snippet which contains the letter
>> Omega (Ω) into a TextWindow it is displayed as Ω
>>
>> The part which does it the other way round is not included.
>>
>> --Hannes
>>
>>
>>
>> On 1/22/13, H. Hirzel <hannes.hirzel at gmail.com> wrote:
>>
>>> Hello Germán
>>>
>>> On 1/22/13, Germán Arduino <garduino at gmail.com> wrote:
>>>
>>>> Nice if you will develop the needed code!
>>>>
>>>> The first need I have is on the methods of Swazoo that I commented in
>>>> other mail, but I think that is more simple, only that I don't was
>>>> aware of the already inplace support in Cuis itself.
>>>>
>>> Yes, that took me as well some time to find out that Cuis indeed has
>>> some limited Unicode support.
>>>
>>> Juan originally wrote that Cuis had dropped Unicode support.
>>>
>>> When I have a look at Cuis from outside I cannot say that it is the
>>> case as Cuis consumes and writes UFT8 text files. Unicode text
>>> snippets pasted through the clipboard into a Cuis TextEditor also pass
>>> in well. The only limitation is that internally it only handles the
>>> code points which are in https://de.wikipedia.org/wiki/ISO_8859-15.
>>> And if I work in a Cuis workspace  with
>>>
>>>     nn asCharacter
>>>
>>> where nn is an Integer
>>>
>>>    nn must belong to ISO_8859-15
>>>
>>>
>>> ISO_8859-15 is good for most European languages. If we would have an
>>> Add-On to cater for occasional other characters of Unicode which do
>>> not fall into the set covered by ISO_8859-15 that would make UTF8 text
>>> file processing with Cuis safe.
>>>
>>>
>>> --Hannes
>>>
>>>
>>>
>>>> Germàn.
>>>>
>>>> 2013/1/22 H. Hirzel <hannes.hirzel at gmail.com>:
>>>>
>>>>> Hello Germán and Juan
>>>>>
>>>>> As we have seen we can say that Cuis handles Unicode to a certain
>>>>> limited extent.
>>>>>
>>>>> I will post summary a writeup of what I know about it later. I am
>>>>> interested in working/contributing to an add-on which loads Unicode
>>>>> support into Cuis.
>>>>>
>>>>> For general work I need
>>>>>
>>>>> a)
>>>>> an add-on so that Cuis can process arbitrary UFT8 text files. However
>>>>> the majority of the content characters will fall into the
>>>>>   https://de.wikipedia.org/wiki/ISO_8859-15
>>>>> range. So it is fine if the other characters are rendered as \unnn or
>>>>> &#nnn;
>>>>>
>>>>> b)
>>>>> Another more rewarding put maybe more difficult way  would be to
>>>>> replace the String class with a class which handles 16bit characters
>>>>> instead of 8 bit characters. In terms of structure all would remain
>>>>> the same. Characters would be 16bit like in Java.
>>>>>
>>>>>
>>>>> This will come later. At the moment I am working on ContentPack
>>>>> version 2 which will run on Cuis, Squeak and Pharo.
>>>>>
>>>>> Kind regards
>>>>>
>>>>> --Hannes
>>>>>
>>>>>
>>>>>> 2013/1/22 Germán Arduino <garduino at gmail.com>:
>>>>>>
>>>>>>> Thanks for the comments Hannes / Juan:
>>>>>>>
>>>>>>> I will look into it when have time, or if you prefer Hannes and want
>>>>>>> to help I will integrate it when finish with Aida.
>>>>>>>
>>>>>>> Germán.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2013/1/21 Juan Vuletich <juan at jvuletich.org>:
>>>>>>>
>>>>>>>> Hi Germán,
>>>>>>>>
>>>>>>>> Cool! Just a remark: Cuis does include conversion to/from utf-8 for
>>>>>>>> the
>>>>>>>> charset it supports (ISO-8859-15, covering nearly all the latin
>>>>>>>> alphabets).
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Juan Vuletich
>>>>>>>>
>>>>>>>> Germán Arduino wrote:
>>>>>>>>
>>>>>>>>> Hi:
>>>>>>>>>
>>>>>>>>> The first versions of Sport and Swazoo working in Cuis 4.1 with all
>>>>>>>>> tests green are ready to install.
>>>>>>>>>
>>>>>>>>> The changes I did in Swazoo are:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> - Avoid Unicode support that don't exist in Cuis
>>>>>>>>>
>>>>>>>>>
>>>>> ......
>>>>>
>>>>> _______________________________________________
>>>>> Cuis mailing list
>>>>> Cuis at jvuletich.org
>>>>> http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org
>>>>>
>>>>
>>>> --
>>>> Sincerely,
>>>> Germán Arduino
>>>> about.me/garduino
>>>>
>>>> _______________________________________________
>>>> Cuis mailing list
>>>> Cuis at jvuletich.org
>>>> http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org
>>>>
>>>>
>>> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> Cuis mailing list
>>> Cuis at jvuletich.org
>>> http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org
>>>
>
>
> _______________________________________________
> Cuis mailing list
> Cuis at jvuletich.org
> http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org
>




More information about the Cuis mailing list