[Cuis] (Minimal) requirements for Unicode support?

Juan Vuletich juan at jvuletich.org
Fri Feb 22 13:09:50 CST 2013


Hi Hannes,

I had trouble with my internet connection. Now, the updates are at 
GitHub. Please check them. The test I added is your UnicodeTest, verbatim.

Cheers,
Juan Vuletich

On 2/21/2013 7:00 PM, H. Hirzel wrote:
> Hello Juan
>
> On 2/21/13, Juan Vuletich<juan at jvuletich.org>  wrote:
>> Thanks for this. I added your new test.
> Yes, thank you. I have seen
>       StringTest new testAsUtf8
> and
>       StringTest new testAsUtf8WithNCRs
>
> or is it something different?
>
>
>
>
>> BTW, in later changes, I tweaked a bit the protocol, adding a flag to
>> skip a trailing null. This was needed for Windows clipboard.
> fromUtf8 is now
>
>
> String>>
> fromUtf8: aByteArray hex: useHexForNCRs trimLastNull: doTrimLastNullChar
> 	"Convert the given string from UTF-8 to  the internal encoding: ISO
> Latin 9 (ISO 8859-15)"
> 	"For unicode chars not in ISO Latin 9 (ISO 8859-15), embed Decimal
> NCRs or Hexadecimal NCRs according to useHex.
> 	
> 	See http://en.wikipedia.org/wiki/Numeric_character_reference
> 	See http://rishida.net/tools/conversion/. Tests prepared there.
> 	
> 	Note: The conversion of NCRs is reversible. See #asUtf8:
> 	This allows handling the full Unicode in Cuis tools, that can only
> display the Latin alphabet, by editing the NCRs.
> 	The conversions can be done when reading / saving files, or when
> pasting from Clipboard and storing back on it."
>
>
>
> Can you
>> update the examples again?
> Which examples? I have added you as a collaborator for
> https://github.com/hhzl/Cuis-Add-Ons
> so that you can mark it directly through the github web interface.
>
>
> BTW, some of the Cuis methods in
>> UnicodeNotes.md are outdated as well...
> OK, noted.
>
> Regards
> --Hannes
>
>
>> On 2/13/2013 2:46 PM, H. Hirzel wrote:
>>> Hello Juan
>>>
>>>> On 2/8/13, Juan Vuletich<juan at jvuletich.org>   wrote:
>>>>> Unfortunately, this means I broke the examples at
>>>>> https://github.com/hhzl/Cuis-Add-Ons/blob/master/UnicodeNotes.md .
>>> I have updated the file UnicodeNotes.md
>>>
>>> and I did a test class (attached) which shows how to read and write an
>>> UTF8 file.
>>>
>>> test5ReadWriteUtf8
>>> 	
>>> 	"see UnicodeNotes.md"
>>> 	
>>>        "self new test5ReadWriteUtf8"
>>>        | stream content byteArray byteArray2 |
>>>
>>> 	"read UTF8 Unicode file into internal string with NCRs"
>>> 	"for NCR see http://en.wikipedia.org/wiki/Numeric_character_reference"
>>> 	
>>> 	stream := (FileStream  fileNamed:  self class fileName) binary.
>>> 	byteArray := stream contentsOfEntireFile.
>>>         content := String fromUtf8: byteArray.
>>> 	"NCRs were added to 'content' as needed"
>>>
>>> 	"write internal string back to UTF8 file with NCRs converted back to
>>> UTF8 chars"
>>> 	stream := (FileStream  forceNewFileNamed:  self class fileName2) binary.
>>> 	stream nextPutAll: (content asUtf8: true).  "true means: convert NCRs
>>> back to UTF8"
>>> 	stream close.
>>>
>>>         "compare the two versions: what is in file 'fileName' with what
>>> is in file 'fileName2'"
>>> 	stream := (FileStream  fileNamed:  self class fileName) binary.
>>> 	byteArray := stream contentsOfEntireFile.
>>> 	stream close.
>>>
>>> 	stream := (FileStream  fileNamed:  self class fileName2) binary.
>>> 	byteArray2 := stream contentsOfEntireFile.
>>> 	stream close.
>>>
>>> 	self assert: byteArray = byteArray2.
>>>
>>>
>>> BTW according to http://en.wikipedia.org/wiki/UTF8 'Official name and
>>> variants'
>>> UTF8 should all be uppercase.
>>>
>>> As of now I can use Cuis 4.1-1590 as is for my work which includes
>>> reading and writing UTF8 encoded text files (including HTML files). So
>>> as far as I am concerned further extended Cuis Unicode support might
>>> be put on the back burner for some time.
>>>
>>> However it might still be worthwhile considering maintaining a
>>> TextConverter and UTF8Converter class for compatibility and other
>>> reasons. More on this later.
>>>
>>> Thank you for the update
>>>
>>> https://github.com/jvuletich/Cuis/blob/master/UpdatesSinceLastRelease/1590-InvertibleUTF8Conversion-JuanVuletich-2013Feb08-08h11m-jmv.1.cs.st
>>>
>>> and
>>>
>>> kind regards
>>>
>>> Hannes Hirzel
>>>
>>>
>>> _______________________________________________
>>> Cuis mailing list
>>> Cuis at jvuletich.org
>>> http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org
>>
> _______________________________________________
> Cuis mailing list
> Cuis at jvuletich.org
> http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org
>





More information about the Cuis mailing list