[Cuis] (Minimal) requirements for Unicode support?
Juan Vuletich
juan at jvuletich.org
Fri Feb 22 13:09:50 CST 2013
Hi Hannes,
I had trouble with my internet connection. Now, the updates are at
GitHub. Please check them. The test I added is your UnicodeTest, verbatim.
Cheers,
Juan Vuletich
On 2/21/2013 7:00 PM, H. Hirzel wrote:
> Hello Juan
>
> On 2/21/13, Juan Vuletich<juan at jvuletich.org> wrote:
>> Thanks for this. I added your new test.
> Yes, thank you. I have seen
> StringTest new testAsUtf8
> and
> StringTest new testAsUtf8WithNCRs
>
> or is it something different?
>
>
>
>
>> BTW, in later changes, I tweaked a bit the protocol, adding a flag to
>> skip a trailing null. This was needed for Windows clipboard.
> fromUtf8 is now
>
>
> String>>
> fromUtf8: aByteArray hex: useHexForNCRs trimLastNull: doTrimLastNullChar
> "Convert the given string from UTF-8 to the internal encoding: ISO
> Latin 9 (ISO 8859-15)"
> "For unicode chars not in ISO Latin 9 (ISO 8859-15), embed Decimal
> NCRs or Hexadecimal NCRs according to useHex.
>
> See http://en.wikipedia.org/wiki/Numeric_character_reference
> See http://rishida.net/tools/conversion/. Tests prepared there.
>
> Note: The conversion of NCRs is reversible. See #asUtf8:
> This allows handling the full Unicode in Cuis tools, that can only
> display the Latin alphabet, by editing the NCRs.
> The conversions can be done when reading / saving files, or when
> pasting from Clipboard and storing back on it."
>
>
>
> Can you
>> update the examples again?
> Which examples? I have added you as a collaborator for
> https://github.com/hhzl/Cuis-Add-Ons
> so that you can mark it directly through the github web interface.
>
>
> BTW, some of the Cuis methods in
>> UnicodeNotes.md are outdated as well...
> OK, noted.
>
> Regards
> --Hannes
>
>
>> On 2/13/2013 2:46 PM, H. Hirzel wrote:
>>> Hello Juan
>>>
>>>> On 2/8/13, Juan Vuletich<juan at jvuletich.org> wrote:
>>>>> Unfortunately, this means I broke the examples at
>>>>> https://github.com/hhzl/Cuis-Add-Ons/blob/master/UnicodeNotes.md .
>>> I have updated the file UnicodeNotes.md
>>>
>>> and I did a test class (attached) which shows how to read and write an
>>> UTF8 file.
>>>
>>> test5ReadWriteUtf8
>>>
>>> "see UnicodeNotes.md"
>>>
>>> "self new test5ReadWriteUtf8"
>>> | stream content byteArray byteArray2 |
>>>
>>> "read UTF8 Unicode file into internal string with NCRs"
>>> "for NCR see http://en.wikipedia.org/wiki/Numeric_character_reference"
>>>
>>> stream := (FileStream fileNamed: self class fileName) binary.
>>> byteArray := stream contentsOfEntireFile.
>>> content := String fromUtf8: byteArray.
>>> "NCRs were added to 'content' as needed"
>>>
>>> "write internal string back to UTF8 file with NCRs converted back to
>>> UTF8 chars"
>>> stream := (FileStream forceNewFileNamed: self class fileName2) binary.
>>> stream nextPutAll: (content asUtf8: true). "true means: convert NCRs
>>> back to UTF8"
>>> stream close.
>>>
>>> "compare the two versions: what is in file 'fileName' with what
>>> is in file 'fileName2'"
>>> stream := (FileStream fileNamed: self class fileName) binary.
>>> byteArray := stream contentsOfEntireFile.
>>> stream close.
>>>
>>> stream := (FileStream fileNamed: self class fileName2) binary.
>>> byteArray2 := stream contentsOfEntireFile.
>>> stream close.
>>>
>>> self assert: byteArray = byteArray2.
>>>
>>>
>>> BTW according to http://en.wikipedia.org/wiki/UTF8 'Official name and
>>> variants'
>>> UTF8 should all be uppercase.
>>>
>>> As of now I can use Cuis 4.1-1590 as is for my work which includes
>>> reading and writing UTF8 encoded text files (including HTML files). So
>>> as far as I am concerned further extended Cuis Unicode support might
>>> be put on the back burner for some time.
>>>
>>> However it might still be worthwhile considering maintaining a
>>> TextConverter and UTF8Converter class for compatibility and other
>>> reasons. More on this later.
>>>
>>> Thank you for the update
>>>
>>> https://github.com/jvuletich/Cuis/blob/master/UpdatesSinceLastRelease/1590-InvertibleUTF8Conversion-JuanVuletich-2013Feb08-08h11m-jmv.1.cs.st
>>>
>>> and
>>>
>>> kind regards
>>>
>>> Hannes Hirzel
>>>
>>>
>>> _______________________________________________
>>> Cuis mailing list
>>> Cuis at jvuletich.org
>>> http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org
>>
> _______________________________________________
> Cuis mailing list
> Cuis at jvuletich.org
> http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org
>
More information about the Cuis
mailing list