[Cuis] Ropes & Unicode
H. Hirzel
hannes.hirzel at gmail.com
Sat Feb 16 12:00:19 CST 2013
Interesting observation, Ken
This may be considered a confirmation to move on with the
implementation of Ropes.
According to http://static.rust-lang.org/doc/0.5/std/rope.html
"Ropes are a high-level representation of text that offers much better
performance than strings for common operations, and generally reduce
memory allocations and copies, while only entailing a small
degradation of less common operations."
.....
"In addition, the tree structure of ropes makes them suitable as a
form of index to speed-up access to Unicode characters by index in
long chunks of text."
And the string basic type in Rust contains UTF8 encoded characters
http://dl.rust-lang.org/doc/0.3/tutorial.html (in version 0.3)
Should the Rust language Ropes API
http://static.rust-lang.org/doc/0.5/std/rope.html#type-rope
be taken as a model for the Cuis implementation?
So far there are 10 methods in the Cuis Ropes implementation
Rope selectors
a Set(#asString
#,
#stringRepresentation
#first
#doesNotUnderstand:
#last
#asText
#printOn:
#copyReplaceFrom:to:with:
#printString
#asRope)
Interesting candidates from the Rust language Rope API are
Function append_char - Add one char to the end of the rope
Function prepend_char - Add one char to the beginning of the rope
Function append_str - Add one string to the end of the rope
Function char_at - The character at position pos
Function char_len - The number of character in the rope
Function cmp - Compare two ropes by Unicode lexicographical order.
Function eq - Returns true if both ropes have the same content
(regardless of their structure), false otherwise
Function ge - # Arguments
Function gt - # Arguments
Function iter_chars - Loop through a rope, char by char, until the end.
I would say a high priority is
a) Rope construction (i.e. appending and prepending instances of
Character and String)
b) streaming over a Rope. This is a typical operation when you deal
with a large text file.
c) finding a subrope
And of course performance tests with random data to see where it
starts to be more efficient to deal with Ropes than Strings.
--Hannes
On 2/16/13, Ken Dickey <Ken.Dickey at whidbey.com> wrote:
> BTW,
>
> Doing a web search on +Rope +Unicode, I found that Mozilla is developing a
> programming language called Rust which uses Ropes with packed UTF-8
> strings.
>
> The internal documentation suggests heavy users of strings use ropes
> instead.
>
> Note:
> http://static.rust-lang.org/doc/0.5/std/rope.html
>
> FYI,
> -KenD
>
> _______________________________________________
> Cuis mailing list
> Cuis at jvuletich.org
> http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org
>
More information about the Cuis
mailing list