[Cuis] Ropes & Unicode

H. Hirzel hannes.hirzel at gmail.com
Sat Feb 16 12:00:19 CST 2013


Interesting observation, Ken

This may be considered a confirmation to move on with the
implementation of  Ropes.

According to http://static.rust-lang.org/doc/0.5/std/rope.html

"Ropes are a high-level representation of text that offers much better
performance than strings for common operations, and generally reduce
memory allocations and copies, while only entailing a small
degradation of less common operations."

.....
"In addition, the tree structure of ropes makes them suitable as a
form of index to speed-up access to Unicode characters by index in
long chunks of text."


And the string basic type in Rust contains UTF8 encoded characters
http://dl.rust-lang.org/doc/0.3/tutorial.html   (in version 0.3)

Should the Rust language Ropes API
    http://static.rust-lang.org/doc/0.5/std/rope.html#type-rope
be taken as a model for the Cuis implementation?

So far there are 10 methods in the Cuis Ropes implementation

    Rope selectors
     a Set(#asString
              #,
              #stringRepresentation
              #first
              #doesNotUnderstand:
              #last
              #asText
              #printOn:
              #copyReplaceFrom:to:with:
              #printString
              #asRope)

Interesting candidates from the Rust language Rope API are

Function append_char - Add one char to the end of the rope
Function prepend_char - Add one char to the beginning of the rope
Function append_str - Add one string to the end of the rope

Function char_at - The character at position pos
Function char_len - The number of character in the rope
Function cmp - Compare two ropes by Unicode lexicographical order.
Function eq - Returns true if both ropes have the same content
(regardless of their structure), false otherwise
Function ge - # Arguments
Function gt - # Arguments
Function iter_chars - Loop through a rope, char by char, until the end.

I would say a high priority is

a) Rope construction (i.e. appending and prepending instances of
Character and String)
b) streaming over a Rope. This is a typical operation when you deal
with a large text file.
c) finding a subrope

And of course performance tests with random data to see where it
starts to be more efficient to deal with Ropes than Strings.

--Hannes

On 2/16/13, Ken Dickey <Ken.Dickey at whidbey.com> wrote:
> BTW,
>
> Doing a web search on +Rope +Unicode, I found that Mozilla is developing a
> programming language called Rust which uses Ropes with packed UTF-8
> strings.
>
> The internal documentation suggests heavy users of strings use ropes
> instead.
>
> Note:
> 	http://static.rust-lang.org/doc/0.5/std/rope.html
>
> FYI,
> -KenD
>
> _______________________________________________
> Cuis mailing list
> Cuis at jvuletich.org
> http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org
>




More information about the Cuis mailing list