Re: [Scheme-reports] mutable unicode strings
Per Bothner 02 Jul 2014 04:40 UTC
On 07/01/2014 08:29 PM, Jim Rees wrote:
> Java has StringBuffer, Python has StringIO, R7RS has
> open-output-string. Add seek support to ports in general and there
> you have it.
"Adding seek support to ports in general" is much more complicated
than what I'm proposing.
First of course ports in general do not support seek, so you're
limited to a feature of *some* ports.
File APIs that do support seeking typically only work for binary
files, at least if you use integer-index-based addressing.
Which doesn't help with for strings.
Some APIs allow seeking based on "cookies". However, they don't
allow insertion and deletion. Once you allow insertion and deletion
you have all kinds of extra complications and possible overheads.
I.e. consider Emacs "marker" objects. I don't think we want to
require that a seek "cookie" be implemented as a marker, as they
are both complex to implement and heavy-duty.
A seekable writable textual Unicode port is a fairly complex beast.
One reason is that "writing" in the "interior" of a string (as
opposed to just appending at the end) is a near-useless feature unless
you can insert/delete/replace.
> This gives implementors more flexibility as to
> speed/space trade-offs rather than make the string type all things to
> all people.
I'm not sure what you're trying to say here. Seekable ports opens
up a tremendous amount of conceptual complexity if you want them to
have anywhere close to the functionality of Java StringBuffer.
(What I'm proposing is basically is the functionality of Java
StringBuffer in two new Scheme functions.)
What I'm proposing is a very simple extension that is trivial to
implement and adds no extra overhead to strings, beyond requiring
an indirection pointer (from the string object to a buffer so the
latter can be re-allocated). And that indirection seems hard to
avoid in any implementation that supports string-set! in combination
with full 20-bit Unicode, unless you store each character in at
least 3 bytes. Does any implementation do that?
--
--Per Bothner
per@bothner.com http://per.bothner.com/
_______________________________________________
Scheme-reports mailing list
Scheme-reports@scheme-reports.org
http://lists.scheme-reports.org/cgi-bin/mailman/listinfo/scheme-reports