Re: [Scheme-reports] fresh empty strings John Cowan (24 Jan 2012 18:58 UTC)
Re: [Scheme-reports] fresh empty strings Per Bothner (24 Jan 2012 20:07 UTC)

Re: [Scheme-reports] fresh empty strings Per Bothner 24 Jan 2012 20:05 UTC

On 01/24/2012 10:58 AM, John Cowan wrote:
> Per Bothner scripsit:
>
>> (Kawa does use separate Java classes for mutable and immutable strings,
>> though it didn't used to - and I'm thinking about adding another class to
>> support O(1) indexing of strings containing non-basic-plane characters.)
>
> There's a case to be made for using three classes for the Latin-1, BMP, and
> full Unicode repertoires.  I know of one (non-Scheme) package that does this,
> plus using java.lang.String for immutable strings.
>
> On the other hand, in fairly capacious environments like the desktop,
> it may be the Right Thing to use only 32-bit mutable strings, especially
> considering how much more common string literals are in typical Scheme code.

First, of course you have non-BMP string literals, which you'd also want
O(1) indexing.  Secondly, having the data buffer be 16-bit Unicode (either
a java.lang.String or an array of 16-bit chars) may be desirable for
interoperality - it makes the toString operation cheap.  If so, a
separate indexing table or cache may make sense.  Assuming most string
indexing will be increasing by one, a single-element cache of the
most recent (charpoint-index, buffer-index) may work well, though I don't
like a mutable cache when the string is immutable.  For immutable strings,
having an array that maps (say) every 16th code point to its buffer position
may be a good compromise between space and time.  One can do the same for
mutable strings, but of course updates are more complicated to make
efficient.
--
	--Per Bothner
per@bothner.com   http://per.bothner.com/

_______________________________________________
Scheme-reports mailing list
Scheme-reports@scheme-reports.org
http://lists.scheme-reports.org/cgi-bin/mailman/listinfo/scheme-reports