Re: [Scheme-reports] mutable unicode strings
Sam Tobin-Hochstadt 03 Jul 2014 12:56 UTC
On Wed, Jul 2, 2014 at 5:04 PM, Per Bothner <per@bothner.com> wrote:
> On 07/02/2014 12:34 AM, Sam Tobin-Hochstadt wrote:
>> Racket stores strings as arrays of 4-byte characters.
>
> On 07/02/2014 07:06 AM, Michael Montague wrote:
>> Foment uses 4 bytes to store each character in a string.
>
> The follow-up question is: When you have a Scheme reference
> to a string, presumably that is a pointer to a string object header
> containing at least a size and maybe some kind of typecode. The
> actual data characters can be stored right next to the header (inlined),
> or the header can have a pointer to a separate array of data characters
> (indirected). The latter makes it easy to resize the array by re-allocating
> it; doing it inlined makes resizing difficult.
>
> I would argue that the ability to resize a mutable string is so important
> that it justifies the slight overhead of indirection. (Indirection may
> also be easier to implement, depending on how your memory allocation works.)
>
> In other words: Supporting string-replace! has no extra overheads beyond
> requiring an "indirect" representation. The latter is forced anyway if
> you support mutability and full Unicode, unless you use 3- or 4-byte characters.
> Even if you do use 3- or 4-byte characters, indirection is worth it, because
> mutable fixed-size strings is an essentially-useless feature.
Racket provides mutable fixed-size strings with constant-time
string-ref and string-set! and no resizing. I think everyone agrees
that this is not a particularly great spot in the design space, but
it's where we ended up starting from earlier Scheme designs.
Sam
_______________________________________________
Scheme-reports mailing list
Scheme-reports@scheme-reports.org
http://lists.scheme-reports.org/cgi-bin/mailman/listinfo/scheme-reports