Re: [Scheme-reports] Inconsistency of sequence copying procedures Alex Shinn (01 Jul 2012 21:09 UTC)

Re: [Scheme-reports] Inconsistency of sequence copying procedures Alex Shinn 01 Jul 2012 21:08 UTC

On Sun, Jul 1, 2012 at 9:24 AM, Marc Feeley <feeley@iro.umontreal.ca> wrote:
> Formal Comment
>
> Submitter's name: Marc Feeley
> Submitter's email: feeley at iro.umontreal.ca
> Relevant draft: r7rs draft 6
>
> Type: defect
> Priority: major
> Relevant section of draft: 6.7. Strings, 6.8. Vectors, 6.9. Bytevectors
>
> Summary: Inconsistency of sequence copying procedures
>
> R7RS has three vector-like data types: strings, vectors and
> bytevectors.  The inconsistencies in their properties and sequence
> copying procedures (names and API) make it harder than it needs to be
> for the programmer to remember.
>
> 1) self-evaluation inconsistencies
>
> Vectors and bytevectors have a similar external representation, yet
> bytevectors are self evaluating (page 46) and vectors are not self
> evaluating.  I do not care very much if they are, or if they are not
> self-evaluating, but it should be the same for vectors and
> bytevectors.
>
> 2) sequence copying procedures inconsistencies
>
> Subsequences of strings can be extracted using the procedure substring
> which takes 3 required parameters, i.e.
>
>   (substring string start end)
>
> There is also a string-copy procedure which takes a single required
> parameter and returns a copy of the string.  These procedures are
> related like so:
>
>   (string-copy string) = (substring string 0 (string-length string))
>
> Subsequences of vectors can be extracted using the procedure
> vector-copy only, which takes one required parameter and 3 optional
> parameters, i.e.
>
>   (vector-copy vector [start [end [fill]]])
>
> With a single parameter a copy of the whole vector is returned,
> otherwise a subsequence is returned.
>
> Subsequences of bytevectors can be extracted using the procedure
> bytevector-copy-partial, which takes 3 required parameters and behaves
> exactly like substring except for the fact that bytevectors are being
> processed and returned, i.e.
>
>   (bytevector-copy-partial bv start end)
>
> There is also a bytevector-copy procedure which takes a single
> required parameter and returns a copy of the bytevector.  These
> procedures are related like so
>
>   (bytevector-copy bv) = (bytevector-copy-partial bv 0 (bytevector-length bv))
>
> There are also 2 procedures to copy the content of a bytevector to
> another bytevector imperatively: bytevector-copy and
> bytevector-copy-partial!.
>
> I do not see a good reason for having different APIs (mix of required
> and optional parameters) and naming conventions for similar
> operations.
>
> The naming convention could be based on the one which has been in
> place for strings for a long time, i.e. substring, subvector, and
> subbytevector for extracting subsequences.  The same API should
> be used consistently for all the procedures, in other words:
>
>   (substring     string     [start [end [fill]]])
>   (subvector     vector     [start [end [fill]]])
>   (subbytevector bytevector [start [end [fill]]])
>
> Note that it reads even better if bytevector operations are named using
> the SRFI-4 naming convention:
>
>   (substring   string   [start [end [fill]]])
>   (subvector   vector   [start [end [fill]]])
>   (subu8vector u8vector [start [end [fill]]])
>
> The functional copy procedures would remain for consistency:
>
>   (string-copy   string)   = (substring   string)
>   (vector-copy   vector)   = (subvector   vector)
>   (u8vector-copy u8vector) = (subu8vector u8vector)
>
> The imperative partial copy procedure defined for bytevectors
>
>   (bytevector-copy-partial! from start end to at)
>
> should exist for other sequences too.  Better consistency would be
> achieved by exchanging the order of the destination and source, in
> order to benefit from the same pattern of optional parameters as the
> other procedures:
>
>   (substring-move!   to at from [start [end [fill]]])
>   (subvector-move!   to at from [start [end [fill]]])
>   (subu8vector-move! to at from [start [end [fill]]])
>
> I don't think the imperative copy operation performed by
> bytevector-copy! is sufficiently common to be included in R7RS (and
> applied to the other sequence types).  In any case the same operation
> could be obtained by using a ...-move! procedure with an additional
> constant 0 used for the "at" parameter.
>
> Finally, I think the handling of the fill parameter is questionable.
> It is a bad idea for the fill parameter to have a default.  When fill
> is absent, it should be an error when start and end are not within the
> bounds of the sequence.  Otherwise, some index calculation errors
> (off-by-one on "end") may go unnoticed.  Moreover, when it is
> supplied, the fill should also be used when start is less than 0, for
> consistency with the case where end is greater to the length of the
> sequence.

Thanks, ticket #437 filed.

--
Alex

_______________________________________________
Scheme-reports mailing list
Scheme-reports@scheme-reports.org
http://lists.scheme-reports.org/cgi-bin/mailman/listinfo/scheme-reports