Re: [Scheme-reports] Bytevectors should be called u8vectors Jussi Piitulainen (03 Jul 2012 15:18 UTC)

Re: [Scheme-reports] Bytevectors should be called u8vectors Jussi Piitulainen 03 Jul 2012 15:17 UTC

Marc Feeley writes:

> I have a feeling that the use of "bytevector" in the names of
> procedures in R7RS small is due to WG2 concerns of extending the set
> of operations on bytevectors to 16, 32, etc bit width access
> operations.  Alaric Snell-Pym and others have pointed out that
> "blob" is a better name for such a data type.  I am not saying I
> prefer it, but perhaps that's the name WG2 and the community will
> prefer for that data type.  So committing to the name "bytevector"
> in R7RS small is premature.  On the other hand, you say that in your
> WG2 bytevector proposal, you were proposing to support u8vectors and
> the other SRFI-4 names.  So I don't see your position of prefering
> to standardize in R7RS small the "bytevector" names instead of the
> "u8vector" names.

Apologies if I mistake, but I think John distinguishes meaningfully
between the bytevector-* interface and the [fus]{8,16,32,64}vector-*
interface (somewhere in this thread) and this distinction should be
appreciated more. These are not alternative names for the same thing:
bytevector-* offsets are in bytes, the other kind offsets in the units
indicated by the name. The bytevector interface can interpret binary
formats byte by byte in varying units. The other interface fixes an
interpretation as a homogeneous vector, and the interfaces overlap in
the 8-bit cases.

Let v be #u8(a, b, c, d, e, f, g, h) for suitable integers a, b, ...
Let w be the same memory as an u16vector, disjoint type or not.

(bytevector-u8-ref v 3)  => d as unsigned-int8
(bytevector-s8-ref v 3)  => d as signed-int8

(u8vector-ref v 3)       => d as unsigned-int8

(bytevector-u16-ref w 3) => d, e as unsigned-int16
(bytevector-u16-ref w 4) => e, f as unsigned-int16

;; no access to d, e with u16vector-ref, er, (u16vector-ref w 3/2)?
(u16vector-ref w 2)      => e, f as unsigned-int16
(u16vector-ref w 3)      => g, h as unsigned-int16

Hm. I'm not sure if it makes much practical sense for u8vector and
bytevector to be disjoint types. For the other homogeneous types a
distinct written representation would be nice, at least in a REPL.

Both interfaces seem important to me.

_______________________________________________
Scheme-reports mailing list
Scheme-reports@scheme-reports.org
http://lists.scheme-reports.org/cgi-bin/mailman/listinfo/scheme-reports