[Scheme-reports] Padding/placeholders (hashes) in numerical syntax Peter Bex 10 Aug 2011 19:09 UTC

Hello all,

While we're on the topic of numerical stuff in the standard, I'd like
to ask why the "padding"/placeholder digits for numbers (# characters
instead of digits inside a number) is kept around.

I've always wondered what the practical use of it is.  Apparently it's
supposed to be a way of saying "I know that these positions exist in
the number (the number is at least this precise), but I don't know
the exact values of these positions".  I've never seen this used in
practice (but I'd appreciate pointers to code using this to good effect)

Supporting this syntax seems pretty easy but I've found it to be
quite hairy in practice; there are various corner cases you need
to deal with specially, and it interacts with the decimal syntax
since the dot can be anywhere between the hashes or digits.
Hashes are only allowed as trailing characters, but only following
*at least one other* digit (eg, ".#" is invalid).  The hashes, when
present, cause the number to be regarded as an inexact number, but
*only* when it *isn't* prefixed with "#e".

The standard doesn't even say what an implementation is supposed to
fill in for the value.  Some (most?) implementations seem to read
a # as a zero, but Scheme48 reads it as a 5, *even when the radix is
not 10* (I'd have expected it to use radix/2 since that's a good
"median" value for unknown digits; for example, 1.23### is "rounded"
to 1.23555, but for some reason #x1# is not seen as #x18 but as #x15.
It also doesn't allow binary numbers with # digits, which is probably
a bug since the <uinteger 2> production seems to require it).

Chibi accepts placeholders as digits with a value of radix/2, which
I think is more intuitive.

R6RS cleaned up the numerical syntax by removing support for padding
characters.  R7RS followed R6RS in *adding* new productions for
infinities and NaN, but didn't follow in *removing* the  #  from
the <uinteger N> and <decimal 10> productions.
To me, padding syntax is a good example of feature piling; it
complicates reader implementation without providing any extra power
to the programmer.

I understand removing R5RS features breaks compatibility and isn't
an easy thing to do, but it's been done already with the
transcript-{on,off} procedures which virtually nobody uses.
Schemes that want to continue supporting it can still do so, since the
padding syntax can be added onto R6RS-like syntax without problems, but
those who want to drop the historical cruft could still choose to do so.

I think there's also a bug in the R7RS BNF for numbers; it doesn't seem
to allow for a complex number consisting of a real and imaginary
component which are infinite.  AFAICT, only an <ureal> can follow the
sign after the first number in the rectangular format.
Several Schemes I've tested this with simply allow "+inf.0+inf.0i" for
this.  The R6RS BNF seems to allow it, though (it's handled specially).

Cheers,
Peter
--
http://sjamaan.ath.cx
--
"The process of preparing programs for a digital computer
 is especially attractive, not only because it can be economically
 and scientifically rewarding, but also because it can be an aesthetic
 experience much like composing poetry or music."
							-- Donald Knuth

_______________________________________________
Scheme-reports mailing list
Scheme-reports@scheme-reports.org
http://lists.scheme-reports.org/cgi-bin/mailman/listinfo/scheme-reports