Re: [Scheme-reports] ballot question #229: EQV? and NaN

Show/hide message thread
Re: [Scheme-reports] ballot question #229: EQV? and NaN will@ccs.neu.edu (29 Sep 2011 22:32 UTC)
Re: [Scheme-reports] ballot question #229: EQV? and NaN John Cowan (30 Sep 2011 00:10 UTC)
Re: [Scheme-reports] ballot question #229: EQV? and NaN will@ccs.neu.edu 29 Sep 2011 22:31 UTC
John Cowan quoting me:
> > Bradley Lucier voted for and described the R6RS-compatible (and most
> > sensible) semantics for EQV? and NaN.
>
> I concede that I was wrong to say that "same" is the R6RS semantics.

And I concede that I was wrong to say that "same*" is the R6RS semantics.

The important point, however, is that the R6RS semantics allows the "same*"
semantics, but the "same" semantics described in the ballot question does
not allow the "same*" semantics.  Since the "same*" semantics is the most
reasonable semantics for an implementation, it would be better to adopt
the R6RS semantics than the "same" semantics.

> However, on closer examination I think that R6RS leaves the semantics of
> EQV? comparison between any NaN and any inexact number (including NaN)
> wide open.

The R6RS leaves that comparison *partly* open, but not "wide open."
According to R6RS section 11.5, EQV? must return #f if

    Obj1 and obj2 yield different results (in the sense of eqv?) when
    passed as arguments to any other procedure that can be defined as
    a finite composition of Scheme’s standard arithmetic procedures.

Consider this:

> (define (self-equal? x) (= x x))
> (self-equal? +nan.0)
#f
> (self-equal? 0.0)
#t
> (eqv? #f #t)
#f

Hence the R6RS requires (eqv? +nan.0 0.0) to evaluate to #f.

> > Yet ballot question #229 was written as though +nan.0 denotes a single
> > value.  It does not.
>
> Unless someone can meet the demand above, +nan.0 is only a single value in
> Scheme terms because of the identity of indiscernibles.

The identity of indiscernibles is valid only in certain contexts,
such as final algebras.  No extant standard for Scheme requires
all NaNs to be collapsed onto a single value.  If the R7RS were
to require that, every arithmetic operation on IEEE floating point
numbers would have to be followed by a NaN test so any NaN result
could be converted to the canonical value.  That would cost a lot
of performance in numerical code, and its only "benefit" would be
that Scheme would treat NaNs differently from other programming
languages.  That's a pseudo-benefit, achieved at real cost.

Although none of Scheme's standard arithmetic procedures are
*required* to produce different results when applied to NaNs of
different bit patterns, it's easy to show that some of Scheme's
standard arithmetic procedures are *allowed* to produce different
results when applied to NaNs of different patterns, and it's also
easy to show that this is standard practice.

Here's another example:

> (begin (define nan1 (- +inf.0 +inf.0))
         (define nan2 (flasin 2.0)))
> (eqv? nan1 nan1)
#t
> (eqv? nan2 nan2)
#t
> (eqv? nan1 nan2)
#f
> (list nan1 nan2)
(+nan.0 +nan.0)

The behavior shown above can be duplicated in Larceny, Ikarus, and
Petite Chez on an old Pentium machine running a certain version of
Linux.  It might behave differently on machines that have different
processors or different trigonometric libraries, but all we need is
the fact that it's an allowed behavior in R6RS Scheme and actually
occurs on lots of machines because that's how the native floating
point hardware and libraries behave on those machines.

The IEEE-754 and IEEE-754R standards require arithmetic operations
to propagate NaN values (because a NaN's payload can provide a clue
to its origin).  Collapsing different NaN values onto a single
canonical value would require implementations to ignore that part
of the IEEE standards.

> > The second technical mistake in the wording of ballot question #229
> > is its assertion that the "same" semantics "is what R6RS specifies."
> > That's just not true.  R6RS specifies the "same*" semantics for which
> > Bradley Lucier voted.
>
> I agree with the first point, but dispute the second, as shown above.

You're right.  The important point is that the R6RS semantics *allows*
the "same*" semantics, which is the most sensible and useful semantics.

The R6RS was the first standard for Scheme that tried to explain how
implementations of Scheme that use IEEE floating point numbers should
behave.  As such, it was the first standard for Scheme that gave any
real thought to how EQV? should behave on NaN arguments.  That led to
a careful rewrite of the specification for EQV? so it would *allow*
(but not *require*) the "same*" semantics that is generally recognized
as the most reasonable behavior for EQV? on NaNs.

> I grant that we must revote because of my errors in description.
> However, my arguments for the "same" semantics, namely that +nan.0 is
> a single Scheme value no matter how many representations it has, and
> that it is appropriate to be able to use EQV? to determine if a NaN is a
> member of a data structure or not, still stand.

Then we'll have to argue about this for a while.  I'm not on the WG1
committee, so I don't get a vote.  Arguing is the best I can do.

Aubrey Jaffer wrote:
> There is only one NaN in SCM.  So the behavior of NaNs in SCM is
> independent of hardware and libraries:
>
>   (eqv? +nan.0 (/ 0.0 0.0)) ==> #t

That's fine.  Although the R5RS does not allow the SCM behavior shown
above, the R6RS allows it and the R7RS should also allow it.

On the other hand, the R7RS should not disallow the "same*" behavior.
The IEEE-754 and IEEE-754R standards allowed multiple values of NaN
because distinct values of NaN are useful for debugging numerical
codes.  The R7RS should not overrule the IEEE standards just to make
all of Scheme's inexact arithmetic run slower.

Will

_______________________________________________
Scheme-reports mailing list
Scheme-reports@scheme-reports.org
http://lists.scheme-reports.org/cgi-bin/mailman/listinfo/scheme-reports