Formal Objection Submitter's name: Mark H Weaver Submitter's email: mhw@netris.org Relevant draft: R7RS draft 7 Type: defect Priority: major Relevant section of draft: Equivalence predicates Summary: Memoization is not possible in portable R7RS. Memoization requires an equality predicate that is guaranteed to return #f when its arguments are not operationally equivalent. The R6RS provides such a predicate (eqv?), but the R7RS draft does not. The R7RS draft 'eqv?' provides the needed guarantee only for IEEE-style flonums. For other inexact representations, the R7RS draft imposes a requirement that may conflict with the needed guarantee: it requires that if its arguments are non-IEEE-style inexact numbers and not both NaNs, then '=' and 'eqv?' must agree. Conflict Example 1: Signed zeroes ================================= Signed zeroes are meant to preserve the sign in case of a numerical underflow, so that the correct branch is selected along a branch cut. For example, although (= -1.0+0.0i -1.0-0.0i) => #t : (log -1.0+0.0i) => +pi*i (log -1.0-0.0i) => -pi*i This ensures that, in some numerical process, if variable 'z' approaches -1.0 from the negative imaginary side, (log z) will not abruptly change from -pi*i to +pi*i when the imaginary part of 'z' underflows. More generally, the usual convention is that (f +0.0) evaluates the limit of (f x) where 'x' approaches zero from above, and (f -0.0) evaluates the same limit from below. Although the signed zeroes are traditionally associated with IEEE 754, they are a useful concept that might be included in any other inexact numerical representation. In order to memoize 'log' properly, it must be possible to distinguish -1.0+0.0i from -1.0-0.0i, although those two numbers are '='. Conflict Example 2: Multiple precision arithmetic ================================================= Implementations may provide multiple representations of inexact numbers with different precisions, and primitive arithmetic operations may compute a result whose precision depends on the precision of the arguments. For example (sqrt x), where (= x 2.0), may yield a different result depending on the precision of 'x'. In order to memoize 'sqrt' properly, it must be possible to distinguish a low-precision representation of 2.0 from a high-precision representation of 2.0, although those two numbers are '='. History of eqv? on numbers ========================== The RRRS was the first Scheme report to define 'eqv?'. It defined 'eqv?' on inexact numbers as follows (this is actually from its definition of 'eq?', but 'eqv?' was defined as being the same as 'eq?' for inexact numbers): <http://dspace.mit.edu/bitstream/handle/1721.1/5600/AIM-848.pdf> (page 24) "Returns #!true if obj1 is identical in all respects to obj2, otherwise returns #!false. If there is any way at all that a user can distinguish obj1 and obj2, then eq? will return #!false." The R3RS explicitly defined EQV? in terms of operational equivalence: <http://people.csail.mit.edu/jaffer/r3rs_8.html> The eqv? procedure implements an approximation to the relation of operational equivalence. It returns #t if it can prove that obj1 and obj2 are operationally equivalent. If it can't, it always errs on the conservative side and returns #f. Here's the R3RS definition of operational equivalence: Two objects are operationally equivalent if and only if there is no way that they can be distinguished, using Scheme primitives other than eqv? or eq? or those like memq and assv whose meaning is defined explicitly in terms of eqv? or eq?. It is guaranteed that objects maintain their operational identity despite being named by variables or fetched from or stored into data structures. It then elaborates what this means for various types, including numbers: Two numbers are operationally equivalent if they are numerically equal (see =, section see section 6.5.4 Numerical operations) and are either both exact or both inexact (see section see section 6.5.2 Exactness). This elaboration is all that remained in the R4RS and R5RS. The R6RS was the first report to explicitly discuss IEEE floating point numbers, signed zeroes, and multiple precisions. The authors recognized the inadequacy of the R5RS definition of 'eqv?' in the presense of these new numbers, and changed the definition of 'eqv?' on inexact numbers to one based on operational equivalence. The R7RS working group has also recognized the inadequacy of the R5RS 'eqv?', and adopted language that provides the needed guarantee for IEEE-style floating point numbers. However, the current draft requires that implementations violate the needed guarantee for many types of non-IEEE numbers. Proposed solutions ================== Option 1: Adopt the definition of 'eqv?' on numbers attached below. Option 2: Remove the clauses of the R7RS 'eqv?' definition pertaining to non-IEEE-style inexact numbers, thus eliminating the conflict that prohibits implementations from providing the guarantee needed to implement portable memoization. Proposed R7RS definition for 'eqv?' on numbers ============================================== Objects obj_1 and obj_2 are /substantially different/ if and only if one of the following holds: * Obj_1 and obj_2 are both numbers, at least one is numerically equal to itself (see =), and they are not numerically equal (see =) to each other. * Obj_1 and obj_2 are not both numbers, and they are different (in the sense of eqv?). Inexact numbers z_1 and z_2 are /operationally equivalent/ if and only if for all procedures f that can be defined as a finite composition of the standard numerical operations specified in section 6.2.6, (f z_1) and (f z_2) either both raise exceptions or yield results that are not /substantially different/. The eqv? procedure returns #t if one of the following holds: [...] * Obj_1 and obj_2 are both exact numbers and are numerically equal (see =). * Obj_1 and obj_2 are both inexact numbers, at least one is numerically equal to itself (see =), and the implementation is able to prove that obj_1 and obj_2 are /operationally equivalent/. Implementations must be able to prove that two inexact numbers with the same internal representation are /operationally equivalent/. The eqv? procedure returns #f if one of the following holds: [...] * One of obj_1 and obj_2 is an exact number but the other is an inexact number. * Obj_1 and obj_2 are exact numbers for which the = procedure returns #f. * Obj_1 and obj_2 are inexact numbers, at least one is numerically equal to itself (see =), and the implementation is unable to prove that obj_1 and obj_2 are /operationally equivalent/. Rationale for the above definition ================================== The novel feature of this definition is the auxiliary predicate /substantially different/, which is needed to gracefully avoid circularities and the problems associated with NaNs, both of which plagued the R6RS definition. The circularity problem is addressed by defining /substantially different/ on numbers in terms of = instead of eqv?. The NaN problem (see below) is addressed by making sure that two numbers can only be /substantially different/ from each other if at least one of them is = to itself. Note that there is considerable freedom in how /substantially different/ is defined. As long as it is capable of making the most coarse distinctions between numbers, that's good enough, because it should always be possible to choose a procedure 'f' that amplifies even the finest distinction between any two inexact numbers that are not operationally equivalent. For example, suppose that in addition to the usual +0.0 and -0.0, an experimental numeric representation was able to distinguish (x/y+0.0) from (x/y-0.0) for any exact rational x/y. It would still be possible to amplify that distinction to an infinite difference by subtracting x/y and then taking the reciprocal. Note also that there is considerable freedom in the choice of procedures to allow in the construction of 'f'. The main requirements are that they are sufficient to amplify arbitrary fine distinctions into coarse ones that are /substantially different/, and that the procedures are pure functions, i.e. they must not use 'eqv?' or 'eq?' (directly or indirectly), they must not cause visible side effects, and their return values must depend only on their arguments. It needn't be a comprehensive set. The NaN problem =============== Alex Shinn has brought it to my attention that the R6RS definition of 'eqv?' has a fatal flaw. It is _never_ required to return #true when applied to inexact number objects, even if both arguments are the same object. Here is the relevant excerpt from section 11.5 of the R6RS (page 37 of the PDF): The eqv? procedure returns #t if one of the following holds: [...] * Obj1 and obj2 are both inexact number objects, are numerically equal (see =, section 11.7), and yield the same results (in the sense of eqv?) when passed as arguments to any other procedure that can be defined as a finite composition of Scheme’s standard arithmetic procedures. The problem has to do with NaNs. Since (= obj1 obj2) is needed for the above condition to apply, and a NaN object is not '=' to itself, it follows that (let ((x +nan.0)) (eqv? x x)) => unspecified. Now consider an arbitrary finite inexact number object 'z'. The R6RS only requires (eqv? z z) => #t if the above condition applies, which in turn requires that (eqv? (f z) (f z)) => #t for any procedure 'f' which is "a finite composition of Scheme's standard arithmetic procedures." This condition is never satisfied, because it is trivial to produce an 'f' that meets the above requirements and yet always returns +nan.0, e.g.: (lambda (z) (/ (* z 0.0) 0.0)) (lambda (z) +nan.0) _______________________________________________ Scheme-reports mailing list Scheme-reports@scheme-reports.org http://lists.scheme-reports.org/cgi-bin/mailman/listinfo/scheme-reports