The behavior of eq? and eqv? on procedure arguments affects quite a few optimizations, but three of those optimizations have been especially important in the history of Lisp, Scheme, and related languages. Understanding these three optimizations would go a long way toward improving our discussion of eq? and eqv?. The three optimizations I'm going to discuss are: A. implementing eq? by a simple pointer comparison B. implementing eqv? by examining the representations of procedures C. implementing delayed boxing for procedures Optimization A is explicitly sanctioned by the IEEE standard, the R4RS, and the R5RS. Many implementations of Scheme, including almost all compiled implementations, perform this optimization. Optimization B is very nearly explicit in the rationale given by the IEEE standard, the R4RS, and the R5RS for the behavior of eqv? on procedures and literals. Optimization C is based on the eta rule of lambda calculus, and is often implemented by compilers for languages based on the lambda calculus. It's similar to delayed boxing for floating point numbers, which is probably the most important technique available for improving the performance of floating point calculations in Scheme. Delayed boxing for numbers is the main reason eq? and eqv? are allowed to behave differently on numerical arguments. (I should mention that aggressively naive delayed boxing is not safe for space efficiency. This is more likely to be a problem with delayed boxing for numbers than for procedures, and those who write optimizing compilers will probably know of the problem and a possible solution, so I won't mention this further.) Although some Scheme compilers have performed optimization C, most current compilers do not. One reason this optimization is implemented so rarely by Scheme compilers is that, beginning with the R3RS, the behavior of eq? and eqv? on procedures has been constrained in ways that make optimization C hard to reconcile with optimization A. One key constraint is that, according to the R3RS: ...it is guaranteed that eqv? can recognize a procedure created at a given time by a given lambda expression as "being itself." This is useful for applications in which procedures are being used to implement objects with local state. That constraint can be satisfied by combining optimization B with optimization C. If eqv? compares two procedures, each represented as closures, by looking to see whether their code pointers coincide *and* their environment pointers coincide, and the delayed boxing of optimization C is done carefully (see below), then eqv? will always recognize the equality of every procedure with itself, even if delayed boxing causes that procedure to be represented by two or more different pointers. For that combination of optimizations B and C to satisfy the constraints of the R3RS and subsequent reports, the delayed boxing must be done carefully. If a closure consists of a code pointer and an environment pointer, then delayed boxing can arrange for the same environment pointer to be used for each delayed boxing. A simpler alternative is to use flat closures, although that requires eqv? to look at every slot of the environment part of the closure. Combining optimizations B and C has been possible under every Scheme report ever written. That combination is seldom seen, however, because the R3RS and subsequent reports required eq? as well as eqv? to recognize every procedure as "itself". That was the purpose of the (let ((p (lambda (x) x))) (eq? p p)) ==> #t example that was introduced by the R3RS and retained by the R4RS, IEEE, and R5RS standards. * * * That example was *not* intended to say eq? and eqv? must behave the same on procedures. How do I know? Because Jonathan Rees and I worked together on this. Although Jonathan was primary editor of the R3RS, I was the one who suggested restating the semantics of eqv? in terms of operational equivalence, and I wrote most of the prose for that specification. I was careful *not* to list procedures as a type on which eq? and eqv? would be guaranteed to behave the same. Furthermore, Jonathan and I added the notes that were relabelled as rationales in the R4RS, IEEE, and R5RS. That's how I know those rationales should *not* be misinterpreted to say eq? and eqv? are required to behave the same on procedures. If I recall correctly, the example above was a late addition to mollify those who regarded eq? (rather than eqv?) as the primary arbiter of object identity in Scheme. Those people didn't much care whether eqv? treated (lambda (x) x) and (lambda (y) y) as the same procedure, because they didn't regard eqv? as the primary arbiter of identity; they regarded eqv? as a half-measure along the way to equal?. For my part, I regard eq? as a fast but slightly broken version of eqv? that's guaranteed to behave the same as eqv? on some kinds of arguments (which were listed explicitly in the R3RS/R4RS/IEEE/R5RS/R6RS) but not others (only some of which were listed explicitly). I don't remember whether the failure to list procedures among the types on which eq? and eqv? might not behave the same was deliberate or an oversight, but I suspect it was deliberate. The rule of consensus was already beginning to be interpreted as a requirement for unanimity among the 17 listed authors, so editors of the R*RS series were afraid to add clarifications that might spur someone to raise an objection. In the R4RS/IEEE/R5RS, that semantics of procedure identity was formalized by the (conceptual) location tags. Although Richard Kelsey was primary editor of the R4RS, I came up with the idea of explaining the semantics of procedure identity in terms of these (conceptual) location tags, and I drafted the prose for the new specifications of eq? and eqv? as well as the paragraph that added location tags to the specification of lambda expressions. That history is why I can say with some authority that the R3RS, R4RS, IEEE, and R5RS standards do not require eq? and eqv? to behave the same on procedure arguments. They are constrained more tightly than I would like, but they are not constrained to behave the same. * * * This has been a long digression, but I believe the digression was needed to counter a common misconception. Having disposed of that misconception, I can now summarize the combinations of optimizations A, B, and C that are practical under constraints imposed by the R5RS, the R6RS, and by my proposed semantics for eq? and eqv?. R5RS R6RS my proposal ABC no no yes AB yes no yes AC no yes no BC yes yes yes A yes yes yes B yes yes yes C no yes no none yes yes yes The table above represents a judgement call. A sufficiently weird implementation of Scheme might implement a combination I regard as impractical. Indeed, some combinations I count as practical are pretty weird. The R6RS, for example, allows both eq? and eqv? to return false whenever either argument is a procedure, and that weird implementation justifies several "yes" items in the R6RS column. Implementing eq? by making it the same as eqv? is a lot less weird, and I assumed eq? is the same as eqv? for all rows that don't include optimization A. For rows "C" and "none", I assumed eq? is the same as eqv?, and that eqv? uses a simple pointer comparison when comparing procedures. * * * This message is long enough, and I have meetings to attend. In a subsequent message, I will explain why code that uses eq? to compare procedures usually works, even though it's probably incorrect in theory. I will also explain what implementors have been doing and can do in the future to allow that kind of incorrect legacy code to work in spite of itself. Will _______________________________________________ Scheme-reports mailing list Scheme-reports@scheme-reports.org http://lists.scheme-reports.org/cgi-bin/mailman/listinfo/scheme-reports