[Scheme-reports] WG1 Scheme as a language for CS1 Vincent Manis 09 May 2011 01:45 UTC

The small version of R7RS is supposed to be suitable as a teaching language. Now that the draft has been out for review for a few weeks, I feel ready to comment on the degree to which the WG1 language meets that requirement.

My criterion is: would the WG1 language be usable, unaugmented, for about 80 hours of instruction taking a student from essentially no background in programming and computer science to writing and testing substantial multi-module programs using interesting data structures?  Similarly, could someone use a book that used only the WG1 language to learn programming on their own? (Disclaimer: I once co-authored an introductory computer science text that used Scheme. That was getting on for 20 years ago, I have no intention of doing it again.)

By the way, if I were designing such a course today, I would include a lot of material on user interfaces and web services. The frameworks needed for this are of course out of scope for a core programming language; so a hypothetical text would have a name something like `Introduction to Computer Science using WG1 Scheme and the SuperWhizzy Web Framework'. The fact that I'm not commenting on the fictitious SuperWhizzy doesn't mean I think it's unimportant.

On the whole, I think the Draft succeeds admirably. While there is still a bit of work to do in cleaning up and tightening the specification, on the whole, the WG1 language has pretty much everything you would need in teaching people how to program. Some parts of the language would be pretty hard going for a non-mathematical student, e.g., the various division procedures; but in that case, a nice façade is provided (QUOTIENT, MODULUS, and REMAINDER). I doubt that I would cover every nook and cranny of the Draft if teaching an extended sequence in Scheme, but the match-up is pretty good.

There are, however, two areas in which I believe the language specified by the Draft is lacking. I am formally requesting that a ticket be created for each of these items, so that they can be considered.

1. WG1 quite properly eschews any kind of formatting language. Unfortunately, the Draft as provided makes it impossible to display a number with a fixed number of decimal places. Not only does this make output look ugly, but it causes the sin of False Accuracy (`Captain, the Enterprise will enter the black hole in exactly 17.867 seconds').

This may sound like a cosmetic issue, but I have sat through a LOT of computer science curriculum meetings in my life, and it's exactly the kind of detail someone opposed to Scheme will drag out. Similarly, if someone is considering learning Scheme, this `trivial' detail can be enough to discourage them.

I would therefore like to suggest augmenting NUMBER->STRING to accept a keyword indicating the kind of conversion expected. So

  (number->string 1.2345 'fixed 2)  => "1.23"

This doesn't break any existing programs, because a non-numeric radix isn't currently allowed. WG2 extensions to this could allow for E-notation, polar representation of complex numbers, etc. FIXED is the only conversion specifier I am requesting in WG1. (I'm not wedded to the particular symbol FIXED.)

Alternatively, WG1 could provide another procedure, e.g.

  (number->string-custom-format 1.2345 'fixed 2)

(I don't have time to think of a shorter name.)

This doesn't do the entire numeric formatting job (for example, it doesn't offer the ability to set a field width). But it does enough to be useful in real programs.

One might argue that NUMBER->STRING(-CUSTOM-FORMAT) can be written in Scheme. Indeed it can; so can SQRT, so one can just give learners some code. But this violates the pedagogical principle that you never show code that the student can't understand, and students often have serious trouble with understanding representations and conversions anyway. (One student, in a second-year course, once asked me, `I understand how the program converts from the string "01A2" to hexadecimal, but how exactly do you convert the result to binary?')

So I hope that a facility like this (present in Fortran I) can be added.

2. [The controversial issue] WG1 has already taken a vote and decided not to include random number generation in the language. I very respectfully disagree with that decision, and this is my opportunity to say why I think that the WG should revisit its decision.

First, the need for the facility. We all, of course, like to play games. A standard early exercise in my introductory programming courses was Matching Pennies, aka Coinflip. This takes something like 20 lines of code, and is a coherent, interesting (for five minutes or so) application, where beginners can see an entire program that does something useful. In later years, I tended to think that a completely redesigned CS1 course ought to use nothing but games as its domain (running a SE training program for a videogame development company can have that effect on you), rather than the WG1-forced rule of banishing games almost completely.

Of course, random numbers aren't just useful for games. They're useful in various algorithms (e.g., in vector Quicksort, where one technique swaps the pivot element with a randomly-chosen element to destroy ordering effects), and in testing and test data generation.

Indeed, one can give the learner a random number generator (`you are not expected to understand this'---http://www.cs.bell-labs.com/who/dmr/odd.html), but why not ask them to type in SQRT as well? Besides, typing errors in getting the random number generator from a book or other place into a file can result in very obscure bugs for students to ferret out. Even cut&paste can get mangled.

I believe that a very basic facility for generating random numbers is essential in a core language. I further believe that, if WG2's random number facilities provide all the features found in SRFI-27 (or maybe even more features), many casual users of random numbers will prefer a more spartan API. The proposed facility therefore serves a purpose even in a WG2 implementation, just as both (READ port) and (READ) do in core Scheme.

Second, the proposed API. This is taken from Dartmouth BASIC (the 1968 edition), and should cause no surprise to anyone. There is an optional module basicrandom, which hides a pseudorandom number generator that approximates a uniform distribution, and exports the following.

  (RANDOMIZE! [n])    basicrandom procedure

  Uses n in an implementation-dependent way to seed the random-number generator deterministically. If n is omitted, the implementation may either choose a value for n, non-deterministically, or report an implementation violation.

  (RANDOM-REAL)       basicrandom procedure

  Returns the next randomly-chosen inexact number in the semi-closed interval [0.0, 1.0).

  Implementations should document:
    - whether evaluating (RANDOMIZE!) results in an implementation violation
    - the statistical properties of and/or the technique used in the random number generator.

(Fussy technicality: Dartmouth BASIC did not allow RANDOMIZE 12345, but only RANDOMIZE, according to the manual.)

Third, the rationale. I'd be curious to know if anyone has read this far in the message :)

Names are given for attribution only, and refer to messages on the WG1 mailing list last November. No claim is made that the persons cited agree or disagree with the corresponding quoted text as of now.

`If I'm understanding correctly (medium probability :-) the dispute, apart from whether to say anything at all about randomness, is between the complexity needed to get it exactly right for complicated uses and the simplicity to make it usable by people like me. This could be resolved by specifying a simple interface on top of the complicated one, and putting the whole thing in a module.' (Brian Harvey).

YES!!!!

  Objections:

    - `For the record, I don't think we need randomness in WG1. A user can roll their own, given good integers etc. Yes, they can do it wrongly. But the same argument applies to all sorts of algorithms. I have textbooks full of algorithms that, at first sight, are overcomplicated, but fix all sorts of subtle problems in the 'obvious' approach (if there even is one). Why are random numbers special?' (Alaric Snell-Pym)

    Answer 1: Why is SQRT special? NUMBER->STRING? APPEND? *?

    Answer 2: Something a student will want to use within 3 hours of starting to learn Scheme ought to be built into the language.

    - this duplicates the module WG2 will come up with

    Not in any significant way. A WG2 implementation will provide this module as about a 10-line façade.

    - There are too many issues for WG2 ever to come up with a coherent proposal.

    If that's the case, that's WG2's problem. It doesn't affect WG1, though.

    - There are issues about reading and writing random states, and maybe other things too.

    Again, WG2 problems.

    - this will generate inferior random numbers

    Nothing in this proposal says anything about the technique to be used, which will depend upon the goals of the implementer. Indeed, an implementation might use C's rand, which is nowadays an indefensible technique (except when I do it in something quick and dirty :). I personally would be reluctant to use a library-defined random number generator in code intended to be portable, because I don't trust the authors of libraries. I don't keep up with the RNG literature, but the Mersenne Twister and WELL are very popular. However, as with many other issues, the quality of random numbers generated is a criterion that prospective users will consider before selecting a particular implementation. This specification simply requests that the implementer document the choices that were made.

    - `I'm beginning to come around to the viewpoint that this matter isn't ripe for standardization, despite the many languages that have standardized it already.' (John Cowan, last November).

    I think almost everyone can agree on the facility as given here, leaving it to the implementation to make appropriate choices and document them, and to WG2 to analyze the essential randomness of the universe.

    - What about RANDOM-INTEGER and PICK-RANDOM-ELEMENT-FROM-LIST?

    These actually ARE easy to write, given RANDOM-REAL, and writing each teaches the learner something useful about Scheme programming (for RANDOM-INTEGER, `how do TRUNCATE, CEILING, FLOOR, and ROUND differ?'). I have no objection to adding these two procedures to basicrandom, but would prefer to omit them to keep the module as small as possible.

    - What about normal, Pareto, negative exponential, and Poisson distributions? Multiple random states?

    Overengineering as far as WG1 is concerned.

    - Can you make the RNG a parameter, so that it works correctly in multi-threaded programs?

    Threads aren't in WG1. If WG2 includes threads, it can decide how that impacts basicrandom. I have an opinion on that, but won't express it here, because I want to rule WG2 out completely as an issue in designing basicrandom's interface.

    - Why the implementation violation?

    It's conceivable (though extremely unlikely) that a particular platform has no sources of random bits (other than doing what DEC FOCAL was apocryphally claimed to do, namely keeping a count of the number of iterations of the `waiting for a key' routine, and returning that; I was just looking at the PDP-8 FOCAL manual (on bitsavers.org), which describes the random number generator as `nonstatistical'). If an implementer doesn't feel they can truly produce a nondeterministic seed, they have the option of reporting a violation, rather than producing a (possibly biased) result. I doubt many implementations would take that option.

Conclusion: the basicrandom module as described here satisfies a real need on the part of both learners and experienced programmers; it leaves it to the implementation to make appropriate choices and document them; and it in no way precludes any (or no) WG2 module (a WG2 randomness facility that cannot support basicrandom as a façade is too strange to contemplate). I urge WG1 to reconsider its decision.

If anyone has gotten to this point, my hearty congratulations. I hope that WG1 will indeed create tickets on the issues I've raised.

-- vincent

_______________________________________________
Scheme-reports mailing list
Scheme-reports@scheme-reports.org
http://lists.scheme-reports.org/cgi-bin/mailman/listinfo/scheme-reports