Re: [Scheme-reports] Standard Feature Identifiers are too low-level

Re: [Scheme-reports] Standard Feature Identifiers are too low-level Alex Shinn 05 Jan 2012 04:45 UTC
[Sorry, the formatting in my previous post was garbled, which seems to be
a bug triggered by replying directly.  Hopefully this works.]

On Wed, Jan 4, 2012 at 12:38 AM, Marc Feeley <feeley@iro.umontreal.ca> wrote:

> I think the standard feature identifiers given in appendix B are too
> low-level and ill-conceived.  They are removing from the high-level
> nature of Scheme.  Specifically I object to:
>
> windows
>  This Scheme implementation is running on Windows.
>
> posix
>  This Scheme implementation is running on a POSIX system.
>
> unix, darwin, linux, bsd, freebsd, solaris, ...
>  Operating system flags (more than one may apply).
>
> i386, x86-64, ppc, sparc, jvm, clr, llvm, ...
>  CPU architecture flags.
>
> ilp32, lp64, ilp64, ...
>  C memory model flags.
>
> big-endian, little-endian
>  Byte order flags.

Thanks, this is a good objection, and one I held myself originally.
Fortunately, we have at least some empirical evidence for existing
features as used in different implementations.  In Chibi the following
features are cond-expanded on, in order of frequency:

  17: modules (whether the module system is supported (only in tests))
   6: string-streams (whether custom ports as FILE* objects are supported)
   5: complex (exact-complex in the draft, the name is an open ticket)
   4: threads (whether chibi was compiled with green threads support)
   2: bsd
   2: ratios
   2: full-unicode
   1: auto-force (whether all primitive ops implicitly force their args)

In addition there are a few scripts that cond-expand on the
implementation name between Chibi, Chicken and/or Gauche.  Since the
Chibi distribution is of course meant only for Chibi, there's not much
need for this right now.  However, as the initial attempts at portable
R6RS code have shown, the one thing that is absolutely necessary is
the ability to cond-expand on the implementation.  R7RS large will
hopefully reduce the need for this, but can't eliminate it.

A quick grep for cond-expand in the Chicken core also turns up mainly
tests - the most platform dependent code is in the posix module, which
conditionally uses different files from the Makefile.

Searching the entire Chicken eggs repository turns up many, many uses
of cond-expand, including checks for different OSes (windows, mingw,
msvc, macosx, bsd, linux, etc.), 32bit and 64bit, big-endian and
little-endian, and other various features such as whether the code is
being compiled and whether unsafe operations are supported.  What I
can't find any instances of are the architecture (beyond bit size and
endianness being checked separately), or the C memory model flags.

> In the specific case of Gambit-C and probably other Scheme to C
> compilers, these features are only known when the C code is compiled
> by a specific C compiler for a particular platform.  Indeed it is
> possible, and a common use-case for me, to compile Scheme code to C,
> and to ship the C code so that the client can compile the C code in
> his specific environment (his choice of C compiler, his choice of
> OS, his choice of hardware, etc).  In a sense the C code is a
> portable assembly language.  So these features are not known when
> cond-expand is expanded.

Presumably this would not work for code that were actually
OS-dependent?  If you need networking support, then the code would be
wildly different for WINSOCK vs BSD vs Plan9.  If part of the
generated C code is a portable runtime that handles this
automatically, then I would argue that the tools used to make such a
runtime should as much as possible be available to the programmer.

An interesting approach for Scheme to C compilers would be to
translate the cond-expand into #if pragmas, with high-level features
substituted by Scheme and low-level values determined by the C
preprocessor.

> Let me stress that one of the important qualities of a high-level
> language is that they abstract from lower-level issues (such as
> processor type, operating system, endianness, etc).  This
> abstraction reduces the conceptual burden when programming and
> allows code to be portable.  It is strange to be able to know with
> cond-expand what the CPU architecture is, yet nothing in the
> semantics of Scheme actually depends on the CPU architecture (unless
> I've missed something when reading the r7rs spec).  Same goes for
> the other features.

Yet knowing the endianness is essential for reading many binary files,
and specifically because nothing in the semantics of Scheme depends on
this there is no other way to determine the endianness in R7RS small.
Your proposed test

  (= 0 (u8vector-ref '#u16(1) 0))

is not possible because we don't have u16 vectors.

> There is another issue which concerns code mobility.  It is
> currently possible in Gambit-C, and probably other Scheme systems
> now or in the future, to migrate code from one execution environment
> to another at run time (think Termite's process migration, or an RPC
> mechanism).  In Gambit-C some code loaded in one instance of Gambit
> (A) can be serialized and sent over to a different instance of
> Gambit (B) and executed there.  A and B can have entirely different
> characteristics with respect to the above feature list (CPU
> architecture, operating system, endianness, word width, etc).
> Adding the above cond-expand features will hinder this code
> mobility.

Are there any non-trivial applications which can reliably transfer
arbitrary code between different operating systems?  One of the goals
of Java was portable code, yet even for single, fixed applications you
often need to download separate versions for separate platforms.
Large MapReduce data centers from Google and Amazon, which do transfer
code on the fly between machines, rely on a uniform platform running
the same version of Linux on the same architecture.

> It would be nice if some day code mobility was possible between
> different Scheme implementations.  Unfortunately, Scheme code making
> use of cond-expand will be hard to migrate between instances of
> different Scheme implementations.  So please keep in mind that in
> general, feature testing with cond-expand prevents code mobility
> between different Scheme implementations, and in the case of the
> above list of features it also prevents code mobility between
> instances of the same Scheme implementation.

I agree completely, but we have practical issues to deal with, and I
don't want to prevent people from writing useful libraries in the name
future goals.  I think use of cond-expand in general should be
minimized, and in particular the use of low-level features.  Gradually
uses of cond-expand should be replaced with new standardized
libraries, and the goal for "mobile" code should be to avoid use of
cond-expand altogether.

In addition to revising some of the features in ticket #323, I think
we should consider adding a disclaimer that cond-expand is considered
poor style, and should only be used when there's no alternative.

--
Alex

_______________________________________________
Scheme-reports mailing list
Scheme-reports@scheme-reports.org
http://lists.scheme-reports.org/cgi-bin/mailman/listinfo/scheme-reports