[Scheme-reports] Issues with R7RS draft 8 section 7.1.1 <symbol element>
David A. Wheeler 12 Jan 2013 19:29 UTC
I just found some issues in R7RS draft 8 section 7.1.1
("Lexical structure") involving <symbol element> and <string element>.
Currently these productions are defined as follows:
<symbol element> -->
<any character other than <vertical line> or \>
| <string element> | " | \|
...
<string element> --> <any character other than " or \>
| \a | \b | \t | \n | \r | \" | \\
| \<intraline whitespace>* <line ending>
<intraline whitespace>*
| <inline hex escape>*
But these productions have two issues:
1. Having <symbol element> list double-quote is pointless, since that
is already covered by "<any character other than <vertical line> or \>".
2. More importantly, the call to <string element> creates a useless
ambiguity, because string element's <any character other than " or \>
ALSO matches almost all the same characters.
This causes problems if you try to directly implement these productions
in a typical tokenizer. You can work around it, but it'd better
if that wasn't necessary.
I think these would be better written as follows, which removes the
extraneous double-quote and adds a disambiguating <special string element>:
<symbol element> -->
<any character other than <vertical line> or \>
| <special string element> | \|
...
<string element> --> <any character other than " or \>
| <special string element>
<special string element> -->
\a | \b | \t | \n | \r | \" | \\
| \<intraline whitespace>* <line ending>
<intraline whitespace>*
| <inline hex escape>*
Thanks!
--- David A. Wheeler
_______________________________________________
Scheme-reports mailing list
Scheme-reports@scheme-reports.org
http://lists.scheme-reports.org/cgi-bin/mailman/listinfo/scheme-reports