You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -96,15 +95,15 @@ The {EBNF::Writer} class can be used to write parsed grammars out, either as for
96
95
The formatted HTML results are designed to be appropriate for including in specifications.
97
96
98
97
### Parser Errors
99
-
On a parsing failure, and exception is raised with information that may be useful in determining the source of the error.
98
+
On a parsing failure, an exception is raised with information that may be useful in determining the source of the error.
100
99
101
100
## EBNF Grammar
102
101
The [EBNF][] variant used here is based on [W3C](https://w3.org/)[EBNF][]
103
102
(see [EBNF grammar](https://dryruby.github.io/ebnf/etc/ebnf.ebnf))
104
103
as defined in the
105
104
[XML 1.0 recommendation](https://www.w3.org/TR/REC-xml/), with minor extensions:
106
105
107
-
Note that the grammar includes an optional `[identifer]` in front of rule names, which can be in conflict with the `RANGE` terminal. It is typically not a problem, but if it comes up, try parsing with the `native` parser, add comments or sequences to disambiguate. EBNF does not have beginning of line checks as all whitespace is treated the same, so the common practice of identifying each rule inherently leads to such ambiguity.
106
+
Note that the grammar includes an optional `[number]` in front of rule names, which can be in conflict with the `RANGE` terminal. It is typically not a problem, but if it comes up, try parsing with the `native` parser, add comments or sequences to disambiguate. EBNF does not have beginning of line checks as all whitespace is treated the same, so the common practice of identifying each rule inherently leads to such ambiguity.
108
107
109
108
The character set for EBNF is UTF-8.
110
109
@@ -116,7 +115,7 @@ which can also be proceeded by an optional number enclosed in square brackets to
116
115
117
116
[1] symbol ::= expression
118
117
119
-
(Note, this can introduce an ambiguity if the previous rule ends in a range or enum and the current rule has no identifier. In this case, enclosing `expression` within parentheses, or adding intervening comments can resolve the ambiguity.)
118
+
(Note, introduces an ambiguity if the previous rule ends in a range or enum and the current rule has no number. The parsers dynamically determine the terminal rules for the `LHS` (the identifier, symbol, and `::=`) and `RANGE`).
120
119
121
120
Symbols are written in CAPITAL CASE if they are the start symbol of a regular language (terminals), otherwise with they are treated as non-terminal rules. Literal strings are quoted.
122
121
@@ -134,7 +133,7 @@ Within the expression on the right-hand side of a rule, the following expression
134
133
<tr><td><code>[^abc], [^#xN#xN#xN]</code></td>
135
134
<td>matches any UTF-8 R\_CHAR or HEX with a value not among the characters given. The last component may be '-'. Enumerations and ranges of excluded values may be mixed in one set of brackets.</td></tr>
136
135
<tr><td><code>"string"</code></td>
137
-
<td>matches a literal string matching that given inside the double quotes.</td></tr>
136
+
<td>matches a literal string matching that given inside the double quotes case insensitively.</td></tr>
138
137
<tr><td><code>'string'</code></td>
139
138
<td>matches a literal string matching that given inside the single quotes.</td></tr>
140
139
<tr><td><code>A (B | C)</code></td>
@@ -158,7 +157,8 @@ Within the expression on the right-hand side of a rule, the following expression
158
157
</table>
159
158
160
159
* Comments include `//` and `#` through end of line (other than hex character) and `/* ... */ (* ... *) which may cross lines`
161
-
* All rules **MAY** start with an identifier, contained within square brackets. For example `[1] rule`, where the value within the brackets is a symbol `([a-z] | [A-Z] | [0-9] | "_" | ".")+`
160
+
* All rules **MAY** start with an number, contained within square brackets. For example `[1] rule`, where the value within the brackets is a symbol `([a-z] | [A-Z] | [0-9] | "_" | ".")+`, which is not retained after parsing
161
+
* Symbols **MAY** be enclosed in angle brackets `'<'` and `>`, which are dropped when parsing.
162
162
*`@terminals` causes following rules to be treated as terminals. Any terminal which is all upper-case (eg`TERMINAL`), or any rules with expressions that match characters (`#xN`, `[a-z]`, `[^a-z]`, `[abc]`, `[^abc]`, `"string"`, `'string'`, or `A - B`), are also treated as terminals.
163
163
*`@pass` defines the expression used to detect whitespace, which is removed in processing.
164
164
* No support for `wfc` (well-formedness constraint) or `vc` (validity constraint).
@@ -177,7 +177,7 @@ Intermediate representations of the grammar may be serialized to Lisp-like [S-Ex
177
177
178
178
is serialized as
179
179
180
-
(rule ebnf "1" (star (alt declaration rule)))
180
+
(rule ebnf (star (alt declaration rule)))
181
181
182
182
Different components of an EBNF rule expression are transformed into their own operator:
0 commit comments