dryruby
diff --git a/‎.github/workflows/ci.yml‎
Lines changed: 2 additions & 2 deletions b/‎.github/workflows/ci.yml‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎Gemfile‎
Lines changed: 1 addition & 0 deletions b/‎Gemfile‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎README.md‎
Lines changed: 11 additions & 11 deletions b/‎README.md‎
Lines changed: 11 additions & 11 deletions
diff --git a/‎VERSION‎
Lines changed: 1 addition & 1 deletion b/‎VERSION‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎bin/ebnf‎
Lines changed: 6 additions & 1 deletion b/‎bin/ebnf‎
Lines changed: 6 additions & 1 deletion
diff --git a/‎ebnf.gemspec‎
Lines changed: 1 addition & 0 deletions b/‎ebnf.gemspec‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎etc/ebnf.ebnf‎
Lines changed: 6 additions & 5 deletions b/‎etc/ebnf.ebnf‎
Lines changed: 6 additions & 5 deletions
diff --git a/‎etc/ebnf.html‎
Lines changed: 8 additions & 2 deletions b/‎etc/ebnf.html‎
Lines changed: 8 additions & 2 deletions
diff --git a/‎etc/ebnf.ll1.rb‎
Lines changed: 1 addition & 1 deletion b/‎etc/ebnf.ll1.rb‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎etc/ebnf.ll1.sxp‎
Lines changed: 3 additions & 5 deletions b/‎etc/ebnf.ll1.sxp‎
Lines changed: 3 additions & 5 deletions
@@ -19,7 +19,7 @@ jobs:
     strategy:
       fail-fast: false
       matrix:
-        ruby: ['3.0', 3.1, 3.2, ruby-head, jruby]
+        ruby: ['3.0', 3.1, 3.2, 3.3, ruby-head, jruby]
     steps:
       - name: Clone repository
         uses: actions/checkout@v3
@@ -33,6 +33,6 @@ jobs:
         run: ruby --version; bundle exec rspec spec || $ALLOW_FAILURES
       - name: Coveralls GitHub Action
         uses: coverallsapp/github-action@v2
-        if: "matrix.ruby == '3.2'"
+        if: "matrix.ruby == '3.3'"
         with:
           github-token: ${{ secrets.GITHUB_TOKEN }}
@@ -13,6 +13,7 @@ group :development do
   gem "redcarpet",      platforms: :mri
   gem "rocco",          platforms: :mri
   gem "pygmentize",     platforms: :mri
+  gem 'getoptlong'
 end
 
 group :development, :test do
 
@@ -26,10 +26,9 @@ As LL(1) grammars operate using `alt` and `seq` primitives, allowing for a match
 * Transform `a ::= b+` into `a ::= b b*`
 * Transform `a ::= b*` into `a ::= _empty | (b a)`
 * Transform `a ::= op1 (op2)` into two rules:
-  ```
-  a     ::= op1 _a_1
-  _a_1_ ::= op2
-  ```
+
+        a     ::= op1 _a_1
+        _a_1_ ::= op2
 
 Of note in this implementation is that the tokenizer and parser are streaming, so that they can process inputs of arbitrary size.
 
@@ -75,7 +74,7 @@ Generate formatted grammar using HTML (requires [Haml][Haml] gem):
 
 ### Parsing an ISO/IEC 14977 Grammar
 
-The EBNF gem can also parse [ISO/EIC 14977] Grammars (ISOEBNF) to [S-Expressions][S-Expression].
+The EBNF gem can also parse  [ISO/IEC 14977][] Grammars (ISOEBNF) to [S-Expressions][S-Expression].
 
     grammar = EBNF.parse(File.open('./etc/iso-ebnf.isoebnf'), format: :isoebnf)
 
@@ -96,15 +95,15 @@ The {EBNF::Writer} class can be used to write parsed grammars out, either as for
 The formatted HTML results are designed to be appropriate for including in specifications.
 
 ### Parser Errors
-On a parsing failure, and exception is raised with information that may be useful in determining the source of the error.
+On a parsing failure, an exception is raised with information that may be useful in determining the source of the error.
 
 ## EBNF Grammar
 The [EBNF][] variant used here is based on [W3C](https://w3.org/) [EBNF][]
 (see [EBNF grammar](https://dryruby.github.io/ebnf/etc/ebnf.ebnf))
 as defined in the
 [XML 1.0 recommendation](https://www.w3.org/TR/REC-xml/), with minor extensions:
 
-Note that the grammar includes an optional `[identifer]` in front of rule names, which can be in conflict with the `RANGE` terminal. It is typically not a problem, but if it comes up, try parsing with the `native` parser,  add comments or sequences to disambiguate. EBNF does not have beginning of line checks as all whitespace is treated the same, so the common practice of identifying each rule inherently leads to such ambiguity.
+Note that the grammar includes an optional `[number]` in front of rule names, which can be in conflict with the `RANGE` terminal. It is typically not a problem, but if it comes up, try parsing with the `native` parser,  add comments or sequences to disambiguate. EBNF does not have beginning of line checks as all whitespace is treated the same, so the common practice of identifying each rule inherently leads to such ambiguity.
 
 The character set for EBNF is UTF-8.
 
@@ -116,7 +115,7 @@ which can also be proceeded by an optional number enclosed in square brackets to
 
     [1] symbol ::= expression
 
-(Note, this can introduce an ambiguity if the previous rule ends in a range or enum and the current rule has no identifier. In this case, enclosing `expression` within parentheses, or adding intervening comments can resolve the ambiguity.)
+(Note, introduces an ambiguity if the previous rule ends in a range or enum and the current rule has no number. The parsers dynamically determine the terminal rules for the `LHS` (the identifier, symbol, and `::=`) and `RANGE`).
 
 Symbols are written in CAPITAL CASE if they are the start symbol of a regular language (terminals), otherwise with they are treated as non-terminal rules. Literal strings are quoted.
 
@@ -134,7 +133,7 @@ Within the expression on the right-hand side of a rule, the following expression
   <tr><td><code>[^abc], [^#xN#xN#xN]</code></td>
     <td>matches any UTF-8 R\_CHAR or HEX with a value not among the characters given. The last component may be '-'. Enumerations and ranges of excluded values may be mixed in one set of brackets.</td></tr>
   <tr><td><code>"string"</code></td>
-    <td>matches a literal string matching that given inside the double quotes.</td></tr>
+    <td>matches a literal string matching that given inside the double quotes case insensitively.</td></tr>
   <tr><td><code>'string'</code></td>
     <td>matches a literal string matching that given inside the single quotes.</td></tr>
   <tr><td><code>A (B | C)</code></td>
@@ -158,7 +157,8 @@ Within the expression on the right-hand side of a rule, the following expression
 </table>
 
 * Comments include `//` and `#` through end of line (other than hex character) and `/* ... */ (* ... *) which may cross lines`
-* All rules **MAY** start with an identifier, contained within square brackets. For example `[1] rule`, where the value within the brackets is a symbol `([a-z] | [A-Z] | [0-9] | "_" | ".")+`
+* All rules **MAY** start with an number, contained within square brackets. For example `[1] rule`, where the value within the brackets is a symbol `([a-z] | [A-Z] | [0-9] | "_" | ".")+`, which is not retained after parsing
+* Symbols **MAY** be enclosed in angle brackets `'<'` and `>`, which are dropped when parsing.
 * `@terminals` causes following rules to be treated as terminals. Any terminal which is all upper-case (eg`TERMINAL`), or any rules with expressions that match characters (`#xN`, `[a-z]`, `[^a-z]`, `[abc]`, `[^abc]`, `"string"`, `'string'`, or `A - B`), are also treated as terminals.
 * `@pass` defines the expression used to detect whitespace, which is removed in processing.
 * No support for `wfc` (well-formedness constraint) or `vc` (validity constraint).
@@ -177,7 +177,7 @@ Intermediate representations of the grammar may be serialized to Lisp-like [S-Ex
 
 is serialized as
 
-    (rule ebnf "1" (star (alt declaration rule)))
+    (rule ebnf (star (alt declaration rule)))
 
 Different components of an EBNF rule expression are transformed into their own operator:
 
 
@@ -1 +1 @@
-2.5.0
+2.6.0
@@ -9,6 +9,7 @@ $:.unshift(File.expand_path(File.join(File.dirname(__FILE__), "..", 'lib')))
 require 'rubygems'
 require 'getoptlong'
 require 'ebnf'
+require 'rdf/spec'
 
 options = {
   output_format: :sxp,
@@ -86,7 +87,11 @@ end
 
 input = File.open(ARGV[0]) if ARGV[0]
 
-ebnf = EBNF.parse(input || STDIN, **options)
+logger = Logger.new(STDERR)
+logger.level = options[:level] || Logger::ERROR
+logger.formatter = lambda {|severity, datetime, progname, msg| "%5s %s\n" % [severity, msg]}
+
+ebnf = EBNF.parse(input || STDIN, logger: logger, **options)
 ebnf.make_bnf if options[:bnf] || options[:ll1]
 ebnf.make_peg if options[:peg]
 if options[:ll1]
 
@@ -35,6 +35,7 @@ Gem::Specification.new do |gem|
   gem.add_runtime_dependency     'rdf',             '~> 3.3' # Required by sxp
   gem.add_runtime_dependency     'htmlentities',    '~> 4.3'
   gem.add_runtime_dependency     'unicode-types',   '~> 1.8'
+  gem.add_runtime_dependency     'base64',      '~> 0.2'
   gem.add_development_dependency 'amazing_print',   '~> 1.4'
   gem.add_development_dependency 'rdf-spec',        '~> 3.3'
   gem.add_development_dependency 'rdf-turtle',      '~> 3.3'
 
@@ -5,9 +5,8 @@
 
     # Use the LHS terminal to match the identifier, rule name and assignment due to
     # confusion between the identifier and RANGE.
-    # Note, for grammars not using identifiers, it is still possible to confuse
-    # a rule ending with a range the next rule, as it may be interpreted as an identifier.
-    # In such case, best to enclose the rule in '()'.
+    # The PEG parser has special rules for matching LHS and RANGE
+    # so that RANGE is not confused with LHS.
     [3] rule        ::= LHS expression
 
     [4] expression  ::= alt
@@ -34,11 +33,13 @@
 
     [11] LHS        ::= ('[' SYMBOL ']' ' '+)? SYMBOL ' '* '::='
 
-    [12] SYMBOL     ::= ([a-z] | [A-Z] | [0-9] | '_' | '.')+
+    [12] SYMBOL     ::= '<' O_SYMBOL '>' | O_SYMBOL
+
+    [12a] O_SYMBOL  ::= ([a-z] | [A-Z] | [0-9] | '_' | '.')+
 
     [13] HEX        ::= '#x' ([a-f] | [A-F] | [0-9])+
 
-    [14] RANGE      ::= '[' ((R_CHAR '-' R_CHAR) | (HEX '-' HEX) | R_CHAR | HEX)+ '-'? ']' - LHS
+    [14] RANGE      ::= '[' ((R_CHAR '-' R_CHAR) | (HEX '-' HEX) | R_CHAR | HEX)+ '-'? ']'
 
     [15] O_RANGE    ::= '[^' ((R_CHAR '-' R_CHAR) | (HEX '-' HEX) | R_CHAR | HEX)+ '-'? ']'
 
 
@@ -1,4 +1,4 @@
-<!-- Generated with ebnf version 2.4.0. See https://github.com/dryruby/ebnf. -->
+<!-- Generated with ebnf version 2.5.0. See https://github.com/dryruby/ebnf. -->
 <table class="grammar">
   <tbody id="grammar-productions" class="ebnf">
     <tr id="grammar-production-ebnf">
@@ -77,6 +77,12 @@
       <td>[12]</td>
       <td><code>SYMBOL</code></td>
       <td>::=</td>
+      <td><code class="grammar-paren">(</code>'<code class="grammar-literal">&lt;</code>' <a href="#grammar-production-O_SYMBOL">O_SYMBOL</a> '<code class="grammar-literal">&gt;</code>'<code class="grammar-paren">)</code> <code class="grammar-alt">|</code> <a href="#grammar-production-O_SYMBOL">O_SYMBOL</a></td>
+    </tr>
+    <tr id="grammar-production-O_SYMBOL">
+      <td>[12a]</td>
+      <td><code>O_SYMBOL</code></td>
+      <td>::=</td>
       <td><code class="grammar-paren">(</code><code class="grammar-brac">[</code><code class="grammar-literal">a-z</code><code class="grammar-brac">]</code> <code class="grammar-alt">|</code> <code class="grammar-brac">[</code><code class="grammar-literal">A-Z</code><code class="grammar-brac">]</code> <code class="grammar-alt">|</code> <code class="grammar-brac">[</code><code class="grammar-literal">0-9</code><code class="grammar-brac">]</code> <code class="grammar-alt">|</code> '<code class="grammar-literal">_</code>' <code class="grammar-alt">|</code> '<code class="grammar-literal">.</code>'<code class="grammar-paren">)</code><code class="grammar-plus">+</code></td>
     </tr>
     <tr id="grammar-production-HEX">
@@ -89,7 +95,7 @@
       <td>[14]</td>
       <td><code>RANGE</code></td>
       <td>::=</td>
-      <td>'<code class="grammar-literal">[</code>' <code class="grammar-paren">(</code><code class="grammar-paren">(</code><a href="#grammar-production-R_CHAR">R_CHAR</a> '<code class="grammar-literal">-</code>' <a href="#grammar-production-R_CHAR">R_CHAR</a><code class="grammar-paren">)</code> <code class="grammar-alt">|</code> <code class="grammar-paren">(</code><a href="#grammar-production-HEX">HEX</a> '<code class="grammar-literal">-</code>' <a href="#grammar-production-HEX">HEX</a><code class="grammar-paren">)</code> <code class="grammar-alt">|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a> <code class="grammar-alt">|</code> <a href="#grammar-production-HEX">HEX</a><code class="grammar-paren">)</code><code class="grammar-plus">+</code> '<code class="grammar-literal">-</code>'<code class="grammar-opt">?</code> <code class="grammar-paren">(</code>'<code class="grammar-literal">]</code>' <code class="grammar-diff">-</code> <a href="#grammar-production-LHS">LHS</a><code class="grammar-paren">)</code></td>
+      <td>'<code class="grammar-literal">[</code>' <code class="grammar-paren">(</code><code class="grammar-paren">(</code><a href="#grammar-production-R_CHAR">R_CHAR</a> '<code class="grammar-literal">-</code>' <a href="#grammar-production-R_CHAR">R_CHAR</a><code class="grammar-paren">)</code> <code class="grammar-alt">|</code> <code class="grammar-paren">(</code><a href="#grammar-production-HEX">HEX</a> '<code class="grammar-literal">-</code>' <a href="#grammar-production-HEX">HEX</a><code class="grammar-paren">)</code> <code class="grammar-alt">|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a> <code class="grammar-alt">|</code> <a href="#grammar-production-HEX">HEX</a><code class="grammar-paren">)</code><code class="grammar-plus">+</code> '<code class="grammar-literal">-</code>'<code class="grammar-opt">?</code> '<code class="grammar-literal">]</code>'</td>
     </tr>
     <tr id="grammar-production-O_RANGE">
       <td>[15]</td>
 
@@ -1,4 +1,4 @@
-# This file is automatically generated by ebnf version 2.4.0
+# This file is automatically generated by ebnf version 2.5.0
 # Derived from etc/ebnf.ebnf
 module Meta
   START = :ebnf
 
@@ -100,13 +100,11 @@
       (seq '@pass' expression))
      (terminals _terminals (seq))
      (terminal LHS "11" (seq (opt (seq '[' SYMBOL ']' (plus ' '))) SYMBOL (star ' ') '::='))
-     (terminal SYMBOL "12" (plus (alt (range "a-z") (range "A-Z") (range "0-9") '_' '.')))
+     (terminal SYMBOL "12" (alt (seq '<' O_SYMBOL '>') O_SYMBOL))
+     (terminal O_SYMBOL "12a" (plus (alt (range "a-z") (range "A-Z") (range "0-9") '_' '.')))
      (terminal HEX "13" (seq '#x' (plus (alt (range "a-f") (range "A-F") (range "0-9")))))
      (terminal RANGE "14"
-      (seq '['
-       (plus (alt (seq R_CHAR '-' R_CHAR) (seq HEX '-' HEX) R_CHAR HEX))
-       (opt '-')
-       (diff ']' LHS)) )
+      (seq '[' (plus (alt (seq R_CHAR '-' R_CHAR) (seq HEX '-' HEX) R_CHAR HEX)) (opt '-') ']'))
      (terminal O_RANGE "15"
       (seq '[^' (plus (alt (seq R_CHAR '-' R_CHAR) (seq HEX '-' HEX) R_CHAR HEX)) (opt '-') ']'))
      (terminal STRING1 "16" (seq '"' (star (diff CHAR '"')) '"'))
Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,4 @@`
`1`		`-# This file is automatically generated by ebnf version 2.4.0`
	`1`	`+# This file is automatically generated by ebnf version 2.5.0`
`2`	`2`	`# Derived from etc/ebnf.ebnf`
`3`	`3`	`module Meta`
`4`	`4`	`START = :ebnf`