Updated README.

2025-07-09 01:02:08 +00:00 · 2015-11-26 01:02:09 -05:00 · 2015-11-26 01:02:09 -05:00 · 37dfcbf860
commit 37dfcbf860
parent 33978023cf
1 changed files with 36 additions and 5 deletions
--- a/README.md
+++ b/README.md
@ -32,10 +32,11 @@ int main(void) {
    // (2) Make a parser
    auto syntax = R"(
        # Grammar for Calculator...
-        Additive  <- Multitive '+' Additive / Multitive
-        Multitive <- Primary '*' Multitive / Primary
-        Primary   <- '(' Additive ')' / Number
-        Number    <- [0-9]+
+        Additive    <- Multitive '+' Additive / Multitive
+        Multitive   <- Primary '*' Multitive / Primary
+        Primary     <- '(' Additive ')' / Number
+        Number      <- [0-9]+
+        %whitespace <- [ \t]*
    )";

    parser parser(syntax);
@ -67,7 +68,7 @@ int main(void) {
    parser.packrat_parsing(); // Enable packrat parsing.

    int val;
-    parser.parse("(1+2)*3", val);
+    parser.parse(" (1 + 2) * 3 ", val);

    assert(val == 9);
 }
@ -200,6 +201,36 @@ parser["RULE"].after = [](any& dt) {
 };
 ```

+Ignoring Whitespaces
+--------------------
+
+As you can see in the first example, we can ignore whitespaces between tokens automatically with `%whitespace` rule.
+
+`%whitespace` rule can be applied to the following three conditions:
+
+  * trailing spaces on tokens
+  * leading spaces on text
+  * trailing spaces on literal strings in rules
+
+These are valid tokens:
+
+```
+KEYWORD  <- 'keyword'
+WORD     <-  [a-zA-Z0-9] [a-zA-Z0-9-_]*        # no reference rule is used
+IDNET    <-  < IDENT_START_CHAR IDENT_CHAR* >  # token boundary operator is used.
+```
+
+The following grammar accepts ` one, "two three", four `.
+
+```
+ROOT         <- ITEM (',' ITEM)*
+ITEM         <- WORD / PHRASE
+WORD         <- [a-z]+
+PHRASE       <- '"' (!'"' .)* '"'
+
+%whitespace  <-  [ \t\r\n]*
+```
+
 Simple interface
 ----------------