Skip to content

How to find out last save position, when do "on _error"? #1534

@XvKuoMing

Description

@XvKuoMing

Suppose, we have a grammar:

start: value
?value: object
        | array
        | string
        | SIGNED_NUMBER      -> number
        | "true"             -> true
        | "false"            -> false
        | "null"             -> null
array  : "[" [value ("," value)*] "]"
object : "{" [pair ("," pair)*] "}"
pair   : string ":" value
string : ESCAPED_STRING
%import common.ESCAPED_STRING
%import common.SIGNED_NUMBER
%import common.WS
%ignore WS

now, I have a string to validate:

"name": "John",
"age": 30,
"city": "New York
}

I want to write on_error callback function such as it will do:

  1. understand that there is broken value after city":
  2. give me the position of last accepted token e.g., so I can delete broken "New York and put something else

here is my example:


from lark import Lark
from lark.reconstruct import Reconstructor
from lark import UnexpectedToken, UnexpectedCharacters

grammar = """
start: value
?value: object
        | array
        | string
        | SIGNED_NUMBER      -> number
        | "true"             -> true
        | "false"            -> false
        | "null"             -> null
array  : "[" [value ("," value)*] "]"
object : "{" [pair ("," pair)*] "}"
pair   : string ":" value
string : ESCAPED_STRING
%import common.ESCAPED_STRING
%import common.SIGNED_NUMBER
%import common.WS
%ignore WS
"""


def ignore_errors(e):
    if isinstance(e, UnexpectedCharacters):
        pass # skip
    elif isinstance(e, UnexpectedToken):
        if e.token.type == "$END":
            e.interactive_parser.accepts() # returns a set of accepted tokens after "city":
           # I want to remove everything in a string until string accepted
    else:
        return False
    return True




p = Lark(grammar, parser="lalr", strict=False, maybe_placeholders=False)
r = p.parse("""{
    "name": "John",,
    "age": 30,#
    "city": "New York"
}""", on_error=ignore_errors)


new_json = Reconstructor(p).reconstruct(r)
print (new_json)

# print(r)


I hope I explained detailed enough

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions