Derivative Based Extended Regular Expression Matching Supporting Intersection, Complement and Lookarounds

September 25, 2023 ยท The Ethereal ยท ๐Ÿ› arXiv.org

๐Ÿ”ฎ THE ETHEREAL: The Ethereal
Pure theory โ€” exists on a plane beyond code

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Ian Erik Varatalu, Margus Veanes, Juhan-Peep Ernits arXiv ID 2309.14401 Category cs.FL: Formal Languages Cross-listed cs.PL Citations 6 Venue arXiv.org Last Checked 1 month ago
Abstract
Regular expressions are widely used in software. Various regular expression engines support different combinations of extensions to classical regular constructs such as Kleene star, concatenation, nondeterministic choice (union in terms of match semantics). The extensions include e.g. anchors, lookarounds, counters, backreferences. The properties of combinations of such extensions have been subject of active recent research. In the current paper we present a symbolic derivatives based approach to finding matches to regular expressions that, in addition to the classical regular constructs, also support complement, intersection and lookarounds (both negative and positive lookaheads and lookbacks). The theory of computing symbolic derivatives and determining nullability given an input string is presented that shows that such a combination of extensions yields a match semantics that corresponds to an effective Boolean algebra, which in turn opens up possibilities of applying various Boolean logic rewrite rules to optimize the search for matches. In addition to the theoretical framework we present an implementation of the combination of extensions to demonstrate the efficacy of the approach accompanied with practical examples.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Formal Languages