Regular Expressions and Onigmo, the Ruby regular expression engine
Regular expressions (regex), are powerful tools for finding and manipulating patterns in text. They are widely used in programming languages and text editors, though they are often treated as a black box. I always considered them one part programming and one part magic. The internet is full of articles about how regex are used, but very few diving deeply into their implementations. Today we will explore the theory behind regular expressions, including a brief tour of the most basic theory. We will also delve into the implementation of the Onigmo regular expression engine, which is used in the Ruby programming language.
Brief Theory
I learned some of the theory behind regular expressions reading “Engineering a Compiler” (Cooper & Torczon).
Regular expressions are a type of recognizer. Recognizer is a type of Finite State Automata (FA) which focuses on either accepting or rejecting a...