I have been working on a parser and one issue I have run into is string patterns (I am not even sure that is what they are called). When parsing I have come across those so called string patterns and here is a link to my other question. In the question there is a s:gsub("%(([^()]+)%)", "substitute")
and I was given the %(([^()]+)%)
. My current question is how do I construct things like %(([^()]+)%)
? In my Lua 5.2 reference manual it says that %a
represents all characters, c%
represents all control characters, %d
represents digits, and so on. How do I create my own pattern finders that look like %(([^()]+)%)
and are more complex than just %w
? I would appreciate any help. Thanks!
Note: I am not even sure if %(([^()]+)%)
is what I want because I don't know how it works.
Lua site has the information you need! Documentation, More documentation, Tutorial Since the documentation link has all the patterns on it, I'll just go over what your regular expression means. When it says %(
, that means a literal parenthesis, rather than a capture. The next parenthesis is the opening of a capture, meaning that is what will be returned by the string.match
, returned as extra variable(s) by string.find
, or can be used in string.gsub
. All of these can be used along with %1
-%9
to create some pretty cool effects. Lastly, the part inside your capture ([^()]+
) makes use of the not character, meaning it will be any number (due to the +
) of non-parentheses (due to the ()
) characters. The brackets make the ^()
be a single character so that both ^(
and ^)
are part of the part being extended.
Edit: The last %)
just refers to another literal parenthesis, so overall your pattern matches to parentheses with any non-parenthesis characters between them. A better way of doing this might be %b()
, but this matches any text between parentheses (e.g. it would match all of (guys()whatsup)
which may be worth looking into).
This is what I have in my notes file, I dont know if its quite what youre looking for, but I think string captures are relevant. This is more of a comment than an answer, and it’s also way above my pay grade.
http://lua-users.org/wiki/StringLibraryTutorial
Just like string.find() we can use patterns to search in strings. Patterns are covered in the PatternsTutorial. If a capture is used this can be referenced in the replacement string using the notation %capture_index, e.g.,
= string.gsub("banana", "(an)", "%1-") -- capture any occurences of "an" and replace ban-an-a 2 = string.gsub("banana", "a(n)", "a(%1)") -- brackets around n's which follow a's ba(n)a(n)a 2 = string.gsub("banana", "(a)(n)", "%2%1") -- reverse any "an"s bnanaa 2 If the replacement is a function, not a string, the arguments passed to the function are any captures that are made. If the function returns a string, the value returned is substituted back into the string.
= string.gsub("Hello Lua user", "(%w+)", print) -- print any words found Hello Lua user 3 = string.gsub("Hello Lua user", "(%w+)", function(w) return string.len(w) end) -- replace with lengths 5 3 4 3 = string.gsub("banana", "(a)", string.upper) -- make all "a"s found uppercase bAnAnA 3 = string.gsub("banana", "(a)(n)", function(a,b) return b..a end) -- reverse any "an"s bnanaa 2 Pattern capture: The most commonly seen pattern capture instances could be
"(.-)", e.g. "{(.-)}" means capture any characters between the curly braces {} (lazy match, i.e. as few characters as possible) "(.)", e.g. "{(.)}" means capture any characters between the curly braces {} (greedy match, i.e. as many characters as possible)
= string.gsub("The big {brown} fox jumped {over} the lazy {dog}.","{(.-)}", function(a) print(a) end ) brown over dog
= string.gsub("The big {brown} fox jumped {over} the lazy {dog}.","{(.*)}", function(a) print(a) end ) brown} fox jumped {over} the lazy {dog