Scripting Helpers is winding down operations and is now read-only. More info→
Ad
Log in to vote
1

Challenge: String pattern for getting numbers from a string?

Asked by
funyun 958 Moderation Voter
9 years ago

I've seen this question asked before where people want to get numbers from a string, and this is usually the answer:

text = "12..34abcd1.234abcd"
numbers = {}

for i = 1, string.len(text) do
    local pointer = string.sub(text, i, i)

    if tonumber(pointer) then
        table.insert(numbers, tonumber(pointer))
    end
end
print("{"..table.concat(numbers, ", ").."}")

...or something like that. That gives me each digit in the string. I want to get actual numbers from the string. I don't want {1, 2, 3, 4, 1, 2, 3, 4}, I want {12, 34, 1.234}. So, this is what I have:

pattern = "%d+%.*%d*"
text = "12..34abcd1.234abcd"
numbers = {}

for number in string.gmatch(text, pattern) do
    if tonumber(number) then
        table.insert(numbers, tonumber(number))
    end
end

print("{"..table.concat(numbers, ", ").."}")

...and that gives me exactly what I want. However, I want a little challenge. The pattern checks for arbitrary amounts of decimals. "3..........5" would be matched, but that's not a number. That's why I added the check on line 6. How can I remove the check on line 6 and still be guaranteed that each match in any string can be a number?

0
Why would you even need that check on line 6 if you know what's going to be matched will be a digit? Spongocardo 1991 — 9y
0
I'm not looking for digits, I'm looking for numbers. 645, 1028, 1337, 3.1415, etc. If "3...5" was in the string and the check wasn't there, the script would put "3...5" in the table of numbers. funyun 958 — 9y
0
Still, why would you even need the check if what's going to be matched will be a number as stated in your string pattern? Spongocardo 1991 — 9y
0
The pattern checks for an arbitrary amount of decimals. "3.........5" would be matched, but that's not a number. funyun 958 — 9y
0
Ah, I see. Spongocardo 1991 — 9y

1 answer

Log in to vote
1
Answered by
BlueTaslem 18071 Moderation Voter Administrator Community Moderator Super Administrator
9 years ago

Essentially, what you're saying is that the part after the decimal should be optional.

  • *: zero or more
  • +: one or more:
  • ?: zero or one -- This is the one we are interested in!

Consider this JavaScript regexp (\ replaces %):

\d+(\.\d*)?

The () "group" \.\d*. The ? makes the thing before optional (zero or one). We just make the decimal point followed by numbers optional!

Lua supports ?, but does not support grouping in conjunction with any other operators, so while most regular-expression systems let us do this, Lua does not.1

In this case, it so happens that's not a problem. The key here is that what is before and after the %. is the same -- it's %d repeated a bunch. So this will be what we want:

%d+%.?%d*

Broken apart: %d+ %.? %d*

What does this accept?

  • 10 . 5
  • 10 .
  • 1
  • 1 2
  • 12
  • 12 (It doesn't really matter which one)

Note that this does not accept .25, because you used + on the first (though that's probably OK). You have to use either + before or after, or else . will be a valid number.


Note: tonumber allows a bunch of other formats for numbers, so while everything %d+%.?%d accepts will be a number, not all valid inputs to tonumber will be matched by this pattern.


  1. Lua does not implement regular expressions. Both because it does not support all features of regular languages (like using ? and * with groups) and because it supports non regular features (%b

Ad

Answer this question