Scripting Helpers is winding down operations and is now read-only. More info→
Ad
Log in to vote
1

How can I split-by-hyphens, but not remove the hyphens?

Asked by
BlueTaslem 18071 Moderation Voter Administrator Community Moderator Super Administrator
9 years ago

I want to turn a string like "hello-world" into a list of strings: {"hello-", "world"}

I can easily split by hyphens:

local r = {}
for word in text:gmatch("[^-]+") do
    table.insert(r, word)
end

and I can approximately put it together by assuming each word was followed by a hyphen:

for i, word in pairs(r) do
    if i < #r then
        word[i] = word[i] .. "-"
    end
end

but that is assuming that there is precisely one hyphen after each word and that the string doesn't start or end with hyphens.

I was considering something like this:

local r = {}
for word in text:gmatch("%-|[^-]+") do
    table.insert(r, word)
end

but Lua string patterns doesn't have support for "either" (what I wanted by the |), so that pattern doesn't work.

Is there any way to do this without going character-by-character?


More examples:

"--a--b---c" --> {"--a--", "b---", "c"} or {"--", "a--", "b---", "c"} -- I don't care which

"hello-there-person" --> {"hello-", "there-", "person"}

"hello" --> {"hello"}

2 answers

Log in to vote
3
Answered by
Unclear 1776 Moderation Voter
9 years ago

The string pattern [^-]*-* will do the trick, sort of. At the end it will always match an empty string (though this could probably be filtered out) with a really simple # check.

This fulfills the examples given.

-- text = "hello-world" --> "hello-", "world"
-- text = "--a--b---c" --> "--", "a--", "b---", "c"
-- text = "hello-there-person" --> "hello-", "there-", "person"
-- text = "hello" --> "hello"
text = "--a--b---c"
for word in text:gmatch("[^-]*-*") do
    if #word > 0 then
        print(word)
    end
end

While is it true there is no support for either and so %-|[^-]+ would not be possible, you can just sidestep this in this particular case with the * operator, which matches 0 or more.

The strategy would be to include non-hyphen characters with [^-]* and include hyphens in the end with -*.

0
Any idea what the rule is that let's you use `-*` instead of `%-*`? Is it because it's preceded by a `*`? BlueTaslem 18071 — 9y
0
Not entirely sure. It may be that * takes precedence over -, and so -* is grouped together as one set before - is registered as a modifier, because -*, -+, and -? still work. Unclear 1776 — 9y
Ad
Log in to vote
1
Answered by
Validark 1580 Snack Break Moderation Voter
9 years ago

Edit Ok remembered I can use *, and now my method is great :)

local k = {"hello-world", "--a--b---c", "hello-there-person", "hello"}

local listOfStrings = {}

for _,d in pairs(k) do
    d:gsub("(-*[^-]+-*)", function(x)
        table.insert(listOfStrings, x)
    end)
end
0
k[#k+1] = "-" and k[#k+1] = "1" will make this method fall flat. Unclear 1776 — 9y
0
The above comment was for the proposed string pattern -*%a+-* Unclear 1776 — 9y
0
-*[^-]+-* is not a proper fix to this. k[#k+1] = "-" still fails. Unclear 1776 — 9y
0
@YonaJune That isn't relevant to the intended purpose though. He won't have a string that is just "-", and if he does, he can do a simple check. Validark 1580 — 9y
View all comments (3 more)
0
Read the original string pattern he proposed. %-|[^-]+, where | is a binary operation meaning either %-+ or [^-]+. So yes, "-" is a valid case and you failed to catch it. Unclear 1776 — 9y
0
Agree to disagree. Validark 1580 — 9y
0
Only hyphens is something I need to handle, but it's not hard to special-case that BlueTaslem 18071 — 9y

Answer this question