Scripting Helpers is winding down operations and is now read-only. More info→
Ad
Log in to vote
4

How to Split a String While Keeping Seperators?

Asked by
Benbebop 1049 Moderation Voter
3 years ago

I need to do some string splitting for this text compression algorithm, but it must keep the separators.

Pretty simple, say I have a string:

eabdeeceda

I want to split the string by "e"

string.split(str, "e")

Problem is that it removes the separators.

abd, c, da

I want

e, abd, e, e, c, e, da

I've tried to think of various manual methods but they all seem to fall flat in the end.

Here's my best attempt

function splitKeep(str, separator)
    local minimum = 0
    local tbl = {}
    repeat
        local a, b = string.find(str, separator, minimum) -- find first separator
        minimum = b -- set minimum
        local inner
        if string.sub(str, b, b+(b-a)) ~= separator then -- if the next section is not the same as the separator
            local c = string.find(str, separator, minimum) -- find the next separator
            inner = string.sub(str, b, c) -- get the string inbetween the end of the first separator and the start of the second
        end
        tbl[#tbl+1] = string.sub(str, b) -- insert the separator
        tbl[#tbl+1] = inner -- insert the inside string
    until string.find(str, separator, minimum) == nil
    return tbl
end

I already know this wont work, it wont return anything before the first separator and after the last separator. It's also pretty messy.

0
I like this question. DinozCreates 1070 — 3y

1 answer

Log in to vote
3
Answered by
imKirda 4491 Moderation Voter Community Moderator
3 years ago
Edited 3 years ago

I would use brackets in this case, I don't know if split will work here but string.gsub is what you would use, you can Google String Formatting page, it says everything about it. You would put , after each e and the you would do string.split(str, ','). I'm on phone so sorry about little no details but here is example:

local str = 'eabdeeceda'

str = string.gsub(str, '(e)', '%1,') 

print(string.split(str, ',')) 

where bracket in which e is, it let's you access third arguent, %1 refers to match that has been found and is inside the bracket so %1 in this case refers to e as its the only letter in the bracket.

0
There is one small problem there, the string I'm inputting could have any or all UTF-8 characters (possiblly extending to UTF-16 or UTF-32 depending on how roblox stores characters). So using other characters as separators could mess things up. Maybe I could get away with using something like nul ("string.char(0)") as the temporary separators, but I want to make sure this works for absolutely ever Benbebop 1049 — 3y
0
oh then I would just replace characters that are same as separator on something else while splitting and saving them into a table and then getting them back into the string if you get me? Or there should definitely be a better method idk about . imKirda 4491 — 3y
0
Ok, about the pattern string for gsub, if I wanted to have multiple separators (TAB, space and enter in my case) how would I do that. I could never get the hang of pattern strings. Benbebop 1049 — 3y
0
The pattern would be '\t\32?' where \t is tab, \32 In Ascii I a space (you can just put normal spacw there but I prefer ascii) and ? is enter, look into Ascii table of characters to find enter keycode key as I forgot it sorry, but I think it's \13. Make sure they are in the right order imKirda 4491 — 3y
View all comments (2 more)
0
How does order affect it? The tabs, spaces and enters can appear in any order in the string, would the pattern need to be in the same order as in the string for this method? Or does it just need to be in decending order or something? Benbebop 1049 — 3y
0
Yeah it must be in order like: string.gsub('hello', 'ello', '*') returns h * while string.gsub('hello', 'olle', '*') returns hello as nothing was subbed. imKirda 4491 — 3y
Ad

Answer this question