Scripting Helpers is winding down operations and is now read-only. More info→

How to use string patterns to get multiple parts of a string?

Asked by

trecept 367

6 years ago

I've tried but couldn't figure out how, Basically, here is an example string: "3r8jr83ht9h//f4j9898f439//f9438j9438//jf499f438j" and the lengths/actual string is randomised apart from the // between each part of the string How do I get each part of the string separated by the //? I want to get the first part (3r8jr83ht9h) and then the second one separately (f4j9898f439) and the third (f9438j9438) but I can't figure out how. Thanks

1 answer

Answered by

ScriptGuider 5640

6 years ago

Edited 6 years ago

Parsing

You may have heard the term parsing before; it is the action of breaking down a string of text into desirable components for the programmer (at least in this context). Most of the functions in the string library allow you to parse text by using special character classes to filter out what you really want. Some of these functions include match, find, gmatch, etc.

In your case, however, we don't just want one instance that follows a given pattern, we want every instance that follows a given pattern. Because of this, we will have to iterate over the entire string. Luckily, one of the functions in the string library, gmatch, also happens to be what is known as an iterator. For now, you can think of an iterator as simply a function for determining the next element in a given collection. For loops are particularly good for using iterators. Some iterators you may already be familiar with include: pairs, ipairs, or next

That said, let's begin by constructing a for loop with the gmatch iterator, and deciding what string pattern we will need to find every chunk of text between the // sequence.

-- Sample string
local sample = "3r8jr83ht9h//f4j9898f439//f9438j9438//jf499f438j"

-- We don't care about the second value our iterator returns, so we can just use one variable "chunk."
for chunk in string.gmatch(sample, "[^//]+") do
    print(chunk)
end

The code snippet above will print every chunk of text between the // marks. So... why use [^//]+ as the pattern? Let's break it down:

[ ... ], Is a set, which you can use to combine characters you would like to find or avoid.
^, The carrot symbol, is an anchor, which tells the parser to look at the beginning of the string.
+, Is a quantifier, which tells us how much of the pattern or how frequently to match.

However, this isn't quite the full story for deriving our pattern. The specific pattern we're using is referred to as a compliment, which specifically follows the syntax of a set containing an anchor (the carrot symbol) as the first element: [^ ... ]

You can find this on the ROBLOX wiki on this page if you would like to know more about compliments. It basically tells the program to look for everything in the string, excluding the characters in the set after the carrot symbol. Here's a brief example...

If a set tells you to find [abc], then the compliment of that set ( [^abc] ) tells you to find everything EXCEPT [abc].

Anyways, I hope I didn't make that explanation too confusing, I can have a nasty habit of doing that... Wrapping things up, it may be more convenient to create a function for parsing your text. Not only so you can reuse it, but also because you can organize your data a bit better. Here's an example:

local function parseString(text)
    local chunks = {} -- Store chunks of text in here

    -- Insert new chunks into chunk container
    for chunk in string.gmatch(text, "[^//]+") do
        chunks[#chunks+1] = chunk
    end

    -- Return table of text chunks
    return chunks
end

local textPacket = parseString("3r8jr83ht9h//f4j9898f439//f9438j9438//jf499f438j")
print(unpack(textPacket))

Hope this helps, let me know if you have any questions!

great answer, one question, if I wanted each part of the string to be a variable e.g stringpart1 = the first part before //, and then stringpart2 = the second part etc, how would I do this? trecept 367 — 6y

That can be achieved by organizing your data into something like a table, which is demonstrated in the last part of the answer. For example, you could assign stringpart1 = textPacket[1], stringpart2 = textPacket[2], etc. ScriptGuider 5640 — 6y

thank you! perfect trecept 367 — 6y

How to use string patterns to get multiple parts of a string?

1 answer

Parsing

Answer this question