Scripting Helpers is winding down operations and is now read-only. More info→
Ad
Log in to vote
0

How to get text from a webpage?

Asked by
I_0KI 40
6 years ago

I know to get the HTML contents of a webpage you would do test = game:GetService("HttpService"):GetAsync("http://example.com") print(test) but how would you get just the text contents? I've done it before, I just forgot how.

0
Well... for something like that, the website would have to be something with RAW Text Wiscript 622 — 6y

1 answer

Log in to vote
0
Answered by 6 years ago
function stripTags(html)
  return string.gsub(html, "<[^>]+>", "")
end
local html = game:GetService("HttpService"):GetAsync("http://example.com") 
local text, numberOfTagsRemoved = stripTags(html)
print(text)

Output:






Example Domain body { background-color: #f0f0f2; margin: 0; padding: 0; font-family: "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif; } div { width: 600px; margin: 5em auto; padding: 50px; background-color: #fff; border-radius: 1em; } a:link, a:visited { color: #38488f; text-decoration: none; } @media (max-width: 700px) { body { background-color: #fff; } div { width: auto; margin: 0 auto; border-radius: 0; padding: 1em; } } Example Domain This domain is established to be used for illustrative examples in documents. You may use this domain in examples without prior coordination or asking for permission. More information...
0
You could also do the following before calling stripTags(), to narrow down the content returned at the end:- Remove everything that's not in the <body>- Remove any <style> and <script> elements (this is a bit tricky to do without a full HTML parser) WillieTehWierdo200 966 — 6y
Ad

Answer this question