I have written a program which retrieves information using the HTTP API and makes it into a long string. I need a way of parsing the string so that every substring which lies between <title>.....</title> is saved into a table. I believe the answer is to be found in using string.gmatch or string.gsub but i can't quite figure it out. Any help would be much appreciated!
0
Extracting data between tags from a string - Help!
Started by rich73, Aug 16 2012 07:14 AM
3 replies to this topic
#1
Posted 16 August 2012 - 07:14 AM
#2
Posted 16 August 2012 - 07:18 AM
I have never used those cmds but I am a huge believer in serialized tables, take a look at the documentation, I think it works better than making a long string and splitting it, I admit that I really should learn those cmds if I ever plan on making an OS so that it can separate entered cmds from their parameters and each step in a path etc but maybe I will get to that
#3
Posted 16 August 2012 - 03:23 PM
To extract text from tags one can do something like this:
As you can see tag and pattern are similar to one another, and the bit which is different "(.*)" actually only says match any character (.) unlimited times (*) and return them (the parenthesis).
If you want learn about it more, read this.
tag = "<title>foobar</title>" -- your incoming document pattern = "<title>(.*)</title>" -- a regular expression, it is a pattern, which matches string which fulfill this pattern str = string.match (tag, pattern) -- str now holds "foobar"
As you can see tag and pattern are similar to one another, and the bit which is different "(.*)" actually only says match any character (.) unlimited times (*) and return them (the parenthesis).
If you want learn about it more, read this.
#4
Posted 16 August 2012 - 11:30 PM
Thank you! I had researched it a bit more since I posted the question and found these commands and the like in the official lua documentation but your explanation is very clear and easy to follow. I didn't fully understand how the capture parenthesis worked and how the pattern was put together but this makes a lot of sense to me now, thanks once again. (And the wiki link is great too!) />
1 user(s) are reading this topic
0 members, 1 guests, 0 anonymous users