JMC REGEX LEARNIN
Regular Expression (regex) usage in JMC
Making a regex action in jmc is determined by putting a slash on each side:
#action {/this means regex/}{not required here}
I'll list a set of basic expressions.
. matches any character once (wildcard)
\w matches one letter or underscore
\W matches one character that is NOT a letter
\d matches one digit
\D matches one character that is NOT a number
\s matches one whitespace (space bar)
\S matches one non-space
Quantifiers:
* matches 0 or more e.g. ".*" Can match ANYTHING even nothing. It means 0 or more of any character.
+ matches 1 or more e.g. ".+" Is the same as %0. It will match one or more of any character.
x-y matches x to y e.g. "a-p" matches a-p in alphabetical order, case matching. Must be in brackets.
{x,y} matches #x to #y e.g. "\w{3,10}" Will match any group of letters between 3 and 10.
? beforehand is optional e.g."says?" will match "say" and "says". "lol(wut)?" matches lolwut and lol.
Anchors:
^ beginning of line
$ end of line
#act {/^example$/} will only match the word example if there is nothing before and after, not even spaces.
Because of common characters in the middle of lines sometimes you want the literal character.
\ cancels the regex meaning e.g. "\?" matches "?", "\." matches ".", "\^" matches... I think you got it.
Groups and group commands:
[] match expressions e.g. "[a-z]" will match a lowercase letter. "[A-Z]" will match an uppercase letter.
() captures expression e.g. "(.*)" will capture everything to a variable. Used for wildcards and grouping sets of "[]".
^ cannot contain e.g. "[^a-z]" will capture anything that is not a lowercase a-z
| "or" e.g. "(staf|dager)" will match staf or dager
Now it's time to learn how to group things to match something a little more specific
[] brackets are required for any "-" usage such as "[a-z]"
A common one I use to capture a players name in Nodeka is "[\w-']+"
I will break it down in the order that regex recognizes it:
Everything in [] matches at the same time so [\w-'] will match 1 character that is a letter, a hyphen, or an apostrophe. Note that the hyphen is literal because there is no x-y.
If you want only lowercase you can use [a-z-'] but for a capitalized nodeka name that will not match.
Now the + outside of the bracket determines how many instances of the matched expression we want and + means 1 or more instance of.
To be extremely strict on a nodeka player name we could do "([A-Z][a-z-']+)" to say that the name MUST start with 1 capital and the rest of the name MUST be lowercase.
We use () brackets to contain the other sets of [].
We would not use * in this case because it also matches 0 and matching 0 of something already determined.
Capturing and using variables:
Anything inside () is turned into a variable in jmc starting with $0 and going to $1 and $2 etc, up to $9 as $10 is seen as variable $1 with a 0.
So now lets say we want to capture that players name and use it in in action, lets say a tell trigger which I will also break down.
#action {/^([\w-']+) tells? ([\w-']+), '(.*)'$/}{#output {yellow} [$0 to $1]: '$2'}
Firstly we have our slashes to tell jmc it will be using regex.
Secondly we have our anchors to say that nothing will come before ^ and nothing will come after $.
Now we have ([\w-']+) which is going to capture a variable because of () and it will be capturing 1 or more "+" of "\w-'" a word, a hyphen, or an apostrophe.
Then we have tells? which can match either tell or tells. The ? makes the s optional.
Now we can skip ahead to the (.*) which means another variable that can match any character "." 0 or more times "*". Nodeka conveniently gives us the apostophes that we can use as anchors.
This is a more powerful technique than using '%0' because it will stop at the first apostrophe while '(.*)' will continue until the very last apostrophe.
On the other side we have our output and our variables $0, $1, and $2. $0 matches the first persons name, $1 matches the second persons name, and $2 matches the message. Simple? :)
Non variable capturing:
Sometimes you will want to use () brackets to enclose a group of [] brackets but dont want to save them as a variable.
There is a simple way to do that in regex. Just put ?: at the start of the () brackets
(?:[\w-']+) will not save a variable but will still match everything that ([\w-']+) would.
This is useful if you are using a lot of () brackets in an action that would otherwise reach over the 10 variable limit.
Some jmc examples I use.
#act {/^[\s!*-.'`^v#@\$\[\]|\(\?\)~\\\/i\$]{36,37})$/}{matches any map line}
#act {/^You get a pile of (\d+),?(\d+)?,?(\d+)? gold\.$/}{matches the gold line}
#act {/- battle locked.*\(00:(\d+)\)$/}{#output {light green} Locked for $0 secs.}
Some great links that taught me all of this and much more:
I test everything here first, just remember their variables start with $1 and not $0 like JMC.
http://gskinner.com/RegExr/
All the information you could ever need on regex.
http://www.regular-expressions.info/