Syntax
Quantifiers
... of- used to express a specific amount of a pattern. equivalent to regex{5}(assuming5 of ...)... to ... of- used to express an amount within a range of a pattern. equivalent to regex{5,9}(assuming5 to 9 of ...)over ... of- used to express more than an amount of a pattern. equivalent to regex{6,}(assumingover 5 of ...)some of- used to express 1 or more of a pattern. equivalent to regex+any of- used to express 0 or more of a pattern. equivalent to regex*option of- used to express 0 or 1 of a pattern. equivalent to regex?
All quantifiers can be preceded by lazy to match the least amount of characters rather than the most characters (greedy). Equivalent to regex +?, *?, etc.
Symbols
<char>- matches any single character. equivalent to regex.<space>- matches a space character. equivalent to regex<whitespace>- matches any kind of whitespace character. equivalent to regex\sor[ \t\n\v\f\r]<newline>- matches a newline character. equivalent to regex\n<tab>- matches a tab character. equivalent to regex\t<return>- matches a carriage return character. equivalent to regex\r<feed>- matches a form feed character. equivalent to regex\f<null>- matches a null characther. equivalent to regex\0<digit>- matches any single digit. equivalent to regex\dor[0-9]<vertical>- matches a vertical tab character. equivalent to regex\v<word>- matches a word character (any latin letter, any digit or an underscore). equivalent to regex\wor[a-zA-Z0-9_]<alphabetic>- matches any single latin letter. equivalent to regex[a-zA-Z]<alphanumeric>- matches any single latin letter or any single digit. equivalent to regex[a-zA-Z0-9]<boundary>- Matches a character between a character matched by<word>and a character not matched by<word>without consuming the character. equivalent to regex\b<backspace>- matches a backspace control character. equivalent to regex[\b]
All symbols can be preceeded with not to match any character other than the symbol
Special Symbols
<start>- matches the start of the string. equivalent to regex^<end>- matches the end of the string. equivalent to regex$
Unicode Categories
Note: these are not supported when testing in the CLI (-t or -f) as the regex engine used does not support unicode categories. These require using the u flag.
<category::letter>- any kind of letter from any language<category::lowercase_letter>- a lowercase letter that has an uppercase variant<category::uppercase_letter>- an uppercase letter that has a lowercase variant.<category::titlecase_letter>- a letter that appears at the start of a word when only the first letter of the word is capitalized<category::cased_letter>- a letter that exists in lowercase and uppercase variants<category::modifier_letter>- a special character that is used like a letter<category::other_letter>- a letter or ideograph that does not have lowercase and uppercase variants
<category::mark>- a character intended to be combined with another character (e.g. accents, umlauts, enclosing boxes, etc.)<category::non_spacing_mark>- a character intended to be combined with another character without taking up extra space (e.g. accents, umlauts, etc.)<category::spacing_combining_mark>- a character intended to be combined with another character that takes up extra space (vowel signs in many Eastern languages)<category::enclosing_mark>- a character that encloses the character it is combined with (circle, square, keycap, etc.)
<category::separator>- any kind of whitespace or invisible separator<category::space_separator>- a whitespace character that is invisible, but does take up space<category::line_separator>- line separator character U+2028<category::paragraph_separator>- paragraph separator character U+2029
<category::symbol>- math symbols, currency signs, dingbats, box-drawing characters, etc<category::math_symbol>- any mathematical symbol<category::currency_symbol>- any currency sign<category::modifier_symbol>- a combining character (mark) as a full character on its own<category::other_symbol>- various symbols that are not math symbols, currency signs, or combining characters
<category::number>- any kind of numeric character in any script<category::decimal_digit_number>- a digit zero through nine in any script except ideographic scripts<category::letter_number>- a number that looks like a letter, such as a Roman numeral<category::other_number>- a superscript or subscript digit, or a number that is not a digit 0â9 (excluding numbers from ideographic scripts)
<category::punctuation>- any kind of punctuation character<category::dash_punctuation>- any kind of hyphen or dash<category::open_punctuation>- any kind of opening bracket<category::close_punctuation>- any kind of closing bracket<category::initial_punctuation>- any kind of opening quote<category::final_punctuation>- any kind of closing quote<category::connector_punctuation>- a punctuation character such as an underscore that connects words<category::other_punctuation>- any kind of punctuation character that is not a dash, bracket, quote or connectors
<category::other>- invisible control characters and unused code points<category::control>- an ASCII or Latin-1 control character: 0x00â0x1F and 0x7Fâ0x9F<category::format>- invisible formatting indicator<category::private_use>- any code point reserved for private use<category::surrogate>- one half of a surrogate pair in UTF-16 encoding<category::unassigned>- any code point to which no character has been assigned
These descriptions are from regular-expressions.info
Character Ranges
... to ...- used with digits or alphabetic characters to express a character range. equivalent to regex[5-9](assuming5 to 9) or[a-z](assuminga to z)
Literals
"..."or'...'- used to mark a literal part of the match. Melody will automatically escape characters as needed. Quotes (of the same kind surrounding the literal) should be escaped
Raw
`...`- added directly to the output without any escaping
Groups
capture- used to open acaptureor namedcaptureblock. captured patterns are later available in the list of matches (either positional or named). equivalent to regex(...)match- used to open amatchblock, matches the contents without capturing. equivalent to regex(?:...)either- used to open aneitherblock, matches one of the statements within the block. equivalent to regex(?:...|...)
Assertions
ahead- used to open anaheadblock. equivalent to regex(?=...). use after an expressionbehind- used to open anbehindblock. equivalent to regex(?<=...). use before an expression
Assertions can be preceeded by not to create a negative assertion (equivalent to regex (?!...), (?<!...))
Variables
-
let .variable_name = { ... }- defines a variable from a block of statements. can later be used with.variable_name. Variables must be declared before being used. Variable invocations cannot be quantified directly, use a group if you want to quantify a variable invocationexample:
let .a_and_b = { "a"; "b"; } .a_and_b; "c"; // abc
Extras
/* ... */,// ...- used to mark comments (note:// ...comments must be on separate line)