Syntax

Quantifiers

... of - used to express a specific amount of a pattern. equivalent to regex {5} (assuming 5 of ...)
... to ... of - used to express an amount within a range of a pattern. equivalent to regex {5,9} (assuming 5 to 9 of ...)
over ... of - used to express more than an amount of a pattern. equivalent to regex {6,} (assuming over 5 of ...)
some of - used to express 1 or more of a pattern. equivalent to regex +
any of - used to express 0 or more of a pattern. equivalent to regex *
option of - used to express 0 or 1 of a pattern. equivalent to regex ?

All quantifiers can be preceded by lazy to match the least amount of characters rather than the most characters (greedy). Equivalent to regex +?, *?, etc.

Symbols

<char> - matches any single character. equivalent to regex .
<space> - matches a space character. equivalent to regex
<whitespace> - matches any kind of whitespace character. equivalent to regex \s or [ \t\n\v\f\r]
<newline> - matches a newline character. equivalent to regex \n
<tab> - matches a tab character. equivalent to regex \t
<return> - matches a carriage return character. equivalent to regex \r
<feed> - matches a form feed character. equivalent to regex \f
<null> - matches a null characther. equivalent to regex \0
<digit> - matches any single digit. equivalent to regex \d or [0-9]
<vertical> - matches a vertical tab character. equivalent to regex \v
<word> - matches a word character (any latin letter, any digit or an underscore). equivalent to regex \w or [a-zA-Z0-9_]
<alphabetic> - matches any single latin letter. equivalent to regex [a-zA-Z]
<alphanumeric> - matches any single latin letter or any single digit. equivalent to regex [a-zA-Z0-9]
<boundary> - Matches a character between a character matched by <word> and a character not matched by <word> without consuming the character. equivalent to regex \b
<backspace> - matches a backspace control character. equivalent to regex [\b]

All symbols can be preceeded with not to match any character other than the symbol

Special Symbols

<start> - matches the start of the string. equivalent to regex ^
<end> - matches the end of the string. equivalent to regex $

Unicode Categories

Note: these are not supported when testing in the CLI (-t or -f) as the regex engine used does not support unicode categories. These require using the u flag.

<category::letter> - any kind of letter from any language
- <category::lowercase_letter> - a lowercase letter that has an uppercase variant
- <category::uppercase_letter> - an uppercase letter that has a lowercase variant.
- <category::titlecase_letter> - a letter that appears at the start of a word when only the first letter of the word is capitalized
- <category::cased_letter> - a letter that exists in lowercase and uppercase variants
- <category::modifier_letter> - a special character that is used like a letter
- <category::other_letter> - a letter or ideograph that does not have lowercase and uppercase variants
<category::mark> - a character intended to be combined with another character (e.g. accents, umlauts, enclosing boxes, etc.)
- <category::non_spacing_mark> - a character intended to be combined with another character without taking up extra space (e.g. accents, umlauts, etc.)
- <category::spacing_combining_mark> - a character intended to be combined with another character that takes up extra space (vowel signs in many Eastern languages)
- <category::enclosing_mark> - a character that encloses the character it is combined with (circle, square, keycap, etc.)
<category::separator> - any kind of whitespace or invisible separator
- <category::space_separator> - a whitespace character that is invisible, but does take up space
- <category::line_separator> - line separator character U+2028
- <category::paragraph_separator> - paragraph separator character U+2029
<category::symbol> - math symbols, currency signs, dingbats, box-drawing characters, etc
- <category::math_symbol> - any mathematical symbol
- <category::currency_symbol> - any currency sign
- <category::modifier_symbol> - a combining character (mark) as a full character on its own
- <category::other_symbol> - various symbols that are not math symbols, currency signs, or combining characters
<category::number> - any kind of numeric character in any script
- <category::decimal_digit_number> - a digit zero through nine in any script except ideographic scripts
- <category::letter_number> - a number that looks like a letter, such as a Roman numeral
- <category::other_number> - a superscript or subscript digit, or a number that is not a digit 0–9 (excluding numbers from ideographic scripts)
<category::punctuation> - any kind of punctuation character
- <category::dash_punctuation> - any kind of hyphen or dash
- <category::open_punctuation> - any kind of opening bracket
- <category::close_punctuation> - any kind of closing bracket
- <category::initial_punctuation> - any kind of opening quote
- <category::final_punctuation> - any kind of closing quote
- <category::connector_punctuation> - a punctuation character such as an underscore that connects words
- <category::other_punctuation> - any kind of punctuation character that is not a dash, bracket, quote or connectors
<category::other> - invisible control characters and unused code points
- <category::control> - an ASCII or Latin-1 control character: 0x00–0x1F and 0x7F–0x9F
- <category::format> - invisible formatting indicator
- <category::private_use> - any code point reserved for private use
- <category::surrogate> - one half of a surrogate pair in UTF-16 encoding
- <category::unassigned> - any code point to which no character has been assigned

These descriptions are from regular-expressions.info

Character Ranges

... to ... - used with digits or alphabetic characters to express a character range. equivalent to regex [5-9] (assuming 5 to 9) or [a-z] (assuming a to z)

Literals

"..." or '...' - used to mark a literal part of the match. Melody will automatically escape characters as needed. Quotes (of the same kind surrounding the literal) should be escaped

Raw

`...` - added directly to the output without any escaping

Groups

capture - used to open a capture or named capture block. captured patterns are later available in the list of matches (either positional or named). equivalent to regex (...)
match - used to open a match block, matches the contents without capturing. equivalent to regex (?:...)
either - used to open an either block, matches one of the statements within the block. equivalent to regex (?:...|...)

Assertions

ahead - used to open an ahead block. equivalent to regex (?=...). use after an expression
behind - used to open an behind block. equivalent to regex (?<=...). use before an expression

Assertions can be preceeded by not to create a negative assertion (equivalent to regex (?!...), (?<!...))

Variables

let .variable_name = { ... } - defines a variable from a block of statements. can later be used with .variable_name. Variables must be declared before being used. Variable invocations cannot be quantified directly, use a group if you want to quantify a variable invocation

example:
```
let .a_and_b = {
  "a";
  "b";
}

.a_and_b;
"c";

// abc
```

Extras

/* ... */, // ... - used to mark comments (note: // ... comments must be on separate line)

Melody