Welcome
Welcome to the melody documentation!
Melody is a language that compiles to regular expressions and aims to be more easily readable and maintainable
Examples
Note: these are for the currently supported syntax and may change
Batman Theme
16 of "na";
2 of match {
<space>;
"batman";
}
// 🦇🦸♂️
Turns into
(?:na){16}(?: batman){2}
Twitter Hashtag
"#";
some of <word>;
// #melody
Turns into
#\w+
Introductory Courses
some of <alphabetic>;
<space>;
"1";
2 of <digit>;
// classname 1xx
Turns into
[a-zA-Z]+ 1\d{2}
Indented Code (2 spaces)
some of match {
2 of <space>;
}
some of <char>;
";";
// let value = 5;
Turns into
(?: {2})+.+;
Semantic Versions
<start>;
option of "v";
capture major {
some of <digit>;
}
".";
capture minor {
some of <digit>;
}
".";
capture patch {
some of <digit>;
}
<end>;
// v1.0.0
Turns into
^v?(?<major>\d+)\.(?<minor>\d+)\.(?<patch>\d+)$
Playground
You can try Melody in your browser by visiting the playground
Install
Cargo
cargo install melody_cli
From Source
git clone https://github.com/yoav-lavi/melody.git
cd melody
cargo install --path crates/melody_cli
Binary
- macOS binaries (aarch64 and x86_64) can be downloaded from the release page
Community
-
Brew (macOS and Linux)
Installation instructions
brew install melody -
Arch Linux (maintained by @ilai-deutel)
Installation instructions
-
Installation with an AUR helper, for instance using
paru:paru -Syu melody -
Install manually with
makepkg:git clone https://aur.archlinux.org/melody.git cd melody makepkg -si
-
-
Installation instructions
Should be the following once the registry is updated.
If you've successfuly installed via this method please open an issue and let me know.
Thanks!
nix-env -i melody
CLI
USAGE:
melody [OPTIONS] [INPUT_FILE_PATH]
ARGS:
<INPUT_FILE_PATH> Read from a file
Use '-' and or pipe input to read from stdin
OPTIONS:
-f, --test-file <TEST_FILE>
Test the compiled regex against the contents of a file
--generate-completions <COMPLETIONS>
Outputs completions for the selected shell
To use, write the output to the appropriate location for your shell
-h, --help
Print help information
-n, --no-color
Print output with no color
-o, --output <OUTPUT_FILE_PATH>
Write to a file
-r, --repl
Start the Melody REPL
-t, --test <TEST>
Test the compiled regex against a string
-V, --version
Print version information
Crates
melody_compiler- The Melody compiler crates.io docs.rsmelody_cli- A CLI wrapping the Melody compiler crates.io docs.rs
Syntax
Quantifiers
... of- used to express a specific amount of a pattern. equivalent to regex{5}(assuming5 of ...)... to ... of- used to express an amount within a range of a pattern. equivalent to regex{5,9}(assuming5 to 9 of ...)over ... of- used to express more than an amount of a pattern. equivalent to regex{6,}(assumingover 5 of ...)some of- used to express 1 or more of a pattern. equivalent to regex+any of- used to express 0 or more of a pattern. equivalent to regex*option of- used to express 0 or 1 of a pattern. equivalent to regex?
All quantifiers can be preceded by lazy to match the least amount of characters rather than the most characters (greedy). Equivalent to regex +?, *?, etc.
Symbols
<char>- matches any single character. equivalent to regex.<space>- matches a space character. equivalent to regex<whitespace>- matches any kind of whitespace character. equivalent to regex\sor[ \t\n\v\f\r]<newline>- matches a newline character. equivalent to regex\n<tab>- matches a tab character. equivalent to regex\t<return>- matches a carriage return character. equivalent to regex\r<feed>- matches a form feed character. equivalent to regex\f<null>- matches a null characther. equivalent to regex\0<digit>- matches any single digit. equivalent to regex\dor[0-9]<vertical>- matches a vertical tab character. equivalent to regex\v<word>- matches a word character (any latin letter, any digit or an underscore). equivalent to regex\wor[a-zA-Z0-9_]<alphabetic>- matches any single latin letter. equivalent to regex[a-zA-Z]<alphanumeric>- matches any single latin letter or any single digit. equivalent to regex[a-zA-Z0-9]<boundary>- Matches a character between a character matched by<word>and a character not matched by<word>without consuming the character. equivalent to regex\b<backspace>- matches a backspace control character. equivalent to regex[\b]
All symbols can be preceeded with not to match any character other than the symbol
Special Symbols
<start>- matches the start of the string. equivalent to regex^<end>- matches the end of the string. equivalent to regex$
Unicode Categories
Note: these are not supported when testing in the CLI (-t or -f) as the regex engine used does not support unicode categories. These require using the u flag.
<category::letter>- any kind of letter from any language<category::lowercase_letter>- a lowercase letter that has an uppercase variant<category::uppercase_letter>- an uppercase letter that has a lowercase variant.<category::titlecase_letter>- a letter that appears at the start of a word when only the first letter of the word is capitalized<category::cased_letter>- a letter that exists in lowercase and uppercase variants<category::modifier_letter>- a special character that is used like a letter<category::other_letter>- a letter or ideograph that does not have lowercase and uppercase variants
<category::mark>- a character intended to be combined with another character (e.g. accents, umlauts, enclosing boxes, etc.)<category::non_spacing_mark>- a character intended to be combined with another character without taking up extra space (e.g. accents, umlauts, etc.)<category::spacing_combining_mark>- a character intended to be combined with another character that takes up extra space (vowel signs in many Eastern languages)<category::enclosing_mark>- a character that encloses the character it is combined with (circle, square, keycap, etc.)
<category::separator>- any kind of whitespace or invisible separator<category::space_separator>- a whitespace character that is invisible, but does take up space<category::line_separator>- line separator character U+2028<category::paragraph_separator>- paragraph separator character U+2029
<category::symbol>- math symbols, currency signs, dingbats, box-drawing characters, etc<category::math_symbol>- any mathematical symbol<category::currency_symbol>- any currency sign<category::modifier_symbol>- a combining character (mark) as a full character on its own<category::other_symbol>- various symbols that are not math symbols, currency signs, or combining characters
<category::number>- any kind of numeric character in any script<category::decimal_digit_number>- a digit zero through nine in any script except ideographic scripts<category::letter_number>- a number that looks like a letter, such as a Roman numeral<category::other_number>- a superscript or subscript digit, or a number that is not a digit 0–9 (excluding numbers from ideographic scripts)
<category::punctuation>- any kind of punctuation character<category::dash_punctuation>- any kind of hyphen or dash<category::open_punctuation>- any kind of opening bracket<category::close_punctuation>- any kind of closing bracket<category::initial_punctuation>- any kind of opening quote<category::final_punctuation>- any kind of closing quote<category::connector_punctuation>- a punctuation character such as an underscore that connects words<category::other_punctuation>- any kind of punctuation character that is not a dash, bracket, quote or connectors
<category::other>- invisible control characters and unused code points<category::control>- an ASCII or Latin-1 control character: 0x00–0x1F and 0x7F–0x9F<category::format>- invisible formatting indicator<category::private_use>- any code point reserved for private use<category::surrogate>- one half of a surrogate pair in UTF-16 encoding<category::unassigned>- any code point to which no character has been assigned
These descriptions are from regular-expressions.info
Character Ranges
... to ...- used with digits or alphabetic characters to express a character range. equivalent to regex[5-9](assuming5 to 9) or[a-z](assuminga to z)
Literals
"..."or'...'- used to mark a literal part of the match. Melody will automatically escape characters as needed. Quotes (of the same kind surrounding the literal) should be escaped
Raw
`...`- added directly to the output without any escaping
Groups
capture- used to open acaptureor namedcaptureblock. captured patterns are later available in the list of matches (either positional or named). equivalent to regex(...)match- used to open amatchblock, matches the contents without capturing. equivalent to regex(?:...)either- used to open aneitherblock, matches one of the statements within the block. equivalent to regex(?:...|...)
Assertions
ahead- used to open anaheadblock. equivalent to regex(?=...). use after an expressionbehind- used to open anbehindblock. equivalent to regex(?<=...). use before an expression
Assertions can be preceeded by not to create a negative assertion (equivalent to regex (?!...), (?<!...))
Variables
-
let .variable_name = { ... }- defines a variable from a block of statements. can later be used with.variable_name. Variables must be declared before being used. Variable invocations cannot be quantified directly, use a group if you want to quantify a variable invocationexample:
let .a_and_b = { "a"; "b"; } .a_and_b; "c"; // abc
Extras
/* ... */,// ...- used to mark comments (note:// ...comments must be on separate line)
Future Feature Status
🐣 - Partially implemented
❌ - Not implemented
❔ - Unclear what the syntax will be
❓ - Unclear whether this will be implemented
| Melody | Regex | Status |
|---|---|---|
not "A"; | [^A] | 🐣 |
| variables / macros | 🐣 | |
<...::...> | \p{...} | 🐣 |
not <...::...> | \P{...} | 🐣 |
| file watcher | ❌ | |
| multiline groups in REPL | ❌ | |
flags: global, multiline, ... | /.../gm... | ❔ |
| (?) | \# | ❔ |
| (?) | \k<name> | ❔ |
| (?) | \uYYYY | ❔ |
| (?) | \xYY | ❔ |
| (?) | \ddd | ❔ |
| (?) | \cY | ❔ |
| (?) | $1 | ❔ |
| (?) | $` | ❔ |
| (?) | $& | ❔ |
| (?) | x20 | ❔ |
| (?) | x{06fa} | ❔ |
any of "a", "b", "c" * | [abc] | ❓ |
| multiple ranges * | [a-zA-Z0-9] | ❓ |
| regex optimization | ❓ | |
| standard library / patterns | ❓ | |
| reverse compiler | ❓ |
* these are expressable in the current syntax using other methods
Performance
Last measured on v0.20.0
Measured on an 8 core 2021 MacBook Pro 14-inch, Apple M1 Pro using criterion:
-
8 lines:
compiler/normal (8 lines) time: [4.3556 µs 4.3674 µs 4.3751 µs] slope [4.3556 µs 4.3751 µs] R^2 [0.9996144 0.9996931] mean [4.3377 µs 4.3678 µs] std. dev. [16.019 ns 30.154 ns] median [4.3270 µs 4.3777 µs] med. abs. dev. [3.1402 ns 41.334 ns] -
1M lines:
compiler/long input (1M lines) time: [470.04 ms 472.35 ms 474.78 ms] mean [470.04 ms 474.78 ms] std. dev. [2.0458 ms 5.3453 ms] median [469.54 ms 475.24 ms] med. abs. dev. [734.10 µs 6.8144 ms] -
Deeply nested:
compiler/deeply nested time: [4.2357 µs 4.2561 µs 4.2782 µs] slope [4.2357 µs 4.2782 µs] R^2 [0.9988854 0.9988087] mean [4.2474 µs 4.2752 µs] std. dev. [13.698 ns 29.574 ns] median [4.2426 µs 4.2819 µs] med. abs. dev. [2.7127 ns 43.193 ns]
To reproduce, run cargo bench or cargo xtask benchmark
Extensions
Packages
Integrations
Changelog
[v0.20.0] - 2024-11-24
Breaking
- Sets the MSRV to Rust 1.70.0
Fixes
- Removes use of
attyas it is unmaintained and has a low CVE
Dependencies
- Updates dependencies
Refactoring
- Clippy fixes
[v0.19.0] - 2023-07-16
Breaking
- Sets the MSRV to Rust 1.65.0
Features
- Adds
console.erroroutput for panics on the Wasm version - Deno no longer requires an init function
Fixes
- Fixes a few edge cases with hyphens and slashes
Dependencies
- Updates dependencies
Refactoring
- Clippy fixes
[v0.18.1] - 2022-06-25
Fixes
- Fixes playground link
Dependencies
- Updates dependencies
Refactoring
- Clippy fixes
[v0.18.0] - 2022-04-24
Features
- Adds support for unicode categories
Misc.
- Update dependencies
[v0.17.0] - 2022-04-23
Features
- Add support for testing matches in a file in the CLI
Refactoring
- Remove
anyhowin compiler in favor of emitting specific error variants
[v0.16.0] - 2022-04-13
Features
- Add support for testing matches in CLI
[v0.15.0] - 2022-04-13
Features
- Add shell completions for CLI
- Add Deno support
[v0.14.0] - 2022-04-11
Features
- Support stdin in CLI
- Emit proper exit codes on specific errors
[v0.13.10] - 2022-03-11
Fixes
- Fixes unnecessary grouping in quantifiers
[v0.13.9] - 2022-03-11
Misc.
- Version bump for documentation update
[v0.13.8] - 2022-03-11
Misc.
- Version bump for documentation update
[v0.13.7] - 2022-03-11
Misc.
- Version bump for documentation update
[v0.13.6] - 2022-03-11
Fixes
- Handles a few possible panics
[v0.13.5] - 2022-03-11
Misc.
- Version bump
[v0.13.4] - 2022-03-11
Tooling
- Strips binaries
Dependencies
- Updates dependencies
[v0.13.3] - 2022-03-09
Refactoring
- Replaces
lazy_staticwithonce_cell
[v0.13.2] - 2022-03-09
Performance
- Improves literal parse performance
Refactoring
- Reports a few possible panics with a ParseError
[v0.13.1] - 2022-03-08
Fixes
- Fixes an issue with single letter variable identifiers matching a following space
- Fixes a clash between REPL commands and variables
[v0.13.0] - 2022-03-08
Breaking
<alphabet>is now<alphabetic>
Features
- Support for lazy quantifiers
- All symbols now have negative counterparts
<alphanumeric>symbol added- Adds an experimental implementation of variables
[v0.12.4] - 2022-03-06
Misc.
- Version bump
[v0.12.3] - 2022-03-06
Fixes
- Fixes an issue with identifying negative char ranges
[v0.12.2] - 2022-03-05
Refactoring
- Performance improvements
Misc.
- Adds keywords and categories to cargo.toml files
[v0.12.1] - 2022-03-04
Misc.
- CLI documentation update
[v0.12.0] - 2022-03-04
Breaking
- Produces clean output (no
//and new newline after output)
Features
- Adds favicons for documentation and playground
- The Melody playground now supports add to homescreen
- Adds
#![forbid(unsafe_code)]
Benchmarks
- Adds benchmarks
[v0.11.1] - 2022-03-03
Fixes
- Fixes possible panics
Tests
- Adds tests
- Adds tests for CLI
Refactoring
- Removes duplicated code
[v0.11.0] - 2022-03-02
Breaking
ParseErrornow contains only onemessagefield, may be changed in the future- Line comments (
//) may only be used in a separate line - The REPL currently accepts blocks on a single line but not multiple lines
- Semicolons are no longer optional
Features
- Uses a Pest grammar and an AST to parse Melody
- Adds support for nested groups
- Adds support for negative ranges
- Adds initial support for negative character classes
- Adds support for
<backspace>,<boundary> - Adds support for inline comments
- Enforces group closing
- Supports NO_COLOR in CLI
-nremoves color from REPL as well
[v0.10.3] - 2022-02-26
Fixes
- Removes quantifiers after newlines
[v0.10.2] - 2022-02-26
Fixes
- Fixes the handling of some newline issues in the REPL
- Adds an error message for a read error in the REPL
[v0.10.1] - 2022-02-26
Fixes
- Trims only the end off of REPL input
[v0.10.0] - 2022-02-26
Breaking
- Changes the
-f, --fileCLI argument to-o, --output
Features
- Adds descriptions to CLI commands
[v0.9.0] - 2022-02-26
Features
- Adds
ahead,not ahead,behindandnot behindassertions
[v0.8.0] - 2022-02-26
Features
- Changes
<space>to<whitespace>(thanks @amirali #34) - Adds
<space>and<alphabet>(thanks @amirali #34) - Adds long versions for REPL commands
- Adds
.s, .sourceto print the current source in the REPL - Adds
.c, .clearto clear REPL history - Adds better error reporting to the playground
Fixes
- Fixes some undo / redo issues in the REPL
Refactoring
- Better error handling in the CLI
[v0.7.0] - 2022-02-24
Features
- Adds a REPL for
melody_cli - Adds better error messages for the playground
[v0.6.0] - 2022-02-23
Features
- Adds support for raw sequences (
`...`) - Allows any word character in
capturenames - Adds auto escaping for literals
- Adds the Melody version number to the documentation
Syntax Changes
- Changes
start,end, andcharto symbols (<start>,<end>,<char>) eithercreates a non capturing group
Refactoring
cargo clippyfixes inmelody_wasm
Fixes
- Uses the correct
urlin the documentation site config