dust-cli | ||
dust-lang | ||
examples | ||
.gitignore | ||
Cargo.lock | ||
Cargo.toml | ||
README.md |
Dust
Dust is a high-level interpreted programming language with static types that focuses on ease of use, performance and correctness.
Feature Progress
- Lexer
- Compiler
- VM
- Formatter
- CLI
- Run source
- Compile to chunk and show disassembly
- Tokenize using the lexer and show token list
- Format using the formatter and display the output
- Compile to and run from intermediate formats
- JSON
- Postcard
- Values
- No
null
orundefined
- Booleans
- Bytes
- Characters
- Enums
- Integers
- Floats
- Functions
- Lists
- Maps
- Ranges
- Strings
- Structs
- Tuples
- Runtime-efficient abstract values for lists and maps
- No
- Types
- Basic types for each kind of value
- Generalized types:
num
,any
struct
typesenum
types- Type arguments
- Type Checking
- Function returns
- If/Else branches
- Instruction arguments
- Variables
- Immutable by default
- Block scope
- Statically typed
- Functions
- First-class value
- Statically typed arguments and returns
- Pure (does not "inherit" local variables - only arguments)
- Type arguments
Implementation
Dust is implemented in Rust and is divided into several parts, primarily the lexer, compiler, and virtual machine. All of Dust's components are designed with performance in mind and the codebase uses as few dependencies as possible.
Lexer
The lexer emits tokens from the source code. Dust makes extensive use of Rust's zero-copy capabilities to avoid unnecessary allocations when creating tokens. A token, depending on its type, may contain a reference to some data from the source code. The data is only copied in the case of an error, because it improves the usability of the codebase for errors to own their data when possible. In a successfully executed program, no part of the source code is copied unless it is a string literal or identifier.
Compiler
The compiler creates a chunk, which contains all of the data needed by the virtual machine to run a Dust program. It does so by emitting bytecode instructions, constants and locals while parsing the tokens, which are generated one at a time by the lexer.
Parsing
Dust's compiler uses a custom Pratt parser, a kind of recursive descent parser, to translate a sequence of tokens into a chunk.
Optimizing
When generating instructions for a register-based virtual machine, there are opportunities to optimize the generated code, usually by consolidating register use or reusing registers within an expression. While it is best to output optimal code in the first place, it is not always possible. Dust's compiler has a simple peephole optimizer that can be used to modify isolated sections of the instruction list through a mutable reference.