Add some more README content
This commit is contained in:
parent
9331ae1a40
commit
3b23136eda
111
README.md
111
README.md
@ -1,10 +1,15 @@
|
|||||||
# Dust
|
# Dust
|
||||||
|
|
||||||
Dust is a high-level interpreted programming language with static types that focuses on ease of use,
|
Dust is a high-level interpreted programming language with static types that focuses on ease of use,
|
||||||
performance and correctness.
|
performance and correctness. The syntax, safety features and evaluation model are inspired by Rust.
|
||||||
|
Due to being interpreted, Dust's total time to execution is much lower than Rust's. Unlike other
|
||||||
|
interpreted languages, Dust is type-safe, with a simple yet powerful type system that enhances the
|
||||||
|
clarity and correctness of a program.
|
||||||
|
|
||||||
## Feature Progress
|
## Feature Progress
|
||||||
|
|
||||||
|
Dust is still in development. This list may change as the language evolves.
|
||||||
|
|
||||||
- [X] Lexer
|
- [X] Lexer
|
||||||
- [X] Compiler
|
- [X] Compiler
|
||||||
- [X] VM
|
- [X] VM
|
||||||
@ -18,18 +23,13 @@ performance and correctness.
|
|||||||
- [ ] JSON
|
- [ ] JSON
|
||||||
- [ ] Postcard
|
- [ ] Postcard
|
||||||
- Values
|
- Values
|
||||||
|
- [X] Basic values: booleans, bytes, characters, integers, floats, UTF-8 strings
|
||||||
- [X] No `null` or `undefined`
|
- [X] No `null` or `undefined`
|
||||||
- [X] Booleans
|
|
||||||
- [X] Bytes
|
|
||||||
- [X] Characters
|
|
||||||
- [ ] Enums
|
- [ ] Enums
|
||||||
- [X] Integers
|
|
||||||
- [X] Floats
|
|
||||||
- [X] Functions
|
- [X] Functions
|
||||||
- [X] Lists
|
- [X] Lists
|
||||||
- [ ] Maps
|
- [ ] Maps
|
||||||
- [X] Ranges
|
- [ ] Ranges
|
||||||
- [X] Strings
|
|
||||||
- [ ] Structs
|
- [ ] Structs
|
||||||
- [ ] Tuples
|
- [ ] Tuples
|
||||||
- [ ] Runtime-efficient abstract values for lists and maps
|
- [ ] Runtime-efficient abstract values for lists and maps
|
||||||
@ -38,25 +38,46 @@ performance and correctness.
|
|||||||
- [X] Generalized types: `num`, `any`
|
- [X] Generalized types: `num`, `any`
|
||||||
- [ ] `struct` types
|
- [ ] `struct` types
|
||||||
- [ ] `enum` types
|
- [ ] `enum` types
|
||||||
|
- [ ] Type aliases
|
||||||
- [ ] Type arguments
|
- [ ] Type arguments
|
||||||
- [ ] Type Checking
|
- [ ] Compile-time type checking
|
||||||
- [ ] Function returns
|
- [ ] Function returns
|
||||||
- [X] If/Else branches
|
- [X] If/Else branches
|
||||||
- [ ] Instruction arguments
|
- [ ] Instruction arguments
|
||||||
|
- [ ] Runtime type checking for debug compilation modes
|
||||||
- Variables
|
- Variables
|
||||||
- [X] Immutable by default
|
- [X] Immutable by default
|
||||||
- [X] Block scope
|
- [X] Block scope
|
||||||
- [X] Statically typed
|
- [X] Statically typed
|
||||||
|
- [X] Copy-free identifiers are stored in the chunk as string constants
|
||||||
- Functions
|
- Functions
|
||||||
- [X] First-class value
|
- [X] First-class value
|
||||||
- [X] Statically typed arguments and returns
|
- [X] Statically typed arguments and returns
|
||||||
- [X] Pure (does not "inherit" local variables - only arguments)
|
- [X] Pure (no "closure" of local variables, arguments are the only input)
|
||||||
- [ ] Type arguments
|
- [ ] Type arguments
|
||||||
|
- Control Flow
|
||||||
|
- [X] If/Else
|
||||||
|
- [ ] Loops
|
||||||
|
- [ ] `for`
|
||||||
|
- [ ] `loop`
|
||||||
|
- [X] `while`
|
||||||
|
- [ ] Match
|
||||||
|
- Instructions
|
||||||
|
- [X] Arithmetic
|
||||||
|
- [X] Boolean
|
||||||
|
- [X] Call
|
||||||
|
- [X] Constant
|
||||||
|
- [X] Control flow
|
||||||
|
- [X] Load
|
||||||
|
- [X] Store
|
||||||
|
- [X] Return
|
||||||
|
- [X] Stack
|
||||||
|
- [X] Unar
|
||||||
|
|
||||||
## Implementation
|
## Implementation
|
||||||
|
|
||||||
Dust is implemented in Rust and is divided into several parts, primarily the lexer, compiler, and
|
Dust is implemented in Rust and is divided into several parts, most importantly the lexer, compiler,
|
||||||
virtual machine. All of Dust's components are designed with performance in mind and the codebase
|
and virtual machine. All of Dust's components are designed with performance in mind and the codebase
|
||||||
uses as few dependencies as possible.
|
uses as few dependencies as possible.
|
||||||
|
|
||||||
### Lexer
|
### Lexer
|
||||||
@ -74,27 +95,77 @@ The compiler creates a chunk, which contains all of the data needed by the virtu
|
|||||||
Dust program. It does so by emitting bytecode instructions, constants and locals while parsing the
|
Dust program. It does so by emitting bytecode instructions, constants and locals while parsing the
|
||||||
tokens, which are generated one at a time by the lexer.
|
tokens, which are generated one at a time by the lexer.
|
||||||
|
|
||||||
|
Types are checked during parsing and each emitted instruction is associated with a type.
|
||||||
|
|
||||||
#### Parsing
|
#### Parsing
|
||||||
|
|
||||||
Dust's compiler uses a custom Pratt parser, a kind of recursive descent parser, to translate a
|
Dust's compiler uses a custom Pratt parser, a kind of recursive descent parser, to translate a
|
||||||
sequence of tokens into a chunk.
|
sequence of tokens into a chunk. Each token is given a precedence and may have a prefix and/or infix
|
||||||
|
parser. The parsers are just functions that modify the compiler and its output. For example, when
|
||||||
|
the compiler encounters a boolean token, its prefix parser is the `parse_boolean` function, which
|
||||||
|
emits a `LoadBoolean` instruction. An integer token's prefix parser is `parse_integer`, which emits
|
||||||
|
a `LoadConstant` instruction and adds the integer to the constant list. Tokens with infix parsers
|
||||||
|
include the math operators, which emit `Add`, `Subtract`, `Multiply`, `Divide`, and `Modulo`
|
||||||
|
instructions.
|
||||||
|
|
||||||
|
Functions are compiled into their own chunks, which are stored in the constant list. A function's
|
||||||
|
arguments are stored in the locals list. The VM must later bind the arguments to runtime values by
|
||||||
|
assigning each argument a register and associating the register with the local.
|
||||||
|
|
||||||
#### Optimizing
|
#### Optimizing
|
||||||
|
|
||||||
When generating instructions for a register-based virtual machine, there are opportunities to
|
When generating instructions for a register-based virtual machine, there are opportunities to
|
||||||
optimize the generated code, usually by consolidating register use or reusing registers within an
|
optimize the generated code by using fewer instructions or fewer registers. While it is best to
|
||||||
expression. While it is best to output optimal code in the first place, it is not always possible.
|
output optimal code in the first place, it is not always possible. Dust's compiler uses simple
|
||||||
Dust's compiler has a simple peephole optimizer that can be used to modify isolated sections of the
|
functions that modify isolated sections of the instruction list through a mutable reference.
|
||||||
instruction list through a mutable reference.
|
|
||||||
|
|
||||||
### Instructions
|
### Instructions
|
||||||
|
|
||||||
|
Dust's virtual machine is register-based and uses 64-bit instructions, which encode nine pieces of
|
||||||
|
information:
|
||||||
|
|
||||||
|
Bit | Description
|
||||||
|
----- | -----------
|
||||||
|
0-8 | The operation code.
|
||||||
|
9 | Boolean flag indicating whether the B argument is a constant.
|
||||||
|
10 | Boolean flag indicating whether the C argument is a constant.
|
||||||
|
11 | Boolean flag indicating whether the A argument is a local.
|
||||||
|
12 | Boolean flag indicating whether the B argument is a local.
|
||||||
|
13 | Boolean flag indicating whether the C argument is a local.
|
||||||
|
17-32 | The A argument,
|
||||||
|
33-48 | The B argument.
|
||||||
|
49-63 | The C argument.
|
||||||
|
|
||||||
### Virtual Machine
|
### Virtual Machine
|
||||||
|
|
||||||
## Previous Implementations
|
## Previous Implementations
|
||||||
|
|
||||||
|
Dust has gone through several iterations, each with its own unique features and design choices. It
|
||||||
|
was originally implemented with a syntax tree generated by an external parser, then a parser
|
||||||
|
generator, and finally a custom parser. Eventually the language was rewritten to use bytecode
|
||||||
|
instructions and a virtual machine. The current implementation is by far the most performant and the
|
||||||
|
general design is unlikely to change.
|
||||||
|
|
||||||
|
Dust previously had a more complex type system with type arguments (or "generics") and a simple
|
||||||
|
model for asynchronous execution of statements. Both of these features were removed to simplify the
|
||||||
|
language when it was rewritten to use bytecode instructions. Both features are planned to be
|
||||||
|
reintroduced in the future.
|
||||||
|
|
||||||
## Inspiration
|
## Inspiration
|
||||||
|
|
||||||
- [The Implementation of Lua 5.0](https://www.lua.org/doc/jucs05.pdf)
|
[Crafting Interpreters] by Bob Nystrom was a major inspiration for rewriting Dust to use bytecode
|
||||||
- [A No-Frills Introduction to Lua 5.1 VM Instructions](https://www.mcours.net/cours/pdf/hasclic3/hasssclic818.pdf)
|
instructions. It was also a great resource for writing the compiler, especially the Pratt parser.
|
||||||
- [Crafting Interpreters](https://craftinginterpreters.com/)
|
|
||||||
|
[A No-Frills Introduction to Lua 5.1 VM Instructions] by Kein-Hong Man was a great resource for the
|
||||||
|
design of Dust's instructions and operation codes. The Lua VM is simple and efficient, and Dust's VM
|
||||||
|
attempts to be the same, though it is not as optimized for different platforms. Dust's instructions
|
||||||
|
were originally 32-bit like Lua's, but were changed to 64-bit to allow for more complex information
|
||||||
|
about the instruction's arguments.
|
||||||
|
|
||||||
|
[The Implementation of Lua 5.0] by Roberto Ierusalimschy, Luiz Henrique de Figueiredo, and Waldemar
|
||||||
|
Celes was a great resource for understanding how a compiler and VM tie together. Dust's compiler's
|
||||||
|
optimization functions were inspired by Lua optimizations covered in this paper.
|
||||||
|
|
||||||
|
[Crafting Interpreters]: https://craftinginterpreters.com/
|
||||||
|
[The Implementation of Lua 5.0]: https://www.lua.org/doc/jucs05.pdf
|
||||||
|
[A No-Frills Introduction to Lua 5.1 VM Instructions]: https://www.mcours.net/cours/pdf/hasclic3/hasssclic818.pdf
|
||||||
|
@ -1,6 +1,6 @@
|
|||||||
//! An operation and its arguments for the Dust virtual machine.
|
//! An operation and its arguments for the Dust virtual machine.
|
||||||
//!
|
//!
|
||||||
//! Each instruction is a 64-bit unsigned integer that is divided into five fields:
|
//! Each instruction is a 64-bit unsigned integer that is divided into nine fields:
|
||||||
//! - Bits 0-8: The operation code.
|
//! - Bits 0-8: The operation code.
|
||||||
//! - Bit 9: Boolean flag indicating whether the B argument is a constant.
|
//! - Bit 9: Boolean flag indicating whether the B argument is a constant.
|
||||||
//! - Bit 10: Boolean flag indicating whether the C argument is a constant.
|
//! - Bit 10: Boolean flag indicating whether the C argument is a constant.
|
||||||
|
Loading…
Reference in New Issue
Block a user