1
0

Add some more README content

This commit is contained in:
Jeff 2024-11-29 22:43:13 -05:00
parent 9331ae1a40
commit 3b23136eda
2 changed files with 92 additions and 21 deletions

111
README.md
View File

@ -1,10 +1,15 @@
# Dust # Dust
Dust is a high-level interpreted programming language with static types that focuses on ease of use, Dust is a high-level interpreted programming language with static types that focuses on ease of use,
performance and correctness. performance and correctness. The syntax, safety features and evaluation model are inspired by Rust.
Due to being interpreted, Dust's total time to execution is much lower than Rust's. Unlike other
interpreted languages, Dust is type-safe, with a simple yet powerful type system that enhances the
clarity and correctness of a program.
## Feature Progress ## Feature Progress
Dust is still in development. This list may change as the language evolves.
- [X] Lexer - [X] Lexer
- [X] Compiler - [X] Compiler
- [X] VM - [X] VM
@ -18,18 +23,13 @@ performance and correctness.
- [ ] JSON - [ ] JSON
- [ ] Postcard - [ ] Postcard
- Values - Values
- [X] Basic values: booleans, bytes, characters, integers, floats, UTF-8 strings
- [X] No `null` or `undefined` - [X] No `null` or `undefined`
- [X] Booleans
- [X] Bytes
- [X] Characters
- [ ] Enums - [ ] Enums
- [X] Integers
- [X] Floats
- [X] Functions - [X] Functions
- [X] Lists - [X] Lists
- [ ] Maps - [ ] Maps
- [X] Ranges - [ ] Ranges
- [X] Strings
- [ ] Structs - [ ] Structs
- [ ] Tuples - [ ] Tuples
- [ ] Runtime-efficient abstract values for lists and maps - [ ] Runtime-efficient abstract values for lists and maps
@ -38,25 +38,46 @@ performance and correctness.
- [X] Generalized types: `num`, `any` - [X] Generalized types: `num`, `any`
- [ ] `struct` types - [ ] `struct` types
- [ ] `enum` types - [ ] `enum` types
- [ ] Type aliases
- [ ] Type arguments - [ ] Type arguments
- [ ] Type Checking - [ ] Compile-time type checking
- [ ] Function returns - [ ] Function returns
- [X] If/Else branches - [X] If/Else branches
- [ ] Instruction arguments - [ ] Instruction arguments
- [ ] Runtime type checking for debug compilation modes
- Variables - Variables
- [X] Immutable by default - [X] Immutable by default
- [X] Block scope - [X] Block scope
- [X] Statically typed - [X] Statically typed
- [X] Copy-free identifiers are stored in the chunk as string constants
- Functions - Functions
- [X] First-class value - [X] First-class value
- [X] Statically typed arguments and returns - [X] Statically typed arguments and returns
- [X] Pure (does not "inherit" local variables - only arguments) - [X] Pure (no "closure" of local variables, arguments are the only input)
- [ ] Type arguments - [ ] Type arguments
- Control Flow
- [X] If/Else
- [ ] Loops
- [ ] `for`
- [ ] `loop`
- [X] `while`
- [ ] Match
- Instructions
- [X] Arithmetic
- [X] Boolean
- [X] Call
- [X] Constant
- [X] Control flow
- [X] Load
- [X] Store
- [X] Return
- [X] Stack
- [X] Unar
## Implementation ## Implementation
Dust is implemented in Rust and is divided into several parts, primarily the lexer, compiler, and Dust is implemented in Rust and is divided into several parts, most importantly the lexer, compiler,
virtual machine. All of Dust's components are designed with performance in mind and the codebase and virtual machine. All of Dust's components are designed with performance in mind and the codebase
uses as few dependencies as possible. uses as few dependencies as possible.
### Lexer ### Lexer
@ -74,27 +95,77 @@ The compiler creates a chunk, which contains all of the data needed by the virtu
Dust program. It does so by emitting bytecode instructions, constants and locals while parsing the Dust program. It does so by emitting bytecode instructions, constants and locals while parsing the
tokens, which are generated one at a time by the lexer. tokens, which are generated one at a time by the lexer.
Types are checked during parsing and each emitted instruction is associated with a type.
#### Parsing #### Parsing
Dust's compiler uses a custom Pratt parser, a kind of recursive descent parser, to translate a Dust's compiler uses a custom Pratt parser, a kind of recursive descent parser, to translate a
sequence of tokens into a chunk. sequence of tokens into a chunk. Each token is given a precedence and may have a prefix and/or infix
parser. The parsers are just functions that modify the compiler and its output. For example, when
the compiler encounters a boolean token, its prefix parser is the `parse_boolean` function, which
emits a `LoadBoolean` instruction. An integer token's prefix parser is `parse_integer`, which emits
a `LoadConstant` instruction and adds the integer to the constant list. Tokens with infix parsers
include the math operators, which emit `Add`, `Subtract`, `Multiply`, `Divide`, and `Modulo`
instructions.
Functions are compiled into their own chunks, which are stored in the constant list. A function's
arguments are stored in the locals list. The VM must later bind the arguments to runtime values by
assigning each argument a register and associating the register with the local.
#### Optimizing #### Optimizing
When generating instructions for a register-based virtual machine, there are opportunities to When generating instructions for a register-based virtual machine, there are opportunities to
optimize the generated code, usually by consolidating register use or reusing registers within an optimize the generated code by using fewer instructions or fewer registers. While it is best to
expression. While it is best to output optimal code in the first place, it is not always possible. output optimal code in the first place, it is not always possible. Dust's compiler uses simple
Dust's compiler has a simple peephole optimizer that can be used to modify isolated sections of the functions that modify isolated sections of the instruction list through a mutable reference.
instruction list through a mutable reference.
### Instructions ### Instructions
Dust's virtual machine is register-based and uses 64-bit instructions, which encode nine pieces of
information:
Bit | Description
----- | -----------
0-8 | The operation code.
9 | Boolean flag indicating whether the B argument is a constant.
10 | Boolean flag indicating whether the C argument is a constant.
11 | Boolean flag indicating whether the A argument is a local.
12 | Boolean flag indicating whether the B argument is a local.
13 | Boolean flag indicating whether the C argument is a local.
17-32 | The A argument,
33-48 | The B argument.
49-63 | The C argument.
### Virtual Machine ### Virtual Machine
## Previous Implementations ## Previous Implementations
Dust has gone through several iterations, each with its own unique features and design choices. It
was originally implemented with a syntax tree generated by an external parser, then a parser
generator, and finally a custom parser. Eventually the language was rewritten to use bytecode
instructions and a virtual machine. The current implementation is by far the most performant and the
general design is unlikely to change.
Dust previously had a more complex type system with type arguments (or "generics") and a simple
model for asynchronous execution of statements. Both of these features were removed to simplify the
language when it was rewritten to use bytecode instructions. Both features are planned to be
reintroduced in the future.
## Inspiration ## Inspiration
- [The Implementation of Lua 5.0](https://www.lua.org/doc/jucs05.pdf) [Crafting Interpreters] by Bob Nystrom was a major inspiration for rewriting Dust to use bytecode
- [A No-Frills Introduction to Lua 5.1 VM Instructions](https://www.mcours.net/cours/pdf/hasclic3/hasssclic818.pdf) instructions. It was also a great resource for writing the compiler, especially the Pratt parser.
- [Crafting Interpreters](https://craftinginterpreters.com/)
[A No-Frills Introduction to Lua 5.1 VM Instructions] by Kein-Hong Man was a great resource for the
design of Dust's instructions and operation codes. The Lua VM is simple and efficient, and Dust's VM
attempts to be the same, though it is not as optimized for different platforms. Dust's instructions
were originally 32-bit like Lua's, but were changed to 64-bit to allow for more complex information
about the instruction's arguments.
[The Implementation of Lua 5.0] by Roberto Ierusalimschy, Luiz Henrique de Figueiredo, and Waldemar
Celes was a great resource for understanding how a compiler and VM tie together. Dust's compiler's
optimization functions were inspired by Lua optimizations covered in this paper.
[Crafting Interpreters]: https://craftinginterpreters.com/
[The Implementation of Lua 5.0]: https://www.lua.org/doc/jucs05.pdf
[A No-Frills Introduction to Lua 5.1 VM Instructions]: https://www.mcours.net/cours/pdf/hasclic3/hasssclic818.pdf

View File

@ -1,6 +1,6 @@
//! An operation and its arguments for the Dust virtual machine. //! An operation and its arguments for the Dust virtual machine.
//! //!
//! Each instruction is a 64-bit unsigned integer that is divided into five fields: //! Each instruction is a 64-bit unsigned integer that is divided into nine fields:
//! - Bits 0-8: The operation code. //! - Bits 0-8: The operation code.
//! - Bit 9: Boolean flag indicating whether the B argument is a constant. //! - Bit 9: Boolean flag indicating whether the B argument is a constant.
//! - Bit 10: Boolean flag indicating whether the C argument is a constant. //! - Bit 10: Boolean flag indicating whether the C argument is a constant.