1
0
dust/README.md
2024-11-29 19:47:34 -05:00

3.2 KiB

Dust

Dust is a high-level interpreted programming language with static types that focuses on ease of use, performance and correctness.

Feature Progress

  • Lexer
  • Compiler
  • VM
  • Formatter
  • CLI
    • Run source
    • Compile to chunk and show disassembly
    • Tokenize using the lexer and show token list
    • Format using the formatter and display the output
    • Compile to and run from intermediate formats
      • JSON
      • Postcard
  • Values
    • No null or undefined
    • Booleans
    • Bytes
    • Characters
    • Enums
    • Integers
    • Floats
    • Functions
    • Lists
    • Maps
    • Ranges
    • Strings
    • Structs
    • Tuples
    • Runtime-efficient abstract values for lists and maps
  • Types
    • Basic types for each kind of value
    • Generalized types: num, any
    • struct types
    • enum types
    • Type arguments
    • Type Checking
      • Function returns
      • If/Else branches
      • Instruction arguments
  • Variables
    • Immutable by default
    • Block scope
    • Statically typed
  • Functions
    • First-class value
    • Statically typed arguments and returns
    • Pure (does not "inherit" local variables - only arguments)
    • Type arguments

Implementation

Dust is implemented in Rust and is divided into several parts, primarily the lexer, compiler, and virtual machine. All of Dust's components are designed with performance in mind and the codebase uses as few dependencies as possible.

Lexer

The lexer emits tokens from the source code. Dust makes extensive use of Rust's zero-copy capabilities to avoid unnecessary allocations when creating tokens. A token, depending on its type, may contain a reference to some data from the source code. The data is only copied in the case of an error, because it improves the usability of the codebase for errors to own their data when possible. In a successfully executed program, no part of the source code is copied unless it is a string literal or identifier.

Compiler

The compiler creates a chunk, which contains all of the data needed by the virtual machine to run a Dust program. It does so by emitting bytecode instructions, constants and locals while parsing the tokens, which are generated one at a time by the lexer.

Parsing

Dust's compiler uses a custom Pratt parser, a kind of recursive descent parser, to translate a sequence of tokens into a chunk.

Optimizing

When generating instructions for a register-based virtual machine, there are opportunities to optimize the generated code, usually by consolidating register use or reusing registers within an expression. While it is best to output optimal code in the first place, it is not always possible. Dust's compiler has a simple peephole optimizer that can be used to modify isolated sections of the instruction list through a mutable reference.

Instructions

Virtual Machine

Previous Implementations

Inspiration