dust/README.md

287 lines
9.8 KiB
Markdown
Raw Normal View History

2023-08-22 15:40:50 +00:00
# Dust
2023-11-18 01:46:18 +00:00
Dust is a general purpose programming language that emphasises concurrency and correctness.
2023-08-22 15:40:50 +00:00
A basic dust program:
```dust
2023-10-06 12:17:37 +00:00
(output "Hello world!")
2023-08-22 15:40:50 +00:00
```
2023-08-23 04:38:51 +00:00
Dust can do two (or more) things at the same time with effortless concurrency:
2023-08-22 15:40:50 +00:00
```dust
2023-10-17 01:13:58 +00:00
async {
(output 'will this one finish first?')
(output 'or will this one?')
}
2023-10-06 02:07:54 +00:00
```
2023-11-18 01:46:18 +00:00
You can make *any* block, i.e. `{}`, run its statements in parallel by changing it to `async {}`.
```dust
if (random_boolean) {
(output "Do something...")
} else async {
(output "Do something else instead...")
(output "And another thing at the same time...")
}
```
2023-11-27 15:27:44 +00:00
Dust is an interpreted, strictly typed language with first class functions. It emphasises concurrency by allowing any group of statements to be executed in parallel. Dust includes built-in tooling to import and export data in a variety of formats, including JSON, TOML, YAML and CSV.
2023-10-13 15:37:07 +00:00
2023-08-23 21:28:19 +00:00
<!--toc:start-->
- [Dust](#dust)
- [Features](#features)
- [Usage](#usage)
- [Installation](#installation)
2023-11-16 03:33:58 +00:00
- [Benchmarks](#benchmarks)
- [Implementation](#implementation)
2023-08-23 21:28:19 +00:00
- [The Dust Programming Language](#the-dust-programming-language)
2023-10-06 02:07:54 +00:00
- [Declaring Variables](#declaring-variables)
2023-08-23 21:28:19 +00:00
- [Lists](#lists)
- [Maps](#maps)
2023-11-16 03:33:58 +00:00
- [Loops](#loops)
2023-08-23 21:28:19 +00:00
- [Functions](#functions)
2023-10-17 01:13:58 +00:00
- [Concurrency](#concurrency)
2023-12-02 05:16:00 +00:00
- [Acknowledgements](#acknowledgements)
2023-08-23 21:28:19 +00:00
<!--toc:end-->
2023-08-22 15:40:50 +00:00
## Features
2023-10-13 15:37:07 +00:00
- Simplicity: Dust is designed to be easy to learn.
2023-11-27 15:27:44 +00:00
- Speed: Dust is built on [Tree Sitter] and [Rust] to prioritize performance and correctness. See [Benchmarks] below.
2023-12-27 04:30:33 +00:00
- Concurrency: Safe, effortless parallel code using thread pools.
2023-11-27 15:27:44 +00:00
- Safety: Written in safe, stable Rust.
2023-12-02 05:16:00 +00:00
- Correctness: Type checking makes it easy to write good code.
2023-08-22 15:40:50 +00:00
## Usage
Dust is an experimental project under active development. At this stage, features come and go and the API is always changing. It should not be considered for serious use yet.
2023-12-27 04:30:33 +00:00
```sh
cargo install dust-lang
dust -c "(output 'Hello world!')"
2023-08-28 22:05:49 +00:00
```
2023-08-22 15:40:50 +00:00
## Installation
2023-10-06 02:07:54 +00:00
You must have the default rust toolchain installed and up-to-date. Install [rustup] if it is not already installed. Run `cargo install dust-lang` then run `dust` to start the interactive shell. Use `dust --help` to see the full command line options.
2023-08-23 21:28:19 +00:00
2023-10-06 02:07:54 +00:00
To build from source, clone the repository and build the parser. To do so, enter the `tree-sitter-dust` directory and run `tree-sitter-generate`. In the project root, run `cargo run` to start the shell. To see other command line options, use `cargo run -- --help`.
2023-08-23 21:28:19 +00:00
2023-11-16 03:33:58 +00:00
## Benchmarks
Dust is at a very early development stage but performs strongly in preliminary benchmarks. The examples given were tested using [Hyperfine] on a single-core cloud instance with 1024 MB RAM. Each test was run 1000 times. The test script is shown below. Each test asks the program to read a JSON file and count the objects. Dust is a command line shell, programming language and data manipulation tool so three appropriate targets were chosen for comparison: nushell, NodeJS and jq. The programs produced identical output with the exception that NodeJS printed in color.
2023-11-16 03:33:58 +00:00
For the first test, a file with four entries was used.
| Command | Mean [ms] | Min [ms] | Max [ms]
|:---|---:|---:|---:|
| Dust | 3.1 ± 0.5 | 2.4 | 8.4 |
| jq | 33.7 ± 2.2 | 30.0 | 61.8 |
| NodeJS | 226.4 ± 13.1 | 197.6 | 346.2 |
| Nushell | 51.6 ± 3.7 | 45.4 | 104.3 |
2023-11-16 03:33:58 +00:00
The second set of data is from the GitHub API, it consists of 100 commits from the jq GitHub repo.
| Command | Mean [ms] | Min [ms] | Max [ms] |
|:---|---:|---:|---:|
| Dust | 6.8 ± 0.6 | 5.7 | 12.0 | 2.20 ± 0.40 |
| jq | 43.3 ± 3.6 | 37.6 | 81.6 | 13.95 ± 2.49 |
| NodeJS | 224.9 ± 12.3 | 194.8 | 298.5 |
| Nushell | 59.2 ± 5.7 | 49.7 | 125.0 | 19.11 ± 3.55 |
2023-11-16 03:33:58 +00:00
This data came from CERN, it is a massive file of 100,000 entries.
| Command | Mean [ms] | Min [ms] | Max [ms] |
|:---|---:|---:|---:|
| Dust | 1080.8 ± 38.7 | 975.3 | 1326.6 |
| jq | 1305.3 ± 64.3 | 1159.7 | 1925.1 |
| NodeJS | 1850.5 ± 72.5 | 1641.9 | 2395.1 |
| Nushell | 1850.5 ± 86.2 | 1625.5 | 2400.7 |
2023-11-16 03:33:58 +00:00
The tests were run after 5 warmup runs and the cache was cleared before each run.
```sh
hyperfine \
--shell none \
--warmup 5 \
--prepare "rm -rf /root/.cache" \
--runs 1000 \
--parameter-list data_path seaCreatures.json,jq_data.json,dielectron.json \
--export-markdown test_output.md \
"dust -c '(length (from_json input))' -p {data_path}" \
"jq 'length' {data_path}" \
"node --eval \"require('node:fs').readFile('{data_path}',(err,data)=>{console.log(JSON.parse(data).length)})\"" \
"nu -c 'open {data_path} | length'"
```
## Implementation
Dust is formally defined as a Tree Sitter grammar in the tree-sitter-dust directory. Tree sitter generates a parser, written in C, from a set of rules defined in JavaScript. Dust itself is a rust binary that calls the C parser using FFI.
Tests are written in three places: in the Rust library, in Dust as examples and in the Tree Sitter test format. Generally, features are added by implementing and testing the syntax in the tree-sitter-dust repository, then writing library tests to evaluate the new syntax. Implementation tests run the Dust files in the "examples" directory and should be used to demonstrate and verify that features work together.
Tree Sitter generates a concrete syntax tree, which Dust traverses to create an abstract syntax tree that can run the Dust code. The CST generation is an extra step but it allows easy testing of the parser, defining the language in one file and makes the syntax easy to modify and expand. Because it uses Tree Sitter, developer-friendly features like syntax highlighting and code navigation are already available in any text editor that supports Tree Sitter.
2023-08-23 04:38:51 +00:00
## The Dust Programming Language
2023-10-22 20:14:26 +00:00
Dust is easy to learn. Aside from this guide, the best way to learn Dust is to read the examples and tests to get a better idea of what it can do.
2023-08-22 15:40:50 +00:00
2023-10-06 02:07:54 +00:00
### Declaring Variables
2023-08-22 15:40:50 +00:00
2023-10-06 02:07:54 +00:00
Variables have two parts: a key and a value. The key is always a string. The value can be any of the following data types:
2023-08-22 15:40:50 +00:00
- string
- integer
2023-12-02 05:16:00 +00:00
- float
2023-08-22 15:40:50 +00:00
- boolean
- list
- map
2023-12-27 04:30:33 +00:00
- option
2023-08-22 15:40:50 +00:00
- function
Here are some examples of variables in dust.
2023-08-23 04:38:51 +00:00
```dust
2023-12-02 05:16:00 +00:00
string = "foobar"
2023-10-06 02:07:54 +00:00
integer = 42
float = 42.42
list = [1 2 string integer float] # Commas are optional when writing lists.
map = {
2023-10-11 17:10:06 +00:00
key = 'value'
2023-10-06 02:07:54 +00:00
}
2023-08-22 15:40:50 +00:00
```
2023-10-11 17:10:06 +00:00
Note that strings can be wrapped with any kind of quote: single, double or backticks. Numbers are always integers by default. Floats are declared by adding a decimal. If you divide integers or do any kind of math with a float, you will create a float value.
2023-08-22 15:40:50 +00:00
2023-12-10 00:05:36 +00:00
Dust enforces strict type checking, but you don't usually need to write the type, dust can figure it out on its own. The **number** and **any** types are special types that allow you to relax the type bounds.
```dust
string <string> = "foobar"
integer <int> = 42
float <float> = 42.42
numbers <[number]> = [integer float]
stuff <[any]> = [string integer float]
```
2023-08-22 15:40:50 +00:00
### Lists
Lists are sequential collections. They can be built by grouping values with square brackets. Commas are optional. Values can be indexed by their position using a colon `:` followed by an integer. Dust lists are zero-indexed.
2023-08-22 15:40:50 +00:00
2023-08-23 04:38:51 +00:00
```dust
2023-10-06 02:07:54 +00:00
list = [true 41 "Ok"]
2023-08-22 15:40:50 +00:00
(assert_equal list:0 true)
2023-08-29 02:07:20 +00:00
the_answer = list:1 + 1
2023-08-29 02:07:20 +00:00
2023-10-13 15:37:07 +00:00
(assert_equal the_answer, 42) # You can also use commas when passing values to
# a function.
2023-08-22 15:40:50 +00:00
```
### Maps
Maps are flexible collections with arbitrary key-value pairs, similar to JSON objects. A map is created with a pair of curly braces and its entries are variables declared inside those braces. Map contents can be accessed using a colon `:`.
2023-08-22 15:40:50 +00:00
2023-08-23 04:38:51 +00:00
```dust
2023-10-06 02:07:54 +00:00
reminder = {
message = "Buy milk"
tags = ["groceries", "home"]
}
2023-08-22 15:40:50 +00:00
(output reminder:message)
2023-08-22 15:40:50 +00:00
```
2023-10-17 20:21:59 +00:00
### Loops
A **while** loop continues until a predicate is false.
```dust
i = 0
while i < 10 {
(output i)
i += 1
}
```
A **for** loop operates on a list without mutating it or the items inside. It does not return a value.
```dust
list = [ 1, 2, 3 ]
for number in list {
2023-10-22 19:52:33 +00:00
(output number + 1)
2023-10-17 20:21:59 +00:00
}
```
2023-08-22 15:40:50 +00:00
### Functions
2023-12-27 04:30:33 +00:00
Functions are first-class values in dust, so they are assigned to variables like any other value.
2023-08-22 15:40:50 +00:00
2023-08-23 04:38:51 +00:00
```dust
2023-12-27 04:30:33 +00:00
# This simple function has no arguments and no return value.
2023-12-15 22:33:48 +00:00
say_hi = (fn) {
2023-10-11 17:10:06 +00:00
(output "hi")
2023-10-06 02:07:54 +00:00
}
2023-08-22 15:40:50 +00:00
# This function has one argument and will return a value.
2023-12-15 22:33:48 +00:00
add_one = (fn number <num>) <num> {
number + 1
2023-10-06 02:07:54 +00:00
}
2023-10-11 17:10:06 +00:00
(say_hi)
(assert_equal (add_one 3), 4)
2023-08-22 15:40:50 +00:00
```
You don't need commas when listing arguments and you don't need to add whitespace inside the function body but doing so may make your code easier to read.
2023-08-22 15:40:50 +00:00
2023-12-27 04:30:33 +00:00
### Option
The **option** type represents a value that may not be present. It has two variants: **some** and **none**. Dust includes built-in functions to work with option values: `is_none`, `is_some` and `either_or`.
```dust
say_something = (fn message <option(str)>) <str> {
(either_or message, "hiya")
}
(say_something some("goodbye"))
# goodbye
(say_something none)
# hiya
```
2023-10-16 20:48:02 +00:00
### Concurrency
Dust features effortless concurrency anywhere in your code. Any block of code can be made to run its contents asynchronously. Dust's concurrency is written in safe Rust and uses a thread pool whose size depends on the number of cores available.
2023-10-16 20:48:02 +00:00
```dust
# An async block will run each statement in its own thread.
2023-10-17 01:13:58 +00:00
async {
2023-10-22 20:14:26 +00:00
(output (random_integer))
(output (random_float))
(output (random_boolean))
2023-10-16 20:48:02 +00:00
}
```
2023-10-17 01:13:58 +00:00
```dust
data = async {
(output "Reading a file...")
(read "examples/assets/faithful.csv")
}
```
### Acknowledgements
Dust began as a fork of [evalexpr]. Some of the original code is still in place but the project has dramatically changed and no longer uses any of its parsing or interpreting.
2023-10-22 19:52:33 +00:00
[Tree Sitter]: https://tree-sitter.github.io/tree-sitter/
[Rust]: https://rust-lang.org
2023-08-22 15:40:50 +00:00
[evalexpr]: https://github.com/ISibboI/evalexpr
2023-08-28 14:18:55 +00:00
[rustup]: https://rustup.rs
[Hyperfine]: https://github.com/sharkdp/hyperfine