Async Rust needs Await and 'thread for `Send` `Future`

Years ago, I wrote a client library for Apache BookKeeper and chose Send, !Sync and Clone for the client object initially.

Clone and Send so the client can be shared among concurrent tasks.
!Sync so the every clients can have their own data without cross thread synchronization.

This way the client can batch simultaneous requests in single asynchronous task and serve parallel requests in multiple concurrent asynchronous tasks. But it failed due to .await requires &self to be Send which is not possible by definition if Self is !Sync.

I complained it a lot with quotes from What shall Sync mean across an .await?. Recently, in developing spawns, I found many async runtimes have spawn_local to spawn !Send tasks. It is boring. I said A future should be Send unless it captures !Send before. Currently, some !Send tasks should actually be Send. This time, I want to go further about what make a future !Send and how Rust could solve them.

Before continue, I want to state two points.

Codes before `.await` happens before codes after `.await`.

This is the ground truth in our mental, otherwise everything fucked up. It is same for codes in thread and process.

Rust future is a combination of states and Future::poll. .await is the call site of Future::poll which advances states. Then above statement become: Future::poll observe data changes from last run. In single thread executor, Future::poll is invoked sequentially, so above statement hold. In multi-thread executor, thread which acquire the future will observe changes made in thread which release that future. Multi-thread executors are considered buggy if they can’t guarantee above statement.

Futures are self-contained concurrent execution units, just like threads to multi-cores.

From above, we know that codes in future are executed sequentially, we fear no contention inside single future and we are capable to run multiple futures concurrently. Additionally, a Future + Send + 'static is self-contained, it contains nothing !Send or no static to outside. All those are same to what threads provide to us, self-contained sequential execution unit in itself but concurrent with each other. If we are able to use !Send after thread::sleep, we should be able to do the same after .await.

Let’s dive in how future steps.

Decompose future state machine and poll step by step

thread_local! {
    static THREAD_LOCAL_RC1: RefCell<Option<Rc<Cell<u32>>>> = RefCell::new(None);
    static THREAD_LOCAL_RC2: Rc<Cell<usize>> = Rc::new(Cell::new(0));
}

struct Input {
    // rc0: Rc<Cell<u32>>,
    data: Arc<Mutex<u32>>,
};

struct Output {
    rc3: Rc<Cell<u32>>,
}

async fn task0(input: Input) -> Output {
    let rc1 = Rc::new(Cell::new(1));
    task1().await;
    THREAD_LOCAL_RC1.set(Some(rc1.clone()));
    let rc2 = THREAD_LOCAL_RC2.with(|rc| rc.clone());
    task2().await;
    drop(rc1);
    drop(rc2);
    let lock_guard = input.mutex.lock().unwrap();
    task3().await;
    let output = Output {
        rc3: Rc::new(Cell::new(3)),
    };
    output
}

let input = Input::new();
// ...
let future0 = task0(input);
spawn(future0);

Given above code, let’s decomposes future created from task0(input). It should be something similar to below.

enum Task0State {
    Initial(Input),
    Step1(Task0StateStep1),
    Step2(Task0StateStep2),
    Step3(Task0StateStep3),
    Finished(()),
}

struct Task0StateStep1 {
    rc1: Rc<Cell<u32>,
    task1_state: Task1State,
}

struct Task0StateStep2 {
    rc1: Rc<Cell<u32>,
    rc2: Rc<Cell<u32>,
    task2_state: Task2State,
}

struct Task0StateStep3 {
    lock_guard: MutexGuard<..>,
    task3_state: Task3State,
}

struct Task0Output(Output);

All setup, let’s inspect the future state machine step by step.

Initially, future0 is constructed as Task0State::Initial(Input) with input data. If Input is !Send, then apparently future0 should be !Send, otherwise it is possible that multiple execution units owns multiple clones of !Send. So, Future should be !Send if it captures !Send.

Let’s assume Input contains is Send and continue.

Let’s Future::poll(future0) to step to Task0State::Step1(..). In this step, it creates !Send Rc for later usage. By definition, we are unable to send rc1 to another thread in safe way, which means we owns rc1. This way, there will be no clones of rc1 in another execution units (ignoring TLS for now), neither tasks nor threads. That is, Future could be Send if it owns !Send.

Let’s poll in thread1 to step to Task0State::Step2. This time, future0 stores a clone of rc1 to thread1 TLS and loads a clone of rc2 from thread1 TLS. If future0 is migrated to thread2 in next poll, then both thread1 and thread2 have clones of rc1 and rc2, which is apparently wrong. But, currently, Rust has no way to prevent futures from loading/storing !Send from/to TLS. It simply propagate !Send from Task0StateStep2 to forbid migration of future0.

Let’s ignore above issues and continue poll to step from Task0State::Step2 to Task0State::Step3. Currently, Rust will make future0 !Send as it captures MutexGuard which is !Send. But if another future, which requires the same mutex lock, is polled in the same thread, it results in deadlock.

Now, let’s poll again to step to Task0State::Finished. This time, future0 completed with its owning !Send. Transform a sole copy of !Send from one execution unit to another expose no problem.

Let’s summary.

Future should be !Send if it captures !Send.
Future could be Send if it owns !Send.
Future could be Send if it outputs owning !Send.
Currently, Rust has no way to prevent futures from loading/storing !Send from/to TLS.
Currently, !Send Future is vulnerable to deadlock in single thread executor.

All done, but what if all above state machine codes are handwritten ? I think it is prefectly ok for handwritten future to be !Send if it contains any !Send. This way, it could be really easy to detect the cause.

How Rust could solve them ?

Rust behaves this way already.
Rust should implement Send for these transition states if they contains only owning !Send but not thread local !Send.
Same as above. But I havn’t seen much value of this.
This is the hard part. How do we know a !Send is owned by future ? Let’s talk it separately.
I think we could introduce !Await to forbid types from implemenet Future. This way we could avoid the deadlock in compile time.. All thread level lock guards should implement !Await.

How Rust could guard future from access thread locals ?

Continue from above, there are candidates for us.

Document access of thread locals in async future as not safe. Just as how stackful coroutine documents them.
Restrict !Send operations in thread locals. This is absolutely aggressive and unfriendly, but it is worth to try if we can work out in non breaking change way or ground new API.
'thread lifetime. It is the once for all solution. The challenge from my side is how to deal with Clone to !Send ? The lifetime should be capable to decorate types in addition to references, just like how 'static applied to type. And any references to 'thread will make the future !Send and also not 'static by definition. I am positive to this, as lifetime in heart is compile time concept and we are dealing with compiler generation code.

Thoughts in community

In preparing and writing this article, I found may people have similar thoughts.

How often do you want non-send futures?
What shall Sync mean across an .await?
Future + Send Was (Not) Unavoidable
What If We Pretended That a Task = Thread?
Controversial opinion: keeping Rc across await should not make future !Send by itself
Non-Send Futures When? The author constructed a !Sync which make resulting future !Send, just like I did before.
Non-Send Futures

I am not the first having this thought, and I will not be the last. I would like to see Future be !Send if and only if it capture !Send or exchange !Send with thread locals.

Recording

Async Rust needs Await and 'thread for `Send` `Future`

Codes before `.await` happens before codes after `.await`.

Futures are self-contained concurrent execution units, just like threads to multi-cores.

Decompose future state machine and poll step by step

How Rust could solve them ?

How Rust could guard future from access thread locals ?

Thoughts in community

References

Codes before .await happens before codes after .await.

Futures are self-contained concurrent execution units, just like threads to multi-cores.

Decompose future state machine and poll step by step

How Rust could solve them ?

How Rust could guard future from access thread locals ?

Thoughts in community

References

Codes before `.await` happens before codes after `.await`.