Async Rust — Zed 的博客

欢迎阅读名为 Zed Decoded 的新系列文章的第一篇。在 Zed Decoded 中，我将仔细研究 Zed — 它的构建方式、使用的数据结构、技术和技巧、功能、遇到的错误。最棒的是什么？我不会独自完成，而是会采访我在 Zed 的同事，询问我想知道的一切。

配套视频

Async Rust

这篇文章附带一个 1 小时的配套视频，Thorsten 和 Antonio 在其中探讨了 Zed 如何在 Zed 中使用异步 Rust。这是一个轻松的对话，重点关注代码，并深入探讨一些不适合写在文章中的主题。

在此处观看视频 →

我的清单上的第一个主题是：异步 Rust 以及它在 Zed 中的使用方式。在过去的几个月里，我对异步 Rust 变得非常着迷 — Zed 是我参与的第一个使用它的代码库 — 所以我决定坐下来询问 Zed 的联合创始人之一 Antonio，我们在 Zed 中如何使用异步 Rust。

我们不会深入研究异步 Rust 本身的细节（如果您想了解我们将看到的代码的细节，则需要对此有所了解），而是专注于 Zed 如何使用异步 Rust 构建高性能原生应用程序：应用程序级别的异步代码是什么样的、它使用哪个运行时、以及为什么使用该运行时。

使用 GPUI 编写异步 Rust

让我们直接深入了解。这是一个代表 Zed 代码库中异步代码的代码片段

fn show_cursor_names(&mut self, cx: &mut ViewContext<Self>) {
    self.show_cursor_names = true;
    cx.notify();
    cx.spawn(|this, mut cx| async move {
        cx.background_executor().timer(CURSORS_VISIBLE_FOR).await;
        this.update(&mut cx, |this, cx| {
            this.show_cursor_names = false;
            cx.notify()
        })
        .ok()
    })
    .detach();
}

它是我们的 Editor 中的一个函数。当它被调用时，Zed 会显示每个光标的所有者的姓名：您的姓名或与您协作的人的姓名。例如，当编辑器重新聚焦时，会调用它，以便您可以快速了解谁在做什么以及在哪里。

show_cursor_names 做的是以下事情

打开 Editor.show_cursor_names 并触发编辑器的重新渲染。当 Editor.show_cursor_names 为 true 时，将渲染光标名称。
生成一个任务，该任务休眠 CURSOR_VISIBLE_FOR，关闭光标，并触发另一次重新渲染。

如果您之前编写过异步 Rust，您可以在代码中找到一些熟悉的元素：有一个 .spawn，有一个 async move，有一个 await。如果您以前使用过 async_task crate，这可能会让您想起这样的代码

let ex = Executor::new();
ex.spawn(async {
    loop {
        Timer::after(Duration::from_secs(1)).await;
    }
})
.detach();

那是因为 Zed 使用 async_task 作为其 Task 类型。但在这个例子中，有一个 Executor — 那么它在 Zed 代码中的什么位置？ cx.background_executor() 又有什么作用呢？好问题，让我们寻找答案。

macOS 作为我们的异步运行时

关于异步 Rust 的一个显著的事情是，它允许您选择自己的运行时。这与许多其他语言（例如 JavaScript）不同，在 JavaScript 中您也可以编写异步代码。Runtime 不是一个定义非常明确的术语，但就我们这里的目的而言，我们可以说 runtime 是运行您的异步代码并为您提供实用程序的东西，例如 .spawn 和类似 Executor 的东西。

这些运行时中最受欢迎的可能是 tokio。但也有 smol、embassy 等等。选择和切换运行时需要权衡，它们在某种程度上是可互换的，但这是可能的。

事实证明，在 Zed for macOS 中，我们没有使用这些运行时中的任何一个。我们也没有使用 async_task 的 Executor。但必须有东西来执行异步代码，对吗？否则我就不会在 Zed 中输入这些行了。

那么 cx.spawn 做了什么，cx.background_executor() 又是什么？让我们来看一下。以下是来自 GPUI 的 AppContext 的三个相关方法

// crates/gpui/src/app.rs
 
impl AppContext {
    pub fn background_executor(&self) -> &BackgroundExecutor {
        &self.background_executor
    }
 
    pub fn foreground_executor(&self) -> &ForegroundExecutor {
        &self.foreground_executor
    }
 
    /// Spawns the future returned by the given function on the thread pool. The closure will be invoked
    /// with [AsyncAppContext], which allows the application state to be accessed across await points.
    pub fn spawn<Fut, R>(&self, f: impl FnOnce(AsyncAppContext) -> Fut) -> Task<R>
    where
        Fut: Future<Output = R> + 'static,
        R: 'static,
    {
        self.foreground_executor.spawn(f(self.to_async()))
    }
 
    // [...]
}

好吧，两个执行器，foreground_executor 和 background_executor，并且两者都有 .spawn 方法。我们已经在 show_cursor_names 中看到了 background_executor 的 .spawn，并且在这里，在 AppContext.spawn 中，我们看到了 foreground_executor 的对应项。

更深入一层，我们可以看到 foreground_executor.spawn 做了什么

// crates/gpui/src/executor.rs
 
impl ForegroundExecutor {
    /// Enqueues the given Task to run on the main thread at some point in the future.
    pub fn spawn<R>(&self, future: impl Future<Output = R> + 'static) -> Task<R>
    where
        R: 'static,
    {
        let dispatcher = self.dispatcher.clone();
        fn inner<R: 'static>(
            dispatcher: Arc<dyn PlatformDispatcher>,
            future: AnyLocalFuture<R>,
        ) -> Task<R> {
            let (runnable, task) = async_task::spawn_local(future, move |runnable| {
                dispatcher.dispatch_on_main_thread(runnable)
            });
            runnable.schedule();
            Task::Spawned(task)
        }
        inner::<R>(dispatcher, Box::pin(future))
    }
 
    // [...]
}

这里有很多代码，很多语法，但归根结底是这样：.spawn 方法接收一个 future，将其转换为 Runnable 和一个 Task，并请求 dispatcher 在主线程上运行它。

这里的 dispatcher 是一个 PlatformDispatcher。它是 GPUI 中与上面 async_task 的 Executor 等价的概念。之所以有 Platform 在其名称中，是因为它针对 macOS、Linux 和 Windows 有不同的实现。但在本文中，我们将只关注 macOS，因为它是我们目前支持最好的平台，而 Linux/Windows 的实现仍在开发中。

那么 dispatch_on_main_thread 做了什么？现在调用了异步运行时吗？不，那里也没有运行时 either

// crates/gpui/src/platform/mac/dispatcher.rs
 
impl PlatformDispatcher for MacDispatcher {
    fn dispatch_on_main_thread(&self, runnable: Runnable) {
        unsafe {
            dispatch_async_f(
                dispatch_get_main_queue(),
                runnable.into_raw().as_ptr() as *mut c_void,
                Some(trampoline),
            );
        }
    }
    // [...]
}
 
extern "C" fn trampoline(runnable: *mut c_void) {
    let task = unsafe { Runnable::<()>::from_raw(NonNull::new_unchecked(runnable as *mut ())) };
    task.run();
}

dispatch_async_f 是调用离开 Zed 代码库的地方，因为 dispatch_async_f 实际上是编译时生成的绑定，指向 dispatch_async_f 函数，该函数位于 macOS 的 Grand Central Dispatch (GCD) 中。dispatch_get_main_queue() 也是这样的绑定。

没错：Zed 作为 macOS 应用程序，使用 macOS 的 GCD 来调度和执行工作。

上面的代码片段中所发生的是，Zed 将 Runnable（可以将其视为 Task 的句柄）转换为原始指针，并将其与 trampoline 一起传递给 dispatch_async_f，后者将其放在其 main_queue 上。

当 GCD 决定是时候运行 main_queue 上的下一个项目时，它会将该项目从队列中弹出，并调用 trampoline，后者获取原始指针，将其转换回 Runnable，并为了轮询其 Task 后面的 Future，在其上调用 .run()。

而且，正如我惊讶地发现的那样：就是这样。基本上，这就是使用 GCD 作为异步 Rust 的“运行时”所需的所有代码。其他应用程序使用 tokio 或 smol，而 Zed 使用 GCD 的精简包装器和诸如 async_task 之类的 crates。

等等，那 BackgroundExecutor 呢？它与 ForegroundExecutor 非常相似，主要区别在于 BackgroundExecutor 在 PlatformDispatcher 上调用此方法

impl PlatformDispatcher for MacDispatcher {
    fn dispatch(&self, runnable: Runnable, _: Option<TaskLabel>) {
        unsafe {
            dispatch_async_f(
                dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH.try_into().unwrap(), 0),
                runnable.into_raw().as_ptr() as *mut c_void,
                Some(trampoline),
            );
        }
    }
}

这个 dispatch 方法与上面的 dispatch_async_f 之间的唯一区别是队列。BackgroundExecutor 不使用 main_queue，而是一个全局队列。

就像我第一次阅读此代码时一样，您现在可能想知道：为什么？

为什么要使用 GCD？为什么要有 ForegroundExecutor 和 BackgroundExecutor？main_queue 有什么特别之处？

永远不要阻塞主线程

在原生 UI 应用程序中，主线程非常重要。不，主线程是神圣的。主线程是渲染发生的地方，是处理用户输入的地方，是操作系统与应用程序通信的地方。主线程永远不应该被阻塞。应用程序的响应能力就存在于主线程中。

对于 macOS 上的 Cocoa 应用程序也是如此。渲染、接收用户输入、与 macOS 通信以及其他平台相关的问题都必须在主线程上进行。由于 Zed 希望与 macOS 完美协作，以确保高性能和响应能力，因此它做了两件事。

首先，它使用 GCD 来调度其工作（在主线程上和脱离主线程），以便 macOS 可以保持高响应性和整体系统效率。

其次，主线程的重要性被嵌入到 UI 框架 GPUI 中，通过明确区分 ForegroundExecutor 和 BackgroundExecutor，我们上面都看到了。

作为应用程序级别的 Zed 代码的编写者，您应该始终注意主线程上发生的事情，并且永远不要在主线程上放置过多的阻塞工作。如果您在主线程上放置一个阻塞的 sleep(10ms)，则渲染 UI 现在必须等待 sleep() 完成，这意味着渲染下一帧将花费超过 8 毫秒 — 如果您想实现 120 FPS，这是可用的最大帧时间。正如他们所说，你会“丢帧”。

了解了这一点，让我们看一下另一小段代码。这次它来自 Zed 中的内置终端，一个搜索终端缓冲区内容的函数

// crates/terminal/src/terminal.rs
 
pub struct Terminal {
    term: Arc<Mutex<alacritty_terminal::Term<ZedListener>>>,
 
    // [... other fields ...]
}
 
pub fn find_matches(
    &mut self,
    mut searcher: RegexSearch,
    cx: &mut ModelContext<Self>,
) -> Task<Vec<RangeInclusive<AlacPoint>>> {
    let term = self.term.clone();
    cx.background_executor().spawn(async move {
        let term = term.lock();
 
        all_search_matches(&term, &mut searcher).collect()
    })
}

find_matches 中的第一行，self.term.clone()，在主线程上发生并且很快：self.term 是一个 Arc<Mutex<...>>，所以克隆只会增加 Arc 上的引用计数。然后对 .lock() 的调用仅在后台发生，因为 .lock() 可能会阻塞。在这种特定代码路径中，不太可能存在对此锁的争用，但如果存在争用，它不会冻结 UI，只会冻结单个后台线程。这就是模式：如果它很快，你可以在主线程上做，但如果它可能需要一段时间甚至阻塞，请通过使用 cx.background_executor() 将它放在后台线程上。

这是另一个例子，Zed 中的项目范围搜索（⌘-shift-f）。它尽可能地将繁重的工作推送到后台线程，以确保 Zed 在搜索项目中数万个文件时保持响应。这是一个简化且带有大量注释的摘录自 Project.search_local，它展示了搜索的主要部分。

// crates/project/src/project.rs
 
// Spawn a Task on the background executor. The Task finds all files on disk
// that contain >1 matches for the given `query` and sends them back over
// the `matching_paths_tx` channel.
let (matching_paths_tx, matching_paths_rx) = smol::channel::bounded(1024);
cx.background_executor()
    .spawn(Self::background_search(
        // [... other arguments ... ]
        query.clone(),
        matching_paths_tx,
    ))
    .detach();
 
// Setup a channel on which we stream results to the UI.
let (result_tx, result_rx) = smol::channel::bounded(1024);
 
// On the main thread, spawn a Task that first...
cx.spawn(|this, mut cx| async move {
    // ... waits for the background thread to return the filepaths of
    // the maximum number of files that we want to search...
    let mut matching_paths = matching_paths_rx
        .take(MAX_SEARCH_RESULT_FILES + 1)
        .collect::<Vec<_>>()
        .await;
 
    // ... then loops over the filepaths in chunks of 64...
    for matching_paths_chunk in matching_paths.chunks(64) {
        let mut chunk_results = Vec::new();
 
        for matching_path in matching_paths_chunk {
            // .... opens each file....
            let buffer = this.update(&mut cx, |this, cx| {
                this.open_buffer((*worktree_id, path.clone()), cx)
            })?;
 
            // ... and pushes into `chunk_results` a Task that
            // runs on the main thread and ...
            chunk_results.push(cx.spawn(|cx| async move {
                // ... waits for the file to be opened ...
                let buffer = buffer.await?;
                // ... creates a snapshot of its contents ...
                let snapshot = buffer.read_with(&cx, |buffer, _| buffer.snapshot())?;
                // ... and again starts a Task on the background executor,
                // which searches through the snapshot for all results.
                let ranges = cx
                    .background_executor()
                    .spawn(async move {
                        query
                            .search(&snapshot, None)
                            .await
                            .iter()
                            .collect::<Vec<_>>()
                    })
                    .await;
 
                Ok((buffer, ranges))
            }));
        }
 
        // On the main thread, non-blocking, wait for all buffers to be searched...
        let chunk_results = futures::future::join_all(chunk_results).await;
        for result in chunk_results {
            if let Some((buffer, ranges)) = result.log_err() {
                // send the results over the results channel
                result_tx
                    .send(SearchResult::Buffer { buffer, ranges })
                    .await?;
            }
        }
    }
})
.detach();
 
result_rx

代码很多——抱歉！——但实际发生的事情并没有比我们已经讨论过的概念多。这里值得注意的是主线程和后台线程之间的乒乓模式，这也是我想展示它的原因。

主线程：启动搜索并将 query 传递给后台线程。
后台线程：在项目中查找包含 >1 个 query 的文件，并在结果到达时通过通道将其发回。
主线程：等待后台线程找到 MAX+1 个结果，然后丢弃通道，这会导致后台线程退出。
主线程：生成多个其他主线程任务来打开每个文件并创建一个快照。
后台线程：搜索缓冲区快照以查找缓冲区中的所有结果，并通过通道将结果发回。
主线程：等待后台线程在所有缓冲区中找到结果，然后将它们发回给外部 search_local 方法的调用者。

即使此方法可以优化并且搜索可以更快（我们尚未解决这个问题），它仍然可以在不阻塞主线程的情况下搜索数千个文件，同时仍然使用多个 CPU 核心。

异步友好的数据结构、测试执行器以及更多

我非常确定前面的代码摘录提出了很多我尚未回答的问题：如何将缓冲区快照发送到后台线程？这样做效率如何？如果我想在另一个线程上修改这样的快照怎么办？你如何测试所有这些？

我很抱歉地说我无法将所有答案都放在这篇文章中。但是有一个配套视频，Antonio 和我在其中深入研究了这些领域，并讨论了异步友好的数据结构、写时复制缓冲区快照以及其他内容。Antonio 还做了一个关于我们在 Zed 代码库中如何对异步 Rust 代码进行属性测试的精彩演讲，我强烈推荐。我还承诺将来会有一篇关于 Zed 编辑器底层数据结构的文章。

下次再见！