This article has an English version available.

Rust：引用及其可变性的总体理解

by Yanli 盐粒 in 2023-02-18

0. 概述

Rust 最核心的概念是 owner（所有权）。一个变量是某块数据的唯一 owner，意味着当该变量被 drop 时，数据会从内存中移除且无法恢复。这个所有权模型让我们能对内存数据负责，避免内存泄漏与二次释放等问题。

如果某人想访问或修改数据，但又不想承担所有权，就可以使用 reference（引用，&）。在 Rust 中，reference 更像 C/C++ 的 pointer，而不是 C++ 里的引用。

默认情况下，reference 是不可变的，也就是说引用本身和它指向的变量都不能被修改。但当我们讨论“可变引用”时，就会出现类似 C/C++ 中“顶层/底层 const 指针”的问题：到底是引用本身可变，还是它指向的变量可变？

1. Rust 中的 `reference` 更像 C/C++ 的 `pointer`

如果我们使用带有 rust-analyzer 的编辑器插件，代码会显示如下：

rust

    let a = String::from("hello world");
    let b: &String = &a;
    let c: &&String = &b;

分析器会显示 b 的类型是 &String，c 的类型是 &&String。这是不是很像 C/C++ 中的 *int 与 **int？

2. Rust 中的顶层/底层可变引用

先从 C/C++ 说起。正如 C++ Primer 所述，C/C++ 中有两种 const：

cpp

int a = 0;
const int *b = &a;
int *const c = &a;

前者 b 是指向常量值 a 的指针。a 并没有声明为 const 变量，但 b 把它当作常量，禁止对其修改。不过 b 本身不是 const，它可以指向另一个常量值，例如：

cpp

const int a2 = 1;
b = &a2;

这种情况下，b 被称为底层常量指针。

相对地，顶层 const 指的是 c 指向的变量本身可能不是 const，但指针 c 本身不可修改。我们不能让 c 指向另一个变量。

因此，当我们谈论 C++ 的 reference 时，它天然是顶层 const：

cpp

int &d = a;

它所指向的“位置”在引用定义时就被固定下来。d 只是 a 的别名，仅此而已。

回到 Rust，我们也可以定义类似的情况：

rust

let mut s1 = String::from("hello world");
let s3 = &mut s1;
*s3 = String::from("hello new world");

这里的 s3 是一个底层可变引用。最后一行改变了 s3 指向的变量 s1 的值。执行后，s1 的值变成了 "hello new world"，而 s3 仍然指向 s1。

rust

let s1 = String::from("hello world");
let s2 = String::from("hello another new world");
let mut s4 = &s1;
s4 = &s2;

这里的 s4 是一个顶层可变引用。最后一行改变了 s4 指向的“位置”。

最初 s4 指向 s1（其值是 "hello world"）；执行最后一行后，s4 改为指向 s2（其值是 "hello another new world"）。

3. 自动解引用 / Deref 强制转换

看起来与 C/C++ 的指针没什么区别，对吧？但事情会更复杂，因为它们是 reference，也就是说必须有“别名”的表现。

rust

let mut s1 = String::from("hello world");
let s3 = &mut s1;
println!("{}", s3.len());
println!("{}", (*s3).len());
println!("{}", s1.len());

这段代码能运行，且输出相同。此处 s3 可以被当作 s1 的别名。

但是：

rust

let mut s1 = String::from("hello world");
let s3 = &mut s1;
// *s3 = String::from("hello new world");
s3 = String::from("hello new world"); // This doesn't work!

如果去掉解引用符号 *，赋值就不会生效。s3 不能再作为 s1 的别名；它只是一个指针。

而在顶层可变引用的场景：

rust

let s1 = String::from("hello world");
let s2 = String::from("hello another new world");
let mut s4 = &s1;
s4 = &s2; // This works!

这里不需要 *，并且可以正常工作。这一次我们不希望编译器把 s4 当作别名，而是希望它指向另一个位置，例如从 s1 变为 s2。

那到底什么时候 reference 表现为别名，什么时候表现为指针？我们来看一些官方描述：

来自 https://doc.rust-lang.org/std/primitive.reference.html
以下 trait 对所有 &T 都实现，无论其指向类型是什么：
- Copy
- Clone（注意：即便 T 实现了 Clone，这里也不会转发到 T 的实现！）
- Deref
- Borrow
- fmt::Pointer
来自 https://doc.rust-lang.org/reference/expressions/field-expr.html#automatic-dereferencing
如果容器操作数的类型实现了 Deref 或 DerefMut（取决于操作数是否可变），编译器会自动解引用，重复多次，直到字段访问可行。该过程也称为 autoderef（自动解引用）。
来自 https://doc.rust-lang.org/reference/expressions/method-call-expr.html
第一步是构建候选的 receiver 类型列表。具体做法是不断对 receiver 表达式的类型进行解引用，把遇到的每个类型都加入列表，最后尝试一次 unsized coercion 并将结果加入列表。随后，对列表中每个候选类型 T，再紧接着加入 &T 和 &mut T。

所有原始引用类型都有一个 Deref trait。两种机制——“自动解引用”和 “Deref 强制转换”——都可以利用 Deref 实现“引用”行为。

通常情况下，以下场景中引用 A 会表现为别名：

A.attribute_1
A.method_2(B)
B.method_3(A)
function_4(A)

而在下面场景中，A 则表现为指针，包括：

A = B
if A == B {…}
match A { B => … }

The Rust Programming Language

What Can Coerce, and Where, in Rust

0. 概述 ​

1. Rust 中的 reference 更像 C/C++ 的 pointer ​

2. Rust 中的顶层/底层可变引用 ​

3. 自动解引用 / Deref 强制转换 ​

0. 概述

1. Rust 中的 `reference` 更像 C/C++ 的 `pointer`

2. Rust 中的顶层/底层可变引用

3. 自动解引用 / Deref 强制转换