std::memory_order

来自cppreference.com
< cpp‎ | atomic
定义于头文件 <atomic>
enum memory_order {

    memory_order_relaxed,
    memory_order_consume,
    memory_order_acquire,
    memory_order_release,
    memory_order_acq_rel,
    memory_order_seq_cst

};
(C++11 起)

std::memory_order指定常规的非原子内存访问如何围绕原子操作排序。在没有任何制约的多处理器系统上,多个线程同时读或写数个变量时,一个线程能观测到变量值更改的顺序不同于另一个线程写它们的顺序。其实,更改的顺序甚至能在多个读取线程间相异。一些类似的效果还能在单处理器系统上出现,因为内存模型允许编译器变形。

库中所有原子操作的默认行为提供序列一致顺序(见后述讨论)。该默认行为可能有损性能,不过可以给予库的原子操作额外的std::memory_order参数,以指定附加制约,在原子性外,编译器和处理器还必须强制该操作。

目录

[编辑] 常量

定义于头文件 <atomic>
解释
memory_order_relaxed 宽松操作:没有同步或顺序制约,仅对此操作要求原子性(见下方宽松顺序)。
memory_order_consume 带此内存顺序的加载操作,在其影响的内存位置进行消费操作:当前线程中依赖于当前加载的该值的读或写不能被重排到此加载前。其他释放同一原子变量的线程的对数据依赖变量的写入,为当前线程所可见。在大多数平台上,这只影响到编译器优化(见下方释放消费顺序)。
memory_order_acquire 带此内存顺序的加载操作,在其影响的内存位置进行获得操作:当前线程中读或写能被重排到此加载前。其他释放同一原子变量的线程的所有写入,为当前线程所可见(见下方释放-获得顺序)。
memory_order_release 带此内存顺序的存储操作进行释放操作:当前进程中的读或写不能被重排到此存储后。当前线程的所有写入,可见于获得该同一原子变量的其他线程释放获得顺序),并且对该原子变量的带依赖写入变得对于其他消费同一原子对象的线程可见(见下方释放消费顺序)。
memory_order_acq_rel 带此内存顺序的读-修改-写操作既是获得操作又是释放操作。当前线程的读或写内存不能被重排到此存储前或后。所有释放同一原子变量的线程的写入可见于修改之前,而且修改可见于其他获得同一原子变量的线程。
memory_order_seq_cst 任何带此内存顺序的操作既是获得操作又是释放操作,加上存在一个单独的总顺序,其中所有线程以同一顺序观测到所有修改(见下方序列一致顺序)。

[编辑] 正式描述

线程间同步和内存顺序决定表达式的求值副效应在程序执行的不同线程间如何排序。它们以下列项目定义:

[编辑] 先序于

在同一线程中,求值A可以先序于求值B,如求值顺序中所描述。

[编辑] 携带依赖

在同一线程中,若下列任一为真,则先序于求值B的求值A亦可将依赖带入B(即B依赖于A)

1) A的值被用作B的运算数,除非
a) B是对std::kill_dependency的调用
b) A是内建&&||?:、或,运算符的左运算数。
2) A写入标量对象M,B从M读取
3) A将依赖携带入另一求值X,而X将依赖携带入B

[编辑] 修改顺序

All modifications to any particular atomic variable occur in a total order that is specific to this one atomic variable.

The following four requirements are guaranteed for all atomic operations:

1) Write-write coherence: If evaluation A that modifies some atomic M (a write) happens-before evaluation B that modifies M, then A appears earlier than B in the modification order of M
2) Read-read coherence: if a value computation A of some atomic M (a read) happens-before a value computation B on M, and if the value of A comes from a write X on M, then the value of B is either the value stored by X, or the value stored by a side effect Y on M that appears later than X in the modification order of M.
3) Read-write coherence: if a value computation A of some atomic M (a read) happens-before an operation B on M (a write), then the value of A comes from a side-effect (a write) X that appears earlier than B in the modification order of M
4) Write-read coherence: if a side effect (a write) X on an atomic object M happens-before a value computation (a read) B of M, then the evaluation B shall take its value from X or from a side effect Y that follows X in the modification order of M

[编辑] 释放序列

After a release operation A is performed on an atomic object M, the longest continuous subsequence of the modification order of M that consists of

1) Writes performed by the same thread that performed A
2) Atomic read-modify-write operations made to M by any thread

is known as release sequence headed by A

[编辑] 依赖顺序先于

Between threads, evaluation A is dependency-ordered before evaluation B if any of the following is true

1) A performs a release operation on some atomic M, and, in a different thread, B performs a consume operation on the same atomic M, and B reads a value written by any part of the release sequence headed by A.
2) A is dependency-ordered before X and X carries a dependency into B.

[编辑] 线程间发生先于

Between threads, evaluation A inter-thread happens before evaluation B if any of the following is true

1) A synchronizes-with B
2) A is dependency-ordered before B
3) A synchronizes-with some evaluation X, and X is sequenced-before B
3) A is sequenced-before some evaluation X, and X inter-thread happens-before B
4) A inter-thread happens-before some evaluation X, and X inter-thread happens-before B

[编辑] 发生先于

Regardless of threads, evaluation A happens-before evaluation B if any of the following is true:

1) A is sequenced-before B
2) A inter-thread happens before B

The implementation is required to ensure that the happens-before relation is acyclic, by introducing additional synchronization if necessary (it can only be necessary if a consume operation is involved, see Batty et al)

If one evaluation modifies a memory location, and the other reads or modifies the same memory location, and if at least one of the evaluations is not an atomic operation, the behavior of the program is undefined (the program has a data race) unless there exists a happens-before relationship between these two evaluations.

[编辑] 可见副效应

The side-effect A on a scalar M (a write) is visible with respect to value computation B on M (a read) if both of the following are true:

1) A happens-before B
2) There is no other side effect X to M where A happens-before X and X happens-before B

If side-effect A is visible with respect to the value computation B, then the longest contiguous subset of the side-effects to M, in modification order, where B does not happen-before it is known as the visible sequence of side-effects. (the value of M, determined by B, will be the value stored by one of these side effects)

Note: inter-thread synchronization boils down to preventing data races (by establishing happens-before relationships) and defining which side effects become visible under what conditions

[编辑] 消费操作

Atomic load with memory_order_consume or stronger is a consume operation. Note that std::atomic_thread_fence imposes stronger synchronization requirements than a consume operation.

[编辑] 获得操作

Atomic load with memory_order_acquire or stronger is an acquire operation. The lock() operation on a Mutex is also an acquire operation. Note that std::atomic_thread_fence imposes stronger synchronization requirements than an acquire operation.

[编辑] 释放操作

Atomic store with memory_order_release or stronger is a release operation. The unlock() operation on a Mutex is also a release operation. Note that std::atomic_thread_fence imposes stronger synchronization requirements than a release operation.

[编辑] 解释

[编辑] 宽松顺序

带标签memory_order_relaxed的原子操作无同步操作;它们不会在共时的内存访问间强加顺序。它们只保证原子性和修改顺序一致性。

例如,对于最初为零的xy

// 线程1:
r1 = y.load(memory_order_relaxed); // A
x.store(r1, memory_order_relaxed); // B
// 线程2:
r2 = x.load(memory_order_relaxed); // C 
y.store(42, memory_order_relaxed); // D

允许产生结果r1 == 42 && r2 == 42,因为即使线程1中A先序于B且线程2中C先序于D,却没有制约避免在y的修改顺序中D出现先于A,而B在x的修改顺序中出现先于C。D在y上的副效应,可能可见于线程1中的加载A,同时B在x上的副效应,可能可见于线程2中的加载C。

即使使用宽松内存模型,也不允许“无中生有”的值循环地依赖于其各自的计算,例如,对于最初为零的xy

// 线程1:
r1 = x.load(memory_order_relaxed);
if (r1 == 42) y.store(r1, memory_order_relaxed);
// 线程2:
r2 = y.load(memory_order_relaxed);
if (r2 == 42) x.store(42, memory_order_relaxed);

不允许产生结果r1 == 42 && r2 == 42,因为存储42于y只在存储42于x后有可能,这又循环依赖于存储42于y。注意在C++14前,这在技术上被规范允许,但不推荐实现者如此实现。

(C++14 起)

宽松内存顺序的典型使用是计数器自增,例如std::shared_ptr的引用计数器,因为这只要求原子性,但不要求顺序或同步(注意std::shared_ptr计数器自减要求与析构函数进行获得释放同步)

#include <vector>
#include <iostream>
#include <thread>
#include <atomic>
 
std::atomic<int> cnt = {0};
 
void f()
{
    for (int n = 0; n < 1000; ++n) {
        cnt.fetch_add(1, std::memory_order_relaxed);
    }
}
 
int main()
{
    std::vector<std::thread> v;
    for (int n = 0; n < 10; ++n) {
        v.emplace_back(f);
    }
    for (auto& t : v) {
        t.join();
    }
    std::cout << "Final counter value is " << cnt << '\n';
}

输出:

Final counter value is 10000

[编辑] 释放获得顺序

若线程A中的一个原子存储带标签memory_order_release,而线程B中来自同一变量的原子加载带标签memory_order_acquire,则从线程A的视角发生先于原子存储的所有内存写入(非原子及宽松原子的),在线程B中成为可见副效应,即一旦原子加载完成,则保证线程B能观察到线程A写入内存的所有内容。

同步仅建立在释放获得同一原子对象的线程之间。其他线程可能看到与被同步线程的一者或两者相异的内存访问顺序。

在强顺序系统(x86、SPARC TSO、IBM主框架)上,释放获得顺序对于多数操作是自动进行的。无需为此同步模式添加额外的CPU指令,只有某些编译器优化受影响(例如,编译器被禁止将非原子存储移到原子存储-释放后,或将非原子加载移到原子加载-获得前)。在弱顺序系统(ARM、Itanium、Power PC)上,必须使用特别的CPU加载或内存栅栏指令。

互斥锁(例如std::mutex原子自旋锁)是释放获得同步的例子:锁被线程A释放且被线程B获得时,发生于线程A环境的临界区(释放之前)中的所有事件,必须对于执行同一临界区的线程B(获得之后)可见。

#include <thread>
#include <atomic>
#include <cassert>
#include <string>
 
std::atomic<std::string*> ptr;
int data;
 
void producer()
{
    std::string* p  = new std::string("Hello");
    data = 42;
    ptr.store(p, std::memory_order_release);
}
 
void consumer()
{
    std::string* p2;
    while (!(p2 = ptr.load(std::memory_order_acquire)))
        ;
    assert(*p2 == "Hello"); // 绝无问题
    assert(data == 42); // 绝无问题
}
 
int main()
{
    std::thread t1(producer);
    std::thread t2(consumer);
    t1.join(); t2.join();
}


下例演示三个线程间传递性的释放获得顺序

#include <thread>
#include <atomic>
#include <cassert>
#include <vector>
 
std::vector<int> data;
std::atomic<int> flag = {0};
 
void thread_1()
{
    data.push_back(42);
    flag.store(1, std::memory_order_release);
}
 
void thread_2()
{
    int expected=1;
    while (!flag.compare_exchange_strong(expected, 2, std::memory_order_acq_rel)) {
        expected = 1;
    }
}
 
void thread_3()
{
    while (flag.load(std::memory_order_acquire) < 2)
        ;
    assert(data.at(0) == 42); // will never fire
}
 
int main()
{
    std::thread a(thread_1);
    std::thread b(thread_2);
    std::thread c(thread_3);
    a.join(); b.join(); c.join();
}


[编辑] 释放消费顺序

若线程A中的原子存储带标签memory_order_release而线程B中来自同一原子对象的加载带标签memory_order_consume,则线程A视角中依赖排序先于原子存储的所有内存写入(非原子和宽松原子的),会在线程B中该加载操作所携带依赖进入的操作中变成可见副效应,即一旦完成原子加载,则保证线程B中,使用从该加载获得的值的运算符和函数,能见到线程A写入内存的内容。

同步仅在释放消费同一原子对象的线程间建立。其他线程能见到与被同步线程的一者或两者相异的内存访问顺序。

所有异于DEC Alphi的主流CPU上,依赖顺序是自动的,无需为此同步模式产生附加的CPU指令,只有某些编译器优化收益受影响(例如,编译器被禁止牵涉到依赖链的对象上的推测性加载)。

此顺序的典型使用情况,涉及对很少被写入的数据结构(安排表、配置、安全策略、防火墙规则等)的共时读取,和有指针中介发布的发布者-订阅者情形,即当生产者发布消费者能通过其访问信息的指针之时:无需令生产者写入内存的所有其他内容对消费者可见(这在弱顺序架构上可能是昂贵的操作)。这种场景的一个例子是rcu解引用

细粒度依赖链控制可参阅std::kill_dependency[[carries_dependency]]

注意当前(2015年2月)没有产品编译器跟踪依赖链:消费操作被提升成获得操作。

释放消费顺序的规范正在修订中,而且暂时不鼓励使用memory_order_consume

(C++17 起)

此示例演示用于指针中介的发布的依赖定序同步:int data不由数据依赖关系关联到指向字符串的指针,从而其值在消费者中未定义。

#include <thread>
#include <atomic>
#include <cassert>
#include <string>
 
std::atomic<std::string*> ptr;
int data;
 
void producer()
{
    std::string* p  = new std::string("Hello");
    data = 42;
    ptr.store(p, std::memory_order_release);
}
 
void consumer()
{
    std::string* p2;
    while (!(p2 = ptr.load(std::memory_order_consume)))
        ;
    assert(*p2 == "Hello"); // 绝无出错:*p2从ptr携带依赖
    assert(data == 42); // 可能也可能不会出错:data不从ptr携带依赖
}
 
int main()
{
    std::thread t1(producer);
    std::thread t2(consumer);
    t1.join(); t2.join();
}



[编辑] 序列一致顺序

带标签memory_order_seq_cst的原子操作不仅以与释放/获得顺序相同的方式排序内存(在一个线程中发生先于存储的任何结果都变成做加载的线程中的可见副效应),还对所有拥有此标签的内存操作建立一个单独全序

正式而言,

每个加载原子对象M的memory_order_seq_cst操作B,观测到下列之一:

  • 修改M的上个操作A的结果,A在单独全序中出现先于B
  • 或,若存在这种A,则B可能观测到一些M的修改结果,这些修改非memory_order_seq_cst而且不发生先于A
  • 或,若不存在这种A,则B可能观测到一些M的无关联修改,这些修改非memory_order_seq_cst

若存在memory_order_seq_cst的线程栅栏(std::atomic_thread_fence)操作X先序于B,则B观测到下列之一:

  • 在单独全序中出现先于X的上个M的memory_order_seq_cst修改
  • 在单独全序中出现后于它的一些M的无关联修改

设有一对M上的原子操作,我们称之为A和B,这里A写入、B读取M的值,若存在二个memory_order_seq_cst的线程栅栏(std::atomic_thread_fence)X和Y,且若A先序于X,Y先序于B,且X在单独全序中出现先于Y,则B观测到二者之一:

  • A的效应
  • 一些在M的修改顺序中出现后于A的无关联修改

设有一对M上的原子操作,我们称之为A和B,若符合下列条件之一,则M的修改顺序中B发生先于A

  • 存在一个memory_order_seq_cst的线程栅栏(std::atomic_thread_fence)X,它满足A先序于X,且X在单独全序中出现先于B
  • 或者,存在一个memory_order_seq_cst的线程栅栏(std::atomic_thread_fence)Y,它满足Y先序于B,且A在单独全序中出现先于Y
  • 或者,存在memory_order_seq_cst的线程栅栏(std::atomic_thread_fence)X和Y,它们满足A先序于X,Y先序于B,且X在单独全序中出现先于Y

注意这意味着:

1) 只要不带memory_order_seq_cst标签的原子操作有作用,则立即丧失序列一致性
2) 序列一致栅栏仅为栅栏自身建立全序,而不为通常情况下的原子操作(先序于不是跨线程关系,不同于发生先于

序列顺序可能在多生产者-多消费者情形下为必须,这里所有消费者必须以相同顺序观测到所有生产者的动作出现。

全序列顺序在所有多核系统上要求完全的内存栅栏CPU指令。这可能成为性能瓶颈,因为它强制受影响的内存访问传播到每个核心。

此示例演示序列一直顺序为必要的场合。任何其他顺序都可能触发assert,因为可能令线程cd观测到原子对象xy以相反顺序更改。

#include <thread>
#include <atomic>
#include <cassert>
 
std::atomic<bool> x = {false};
std::atomic<bool> y = {false};
std::atomic<int> z = {0};
 
void write_x()
{
    x.store(true, std::memory_order_seq_cst);
}
 
void write_y()
{
    y.store(true, std::memory_order_seq_cst);
}
 
void read_x_then_y()
{
    while (!x.load(std::memory_order_seq_cst))
        ;
    if (y.load(std::memory_order_seq_cst)) {
        ++z;
    }
}
 
void read_y_then_x()
{
    while (!y.load(std::memory_order_seq_cst))
        ;
    if (x.load(std::memory_order_seq_cst)) {
        ++z;
    }
}
 
int main()
{
    std::thread a(write_x);
    std::thread b(write_y);
    std::thread c(read_x_then_y);
    std::thread d(read_y_then_x);
    a.join(); b.join(); c.join(); d.join();
    assert(z.load() != 0);  // 决不发生
}


[编辑] volatile的关系

在线程的执行之内,对volatile对象的访问(读或写)无法被重排到同一线程内先序于后序于它的可观测副效应(含其他volatile访问)之后,但此顺序不保证为另一线程所观察到,因为volatile访问不建立线程间同步。

另外,volatile访问不是原子的(共时的读写是数据竞争,且不对内存访问排序(可以自由地围绕volatile访问重排非volatile内存访问)。

一个值得注意的例外是Visual Studio,其中在默认设置下,每个volatile写入拥有释放语义,而每个volatile读取拥有获得语义(MSDN),从而volatile可用于线程间同步。标准的volatile语义不可应用于多线程变编程,尽管它们在应用于sig_atomic_t对象时是足够的,例如与运行于同一线程中的std::signal处理函数交流。

[编辑] 参阅

memory orderC 文档

[编辑] 外部链接