c++ - Does a syscall automatically imply a memory barrierread values sequentially consistent (specifically futex)? - Stack Overf

IT技术

更新时间：2025-04-171

admin管理员组
文章数量:1405900

In C++, I have two threads. Each thread does a store first on one variable, then a load on another variable, but in reversed order:

std::atomic<bool> please_wake_me_up{false};
uint32_t cnt{0};

void thread_1() {
    std::atomic_ref atomic_cnt(cnt);

    please_wake_me_up.store(true, std::memory_order_seq_cst);
    atomic_cnt.load(std::memory_order_seq_cst); // <-- Is this line necessary or can it be omitted?
    futex_wait(&cnt, 0); // <-- The performed syscall must read the counter.
                         //     But with which memory ordering?
}

void thread_2() {
    std::atomic_ref atomic_cnt(cnt);

    atomic_cnt.store(1, std::memory_order_seq_cst);
    if (please_wake_me_up.load(std::memory_order_seq_cst)) {
        futex_wake(&cnt);
    }
}

Full code example: Godbolt.

If all of the four atomic accesses are performed with sequential consistency, it's guaranteed that at least one thread will see the store of the other thread when performing the load. This is what I want to achieve.

As the futex-syscall must perform a load of the variable it performs on internally, I'm wondering if I can omit the (duplicated) load right before the syscall.

Every syscall should lead to a compiler memory barrier, right?
Do syscalls in general act like full memory barriers?
As the futex syscall is guaranteed to read the counter, is it safe to omit the marked line? Is there any guarantee the load inside the syscall occurs with sequential consistency?
If the line is necessary, would a std::atomic_thread_fence(std::memory_order_seq_cst) be better, as I'm not needing the value, just a fence?

If the answer to the question is architecture-specific, I would be interested in x86_64 and arm64.

In C++, I have two threads. Each thread does a store first on one variable, then a load on another variable, but in reversed order:

std::atomic<bool> please_wake_me_up{false};
uint32_t cnt{0};

void thread_1() {
    std::atomic_ref atomic_cnt(cnt);

    please_wake_me_up.store(true, std::memory_order_seq_cst);
    atomic_cnt.load(std::memory_order_seq_cst); // <-- Is this line necessary or can it be omitted?
    futex_wait(&cnt, 0); // <-- The performed syscall must read the counter.
                         //     But with which memory ordering?
}

void thread_2() {
    std::atomic_ref atomic_cnt(cnt);

    atomic_cnt.store(1, std::memory_order_seq_cst);
    if (please_wake_me_up.load(std::memory_order_seq_cst)) {
        futex_wake(&cnt);
    }
}

Full code example: Godbolt.

As the futex-syscall must perform a load of the variable it performs on internally, I'm wondering if I can omit the (duplicated) load right before the syscall.

Every syscall should lead to a compiler memory barrier, right?
Do syscalls in general act like full memory barriers?
As the futex syscall is guaranteed to read the counter, is it safe to omit the marked line? Is there any guarantee the load inside the syscall occurs with sequential consistency?
If the line is necessary, would a std::atomic_thread_fence(std::memory_order_seq_cst) be better, as I'm not needing the value, just a fence?

If the answer to the question is architecture-specific, I would be interested in x86_64 and arm64.

Share Improve this question asked Mar 9 at 17:48 sedor 3081 silver badge8 bronze badges

Add a comment |

2 Answers 2

Sorted by: Reset to default 5

Any syscalls is a compiler barrier, like any non-inline function.

Not necessarily full barriers against runtime reordering, though, although they might well be in practice, especially since they usually take long enough that the store buffer would have time to probably drain on its own. (Especially with Spectre and MDS mitigation in place (on x86 getting extra microcode to run to flush stuff), taking many extra cycles between reaching the syscall entry point and actually dispatching to a kernel function.)

atomic_thread_fence is probably worse, e.g. on x86-64 that would be an extra mfence or dummy locked operation, while an atomic load would be basically free since it'll normally still be hot in L1d from the xchg store for seq_cst.

On AArch64 stlr / ldar is still sufficient: the reload can't happen until the store commits to cache, and is itself an acquire load. So yes it will keep all later loads/stores (including of cnt by the futex system call) after please_wake_me_up.store. It should be no worse than a stand-alone full barrier, which would have to drain all previous stores from the store buffer, not just stlr seq_cst / release stores. Earlier cache-miss stores could potentially still be in flight... except that stlr is a release store so all earlier loads and stores need to be completed before it can commit.

If anything in the kernel uses an ldar (instead of ARMv8.3 ldapr just acquire not seq_cst), then you'd still be safe and more work could get into the pipeline while waiting for the please_wake_me_up.store to drain from the store buffer. But there's no guarantee that's safe, unfortunately; the futex man page doesn't say it does a seq_cst load.

in your C++ code you can likely remove the explicit load of cnt before the futex_wait call. The futex_wait syscall internally performs a load of the futex word cnt with sequential consistency. This ensures the atomicity of checking the value of the futex word and the blocking action as well as properly sequencing them, which is no different from std::memory_order_seq_cst 's guarantees.

why this works?

- Futex operations are ordered inside, and the load inside futex_wait coordinates the memory synchronization as necessary.

- This eliminates the need for an explicit initial load, and your code is still correct and optimal.

so,

- do not use explicit load,

- assume sequential consistency of futex_wait

本文标签：

版权声明：本文标题：c++ - Does a syscall automatically imply a memory barrierread values sequentially consistent (specifically futex)? - Stack Overf 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1744863983a2629236.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

发表评论

全部评论 0

暂无评论

编程频道|软件玩家 - 软件改变生活！

c++ - Does a syscall automatically imply a memory barrierread values sequentially consistent (specifically futex)? - Stack Overf

2 Answers 2

更多相关文章

javascript - react-error-boundary renders fallback but still shows red screen - Stack Overflow

javascript - Working with Parse.com dates - Stack Overflow

javascript - Unable to catch event emitted from child component in Angular 4 - Stack Overflow

how to fold some text in Monaco Editor - Stack Overflow

Can&#39;t compile javascript using ant and closure compiler because of Jquery&#39;s $ is undeclared - Stack Overflow

javascript - JSDoc required parameter with default value - Stack Overflow

How to build android widget for a React Native and Expo app - Stack Overflow

javascript - How to fix React Context&#39;s object is not a function - TypeError - Stack Overflow

javascript - Ignore Safari&#39;s Initial Page Load popstate - Stack Overflow

javascript - Give every first, second and third element a unique class using jQuery - Stack Overflow

permalinks - Author URL using wrong domain after using Edit Author Slug

javascript - Cannot Read Property map of undefined - Stack Overflow

c# - Windows activates task before user logon although trigger is set as &quot;At log on&quot; - Stack Overflow

javascript - Using jQuery .on() for all events - Stack Overflow

plugin development - Rewrite url for existing page without flush_rewrite_rules

javascript - AWS Lambda invoke function doesn&#39;t always return - Stack Overflow

javascript - Customize specific disabled dates within a range using jQuery UI datepicker - Stack Overflow

javascript - SWFUpload ASP.Net example site not working? - Stack Overflow

javascript - Uncaught TypeError: Cannot read property &#39;files&#39; of null - Stack Overflow

javascript - React Native localeCompare not working on Android - Stack Overflow

发表评论

推荐文章

wordpress - WooCommerce STORE API Extension - Stack Overflow

rvest - &quot;Target position can only be set for new windows&quot; in chromote in R - Stack Overflow

loop - Some doubts about how the main query and the custom query works in this custom theme?

javascript - Slick Slider shows last element first - Stack Overflow

How do you get all the urls of images attached to a post?

热门文章

javascript - How to render something that is async in React? - Stack Overflow

javascript - How do I autosize a blockui dialog to the available visible area with JQuery? - Stack Overflow

javascript - Openlayers 3: animate point feature - Stack Overflow

javascript - document.createElement text field length - Stack Overflow

Does node.js have any performance advantage over client-side Javascript (vs. ChromeV8)? - Stack Overflow

Form validation in WordPress Admin Panel

progressive web apps - Unsubscribe possibility on web push notification - Stack Overflow

I want to get term by term_name without taxonomy

css - HTML Email - Hiding Images in AOL Mobile Mail - Stack Overflow

How to add my content one by one with angular animation - Stack Overflow

最新文章

windows设置断电重启开机后自动输入锁屏密码登录

Windows系统设置开机默认开启数字小键盘

Windows11 开机自动同步时间（开机时间不更新问题）

windows配置开机自启动软件或脚本

【Redis】Windows设置Redis为开机自启动

linq - Date could not be translated in filter using Entity Framework - Stack Overflow

mysql - One of two similar WP Query is very slow

javascript - How do I fill color in between multiple stacked line graphs in Flot? - Stack Overflow

javascript - React Native localeCompare not working on Android - Stack Overflow

The ML.NET wizard fails to build the model when the scenario is set to data classification, and the selected label column has a

惠普OMEN 15-CE001TX 2EF91PA参数报价

苹果新款MacBook Pro 15英寸 i732GB1TBVega Pro 20参数报价

联想Y330A-PSE L参数报价

神舟战神Z7 D6 i7-12650H16GB512GBRTX4050旗舰版参数报价

神舟战神Z7 D6 i7-12650H16GB1TBRTX4050参数报价

Can't compile javascript using ant and closure compiler because of Jquery's $ is undeclared - Stack Overflow

javascript - How to fix React Context's object is not a function - TypeError - Stack Overflow

javascript - Ignore Safari's Initial Page Load popstate - Stack Overflow

c# - Windows activates task before user logon although trigger is set as "At log on" - Stack Overflow

javascript - AWS Lambda invoke function doesn't always return - Stack Overflow

javascript - Uncaught TypeError: Cannot read property 'files' of null - Stack Overflow

rvest - "Target position can only be set for new windows" in chromote in R - Stack Overflow