admin管理员组文章数量:1396092
There is a scenario where clicking the button each time will send a network request, and only use the result of the recent request. To achieve this logic, I use the atomic object to record the request version; only the version the result corresponds to equals to the current version can the result be used.
atomic<int> version = {0};
// Button Click handler
void OnClick(){
auto v = version.fetch_add(1,memory_order::acq_rel); // #1
std::thread([v](){
int current = v+1;
auto r = network_request(current ); // #2
if(versionpare_exchange_strong(current,current,memory_order::release,memory_order::relaxed)){ // #3
use_req(r);
}
});
}
Now, there are two button-clicking events: A and B. A
happens before B(on time). If #3 in A
is ordered after #1 in B
, the result of the request in A
is ignored. In contrast, if #3 in A
is ordered before #1 in B
, the CAS
operation in A
will succeed, its result will be used, then #1
in B
will read the value written by #3 in A
, which also provides the network request in A
happens-before that in B
. In this case, assuming the function use_req
is thread-safe.
So, does the CAS operation that sets the same value ensure the correct logic? Could the CAS operation just be set to load operation?
Update
I think to check whether the version is updated during the network request, there is a difference between the pure load used in the if
condition and the CAS operation.
When #3
uses CAS operation, before the network request in the first clicking event A
returns, another button clicking event B
triggers, and its fetch_add
reads the modification produced by A's #1
, A's #3
must fail, as per atomics.order p10
Atomic read-modify-write operations shall always read the last value (in the modification order) written before the write associated with the read-modify-write operation.
The modification order must be {A_fetch_add, B_fetch_add}
. If the CAS in event A
succeeded and was an RMW operation, p10 would be violated. So the condition checking in the event A
is guaranteed to detect the version change in this case.
In contrast, if #3
is a pure load, even though #1
in the event B
reads the modification produced by #1
in event A
, as per intro.races p14
If a side effect X on an atomic object M happens before a value computation B of M, then the evaluation B takes its value from X or from a side effect Y that follows X in the modification order of M.
The modification order is still {A_fetch_add, B_fetch_add}
, however, #3
in event A might read the modification X
produced by #1
in A
(due to happens-before) or some side effects Y
after X
(in this case, Y
is B_fetch_add
in event B). In other words, the pure load is not guaranteed to detect the version change in this case.
Is this the difference between pure load and CAS for checking version change?
There is a scenario where clicking the button each time will send a network request, and only use the result of the recent request. To achieve this logic, I use the atomic object to record the request version; only the version the result corresponds to equals to the current version can the result be used.
atomic<int> version = {0};
// Button Click handler
void OnClick(){
auto v = version.fetch_add(1,memory_order::acq_rel); // #1
std::thread([v](){
int current = v+1;
auto r = network_request(current ); // #2
if(versionpare_exchange_strong(current,current,memory_order::release,memory_order::relaxed)){ // #3
use_req(r);
}
});
}
Now, there are two button-clicking events: A and B. A
happens before B(on time). If #3 in A
is ordered after #1 in B
, the result of the request in A
is ignored. In contrast, if #3 in A
is ordered before #1 in B
, the CAS
operation in A
will succeed, its result will be used, then #1
in B
will read the value written by #3 in A
, which also provides the network request in A
happens-before that in B
. In this case, assuming the function use_req
is thread-safe.
So, does the CAS operation that sets the same value ensure the correct logic? Could the CAS operation just be set to load operation?
Update
I think to check whether the version is updated during the network request, there is a difference between the pure load used in the if
condition and the CAS operation.
When #3
uses CAS operation, before the network request in the first clicking event A
returns, another button clicking event B
triggers, and its fetch_add
reads the modification produced by A's #1
, A's #3
must fail, as per atomics.order p10
Atomic read-modify-write operations shall always read the last value (in the modification order) written before the write associated with the read-modify-write operation.
The modification order must be {A_fetch_add, B_fetch_add}
. If the CAS in event A
succeeded and was an RMW operation, p10 would be violated. So the condition checking in the event A
is guaranteed to detect the version change in this case.
In contrast, if #3
is a pure load, even though #1
in the event B
reads the modification produced by #1
in event A
, as per intro.races p14
If a side effect X on an atomic object M happens before a value computation B of M, then the evaluation B takes its value from X or from a side effect Y that follows X in the modification order of M.
The modification order is still {A_fetch_add, B_fetch_add}
, however, #3
in event A might read the modification X
produced by #1
in A
(due to happens-before) or some side effects Y
after X
(in this case, Y
is B_fetch_add
in event B). In other words, the pure load is not guaranteed to detect the version change in this case.
Is this the difference between pure load and CAS for checking version change?
Share Improve this question edited Mar 30 at 1:11 xmh0511 asked Mar 27 at 7:35 xmh0511xmh0511 7,3591 gold badge11 silver badges41 bronze badges 12 | Show 7 more comments1 Answer
Reset to default 0If a pure load isn't sufficient for correctness, neither is a same-value CAS
At least in this case, and I'm not aware of a case where it could be. Perhaps something where release
was significant in creating synchronization, but IDK how you'd distinguish whether you synced with the CAS or not if the value is the same either way.
You haven't clearly established what correctness means, but the most you might get out of a CAS instead of a load is being slower to change typical timing (no guarantees), in terms of practical engineering tweaking.
This algorithm ignores obviously-stale numbers in case of a slow network and spamming clicks very fast compared to network and processing latency(?). Like a button debounce algorithm perhaps?
If the button is clicked after processing of a request has started, this code will still handle that button click. It doesn't give any mutual exclusion between calls to use_req
. If that's all you wanted, then a pure load is also correct.
The critical moment for "when processing starts" is when the load takes a value, or when the store side of the atomic RMW (CAS) happens. After that, another button click will lead to another use_req
call, even if the actual use_req
call in this thread hasn't even been reached yet. (In terms of out-of-order exec and memory reordering, if none of the operations in use_req
have happened yet.)
You tagged this [language-lawyer] so there isn't really anything to say about timing guarantees of pure loads vs. RMWs (since the ISO standard avoids making any, and avoids talking about MESI cache coherency). In practice, atomic RMWs all contend with each other and have to get exclusive ownership of the cache line, so will be slower. This tends to push the critical cutoff time later wrt. to the other operations in this thread, but there's no difference in terms of qualitative or formal guarantees. If races you didn't want could happen with a pure load, they could still happen with CAS, but might I think be less likely.
With release
ordering for the RMW, all preceding loads / stores do have to be visible to threads that run an acquire
op on version
, specifically version.fetch_add(1,memory_order::acq_rel);
in the event-handler thread. (Not CASes in other newly-spawned threads, they're release
which doesn't include acquire
so they don't have to sync-with other CASes. They are on the same object though so the CAS itself is serialized, unlike pure loads which can run in parallel on the same object.)
In practice on real implementations, std::atomic_thread_fence(acquire);
before a pure load will normally work as a LoadLoad barrier, but not StoreLoad (for that you'd need thread_fence(seq_cst)
). So you can still delay a pure load until after operations in network_request(current)
have become globally visible. Formally, if there aren't any atomic operations in network_request(current)
, they probably aren't ordered by a fence + load, but optimizing away the fence or reordering them with it would require whole-program analysis by a compiler to show there aren't any atomic stores.
So if I was trying to hack up something like this, I'd be inclined to write it like this if I was trying to use memory ordering to make the version
check as late as possible and wait for all previous network_request(current)
processing to be done before the check:
auto r = network_request(current); // #2
std::atomic_thread_fence(seq_cst); // make sure earlier work is done before checking
if(version.load(relaxed)){ // #3
use_req(r);
}
I intentionally didn't use acquire
, same as your choice with CAS, because it's fine if the check actually happens after some use_req
loads are done speculatively. We actually want it to be done later.
But most likely I'd just use version.load(relaxed)
without any extra barriers, or at most thread_fence(acquire)
, unless I found that the number of invocations of use_req
was a problem in practice on whatever system.
Again, all of this is pretty much practical engineering tweaking, not stronger formal guarantees.
A pure load will load the latest value visible to this thread when the load can run (which is potentially as early as allowed by other ordering constraints). In real-world systems, hardware makes values visible across cores as quickly as it can, so pure loads are quite useful.
It seems pointless to me to do language-lawyer analysis ignoring real-world implementation details when the entire point of the code is a performance optimization, not correctness of an algorithm.
本文标签: cDoes the CAS operation that writes the same value ensure the correct logicStack Overflow
版权声明:本文标题:c++ - Does the CAS operation that writes the same value ensure the correct logic? - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1744106580a2591091.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
use_req
, and you won't detect that. So you probably need some sort of mutex. – ruakh Commented Mar 27 at 8:24use_req
. I wonder about the possibility of whether the version can be used to detect the recent request. – xmh0511 Commented Mar 27 at 8:47load
method [doc link] – ruakh Commented Mar 27 at 20:41release
ordering doesn't help anything in the abstract; the load side of the RMW is stillrelaxed
and can still run arbitrarily early in this new thread. – Peter Cordes Commented Mar 28 at 17:07load
orCAS
, the result is out of date as soon as you get it. I know you said you want to ignore theuse_req
, but stop that. Nothing you do with this pattern can be correct. The version check and the "use" need to be done in a single CAS, or use locking. – Matt Timmermans Commented Mar 31 at 12:38