admin管理员组

文章数量:1396092

There is a scenario where clicking the button each time will send a network request, and only use the result of the recent request. To achieve this logic, I use the atomic object to record the request version; only the version the result corresponds to equals to the current version can the result be used.

atomic<int> version = {0};

// Button Click handler
void OnClick(){
 auto v = version.fetch_add(1,memory_order​::​acq_rel); // #1
 std::thread([v](){
     int current = v+1;
     auto r = network_request(current ); // #2
     if(versionpare_exchange_strong(current,current,memory_order::release,memory_order::relaxed)){ // #3
       use_req(r);
     }
 });
}

Now, there are two button-clicking events: A and B. A happens before B(on time). If #3 in A is ordered after #1 in B, the result of the request in A is ignored. In contrast, if #3 in A is ordered before #1 in B, the CAS operation in A will succeed, its result will be used, then #1 in B will read the value written by #3 in A, which also provides the network request in A happens-before that in B. In this case, assuming the function use_req is thread-safe.

So, does the CAS operation that sets the same value ensure the correct logic? Could the CAS operation just be set to load operation?

Update

I think to check whether the version is updated during the network request, there is a difference between the pure load used in the if condition and the CAS operation.

When #3 uses CAS operation, before the network request in the first clicking event A returns, another button clicking event B triggers, and its fetch_add reads the modification produced by A's #1, A's #3 must fail, as per atomics.order p10

Atomic read-modify-write operations shall always read the last value (in the modification order) written before the write associated with the read-modify-write operation.

The modification order must be {A_fetch_add, B_fetch_add}. If the CAS in event A succeeded and was an RMW operation, p10 would be violated. So the condition checking in the event A is guaranteed to detect the version change in this case.

In contrast, if #3 is a pure load, even though #1 in the event B reads the modification produced by #1 in event A, as per intro.races p14

If a side effect X on an atomic object M happens before a value computation B of M, then the evaluation B takes its value from X or from a side effect Y that follows X in the modification order of M.

The modification order is still {A_fetch_add, B_fetch_add}, however, #3 in event A might read the modification X produced by #1 in A(due to happens-before) or some side effects Y after X(in this case, Y is B_fetch_add in event B). In other words, the pure load is not guaranteed to detect the version change in this case.

Is this the difference between pure load and CAS for checking version change?

There is a scenario where clicking the button each time will send a network request, and only use the result of the recent request. To achieve this logic, I use the atomic object to record the request version; only the version the result corresponds to equals to the current version can the result be used.

atomic<int> version = {0};

// Button Click handler
void OnClick(){
 auto v = version.fetch_add(1,memory_order​::​acq_rel); // #1
 std::thread([v](){
     int current = v+1;
     auto r = network_request(current ); // #2
     if(versionpare_exchange_strong(current,current,memory_order::release,memory_order::relaxed)){ // #3
       use_req(r);
     }
 });
}

Now, there are two button-clicking events: A and B. A happens before B(on time). If #3 in A is ordered after #1 in B, the result of the request in A is ignored. In contrast, if #3 in A is ordered before #1 in B, the CAS operation in A will succeed, its result will be used, then #1 in B will read the value written by #3 in A, which also provides the network request in A happens-before that in B. In this case, assuming the function use_req is thread-safe.

So, does the CAS operation that sets the same value ensure the correct logic? Could the CAS operation just be set to load operation?

Update

I think to check whether the version is updated during the network request, there is a difference between the pure load used in the if condition and the CAS operation.

When #3 uses CAS operation, before the network request in the first clicking event A returns, another button clicking event B triggers, and its fetch_add reads the modification produced by A's #1, A's #3 must fail, as per atomics.order p10

Atomic read-modify-write operations shall always read the last value (in the modification order) written before the write associated with the read-modify-write operation.

The modification order must be {A_fetch_add, B_fetch_add}. If the CAS in event A succeeded and was an RMW operation, p10 would be violated. So the condition checking in the event A is guaranteed to detect the version change in this case.

In contrast, if #3 is a pure load, even though #1 in the event B reads the modification produced by #1 in event A, as per intro.races p14

If a side effect X on an atomic object M happens before a value computation B of M, then the evaluation B takes its value from X or from a side effect Y that follows X in the modification order of M.

The modification order is still {A_fetch_add, B_fetch_add}, however, #3 in event A might read the modification X produced by #1 in A(due to happens-before) or some side effects Y after X(in this case, Y is B_fetch_add in event B). In other words, the pure load is not guaranteed to detect the version change in this case.

Is this the difference between pure load and CAS for checking version change?

Share Improve this question edited Mar 30 at 1:11 xmh0511 asked Mar 27 at 7:35 xmh0511xmh0511 7,3591 gold badge11 silver badges41 bronze badges 12
  • 6 It's not wrong, but it's slightly meaningless: even if the version happens to be the same when you check the version, it could already be mismatched by the time you call use_req, and you won't detect that. So you probably need some sort of mutex. – ruakh Commented Mar 27 at 8:24
  • @ruakh Maybe, we need to ignore the effect of use_req. I wonder about the possibility of whether the version can be used to detect the recent request. – xmh0511 Commented Mar 27 at 8:47
  • 1 In that case you're looking for the load method [doc link] – ruakh Commented Mar 27 at 20:41
  • 1 I don't see any significant benefit to an RMW over a pure load for re-checking if another button-click has happened since this network request started. It has more contention since it needs exclusive ownership of the cache line, vs. a pure load maybe being able to run slightly earlier, but network and button-event latencies should be large compared to cache-coherency latency. release ordering doesn't help anything in the abstract; the load side of the RMW is still relaxed and can still run arbitrarily early in this new thread. – Peter Cordes Commented Mar 28 at 17:07
  • 2 Whether you use a load or CAS, the result is out of date as soon as you get it. I know you said you want to ignore the use_req, but stop that. Nothing you do with this pattern can be correct. The version check and the "use" need to be done in a single CAS, or use locking. – Matt Timmermans Commented Mar 31 at 12:38
 |  Show 7 more comments

1 Answer 1

Reset to default 0

If a pure load isn't sufficient for correctness, neither is a same-value CAS

At least in this case, and I'm not aware of a case where it could be. Perhaps something where release was significant in creating synchronization, but IDK how you'd distinguish whether you synced with the CAS or not if the value is the same either way.

You haven't clearly established what correctness means, but the most you might get out of a CAS instead of a load is being slower to change typical timing (no guarantees), in terms of practical engineering tweaking.


This algorithm ignores obviously-stale numbers in case of a slow network and spamming clicks very fast compared to network and processing latency(?). Like a button debounce algorithm perhaps?

If the button is clicked after processing of a request has started, this code will still handle that button click. It doesn't give any mutual exclusion between calls to use_req. If that's all you wanted, then a pure load is also correct.

The critical moment for "when processing starts" is when the load takes a value, or when the store side of the atomic RMW (CAS) happens. After that, another button click will lead to another use_req call, even if the actual use_req call in this thread hasn't even been reached yet. (In terms of out-of-order exec and memory reordering, if none of the operations in use_req have happened yet.)

You tagged this [language-lawyer] so there isn't really anything to say about timing guarantees of pure loads vs. RMWs (since the ISO standard avoids making any, and avoids talking about MESI cache coherency). In practice, atomic RMWs all contend with each other and have to get exclusive ownership of the cache line, so will be slower. This tends to push the critical cutoff time later wrt. to the other operations in this thread, but there's no difference in terms of qualitative or formal guarantees. If races you didn't want could happen with a pure load, they could still happen with CAS, but might I think be less likely.

With release ordering for the RMW, all preceding loads / stores do have to be visible to threads that run an acquire op on version, specifically version.fetch_add(1,memory_order​::​acq_rel); in the event-handler thread. (Not CASes in other newly-spawned threads, they're release which doesn't include acquire so they don't have to sync-with other CASes. They are on the same object though so the CAS itself is serialized, unlike pure loads which can run in parallel on the same object.)

In practice on real implementations, std::atomic_thread_fence(acquire); before a pure load will normally work as a LoadLoad barrier, but not StoreLoad (for that you'd need thread_fence(seq_cst)). So you can still delay a pure load until after operations in network_request(current) have become globally visible. Formally, if there aren't any atomic operations in network_request(current), they probably aren't ordered by a fence + load, but optimizing away the fence or reordering them with it would require whole-program analysis by a compiler to show there aren't any atomic stores.

So if I was trying to hack up something like this, I'd be inclined to write it like this if I was trying to use memory ordering to make the version check as late as possible and wait for all previous network_request(current) processing to be done before the check:

     auto r = network_request(current); // #2
     std::atomic_thread_fence(seq_cst);  // make sure earlier work is done before checking
     if(version.load(relaxed)){ // #3
       use_req(r);
     }

I intentionally didn't use acquire, same as your choice with CAS, because it's fine if the check actually happens after some use_req loads are done speculatively. We actually want it to be done later.

But most likely I'd just use version.load(relaxed) without any extra barriers, or at most thread_fence(acquire), unless I found that the number of invocations of use_req was a problem in practice on whatever system.

Again, all of this is pretty much practical engineering tweaking, not stronger formal guarantees.

A pure load will load the latest value visible to this thread when the load can run (which is potentially as early as allowed by other ordering constraints). In real-world systems, hardware makes values visible across cores as quickly as it can, so pure loads are quite useful.
It seems pointless to me to do language-lawyer analysis ignoring real-world implementation details when the entire point of the code is a performance optimization, not correctness of an algorithm.

本文标签: cDoes the CAS operation that writes the same value ensure the correct logicStack Overflow