java - What exactly makes Compare-and-swap (CAS) loop a better choice in highly concurrent environment? - Stack Overflow-软件玩家

admin管理员组
文章数量:1406333

Assuming we have just 1 cpu core and for the sake of example given the following CAS loop (in Java language, took it from here), although the question is about CAS loop in general not this code in particular:

private AtomicInteger count = new AtomicInteger(0);

public void increment() {
  int current, next;
  do {
    current = count.get();
    next = current + 1;
  } while (!countpareAndSet(current, next));
}

theoretically, nothing prevents some particular thread from getting stuck in this loop.

For example, after next = current + 1; context switch occurs, some other thread changes the value of the atomic count. Then once this unlucky thread resumes, the expression in the while statement evaluates to true. So, the loop starts again. Yet, after next = current + 1; context switch occurs and things keep repeating so on and on.

So, would it be correct to say that CAS loop is safe to use because it is just a zero-risk bet to expect CPU not to do context switch so often? By safety here I mean some thread won't get stuck in the loop forever. For example, in a mission-critical app (whatever that means), just to emphasize zero tolerance towards unhappy scenarios. If so, how do we know CPU works like that? How to cultivate that intuition of how many operations roughly it usually takes before CPU performs a context switch? In fact, that is the main question that I would like to get an answer for in this scenario.

However if it is not because of that low-risk bet, then what makes us believe the loop will eventually terminate and when? For example, what would be the upper bound of the number of retries of the loop or are there some underlying (os-level) guarantees regarding that?

private AtomicInteger count = new AtomicInteger(0);

public void increment() {
  int current, next;
  do {
    current = count.get();
    next = current + 1;
  } while (!countpareAndSet(current, next));
}

theoretically, nothing prevents some particular thread from getting stuck in this loop.

Share Improve this question edited Mar 6 at 0:35 Peter Cordes 368k49 gold badges717 silver badges981 bronze badges asked Mar 5 at 22:37 Turkhan Badalov 9341 gold badge11 silver badges19 bronze badges

1 Basically what you have there is a poorly designed method. Compare and swap isn't magic, you still have to define your algorithm so that it does in fact work. I think you should expand your example to something that reproduces your use case. Or otherwise improve your problem description. I think the correct way of looking at this is that one thread must succeed, so no thread loops more times than the number of threads currently trying to update that memory. That's a very high rate of concurrency, not a "low risk bet." The "low risk" case is always free and immediate. – markspace Commented Mar 5 at 23:07
@markspace, thanks for the comment. Regarding the "no thread loops more times than the number of threads currently trying to update that memory", I think nothing guarantees how long that "unlucky" thread will wait for its turn unless OS provides some level of fairness I suppose? Meanwhile, some other threads might update the counter many times since the "unlucky" one was put on hold. Then, once it runs it sees the value has changed and again gets on hold by OS until it resumes next time just to realize the value has changed again. Or there are some guarantees after all? – Turkhan Badalov Commented Mar 5 at 23:16
@markspace, "I think you should expand your example to something that reproduces your use case". I don't really have a particular case. I was reading about CAS loops and how they are better compared to locks in a concurrent system. So, wanted to know how much can I trust this loop? Well, given many systems use this approach, it must be safe anyway. For example, project reactor alone uses a lot of CAS loops (that is also why I dived in this topic). It is just me failing to see what exactly prevents the worst-case scenario happening or can we just accept it? Basically, where is normal here? :) – Turkhan Badalov Commented Mar 5 at 23:19
2 I think John gave an excellent answer, I want to address one point: again gets on hold by OS until it resumes. A typical thread runs for many thousands of cycles, maybe even 100 million cycles, before the OS suspends it. A CAS does not run once and then the OS stops it, CAS is non-blocking. It just runs as fast as possible continuously. Thus it would require a really insane probability of bad luck for this loop to get stuck. Basically that's why I asked for a real use case, this one is too simple to actually get stuck or anything interesting. – markspace Commented Mar 6 at 0:57
1 @TurkhanBadalov, In some programs, the possibility of the CAS failing is not decided by a strictly random process. In some programs, threads can unintentionally become synchronized with each other in a way that makes the conflict much more likely. The extreme case, where the CAS frequently fails, and always for the same thread, is a kind of starvation. – Solomon Slow Commented Mar 7 at 13:27

| Show 5 more comments

2 Answers 2

Sorted by: Reset to default 3

Most modern systems are SMP (multi-core and sometimes even multi-socket), so other threads can be running simultaneously with this one. That means CAS can fail even without a context switch by the core running your thread. Actual context-switches happen extremely infrequently compared to how long a CAS retry loop takes, so that's pretty much a non-problem.

See Is incrementing an int effectively atomic in specific cases? for more about how real CPUs handle atomic RMWs, especially on x86.

what makes us believe the loop will eventually terminate and when?

A CAS retry loop is lock-free: at least one thread will make progress every time they all do an iteration. (Of course they don't actually run in lock-step, and a CAS attempt can only happen while holding exclusive ownership of the cache line... See Anything in std::atomic is wait-free? for another take on that.)

Java and other languages don't provide fairness guarantees for their lock-free atomics, but in practice most hardware does try to avoid starving any core of access to a cache line it's waiting for. Its CAS attempt could still fail, but you'd have to be infinitely unlucky for it to fail an infinite number of times in a row with other threads winning the race to do the CAS.

But that's assuming all threads are doing a similar-speed computation between the load and CAS attempt; if many other threads are doing non-stop increments while you're trying to atomically replace x with slow_function(x), you might never succeed.

If there's so much contention that your CAS retry loops often retry more than once, that's not good; lock-free works best when contention is low enough that CAS retries aren't common. So for example, you want to avoid having every thread contending to increment a single shared counter in a tight loop if you can avoid it. Break the work up into regions that are divided by smaller pools of threads, or have each thread claim 16 chunks by doing an atomic += 16 instead of += 1.

But even with high contention, a loop like this tends to degrade fairly gracefully, not completely fall on its face as contention increases. Things would have to get very extreme before you'd start seeing hundreds of CAS failures in a row. Once a core gets ownership of a cache line, it only takes a few nanoseconds (tens of clock cycles) to make a CAS attempt, and latency between cores to move cache lines around is like 50 to 100 nanoseconds on typical CPUs. (The more cores, the more hops in the interconnect.)

On LL/SC machines, CAS itself and other atomic RMWs like getAndIncrement require a retry loop to avoid spurious failures (which is why C++11 compare_exchange_weak exists, a version that's allowed to fail spuriously and thus can be used in retry loops). Livelock with no threads making progress is possible in theory; avoiding that is I think up to CPU architects having cores hang on to cache-line ownership a bit longer, perhaps adaptively noticing that they've failed repeatedly.

Or better, architects providing single-instruction atomic RMWs as an alternative to LL/SC, like ARM did with ARMv8.1. And preferably a rich set of atomic RMWs to directly support methods like getAndIncrement without a CAS retry loop. (CAS retry loops are still often needed, like when some data in an object you're publishing needs to be stored before you CAS a reference to it, or to implement a hypothetical getAndRightShift or whatever.) See https://www.anandtech/show/15578/cloud-clash-amazon-graviton2-arm-against-intel-and-amd/2 for benchmarks of core-to-core round-trip latency averages using LDREX/STREX vs. ARMv8.1 single-instruction CAS on a 64-core system.

For example, what would be the upper bound of the number of retries of the loop or are there some underlying (os-level) guarantees regarding that?

There is none.

So, would it be correct to say that CAS loop is safe to use because it is just a zero-risk bet to expect CPU not to do context switch so fast?

It is important to distinguish between "safe" in the sense that behavior is well defined under the relevant circumstances, and "safe" in the sense that the program can be relied upon to exhibit satisfactory characteristics. Atomic CAS is safe in the former sense, but algorithms using atomic CAS are not necessarily safe in the latter sense. For instance, they don't necessarily guarantee that all threads involved make progress in finite time.

Your example is not safe in the second sense. The use of atomic CAS does not avoid the possibility of one or more threads getting stuck in the loop, unless probabilistically.

In that case,

What exactly makes Compare-and-swap (CAS) loop a better choice in highly concurrent environment?

Individual atomic operations are typically much faster than taking out a lock, performing several operations, and then releasing the lock again. That can make lock-free approaches based on atomic operations consume less wall time and / or accommodate higher concurrency than lock-based approaches, at the cost of possibly consuming more CPU time. Such lock-free approaches often depend on atomic CAS.

But good, correct lock-free algorithms are tricky to develop. Neither lock-free approaches in general nor atomic CAS in particular is a silver bullet. And although they may provide for shorter completion times when used correctly, they do not magically give you more resources to work with, and they do not automatically use your resources more efficiently than lock-based approaches do.

How to cultivate that intuition of how many operations roughly it usually takes before CPU performs a context switch?

Don't. At least not while writing userspace code. Such intuition is not anywhere among the top things you need bear in mind to write good concurrent code, with or without locks. Relying on such assumptions is likely to make your code brittle.

what makes us believe the loop will eventually terminate and when?

To the limited extent that we do believe that, it is based on an assumption that either the timing will eventually work out, or that the other threads contending for the same resources will eventually run out of work to do. How reliable those assumptions are depends on many factors, among them the nature of the tasks, their number, and the number of concurrent execution units available to run them.

For example, what would be the upper bound of the number of retries of the loop or are there some underlying (os-level) guarantees regarding that?

What makes you think there would be any such thing in the general case?

本文标签：

版权声明：本文标题：java - What exactly makes Compare-and-swap (CAS) loop a better choice in highly concurrent environment? - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1745004575a2637181.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

java - What exactly makes Compare-and-swap (CAS) loop a better choice in highly concurrent environment? - Stack Overflow

2 Answers 2

更多相关文章

javascript - TypeScript `unknown` doesn&#39;t allow non-unknown types in function parameters - Stack Overflow

arrays - Why does my code return the error &quot;missing : after property id&quot; in JavaScript? - Stack Overflow

How to contact Google Cloud Sales Support to upgrade disk for a VM Compute Engine? - Stack Overflow

javascript - jqplot completely redraw a graph - Stack Overflow

javascript - v-bind image src with a computed property in Vue.JS - Stack Overflow

node.js - ConversationRelay Twilio Not allow user input when the agent is speaking - Stack Overflow

javascript - 620: Too many queries using Google Maps API - Stack Overflow

plugin development - Hook for validating and rejecting frontend image upload

html - How to center entire columns on a page, and have it stay responsive?

seo - How to avoid images appear as post in google search?

sql server - Entity Framework generating a lot of queries for simple update - Stack Overflow

c# - How to get image with its name and any extension? - Stack Overflow

plugins - my hook or my function wont run in the cronjob

javascript - Reduce size and change text color of the bootstrap-datepicker field - Stack Overflow

c# - Some data is being skipped when writing to Arduino esp32 from Unity per frame - Stack Overflow

javascript - Joi Validation: Is there a way to allow unknown keys for several schema objects in one go - Stack Overflow

dart - How to display images in flutter with different dimensions inside a fixed-size container while maintaining aspect ratio a

javascript - XMLHttpRequest status returning 404 - Stack Overflow

javascript - Good Templating engine to be used with jquery - Stack Overflow

javascript - (React) setState() callback not firing after state change - Stack Overflow

发表评论

推荐文章

c - Can alloca() cause stack corruption in a variadic function when arguments are mismatched? - Stack Overflow

javascript - How to addClass and removeClass with fade via jQuery? - Stack Overflow

javascript - How can I measure if request animation frame helped to my scroll event? - Stack Overflow

javascript - How does one resolve this: &#39;Uncaught ReferenceError: Cannot access &#39;NavBar&#39; before initiali

javascript - Bufferthrottle element drag events with RxJS - Stack Overflow

热门文章

java - IntelliJ IDEA not including com.mysql.jdbc.Driver on artifact - Stack Overflow

javascript - MUI Autocomplete custom options, not showing selected option - Stack Overflow

javascript - Dropzone.js options - cannot get them to work - Stack Overflow

javascript - How to addClass and removeClass with fade via jQuery? - Stack Overflow

flutter - Render widget content to PDF or render html to widget and PDF consistently - Stack Overflow

c# - Google map not showing after Partial postback - Stack Overflow

angular - JavaScript - create a file out of json object and use it in a FormData - Stack Overflow

javascript - dispatch action triggering, but redux store not updating - Stack Overflow

javascript - react native flatlist with several arrays of data - Stack Overflow

javascript - Why is &#226;€‹ or &amp;#8203 being inserted into my html body randomly? - Stack Overflow

最新文章

windows设置断电重启开机后自动输入锁屏密码登录

Windows系统设置开机默认开启数字小键盘

Windows11 开机自动同步时间（开机时间不更新问题）

windows配置开机自启动软件或脚本

【Redis】Windows设置Redis为开机自启动

javascript - (React) setState() callback not firing after state change - Stack Overflow

python - Query returns None in Celery tasks - Stack Overflow

How to pass objects to a javascript function from HTML href link? - Stack Overflow

python - Why is Tkinter Button unpressable? (No errors raised, not disabled) - Stack Overflow

javascript - AngularJS - set HTTP header for GET request - Stack Overflow

惠普OMEN 15-CE001TX 2EF91PA参数报价

苹果新款MacBook Pro 15英寸 i732GB1TBVega Pro 20参数报价

联想Y330A-PSE L参数报价

神舟战神Z7 D6 i7-12650H16GB512GBRTX4050旗舰版参数报价

神舟战神Z7 D6 i7-12650H16GB1TBRTX4050参数报价

javascript - TypeScript `unknown` doesn't allow non-unknown types in function parameters - Stack Overflow

arrays - Why does my code return the error "missing : after property id" in JavaScript? - Stack Overflow

javascript - How does one resolve this: 'Uncaught ReferenceError: Cannot access 'NavBar' before initiali

javascript - Why is â€‹ or &#8203 being inserted into my html body randomly? - Stack Overflow