admin管理员组

文章数量:1387433

I was reading the paper Repairing Sequential Consistency in C/C++11 by Lahav et al and had a doubt regarding the 2+2W litmus test presented there.

Under the section Semantics of SC Atomics in C11, the authors present the following code for the 2+2W litmus test.

// thread 1
x :=sc 1
y :=sc 2
a :=rlx y

// thread 2
y :=sc 1
x :=sc 2
b :=rlx x

The text says that the conditions imposed by C11 SC semantics prevent the reads from both reading a value of 1. That is (a==1 && b==1) is false.

I looked up the intent of the 2+2W litmus test, and as per my understanding, it checks which of the two writes per thread end up contributing the final values of the shared variables. The test, as described here, just intends to check the values of the two shared variables as a post condition, not necessarily on the two concurrent threads. So, it seems, any specific placement of the reads isn't necessarily an inherent part of the test.

The way this test is encoded by Lahav, I find it queer, why should a read return a value different from the one written by the write just preceding it.

I understand, that the SC semantics enforced by C11, are intended to uphold the promises made to the users of the language. And to achieve this the compiler maintains operation ordering, and/or inserts specific barrier instructions, depending on the requirements of the architecture. And this specifically prevents both the reads from reading a value of 1.

All the same, what I have also read is that, all hardware maintain the semantics of a single thread of execution as per the program order. Then even in the absence of any enforcement made by C11, the compiler and hardware should maintain the semantics of y :=sc 2 followed by b :=rlx x and result in b==2.

My point of confusion is: In the absence of any semantic enforcing done by C11 on the above code, what hardware optimizations or run time situations would make b :=rlx x read any thing other than 2 (and similarly for a :=rlx y).

Assuming my confusion is not misplaced, part two of the question is: Would the code have been better with the two reads swapped across the two threads? Under C11 SC semantics, the modification_ordered_before and sequenced_before cycle across the four writes, that precludes the validity of the execution under question, would still be exemplified, for the sake of the discussion in the paper.

I was reading the paper Repairing Sequential Consistency in C/C++11 by Lahav et al and had a doubt regarding the 2+2W litmus test presented there.

Under the section Semantics of SC Atomics in C11, the authors present the following code for the 2+2W litmus test.

// thread 1
x :=sc 1
y :=sc 2
a :=rlx y

// thread 2
y :=sc 1
x :=sc 2
b :=rlx x

The text says that the conditions imposed by C11 SC semantics prevent the reads from both reading a value of 1. That is (a==1 && b==1) is false.

I looked up the intent of the 2+2W litmus test, and as per my understanding, it checks which of the two writes per thread end up contributing the final values of the shared variables. The test, as described here, just intends to check the values of the two shared variables as a post condition, not necessarily on the two concurrent threads. So, it seems, any specific placement of the reads isn't necessarily an inherent part of the test.

The way this test is encoded by Lahav, I find it queer, why should a read return a value different from the one written by the write just preceding it.

I understand, that the SC semantics enforced by C11, are intended to uphold the promises made to the users of the language. And to achieve this the compiler maintains operation ordering, and/or inserts specific barrier instructions, depending on the requirements of the architecture. And this specifically prevents both the reads from reading a value of 1.

All the same, what I have also read is that, all hardware maintain the semantics of a single thread of execution as per the program order. Then even in the absence of any enforcement made by C11, the compiler and hardware should maintain the semantics of y :=sc 2 followed by b :=rlx x and result in b==2.

My point of confusion is: In the absence of any semantic enforcing done by C11 on the above code, what hardware optimizations or run time situations would make b :=rlx x read any thing other than 2 (and similarly for a :=rlx y).

Assuming my confusion is not misplaced, part two of the question is: Would the code have been better with the two reads swapped across the two threads? Under C11 SC semantics, the modification_ordered_before and sequenced_before cycle across the four writes, that precludes the validity of the execution under question, would still be exemplified, for the sake of the discussion in the paper.

Share Improve this question edited Mar 18 at 15:42 Lundin 217k46 gold badges279 silver badges433 bronze badges asked Mar 18 at 13:20 Dhwani KatagadeDhwani Katagade 1,31215 silver badges28 bronze badges 4
  • 2 "maintain the semantics of a single thread of execution as per the program order": I think you have misinterpreted this statement. It just means that a single thread executes as if in program order if other threads don't "interfere" by accessing the same data. If our thread reads an atomic object that another thread has written, we want to observe that write, even though it's something we would not observe if the other thread didn't exist. That's the whole point of having atomic objects. – Nate Eldredge Commented Mar 18 at 15:56
  • @NateEldredge - Thanks for your comment. You have correctly put your finger on the crux of my confusion. I somewhat knew what you have pointed out but some how in the context of this confusion my understanding got clouded. Almost embarrassing. :-( – Dhwani Katagade Commented Mar 19 at 11:03
  • Worth noting is that it would be legitimate for the compiler to optimize out the second load, and simply set a (likewise b) to 2. Even though it's possible for another thread to write to y in the short time between the store and the load, there's always a chance that it doesn't, and so skipping the reload still gives behavior that's consistent with all the rules. That said, current mainstream compilers are fairly conservative about optimizing atomics, and won't actually do this. – Nate Eldredge Commented Mar 19 at 12:31
  • 1 So in that sense, you're right that the test would be a little more reasonable without the immediate reload. If I were writing it, I'd probably test x and y in a main thread, after threads 1 and 2 have both been joined. – Nate Eldredge Commented Mar 19 at 12:34
Add a comment  | 

1 Answer 1

Reset to default 3

In the absence of any semantic enforcing done by C11 on the above code, what hardware optimizations or run time situations would make b :=rlx x read any thing other than 2

Simple: Thread 2 executes y := 1, then x := 2, then it observes thread 1's execution of x := 1, then it executes b := x. There's nothing inconsistent about that sequence of events. The order of operations within a single thread is preserved, but operations by another thread may or may not interpose — because we're making no guarantees.

This outcome is even possible with the full C++11 memory model; thread 1 may or may not write to x in between thread 2's x := 2 and thread 2's b := x, and thread 2 may or may not observe that write since it's making a relaxed read. But it's not allowed for a and b to both be 1 because that would be incompatible with a globally consistent order of the writes to x and y.

本文标签: