

I have various methods for accumulating quantities in fixed-precision numerics based on the following data type:

int95_t {
  long long int x;
  int y;

The numbers count forward in x until it may run out of bounds and wrap around, at which point y accumulates +1 for overflow and -1 for underflow. (I also have methods for adding floating point numbers, which may cause several overflows or underflows of x and thus accumulate several times in y).

The methods for adding or subtracting in this data type are somewhat involved.

int95_t sum(const int95_t a, const int95_t b) {
  int95_t result = { a.x + b.x, a.y + b.y };
  result.y += (1 - (2 * (b.x < 0LL))) * ((a.x ^ result.x) < 0 && (a.x ^ b.x) >= 0LL) * 2;
  return result;

int95_t subtract(const int95_t a, const int95_t b) {
  const int95_t neg_b = { -b.x, -b.y + (2 * (b.x == LLONG_MIN)) };
  return sum(a, neg_b);

I have overloaded the operators +, -, +=, and -= for this data type, but that doesn't help to clean up my code all that much and for the following reason: in many contexts I will be storing not an array of int95_t values but two separate arrays of long long int and int values. (The reason is that there are many conditions under which the supernary accumulator y can be safely ignored, and I want to conserve memory bandwidth.) Thus, I have a lot of situations where, in order to add one int95_t to another in these circumstances, it seems I would still need to do:

long long int primary[8];
int secondary[8];
int95_t add_this = { 81573835283, 3816 };

for (int i = 0; i < 8; i++) {
  int95_t tmp = { primary[i], secondary[i] };
  tmp += add_this;
  primary[i] = tmp.x;
  secondary[i] = tmp.y;

Is there any better way? I would hope that I could count on any given C++ compiler to properly interpret something like this in the above case, but I'm not sure:

for (int i = 0; i < 8; i++) {
  { primary[i], secondary[i] } += add_this;

Thanks for any help you all can offer.

I have various methods for accumulating quantities in fixed-precision numerics based on the following data type:

int95_t {
  long long int x;
  int y;

The numbers count forward in x until it may run out of bounds and wrap around, at which point y accumulates +1 for overflow and -1 for underflow. (I also have methods for adding floating point numbers, which may cause several overflows or underflows of x and thus accumulate several times in y).

The methods for adding or subtracting in this data type are somewhat involved.

int95_t sum(const int95_t a, const int95_t b) {
  int95_t result = { a.x + b.x, a.y + b.y };
  result.y += (1 - (2 * (b.x < 0LL))) * ((a.x ^ result.x) < 0 && (a.x ^ b.x) >= 0LL) * 2;
  return result;

int95_t subtract(const int95_t a, const int95_t b) {
  const int95_t neg_b = { -b.x, -b.y + (2 * (b.x == LLONG_MIN)) };
  return sum(a, neg_b);

I have overloaded the operators +, -, +=, and -= for this data type, but that doesn't help to clean up my code all that much and for the following reason: in many contexts I will be storing not an array of int95_t values but two separate arrays of long long int and int values. (The reason is that there are many conditions under which the supernary accumulator y can be safely ignored, and I want to conserve memory bandwidth.) Thus, I have a lot of situations where, in order to add one int95_t to another in these circumstances, it seems I would still need to do:

long long int primary[8];
int secondary[8];
int95_t add_this = { 81573835283, 3816 };

for (int i = 0; i < 8; i++) {
  int95_t tmp = { primary[i], secondary[i] };
  tmp += add_this;
  primary[i] = tmp.x;
  secondary[i] = tmp.y;

Is there any better way? I would hope that I could count on any given C++ compiler to properly interpret something like this in the above case, but I'm not sure:

for (int i = 0; i < 8; i++) {
  { primary[i], secondary[i] } += add_this;

Thanks for any help you all can offer.

Share Improve this question asked Nov 22, 2024 at 3:14 dsceruttidscerutti 1916 bronze badges 2
  • In sum() you do int95_t result = { a.x + b.x, a.y + b.y };. If calculation of ax + bx overflows a long long, then behaviour of your program is undefined. I suggest getting the computation of the sum working correctly (i.e. avoiding undefined behaviour) before worrying about how to improve your code, by whatever measure. Essentially, to avoid problems with overflowing signed integers in C++, it is necessary to prevent any calculation that produces undefined behaviour, not allow it to occur and then take recovery action. – Peter Commented Nov 22, 2024 at 6:06
  • A reminder that the number of bits in std::intxx_t includes the sign bit, so we get 64 not 63. – Passer By Commented Nov 22, 2024 at 6:20
Add a comment  | 

1 Answer 1

Reset to default 1

One of solutions could be to create a temporary object with overloaded arithmetics which holds references to operand components. For example:

class split
    long long int &rx;
    int &ry;

    split(long long int &x, int &y) : rx { x }, ry { y } {}

    split &operator+=(const int95_t &v)
        // implement your += logic here with rx and ry modification
        return *this;
for (int i = 0; i < 8; i++) {
  split(primary[i], secondary[i]) += add_this;

You can also re-use this class for int95_t overloaded operators implementation, i.e.

int95_t val = { 81573835283, 3816 };

split(val.x, val.y) += add_this;
