admin管理员组文章数量:1289634
I have two C++ implementations that count pairs ((x, y)) satisfying ((x + y) % 7 == 0). Method 1 skips unnecessary iterations using y += 6
, while Method 2 checks every y
. I performed benchmarks on both small and large ranges, and here are my results.
Code for Both Methods (Smaller Range)
#include <iostream>
#include <chrono>
using namespace std;
using namespace std::chrono;
void method1(long long &comparisons) {
int count = 0;
comparisons = 0;
for (int x = 50; x <= 300; ++x) {
int start = max(70, x + 1);
for (int y = start; y <= 400; y++) {
++comparisons;
if ((x + y) % 7 == 0) {
++count;
y += 6; // Optimization: Skipping unnecessary checks
}
}
}
cout << "Method 1 count: " << count << "\n";
cout << "Method 1 comparisons: " << comparisons << "\n";
}
void method2(long long &comparisons) {
int count = 0;
comparisons = 0;
for (int x = 50; x <= 300; ++x) {
int start = max(70, x + 1);
for (int y = start; y <= 400; ++y) {
++comparisons;
if ((x + y) % 7 == 0) {
++count;
}
}
}
cout << "Method 2 count: " << count << "\n";
cout << "Method 2 comparisons: " << comparisons << "\n";
}
int main() {
long long comparisons1, comparisons2;
auto start1 = high_resolution_clock::now();
method1(comparisons1);
auto end1 = high_resolution_clock::now();
double time1 = duration<double>(end1 - start1).count();
cout << "Method 1 Time: " << time1 << " sec\n\n";
auto start2 = high_resolution_clock::now();
method2(comparisons2);
auto end2 = high_resolution_clock::now();
double time2 = duration<double>(end2 - start2).count();
cout << "Method 2 Time: " << time2 << " sec\n";
double speedup = time2 / time1;
double comparison_reduction = (double)comparisons2 / comparisons1;
cout << "\nMethod 1 is **" << speedup << "×** faster than Method 2.\n";
cout << "Method 1 does **" << comparison_reduction << "×** fewer comparisons.\n";
return 0;
}
Question
- Does this type of loop optimization (
y += 6
) generally lead to performance improvements, or does it depend on specific compiler optimizations? - Are there better ways to approach this kind of problem in C++?
Benchmark Results
Smaller Range (50 → 300, 70 → 400)
Method 1 count: 8040
Method 1 comparisons: 8796
Method 1 Time: 3.114e-05 sec
Method 2 count: 8040
Method 2 comparisons: 56285
Method 2 Time: 1.6753e-05 sec
Method 1 is **0.53799×** faster than Method 2.
Method 1 does **6.39893×** fewer comparisons.
Larger Range (5000 → 30000, 7000 → 40000)
Method 1 count: 80074785
Method 1 comparisons: 80149790
Method 1 Time: 0.0372139 sec
Method 2 count: 80074785
Method 2 comparisons: 560523500
Method 2 Time: 0.127715 sec
Method 1 is **3.43193×** faster than Method 2.
Method 1 does **6.99345×** fewer comparisons.
Analysis
- For a small range, Method 1 does ~6.4× fewer comparisons but isn't significantly faster.
(Why?)
- For a larger range, Method 1 does ~7× fewer comparisons and is ~3.4× faster, proving its efficiency at scale.
- Optimization Impact: Skipping unnecessary checks (
y += 6
) improves performance as input size grows.
I have two C++ implementations that count pairs ((x, y)) satisfying ((x + y) % 7 == 0). Method 1 skips unnecessary iterations using y += 6
, while Method 2 checks every y
. I performed benchmarks on both small and large ranges, and here are my results.
Code for Both Methods (Smaller Range)
#include <iostream>
#include <chrono>
using namespace std;
using namespace std::chrono;
void method1(long long &comparisons) {
int count = 0;
comparisons = 0;
for (int x = 50; x <= 300; ++x) {
int start = max(70, x + 1);
for (int y = start; y <= 400; y++) {
++comparisons;
if ((x + y) % 7 == 0) {
++count;
y += 6; // Optimization: Skipping unnecessary checks
}
}
}
cout << "Method 1 count: " << count << "\n";
cout << "Method 1 comparisons: " << comparisons << "\n";
}
void method2(long long &comparisons) {
int count = 0;
comparisons = 0;
for (int x = 50; x <= 300; ++x) {
int start = max(70, x + 1);
for (int y = start; y <= 400; ++y) {
++comparisons;
if ((x + y) % 7 == 0) {
++count;
}
}
}
cout << "Method 2 count: " << count << "\n";
cout << "Method 2 comparisons: " << comparisons << "\n";
}
int main() {
long long comparisons1, comparisons2;
auto start1 = high_resolution_clock::now();
method1(comparisons1);
auto end1 = high_resolution_clock::now();
double time1 = duration<double>(end1 - start1).count();
cout << "Method 1 Time: " << time1 << " sec\n\n";
auto start2 = high_resolution_clock::now();
method2(comparisons2);
auto end2 = high_resolution_clock::now();
double time2 = duration<double>(end2 - start2).count();
cout << "Method 2 Time: " << time2 << " sec\n";
double speedup = time2 / time1;
double comparison_reduction = (double)comparisons2 / comparisons1;
cout << "\nMethod 1 is **" << speedup << "×** faster than Method 2.\n";
cout << "Method 1 does **" << comparison_reduction << "×** fewer comparisons.\n";
return 0;
}
Question
- Does this type of loop optimization (
y += 6
) generally lead to performance improvements, or does it depend on specific compiler optimizations? - Are there better ways to approach this kind of problem in C++?
Benchmark Results
Smaller Range (50 → 300, 70 → 400)
Method 1 count: 8040
Method 1 comparisons: 8796
Method 1 Time: 3.114e-05 sec
Method 2 count: 8040
Method 2 comparisons: 56285
Method 2 Time: 1.6753e-05 sec
Method 1 is **0.53799×** faster than Method 2.
Method 1 does **6.39893×** fewer comparisons.
Larger Range (5000 → 30000, 7000 → 40000)
Method 1 count: 80074785
Method 1 comparisons: 80149790
Method 1 Time: 0.0372139 sec
Method 2 count: 80074785
Method 2 comparisons: 560523500
Method 2 Time: 0.127715 sec
Method 1 is **3.43193×** faster than Method 2.
Method 1 does **6.99345×** fewer comparisons.
Analysis
- For a small range, Method 1 does ~6.4× fewer comparisons but isn't significantly faster.
(Why?)
- For a larger range, Method 1 does ~7× fewer comparisons and is ~3.4× faster, proving its efficiency at scale.
- Optimization Impact: Skipping unnecessary checks (
y += 6
) improves performance as input size grows.
1 Answer
Reset to default 1Method #1 reduces the number of comparisons necessary with the y += 6;
. But it still checks whether the result is divisible by 7, even though it's already assured that it is.
When you get down to it, you don't really need to check the remainder very often at all. If x + y % 7 == 0, then you know the same will be true for y+7, y+14, y+21, and so on.
But then on the next iteration of the outer loop, you have x = prev_x + 1. You don't really need to search for a value that will give a multiple of 7. since x is one greater than it was the last iteration, the y that gives a multiple of 7 will be one less than it was in the previous iteration. That is, if some_x + some_y % 7 == 0
, then (some_x + 1) + (some_y - 1) % 7 == 0
as well.
On an unrelated note, Howard Hinnant (the author of most of the chrono library) now advises against using high_resolution_clock
. If memory serves, he generally advises using steady_clock
instead.
本文标签: performanceBenchmarking Two C Implementations for Counting Pairs Divisible by 7Stack Overflow
版权声明:本文标题:performance - Benchmarking Two C++ Implementations for Counting Pairs Divisible by 7 - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1741479909a2381105.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
gcc -O3 -march=native
with GCC 14.2, running on i7-6700k Skylake on Arch GNU/Linux would be a possible answer.) If you didn't enable optimization, then latency bottlenecks on most CPUs from store/reload of loop variables might explain some of the effect. Or CPU-frequency warm-up effects for a very short run make the first thing slower: Idiomatic way of performance evaluation? – Peter Cordes Commented Feb 19 at 18:19