admin管理员组

文章数量:1122826

These C and C++ programs are equivalent (as I'm using -fno-exceptions, below):

"loop.c":

#include <stdlib.h>
#include <stdio.h>
#include <math.h>

const int N = 1024*4;

int main(void) {
    double* v = malloc(N * sizeof(double));
    if(!v) return 1;

    for(int i = 0; i < N; i++)
        v[i] = sin(i);

    double sum = 0;

    for(int i = 0; i < N; i++)
        for(int j = 0; j < N; j++)
            for(int k = 0; k < N; k++)
                sum += v[i] + 2 * v[j] + 3 * v[k];

    printf("sum = %f\n", sum);

    free(v);
}

"loop.cpp":

#include <vector>
#include <stdio.h>
#include <math.h>

const int N = 1024 * 4;

int main() {
    std::vector<double> v(N);

    for(int i = 0; i < N; i++)
        v[i] = sin(i);

    double sum = 0;

    for(int i = 0; i < N; i++)
        for(int j = 0; j < N; j++)
            for(int k = 0; k < N; k++)
                sum += v[i] + 2 * v[j] + 3 * v[k];

    printf("sum = %f\n", sum);
}

But when I run them on my system (compiled with gcc -O2 -lm and g++ -O2 -fno-exceptions), the C version uses 800K, while the C++ one uses 1600K, and sometimes this jumps to 3300K half-way through the execution -- I haven't noticed the C version do this (I'm looking at "RES" in top, while the program runs)

I wonder, if there is some compiler setting, or environment variable, or something else that could make the C++ version be as economical with memory as its C equivalent, while still using STL (std::vector, specifically)?

I hope such a setting exists, because one of the design principles of C++ is:

What you don’t use, you don’t pay for (in time or space) and further: What you do use, you couldn’t hand code any better.

To keep the question focused: I'm not interested in switching to alternative libstdc++ implementations, but I can use Clang instead of GCC, if that helps.


Update: I did some more experimenting:

"run.sh":

for i in `seq 1000`; do
    nice ./a.out &
done

followed by

$ killall a.out

and looked the overall memory usage changes. And it is indeed the case, as some commentators suggested, that the overhead is mostly shared by processes, although it still exists, per-process also: The per process memory usage is 200K and 250K for C and C++ versions, respectively.

On the other hand, replacing vector by new and delete or by malloc and free has little or no effect. The overhead here comes from using C++, rather than vector.

These C and C++ programs are equivalent (as I'm using -fno-exceptions, below):

"loop.c":

#include <stdlib.h>
#include <stdio.h>
#include <math.h>

const int N = 1024*4;

int main(void) {
    double* v = malloc(N * sizeof(double));
    if(!v) return 1;

    for(int i = 0; i < N; i++)
        v[i] = sin(i);

    double sum = 0;

    for(int i = 0; i < N; i++)
        for(int j = 0; j < N; j++)
            for(int k = 0; k < N; k++)
                sum += v[i] + 2 * v[j] + 3 * v[k];

    printf("sum = %f\n", sum);

    free(v);
}

"loop.cpp":

#include <vector>
#include <stdio.h>
#include <math.h>

const int N = 1024 * 4;

int main() {
    std::vector<double> v(N);

    for(int i = 0; i < N; i++)
        v[i] = sin(i);

    double sum = 0;

    for(int i = 0; i < N; i++)
        for(int j = 0; j < N; j++)
            for(int k = 0; k < N; k++)
                sum += v[i] + 2 * v[j] + 3 * v[k];

    printf("sum = %f\n", sum);
}

But when I run them on my system (compiled with gcc -O2 -lm and g++ -O2 -fno-exceptions), the C version uses 800K, while the C++ one uses 1600K, and sometimes this jumps to 3300K half-way through the execution -- I haven't noticed the C version do this (I'm looking at "RES" in top, while the program runs)

I wonder, if there is some compiler setting, or environment variable, or something else that could make the C++ version be as economical with memory as its C equivalent, while still using STL (std::vector, specifically)?

I hope such a setting exists, because one of the design principles of C++ is:

What you don’t use, you don’t pay for (in time or space) and further: What you do use, you couldn’t hand code any better.

To keep the question focused: I'm not interested in switching to alternative libstdc++ implementations, but I can use Clang instead of GCC, if that helps.


Update: I did some more experimenting:

"run.sh":

for i in `seq 1000`; do
    nice ./a.out &
done

followed by

$ killall a.out

and looked the overall memory usage changes. And it is indeed the case, as some commentators suggested, that the overhead is mostly shared by processes, although it still exists, per-process also: The per process memory usage is 200K and 250K for C and C++ versions, respectively.

On the other hand, replacing vector by new and delete or by malloc and free has little or no effect. The overhead here comes from using C++, rather than vector.

Share Improve this question edited 2 hours ago MWB asked yesterday MWBMWB 12.5k9 gold badges53 silver badges102 bronze badges 25
  • 6 In what pc application exactly would 2MB make a difference ? It is likely for loading libstdc++ and allocating internal datastructures (onexit handlers and whatnot), it is not recuring, it doesn't scale with your application size. – Ahmed AEK Commented yesterday
  • 2 Your measurement method is not accurate. – dimich Commented yesterday
  • 2 @AlanBirtles std::vector has a larger interface (set of member functions, etc) than a raw pointer and that, depending on optimisation settings (e.g. if unused symbols in class types exist or not in the executable) can affect total code size - and increased code size can be associated with increased memory usage. That said, the more likely explanation of additional memory usage in C++ over C is objects created before main(), such as standard I/O streams which C doesn't have. As wohlstad noted, larger values of N will make the differences less significant. – Peter Commented yesterday
  • 2 @MWB Why the now added requirement, "using std::vector specifically"? You don't need std::vector here since you don't need to be able to adjust the size and you also don't use any of the vector features. auto v = std::make_unique_for_overwrite<double[]>(N); would be a reasonable micro optimization. – Ted Lyngmo Commented yesterday
  • 4 Re: "For people implying that this is a fixed cost" - Yes, linking with the C++ standard library has an additional fixed cost compared to if you only link with the C standard library like I showed above. Some of RES is shared among processes though so just looking at RES is not a good measurement. – Ted Lyngmo Commented yesterday
 |  Show 20 more comments

3 Answers 3

Reset to default 3

You misunderstood this sentence. In "What you do use, you couldn’t hand code any better", there are intrinsically multiple checks and controls within STL objects, compared to raw pointers and C functions vulnerable to buffer overflows.

Typically, a scanf in C can be written in a single line... The result will be, AT BEST, vaguely functional. And at worst, it’s no longer a security flaw, but an abyss. A robust scanf (which, by the way, would no longer use this function but sscanf instead) easily takes around fifty lines of code... These are present in the << operator of C++.

By using C++ and the STL, you inevitably face two types of memory overhead:

  • A fixed overhead, related to the library being the C++ one and NOT the C one.
  • A variable overhead, depending on the number of data elements being manipulated.

The fixed overhead is irrelevant unless you are working on embedded code and a platform that is already nearly full, because it's mostly shared libraries.

The variable overhead can PARTIALLY be mitigated by using custom allocators (such as a pseudo-allocator on std::allocator), or by leveraging std::basic_string_view (if relevant to your case and C++17 is available), std::array (by automatically substituting them for std::vector via a using), or by having knowledge of specific aspects of your program (fixed string sizes, fixed vector sizes, non-modifiable data at runtime, etc.).

However, at some point, you can’t use C++ and expect C-like results, especially if it’s unprotected C – there’s no success check for your malloc, for example, and no range checking either.

As a corollary, your triple loop is inefficient (complexity O(N^3)), whereas it can be done in O(N), you are not using STL algorithms or C++ specifics (algorithm changes, using std::accumulate and std::generate, etc.), which also impacts your final program.

STL algorithms are the best maintainable AND reliable AND generic algorithms available... But they can easily be beaten by algorithms that are less maintainable, with accepted vulnerabilities, and/or more specific.

This is exactly what you’re experiencing: you’re comparing with a C code that is less reliable and much more specific – the C++ equivalent of your malloc would typically be using a new[] or a std::array, not a std::vector, which is as overpowered for this use case as a std::string would be for storing a constant, fixed-size string... Not surprisingly, you find an "abnormal" difference between the two.

After some more experimentation, I noticed that using -static-libstdc++ eliminates the differences in memory consumption. I think this answers the question as originally stated.

Just a small nit though: g++ -O2 -fno-exceptions -static-libstdc++ -s produces an executable that's 6x bigger in size than what gcc -O2 -s -lm produces. I wonder if there is a way to fix that.

These C and C++ programs are not, in fact, equivalent.

  1. C code has malloc with no error-checking. It's statically incorrect, and there's no idiomatic way to stop C++ emitting some exception-based error checking to handle this.
  2. C code does an un-initialized allocation. The equivalent C++ would use v.reserve(N) to avoid the redundant initialization you have now.

Admittedly neither of those should affect dynamic allocation size, just code size (and run time). They're just examples of ways in which your equivalence breaks down.

As an aside, you don't need a busy loop to keep your program alive until you can see it in top. Just run it under time (probably /bin/time rather than the shell built-in).


As to your title question, just be selective about which bits of the standard library you use - and learn how to use them correctly. Calling reserve on your vector is a good example.

本文标签: linuxHow to make C be as economical with memory as Cwhile still using STLStack Overflow