Compiling for Dummies: Compilers, Flags, Options, How to Get Started?
Part 1
“It compiles, so it works. Admit it, we all heard (or said) it in engineering school. But come to think of it, did we really understand what the compilation phase was all about?
For those lucky enough to answer yes, don’t go away. Beyond the compiler back to basics, we’ll go a little further with a deciphering of research and studies on comparing the performance of different compilers, flag scheduling, and different compilation options. For the rest of you, we won’t say a word, we promise.
Compiler ? What is it?
Sorry to disappoint, but no, gcc is not THE compiler. We won’t list them all here, but the main goal is to understand what this compiler is for. In fact, it’s a bit like a compiler.
As a developer, it’s much easier for you to read code as it’s written every day (and, if it’s well written, to understand it just by reading it) than it is to read assembler or even strings of zeros or 1s (machine language). As a result, your compiler will take the code that you, the developer, have written in the language of your choice and convert it to another language.
In fact, the compiler will do more than just translate: it will even be able to detect certain errors in that famous source code you wrote as a developer . Yes, because it happens to all of us! In the end, the compiled program must do exactly the same thing as the original source code.
The compiler will also be able to optimize the code. But is WedoLow developing compilers? No, let us explain. The optimizations proposed by the compiler will try to find the representation that gives the best performance while being equivalent (although this is not always the case! We’ll get to that later). They may also try to match the code as closely as possible to the hardware target that will run it (for example, by specifying the type of architecture being used with the following flags, among others): -mcpu, -march, -target, or -mtune). Simple compiler optimizations include algebraic simplification and constant propagation.
We can “force” the compiler to propose optimizations that target a particular axis, be it execution time or memory. Energy consumption, on the other hand, is not directly targeted by compiler options, as the following study[1] clearly explains (more information in the next article to be published 😉).
Finally, our compiler will work on the trade-off between compilation time, binary/program size and execution speed. These choices will obviously have an impact on the ease of debugging the produced code (you can’t win every time!).
Compilers, user manual
Going back to our famous compilers, are they equivalent? How do you choose between them? Unfortunately, the answer is not that simple. For example, a study[2] was conducted on two well-known compilers for the x86-64 target, gcc and icc. It compared the performance of the two compilers on a set of C++ applications that taxed either the CPU or the I/O system, or both. An application performing a fast Fourier transform was also benchmarked. The elements on which the compilers were challenged are as follows:
- function pointers
- the presence of invariant calculations within a loop (some compilers still have trouble detecting this pattern and getting it out of the loop),
- iterators,
- constant propagation,
- loop unwinding,
- structures/classes.
The results in terms of execution time obtained for the different benchmarks do not allow us to decide in favor of one compiler or another (it would be too simple otherwise). In fact, one compiler will end up being better than another for a specific problem (like gcc and constant propagation, for example).
One of the keys to improving code performance with your compiler is the level of optimization you choose. In other words, ask yourself: am I compiling in -O0, -O2, -O3, or even -OS? And here’s a new pitfall: they don’t mean the same thing for one compiler to another . It’s not all the same for icc and gcc: in fact, some optimizations are tolerated in -O2 for icc, but not in -O3 for gcc!
In addition to optimization levels, another area of fine-tuning is the addition of compiler flags. These allow you to customize the optimization scheme that the compiler explores. For example, they can be used to inline functions (beware of the tradeoff between size and execution time!), to unwind loops, or to allow certain optimizations for floating-point operations (beware of the possible loss of precision). So it’s important to be able to determine exactly which flags are interesting (and acceptable) for a particular use case. Then, to spice things up a bit, the combination of flags and their scheduling can have an impact on program performance. State-of-the-art machine learning techniques have been proposed to automatically suggest the right combination of flags and their order. For reasons of frugality, we at WedoLow are not currently targeting these techniques. But it’s interesting to know that playing with these seemingly trivial parameters can have a significant impact on performance.
It goes without saying that the flags used not only affect the performance of the application itself, but also the compilation time. The latter has also been studied, but we won’t go into it here.
To sum up
Contact our team of experts to discuss the subject: sales@wedolow.com
[2] Botezatu, Mirela. “A study on compiler flags and performance events.” European Council for Nuclear Research(2012).