Modern CPUs have the reputation to hardly utilize their theoretical abilities. One reason is the fact that modern instruction set architectures (ISA) together with new developements in CPU design put high demands on the compiler. It is hard for the compiler, with the information a high level language provides, to use the ISA of modern CPUs in an efficient way. We will present examples, which show that modern CPUs can be used with high efficiency and that the reason for poor achievable performance often is inefficient compiler generated code. What can be done to avoid this waste of performance? We want to initiate a discussion about available and possible solutions.