The HPC Challenge benchmark suite (HPCC) was released to analyze the performance of high-performance computing architectures using several kernels to measure different memory and hardware access patterns comprising latency based measurements, memory streaming, inter-process communication and floating point computation. HPCC defines a set of benchmarks augmenting the High Performance Linpack used in the Top500 list. Based on HPCC benchmark results, super-computers are evaluated. The focus is on the balance between computational speed, memory bandwidth, and inter-node communication. This balance is an important aspect of the programmability of future super-computers. Estimates for first PetaFlop/s systems range from 100,000 to 1,000,000 processors, organized as a cluster of SMP nodes. Programmability for a broad range of applications is crucial for the success of future super-computers. Hybrid (mixed model) MPI & OpenMP programming seems to be determined for such clusters of SMP nodes. Unfortunately, hybrid programming shows several drawbacks on such super-computing architectures.