Performance tools are around since computers are available. With the advent of highly parallel machines current tools reach their limits. They simply do not scale to the enormous number of processors. In this presentation we provide inside into a possible design for future performance analysis tools for petaflop systems. These tools will automatically search for performance problems in an online and distributed fashion.