Embedded Systems typically have to respect tight constraints. In particular, they have to respect tight constraints for the consumed energy. Therefore, hardware platforms for embedded systems are characterized by their energy efficiency. The efficiency of embedded processors has been increased in order to get close to the efficiency of bare silicon. Ed Lee showed in 2005 that the characteristics of threads do not match well with the requirements for specification techniques for embedded systems. This observation motivates research on models of computation other than the Von Neumann approach. Software generation is being used for safety-critical systems. For example SCADE is employed for generating code for Airbus planes and dSPACE Targetlink is employed for generating control software in the automotive domain. This removes some of the problems with generating parallel programs manually. Many systems are specified in the form of task graphs. Software exists which maps such task graphs onto parallel processors. In particular, we consider Thiele's approach for generating optimized architectures for signal processing and Symtavision's approach of mapping automotive applications onto execution control units (ECU). This way, parallel implementations are generated without having to care about locks, monitors etc. Using accelerators based on reconfigurable logic attached to a standard processor is another approach to parallel programming of embedded systems. Techniques from high-performance computing should be applicable to this approach.