Moore's law on semiconductors is coming to an end. Scaling the von Neumann architecture over the 40 years of the microprocessor has led to unsustainable circuit complexity, very low compute-density, and high power consumption. On the other hand, parallel computing practices are nowhere close to the portability, accessibility, productivity and reliability levels of single-threaded software engineering. In terms of productivity, the dominant parallel programming models are often worse level than sequential programming ... in assembler. Moreover, these models hide the key decisions about scheduling and resource management on a variety of distributed, multi-level and heterogeneous parallel architectures. This lack of expressiveness may be desirable for a high-level language, but inside a compiler and at the interface with the run-time system and architecture, it leads to unnecessary overheads, lack of scalability and performance portability. In addition, designers of physically constrained embedded systems are used to attaching temporal and resource requirements to the functional semantics of programs, a requirement often contradicting the modularity and abstraction expectations of modern programming models. This talk advocates for an intermediate-language-driven approach to these challenges. We report on work in progress, building on recent advances of polyhedral compilation (affine scheduling and partitioning) on one side, and on new insights from the synchronous data-flow paradigm. We will highlight the main features of a stream-oriented intermediate language to bridge the productivity and efficiency gaps while offering higher levels of control to the compiler and to the expert (library) programmer. This intermediate language is based on a data-flow, stream-oriented generalization of the classical (scalar) SSA form, and also share some commonalities with transactional memory.