https://www.dagstuhl.de/16412

### 09. – 14. Oktober 2016, Dagstuhl-Seminar 16412

# Automated Algorithm Selection and Configuration

## Organisatoren

Holger H. Hoos (University of British Columbia – Vancouver, CA)

Frank Neumann (University of Adelaide, AU)

Heike Trautmann (Universität Münster, DE)

## Auskunft zu diesem Dagstuhl-Seminar erteilt

## Dokumente

Dagstuhl Report, Volume 6, Issue 10

Motivationstext

Teilnehmerliste

Gemeinsame Dokumente

Dagstuhl's Impact: Dokumente verfügbar

## Summary

The importance of high-performance algorithms, in particular for solving NP-hard optimisation and decision problems, cannot be underestimated. Achievements in this area have substantial impact in sectors such as manufacturing, logistics, healthcare, finance, agriculture and energy systems -- all of strategic importance to modern societies.

The development of effective automated algorithm selection and configuration techniques has been one of the major success stories in the area of empirical algorithmics in recent years. Building on a wide range of algorithmic approaches for problems such as propositional satisfiability (SAT) and mixed integer programming (MIP), these methods permit the selection of appropriate algorithms based on efficiently computable characteristic of a problem instance to be solved (algorithm selection) and the automatic determination of performance optimising parameter settings (algorithm configuration). In both cases, statistical models that enable performance predictions for previously unseen problem instances or parameter settings play a key enabling role; additionally, these models have other important uses, e.g., in load scheduling and distribution on large computer clusters.

The reach of those methods is illustrated by the fact that they have defined the state of the art in solving SAT, arguably the most prominent NP-complete decision problem, for a decade (as witnessed by the results from the international SAT solver competitions, and more recently have been demonstrated to have the potential to achieve significant improvements over the long-standing state of the art in solving the TSP, one of the most widely studied NP-hard optimisation problems [1] Further very encouraging results have been achieved in recent years for continuous optimisation, AI planning and mixed integer programming problems.

The goal of the seminar was to foster research on algorithm selection and configuration, as well as on the underlying performance prediction methods, by bringing together researchers from the areas of artificial intelligence, theoretical computer science and machine learning in order to extend current studies to a much broader class of problems and build up the theoretical foundations of this important research area. On the foundational side, the seminar aimed at bridging the gap between experiments and theory in feature-based algorithm (runtime) analysis. In particular, we began investigating how mathematical and theoretical analyses can contribute to the experimentally driven research area of algorithm selection and configuration. We expect that studies following this initial exploration will bring together two of the currently most successful approaches for analysing heuristic search algorithms and ultimately achieve substantial impact in academic and industrial applications of algorithm configuration and selection techniques. Furthermore, we placed an emphasis on investigating automated algorithm selection and configuration approaches for multiobjective optimisation problems -- an important, but largely unexplored area of investigation.

### Background and Challenges: Algorithm Selection and Configuration for Combinatorial Problems

The design of algorithms for combinatorial optimisation and decision problems plays a key role in theoretical computer science as well as in applied algorithmics. These problems are frequently tackled using heuristic methods that perform extremely well on different classes of benchmark instances but usually do not have rigorous performance guarantees. Algorithm selection and configuration techniques have been applied to some of the most prominent NP-hard combinatorial optimisation and decision problems, such as propositional satisfiability (SAT) and the travelling salesman problem (TSP).

Algorithm selection for SAT has first been explored in the seminal work on SATzilla [2,3,4] which was initially based on linear and ridge regression methods for performance prediction, but later moved to more sophisticated models based on cost-sensitive random forest classification [5]. Other successful methods use clustering techniques to identify the algorithm to be run on a given instance [6,7]. As clearly evident from the results of SAT competitions, which are regularly held to assess and document the state of the art in SAT solving, automated algorithm selection procedures effectively leverage the complementary strengths of different high-performance solvers and thus achieve substantial improvements over the best individual solvers [5].

As heuristic search algorithms often have numerous parameters that influence their performance, one of the classical questions is how to set parameters to optimise performance on a given class of instances. This per-set algorithm configuration problems can be solved using stochastic local search and model-based optimisation techniques [8,9,10], as well as racing techniques [11,12], and these configuration methods have been demonstrated to yield substantial performance improvements to state-of-the-art algorithms for SAT, TSP, MIP, AI planning and several other problems (see, e.g, [13,14,15,16]). Algorithm configuration techniques are now routinely used for optimising the empirical performance of solvers for a wide range of problems in artificial intelligence, operations research and many application areas (see, e.g., [17,18])

Initial work on combining algorithm selection and configuration techniques has shown significant promise [19,20]; such combinations allow configuring algorithms on a per-instance basis [6,7] and configuring algorithm selection methods (which themselves make use of many heuristic design choices) [21]. However, we see much room for further work along these lines. Other challenges concern the automated selection and configuration of mechanisms that adapt parameter settings while an algorithm is running and the configuration of algorithms for optimised scaling behaviour. Finally, a better theoretical foundation of algorithm selection and configuration approaches is desired and necessary. Initial steps into this direction were an important goal of this Dagstuhl seminar. In the following, we motivate and outline some of the challenges addressed in the course of the seminar.

### Background and Challenges: Algorithm Selection for Continuous Black-Box Optimisation

Black-box function optimisation is a basic, yet intensely studied model for general optimisation tasks, where all optimisation parameters are real-valued. Work in this area has important practical applications in parameter and design optimisation and has also inspired some of the most successful general-purpose algorithm configuration techniques currently available [9].

Despite many years of research in metaheuristics, especially evolutionary algorithms, aimed at optimising black-box functions effectively, it is currently hardly possible to automatically determine a good optimisation algorithm for a given black-box function, even if some of its features are known. In single-objective (SO) black-box optimisation, it is therefore of considerable interest to derive rules for determining how problem properties influence algorithm performance as well as for grouping test problems into classes for which similar performance of the optimisation algorithms can be observed. Recent benchmarking experiments [22,23] provide at best high-level guidelines for choosing a suitable algorithm type based on basic features that are known a priori, such as the number of dimensions of the given problem. However, the preference rules for algorithm selection thus obtained are very imprecise, and even for slight algorithm or problem variations, the resulting performance-induced ordering of different algorithms can change dramatically.

Exploratory Landscape Analysis (ELA, [24]) aims at improving this situation by deriving cheaply computable problem features based on which models relating features to algorithm performance can be constructed using benchmark experiments. The final goal is an accurate prediction of the best suited algorithm for an arbitrary optimisation problem based on the computed features. The concept is not entirely knew; however, earlier approaches, such as fitness distance correlation (FDC) [25], have not been completely convincing.

A first idea to employ high-level (human expert designed) features, such as separability and modality, to characterize optimisation problems in an ELA context [26] was therefore refined by also integrating low-level features - e.g., based on convexity or the behaviour of local search procedures [27]. These effectively computable low-level features can be chosen from a wide range of easy to measure statistical properties. Suitably determined combinations of such features are expected to provide sufficient information to enable successful algorithm selection. Following recent results [27], this process is not necessarily costly in terms of function evaluations required for feature computation.

Additional, conceptually similar features were introduced in [28,29,30,31]. In [32], a representative portfolio of four optimisation algorithms was constructed from the complete list of BBOB 2009/2010 candidates. Based on the low-level features a sufficiently accurate prediction of the best suited algorithm within the portfolio for each function was achieved. Recently, the feature set was extended based on the cell mapping concept in [33] by which a finite subdivision of the domain in terms of hypercubes is constructed. Most recently, the ELA approach, extended by several specific features, was successfully used to experimentally detect funnel structured landscapes in unknown black-box optimisation problems [34]. As it can be assumed that this information can be efficiently exploited to speed up the optimisation process, we expect ELA to contribute importantly to automated algorithm selection in single-objective black-box optimisation. (See [35] for a survey of related work.)

One major challenge in this area is the construction of a suitable algorithm portfolio together with an algorithm selection mechanism for unknown instances that generalises well to practical applications. For this purpose, suitable benchmark sets have to be derived and the costs of feature computations have to be kept as small as possible. Furthermore, theoretical foundation of the approaches is desired and necessary. The seminar aimed to make first steps in this direction.

### Special focus: Algorithm selection for multiobjective optimisation

Some of the most challenging real-world problems involve the systematic and simultaneous optimisation of multiple conflicting objective functions -- for example, maximising product quality and manufacturing efficiency, while minimising production time and material waste. To solve such problems, a large number of multiobjective optimisation (MOO) algorithms has been reported. Like single-objective (SO) algorithms, new MOO algorithms are claimed to outperform others by comparing the results over a limited set of test problems. Knowles et al. [36] started working on systematically deriving performance measures for EMOA and evaluating EMOA performance. Mersmann et al. [37] recently derived a systematic benchmarking framework according to similar work of [38] on benchmarking classification algorithms.

However, it is unlikely that any algorithm would outperform all others on a broader set of problems, and it is possible that the algorithm fails miserably on some of them. These results go usually unreported, leaving the algorithm’s limitations unknown. This knowledge is crucial to avoid deployment disasters, gain theoretical insights to improve algorithm design, and ensure that algorithm performance is robustly described. Therefore, we see much value in the development of an algorithm selection and configuration framework for multiobjective optimisation. Successfully selecting the proper optimization algorithm for a multi-objective problem depends on detecting different problem characteristics, one of which is the multimodality of the induced landscape. In recent work [39], formal definitions were introduced for multimodality in multi-objective optimization problems in order to generalize the ELA framework to multi-objective optimization.

Significant progress has been made on single-objective (SO) problems of combinatorial and continuous nature as discussed above. However, these ideas are yet to be applied to the important class of MOO problems. We see five major avenues of exploration: (1) analysis on what makes MOO problems difficult; (2) design of features to numerically characterize MOO problems; (3) identification and visualization of strengths and weaknesses of state-of-the-art MOO algorithms; (4) methodology to assist the algorithm selection and configuration on (possibly expensive) real-world problems; (5) methodology to assist the design of tailored algorithms for real-world problems. An important aim of the seminar was to facilitate discussion of these directions.

### Seminar structure and outcomes

The seminar was structured to balance short invited presentations with group breakout sessions and a generous amount of time set aside for informal discussions and spontaneously organised working groups at a ratio of about 2:1:1. Based on feedback obtained during and after the event, this structure worked well in fostering a vibrant atmosphere of intense and fruitful exchange and discussion. Presenters very successfully introduced important ideas, outlined recent results and open challenges, and facilitated lively discussion that provided much additional value. The afternoon group breakout sessions were particularly effective in addressing the challenges previously outlined as well as additional topics of interest that emerged during the seminar -- thanks to the preparation and moderation by the session organisers as well as the lively participation of the attendees.

While it would be unreasonable to expect to exhaustively or conclusively address the substantial research challenges that inspired us to organise this Dagstuhl seminar, we believe that very significant progress has been achieved. As importantly, we feel that through this week-long event, an invaluable sharing of perspective and ideas has taken place, whose beneficial effects on the algorithm selection and configuration community and its work we hope to be felt for years to come. The following presentation abstracts and session summaries provided by the participants reflect the richness and depth of the scientific exchange facilitated by the seminar.

As organisers, we very much enjoyed working with presenters and session organisers, who greatly contributed to the success of the seminar, as did everyone who participated. Our thanks also go to the local team at Schloss Dagstuhl, who provided outstanding organisational support and a uniquely inspiring environment.

### References

- L. Kotthoff, P. Kerschke, H. Hoos, and H. Trautmann. Improving the state of the art in inexact TSP solving using per-instance algorithm selection. In C. Dhaenens, L. Jourdan, and M.-E. Marmion, editors, Learning and Intelligent Optimization, 9th International Conference, pages 202–217, Cham, 2015. Springer International Publishing. Publication status: Published.
- E. Nudelman, K. Leyton-Brown, H. H Hoos, A. Devkar, and Y. Shoham. Understanding random SAT: Beyond the clauses-to-variables ratio. In Principles and Practice of Constraint Programming–CP 2004, pages 438–452. Springer Berlin Heidelberg, 2004.
- E. Nudelman, K. Leyton-Brown, A. Devkar, Y. Shoham, and H. Hoos. Satzilla: An algorithm portfolio for SAT. Solver description, SAT competition, 2004, 2004.
- L. Xu, F. Hutter, H. Hoos, and K. Leyton-Brown. SATzilla: Portfolio-based algorithm selection for SAT. Journal of Artificial Intelligence Research, 32:565–606, 2008.
- L. Xu, F. Hutter, H.H. Hoos, and K. Leyton-Brown. Evaluating component solver contributions to portfolio-based algorithm selectors. In Theory and Applications of SatisfiabilityTesting–SAT 2012, pages 228–241. Springer Berlin Heidelberg, 2012.
- S. Kadioglu, Y. Malitsky, M. Sellmann, and K. Tierney. Isac-instance-specific algorithm configuration. ECAI, 215:751–756, 2010.
- Y. Malitsky, A. Sabharwal, H. Samulowitz, and M. Sellmann. Algorithm portfolios basedon cost-sensitive hierarchical clustering. In Proceedings of the Twenty-Third international joint conference on Artificial Intelligence, pages 608–614. AAAI Press, 2013.
- F. Hutter, H.H. Hoos, K. Leyton-Brown, and K.P. Murphy. An experimental investigation of model-based parameter optimisation: Spo and beyond. In GECCO’09: Proceedings of the 11th Annual conference on Genetic and evolutionary computation, pages 271–278, New York, NY, USA, 2009. ACM.
- F. Hutter, H.H. Hoos, and K. Leyton-Brown. Sequential model-based optimization for general algorithm configuration. In Learning and Intelligent Optimization, pages 507–523. Springer Berlin Heidelberg, 2011.
- C. Ansótegui, M. Sellmann, and K. Tierney. A gender-based genetic algorithm for the automatic configuration of algorithms. Principles and Practice of Constraint Programming-CP 2009, pages 142–157, 2009.
- M. Birattari, T. Stützle, L. Paquete, and K. Varrentrapp. A racing algorithm for configuring metaheuristics. In GECCO ’02: Proc. of the Genetic and Evolutionary Computation Conference, pages 11–18, 2002.
- M. Birattari, Z. Yuan, P. Balaprakash, and T. Stützle. F-race and iterated F-race: An overview. In T. Bartz-Beielstein, M. Chiarandini, L. Paquete, and M. Preuss, editors, Empirical Methods for the Analysis of Optimization Algorithms. Springer, 2010.
- F. Hutter, M.T. Lindauer, A. Balint, S. Bayless, H.H. Hoos, and K. Leyton-Brown. The configurable SAT solver challenge (CSSC). Artificial Intelligence, Accepted for publication., 2015.
- J. Styles and H. Hoos. Ordered racing protocols for automatically configuring algorithms for scaling performance. In Proceedings of the 15th Conference on Genetic and Evolutionary Computation (GECCO-13), pages 551–558. ACM, 2013.
- F. Hutter, H.H. Hoos, and K. Leyton-Brown. Automated configuration of mixed integer programming solvers. In Integration of AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems, pages 186–202. Springer Berlin Heidelberg, 2010.
- M. Vallati, C. Fawcett, A.E. Gerevini, H.H. Hoos, and A. Saetti. Automatic generation of efficient domain-optimized planners from generic parametrized planners. In Sixth Annual Symposium on Combinatorial Search (SoCS-13), pages 184–192, 2013.
- P. Lengauer and H. Mössenböck. The taming of the shrew: Increasing performance by automatic parameter tuning for java garbage collectors. In Proceedings of the 5th ACM/SPEC International Conference on Performance Engineering, ICPE ’14, pages 111–122, New York, NY, USA, 2014. ACM.
- J.P. Dickerson, A.D. Procaccia, and T. Sandholm. Dynamic matching via weighted myopia with application to kidney exchange. In Proceedings of the 28th National Conference on Artificial Intelligence (AAAI-14), pages 1340–1346, 2012.
- L. Xu, H.H. Hoos, and K. Leyton-Brown. Hydra: Automatically configuring algorithms for portfolio-based selection. Proceedings of the 24th Conference on Artificial Intelligence (AAAI-10), 10:210–216, 2010.
- L. Xu, F. Hutter, H.H. Hoos, and K. Leyton-Brown. Hydra-MIP: Automated algorithm configuration and selection for mixed integer programming. RCRA workshop on experimental evaluation of algorithms for solving problems with combinatorial explosion at the international joint conference on artificial intelligence (IJCAI), pages 16–30, 2011. Holger H. Hoos, Frank Neumann, and Heike Trautmann 39
- M. Lindauer, H. Hoos, F. Hutter and T. Schaub: AutoFolio: An Automatically Configured Algorithm Selector. J. Artif. Intell. Res. (JAIR) 53:745-778, 2015.
- N. Hansen, A. Auger, S. Finck, and R. Ros. Real-parameter black-box optimization benchmarking 2009: Experimental setup. Technical Report RR-6828, INRIA, 2009.
- N. Hansen, A. Auger, S. Finck, and R. Ros. Real-parameter black-box optimization benchmarking 2010: Experimental setup. Technical Report RR-7215, INRIA, 2010.
- O. Mersmann, B. Bischl, H. Trautmann, M. Preuss, C. Weihs, and G. Rudolph. Exploratory landscape analysis. In Proceedings of the 13th annual conference on Genetic and evolutionary computation, GECCO ’11, pages 829–836, New York, NY, USA, 2011. ACM.
- T. Jones and S. Forrest. Fitness distance correlation as a measure of problem difficulty for genetic algorithms. In Proceedings of the Sixth International Conference on Genetic Algorithms, pages 184–192. Morgan Kaufmann, 1995.
- M. Preuss and Th. Bartz-Beielstein. Experimental analysis of optimization algorithms: Tuning and beyond. In Y. Borenstein and A. Moraglio, editors, Theory and Principled Methods for Designing Metaheuristics. Springer, 2011.
- O. Mersmann, M. Preuss, H. Trautmann, B. Bischl, and C. Weihs. Analyzing the BBOB results by means of benchmarking concepts. Evolutionary Computation Journal, 23(1):161–185, 2015.
- M. A. Muñoz, M. Kirley, and S. K. Halgamuge. A meta-learning prediction model of algorithm performance for continuous optimization problems. In C. A. Coello Coello et al., editors, Parallel Problem Solving from Nature – PPSN XII, volume 7491 of Lecture Notes in Computer Science, pages 226–235. Springer, 2012.
- T. Abell, Y. Malitsky, and K. Tierney. Features for exploiting black-box optimization problem structure. In Giuseppe Nicosia and Panos Pardalos, editors, Learning and Intelligent Optimization, Lecture Notes in Computer Science, pages 30–36. Springer, 2013.
- R. Morgan and M. Gallagher. Using landscape topology to compare continuous metaheuristics: A framework and case study on EDAs and ridge structure. Evolutionary Computation, 20(2):277–299, 2012.
- M.A. Munoz, M. Kirley, and S.K. Halgamuge. Exploratory landscape analysis of continuous space optimization problems using information content. Evolutionary Computation, IEEE Transactions on, 19(1):74–87, Feb 2015.
- B. Bischl, O. Mersmann, H. Trautmann, and M. Preuss. Algorithm selection based on exploratory landscape analysis and cost-sensitive learning. In Proceedings of the 14th Annual Conference on Genetic and Evolutionary Computation, GECCO ’12, pages 313–320. ACM, 2012.
- P. Kerschke, M. Preuss, C. Hernández, O. Schütze, J. Sun, C. Grimme, G. Rudolph, B. Bischl, and H. Trautmann. Cell mapping techniques for exploratory landscape analysis. In A Tantar et al., editors, EVOLVE — A Bridge between Probability, Set Oriented Numerics, and Evolutionary Computation V, volume 288 of Advances in Intelligent Systems and Computing, pages 115–131. Springer, 2014.
- P. Kerschke, M. Preuss, S. Wessing, and H. Trautmann. Detecting funnel structures by means of exploratory landscape analysis. In Proceedings of the, pages 265–272, New York, NY, USA, 2015. ACM. Publication status: Published.
- M. A. Muñoz, Y. Sun, M. Kirley, and S. K. Halgamuge. Algorithm selection for blackbox continuous optimization problems: A survey on methods and challenges. Information Sciences, 317:224 – 245, 2015.
- J. Knowles, L. Thiele, and E. Zitzler. A Tutorial on the Performance Assessment of Stochastic Multiobjective Optimizers. TIK Report 214, Computer Engineering and Networks Laboratory, ETH Zurich, 2006.
- O. Mersmann, H. Trautmann, B. Naujoks, and C. Weihs. Benchmarking evolutionary multiobjective optimization algorithms. In IEEE Congress on Evolutionary Computation, pages 1–8. IEEE, 2010.
- K. Hornik and D. Meyer. Deriving consensus rankings from benchmarking experiments. In R. Decker and H.-J. Lenz, editors, Advances in Data Analysis (Proc. of the 30th Ann. Conf. of the Gesellschaft für Klassifikation, pages 163–170. Springer, Berlin, 2007.
- P. Kerschke, H. Wang, M. Preuss, C. Grimme, A. Deutz, H. Trautmann and M. Emmerich. Towards Analyzing Multimodality of Multiobjective Landscapes. In Proceedings of the 14th International Conference on Parallel Problem Solving from Nature (PPSN XIV), pages 962–972. Lecture Notes in Computer Science, Springer, 2016.

**License**

Creative Commons BY 3.0 Unported license

Holger H. Hoos, Frank Neumann, and Heike Trautmann

## Classification

- Artificial Intelligence / Robotics
- Data Structures / Algorithms / Complexity
- Optimization / Scheduling

## Keywords

- Algorithm selection
- Algorithm configuration
- Performance prediction
- Optimization
- Machine learning
- Features