The goal of a retrieval system is to provide the users with results of maximum utility. But how can utility be measured? The conventional approach is to equate utility with some measure (e.g. Avg. Precision, NDCG) derived from expert relevance judgments. Since this has several well-know problems (e.g. divergence between user and expert judgments, ignorance of user context, cost, availability), it raises the question of whether utility can be elicited directly from the user? While users provide observable feedback in terms of, for example, clicks and query reformulations, it is unclear how these relate to utility (e.g. do more clicks indicate higher utility). The relationship between clicks and utility to the user becomes clearer, however, when moving from a passive observation model to an interactive experiment setting. Interactive retrieval systems can conduct such interactive experiments, making implicit feedback data interpretable. In this talk, I will discuss interactive experiment designs under which clicks accurately reveal ordinal statements about the relative utility of two retrieval functions. These experiment designs are evaluated in a controlled study on the E-Print ArXiv. It is shown that such interactive experiments can be used to directly compare different retrieval functions based on the user's clicking behavior. I also discuss how similar experiment designs can be used to adapt system behavior during or search session, and for long-term learning of improved retrieval functions. Further details: F. Radlinski, M. Kurup, T. Joachims, How Does Clickthrough Data Reflect Retrieval Quality?, Proceedings of the ACM Conference on Information and Knowledge Management (CIKM), 2008.