With the advent of deep learning, AI components have achieved unprecedented performance on complex, human competitive tasks, such as image, video, text and audio processing. Hence, they are increasingly integrated into sophisticated software systems, some of which (e.g., autonomous vehicles) are required to deliver certified dependability warranties. In this talk, I will consider the unique features of AI based systems and of the faults possibly affecting them, in order to revise the testing fundamentals and redefine the overall goal of testing, taking a statistical view on the dependability warranties that can be actually delivered. Then, I will consider the key elements of a revised testing process for AI based systems, including the test oracle and the test input generation problems. I will also introduce the notion of runtime supervision, to deal with unexpected error conditions that may occur in the field. Finally, I will identify the future steps that are essential to close the loop from testing to operation, proposing an empirical framework that reconnects the output of testing to its original goals.