In this first part of our series 'AI Rap with Eva Eggeling', Eva Eggeling, Center Manager for Data Driven Design at Fraunhofer Austria and Head of the Innovation Center for Digitalisation and Artificial Intelligence KI4Life, takes a deep dive into the topic of the Turing Test.
Can a computer think like a human or not? In the age of Chat GPT, questions like these are more topical than ever. However, pioneering work in the field of computer science was already being carried out in the 1940s and 1950s. machine learning achieved. And even back then, researchers were looking for answers to questions like these.
The British logician, mathematician, cryptanalyst and computer scientist Alan Turingpresented a test in his essay "Computing Machinery and Intelligence" that was intended to serve as a research method in the field of artificial intelligence.
Alan Turing was a British logician, mathematician, cryptanalyst and computer scientist. Credit: Wikipedia
What is the Turing test?
The Turing test is designed to analyse the Intelligence of machines can prove this. The test is passed if the computer can imitate human reactions under certain conditions.
The precursor of the Turing test is the so-called "Imitation Game". Three human participants sit in three separate rooms, each connected via a screen and keyboard. A male participant sits in one room, a female participant in the second room and a male or female "referee" in the third room.
The female participant now tries to convince the neutral referee that she is the male participant by the way she answers some questions. The referee has the task of finding out who the male participant is.
How does the Alan Turing test work?
The further development of the imitation game for Alan Turing's Turing test now brings an AI into play. A computer terminal is operated by one computer, while humans sit in front of the other two. The referee must question the participants on a specific topic in a given format and context.
After a certain time and a certain number of questions, the referee must decide which computer will be operated by the AI. This procedure is repeated many times. If the referee does not recognise the AI in half or fewer of the test runs, the AI can be classified as intelligent, as it is perceived by the referee as being "just as human" as the human interviewee.
Criticism of the Turing test
A large Problem with AI tests is certainly the fact that Alan Turin himself did not formulate any requirements for the questioner. This makes the test difficult to interpret. Scientists such as Gary Marcus, Professor of Psychology and Neuroscience at New York University, point out the disadvantages of the test. For example, the test with the chatbots says nothing about the intelligence of the people who also take part in the Turing test. In addition, the AI is not able to answer complex questions and does not act intuitively.
Another point of criticism is that in the past, the type of question had to be restricted in order for a computer to demonstrate human-like intelligence. For a long time, a computer only achieved a high score in the Turing test if the questioner asked yes or no questions and the topics related to a narrow field of knowledge. If open questions were asked, the probability that the AI could deceive the human was lower.
To this day, it is questionable whether the Turing test can even provide credible proof of artificial intelligence. Does the deceptively genuine imitation of human communication actually indicate an independent artificial intelligence or is it nothing more than a deceptively genuine imitation?
Which AI passes the Turing test?
To mark the 60th anniversary of Alan Turing's death in 2014, the University of Reading in London organised a competition in which a Chatbot named Eugene Goostman simulated a 13-year-old boy. In the eyes of some scientists, the chatbot passed the test as it fooled 33 per cent of the judges.
Others argue that there were not enough jurors and that the test was invalid because it only lasted five minutes. Another argument is that programmes such as Eugene Goostman are primarily aimed at misleading the questioners. The AI responds to questions that it does not understand or to which it does not have an answer ready with counter-questions and evasive manoeuvres. The more successful this is, the more people believe they are communicating with a human counterpart.
A second example is the programme Google Duplexa project that enables users to make bookings via smartphone. In 2018, Google Duplex successfully made an appointment with a hairdresser over the phone in front of 7,000 people. The receptionist did not realise that she was not talking to a real person. This experiment is also considered by some scientists to be a successful Turing test, although it is not based on Alan Turing's original format.
ChatGPT and other modern AIs based on a Large Language Model (LLM) now regularly pass the Turing test. This is another reason why it is considered obsolete in science.
Three areas of application of the Turing test
Although Alan Turing's AI test has been widely criticised, the basic idea - the deceptively realistic imitation of human communication - still plays a role in digitalisation today.
- Human Interaction Proof (HIP):Â The CAPTCHA query can be seen as a negative Turing test. As a human interaction proof, it has the function of distinguishing humans from machines and effectively keeping bots away from a website through automated text and image queries. The name CAPTCHA stands for "Completely Automated Public Turing test to tell Computers and Humans Apart".
- Bots:Â Bots are electronic tools that fulfil positive or negative tasks depending on their application. For example, they are used as chatbots to efficiently automate customer service enquiries. On the other hand, they are also used as social bots or spam bots to spread false information or malware. In both cases, variants of the Turing test help to further develop bots and make them more human-like.
- Voice assistants: Of all technologies, voice assistants probably come closest to Alan Turing's original concept. Voice assistants such as Alexa or Siri are based on a question-and-answer system and are designed to automate everyday tasks and user needs. Even if today's applications do not yet reach the level required to pass the Turing test, they are constantly being improved through machine learning and the analysis of user behaviour and are becoming increasingly similar to humans
Is Alan Turing's test still relevant today?
Eva EggelingCentre Manager for Data Driven Design at Fraunhofer Austria and Head of the Innovation Center for Digitalisation and Artificial Intelligence KI4Life says: "I think it is particularly topical now, especially since the great popularity of ChatGPT for the general population."
Despite much debate about the relevance of the Turing test today and the validity of the competitions based on it, the test is still a philosophical starting point for the discussion and exploration of AI. As we continue to make progress in AI and better understand and map how the human brain works, the Turing test remains fundamental to the definition of intelligence and is a basis for the debate about what we should expect from technologies in order to consider them as thinking machines.