From the earliest forms of Mancala invented in Africa 6000 years ago to the board games in Europe that became the modern forms of chess, games have been developing in complexity and entertainment. From 1972 onwards, more detailed and possibly more enjoyable digital games have been created with the power of computers.
Developers could create games that do not require people to play against each other, but against computers instead. At first, games replaced players with computers, like in 1952 when British professor A.S. Douglas created the game noughts and crosses on a computer at the University of Cambridge. This may be the first form of AI appearing in gaming as an opponent, despite that arguably the “AI” in that game was just a simple algorithm following certain steps depending on the input data. After that, in the 1970s, the AI in games have been “learning” to become better at the game. That, of course, was not the modern-day concept of machine learning – the computing power back then was way too slow to do that. For example, the famous Space Invaders performed functions on the player’s input, layered with random movement that creates a greater difficulty for players. Pac-man was one of the first to introduce a “personality trait” into each enemy AI, adding a difference to every single enemy, at the same time making their movement even harder to predict.
Alongside the enemy AI in computer games, projects have also been set up to beat human in those more traditional games. When Garry Kasparov was beaten by Deep Blue in 1997, marking the supremacy of computers at chess, the potential of computer processing was again signified to the public. However, behind Deep Blue, the fundamental algorithm was not “intelligent” – it carried out a search algorithm, going through all the possible routes on a chessboard to find the best move.
Compared to the possible moves of 10123 in chess, which meant that it was possible to brutally search through the moves, Go, a board game that originated in China, has an incredible number of 10360 possible moves in a game. This meant that the old method of searching for the optimal path in a data tree will simply just not work anymore. Therefore, the concept of “neural network” is used. The name of it represents what the algorithm is doing inside, with a reference to the neural links in the human brain. Inputs would be put into the algorithm, and the data would pass through multiple layers of “nodes”, given more weight each time if the previous move appears to be successful. In the case of Go, DeepMind developed AlphaGo. They started off by passing hundreds of thousands of games of Go played by humans to let the network learn from the moves played by human players. Then AlphaGo played against different versions of itself countless times, learning from the mistakes it made each time, leading to a victory against world champion Ke Jie in May 2017. AI took the crown of the hardest board game ever.
Then the attention started to fall back onto video games once again. Given the advancements made in video games, the number of possible “moves” is even more astronomical, given that most of the games are based on a 2D plane. The DeepMind team again, was one of the first to step into this field. They created AlphaStar, an AI aiming to achieve high in the game StarCraft II. The start was difficult – there were so many processes that the AI could carry out near the start of the game that would massively affect the entire game. This had led to the team putting in some latent data that made sure the AI has a decent start, based on what the professional players do in this game. Then another problem arose. This problem, “forgetting”, had not manifested itself in games like Go or chess as much. The AI started to forget how to beat the previous versions of itself, creating a cycle like “chasing its tail”. Using a simple game, rock, paper and scissors, as self-play carries out, the AI will go around thinking each one of them is the best tactic, not going out of this endless cycle.
So, how was it solved? The team found that fictitious self-play, meaning that the AI played against a mixture of all the previous strategies it had used, was part of the solution. But that was just not enough for StarCraft II. Consequently, “The League”, is introduced. The main purpose of the League was to exploit the weaknesses of the main AI, through putting in exploiter AIs. The aim of the exploiter AIs is not to win against every single opponent, but to find out the flaws in the tactics of the main AI, making it stronger in that way. Through the League, AlphaStar was able to learn all the complex strategies of StarCraft II fully automated, without the need of feeding in pro plays like what AlphaGo used. With these algorithms, AlphaStar achieved the Grandmaster League in the game, which meant that it was one amongst the 700 best players in the world. It is the first AI to reach the top league in a popular esport without any restrictions. However, it has still got some way to go before it beats the best pro players.
Moving onto another project – OpenAI Five, which was trying to tackle the game Dota 2. This time, alongside with all the problems that AlphaStar had, a brand new problem appears: teamwork. The OpenAI team tackled it through putting in a hyperparameter (a value fed into the network controlling its learning processes) called “team spirit”. This controls how much each of the five AIs valued their individual gain over rewards for the whole team. This value is changed throughout the project based on the results that the AI give. That is the prototype of the teamwork between different neural networks, for which I believe lots of different strategies will emerge in the future. Lots of progress had been made, as OpenAI Five has beaten some semi-pro teams, but still with restrictions of heroes that were chosen, and certain features of the games were taken out.
So, what can these AIs bring to human? Going back to chess, the top players now use computers to analyse their games as well as their opponents’, allowing them to spot their mistakes in the game, furthering the boundaries of human chess. After AlphaGo, DeepMind created AlphaZero, where no data of games played by human were ever put in. All it knew was the basic rules of the games of chess, shogi (Japanese chess) and Go, and through self-play, it taught itself to beat the world champions in all three games. It has an unconventional style of playing, many of its ideas being taken up by human into their own games. Here is a quote from Garry Kasparov: “I can’t disguise my satisfaction that it plays with a very dynamic style, much like my own!” Projects like AlphaStar and OpenAI Five, due to the games more resembling the real-world situations where decisions need to be made quickly with countless possibilities, many new algorithms and methods of solving problems for the AI have been developed. These new ideas are not restricted under “games”. They can be applied in other real-world situations such as law or healthcare. No one quite knows the bounds!
https://openai.com/blog/openai-five/ OpenAI, June 25, 2018
https://www.history.com/topics/inventions/history-of-video-games History.com editors, September 1, 2017