What is AlphaGo, AlphaGo Zero, and AlphaZero

AlphaGo – developed by the DeepMind team of Google – is an AI program which plays the board game Go.

A Go board (By Donarreiskoffer - Self-photographed, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=43383)

The Go board game is an abstract strategy game, which has been invented in China over 2500 years ago. Despite its simple set of rules, Go is considered to be much more complex than Chess, and is one of the most studied strategy game of all time.

The AlphaGo Logo

The AlphaGo uses a Monte Carlo tree search algorithm to find moves using the trained deep neural network which works as its knowledge core. AlphaGo was initially trained on a training set of over 30 million moves data from human Go matches. It was then further trained by letting it compete against copies of itself using reinforcement learning.

AlphaGo’s first victory was on October 2015. It was against 3-times European Champion, Mr Fan Hui, on a full sized (19x19) board. AlphaGo won with 5-0, and became the first computer Go program to beat a human professional.

On March 2016, AlphaGo competed against Lee Sedol, an 18-time world champion and a 9-dan professional (highest professional rank) Go player. On this five-game match, AlphaGo won 4-1, earning it an honorary 9-dan title.

On January 2017, an improved version of AlphaGo – named AlphaGo Master – has been put to compete in an online series of Go games, without revealing its identity, against some of the top international Go players, and managed to win at 60-0.

On the Future of Go Summit on May 2017, AlphaGo (the improved AlphaGo Master version) competed against Ke Jie, the world No. 1 ranked player at the time. AlphaGo won 3-0 in this three-game match. The Chinese Weiqi Association awarded the professional 9-dan status to AlphaGo after this victory.

Ke Jie has later praised AlphaGo’s unique play style, and has stated “After humanity spent thousands of years improving our tactics, computers tell us that humans are completely wrong... I would go as far as to say not a single human has touched the edge of the truth of Go.” [see Related Links].

After the win with Ke Jie, AlphaGo retired from the Go arena.

On October 2017, DeepMind introduced AlphaGo Zero. While being the latest version of AlphaGo the AlphaGo Zero has been built from scratch. Rather than training it on data from millions of moves from human matches, the AlphaGo Zero was trained to play by competing against copies of itself by starting with random play.

Using this technique, AlphaGo Zero surpassed the level of AlphaGo Master by just 21 days, and became superhuman-level by 40 days.

On December 2017, DeepMind generalized the algorithm of AlphaGo Zero, and introduced AlphaZero, which has achieved superhuman levels of gameplay in Chess, Go, and Shogi, is just 24 hours of training.

Full capabilities of AlphaZero is yet to be witnessed.

Pages

Friday, December 8, 2017

What is AlphaGo, AlphaGo Zero, and AlphaZero

No comments:

Post a Comment