AlphaGo

From The Hidden Wiki
Jump to navigationJump to search

AlphaGo is a computer program developed by Google DeepMind to play the board game Go. In October 2015, it became the first computer Go program to beat a professional human Go player without handicaps on a full-sized 19×19 board.<ref name="googlego">Template:Cite web</ref><ref name="bbcgo"/> In March 2016, it beat Lee Sedol during the first game in a five-game match, the first time a computer Go program has beaten a 9-dan professional without handicaps.<ref name="leesedolwin">Template:Cite web</ref>

AlphaGo's algorithm uses a combination of machine learning and tree search techniques, combined with extensive training, both from human and computer play.

History and competitions

Go is considered much more difficult for computers to win than other games such as chess, because its much larger branching factor makes it prohibitively difficult to use traditional AI methods such as brute-force search.<ref name="googlego" /><ref>Bombus<ref name="xeps-dnock">Template:Citation</ref></ref>

Almost two decades after IBM's computer Deep Blue beat world chess champion Garry Kasparov in the 1997 match, the strongest Go programs using artificial intelligence techniques only reached about amateur 5 dan level,<ref name="DeepMindnature2016"/> and still could not beat a professional Go player without handicaps.<ref name="googlego" /><ref name="bbcgo" /><ref name="CNN0128">Template:Cite web</ref> In 2012, the software program Zen, running on a four PC cluster, beat Masaki Takemiya (9p) two times at five and four stones handicap.<ref>Template:Cite web</ref> In 2013, Crazy Stone beat Yoshio Ishida (9p) at four-stones handicap.<ref>Template:Cite web</ref>

AlphaGo represents a significant improvement over previous Go programs. In 500 games against other available Go programs, including Crazy Stone and Zen,<ref>*****o</ref> AlphaGo running on a single computer won all but one.<ref>Template:Cite web</ref> In a similar matchup, AlphaGo running on multiple computers won all 500 games played against other Go programs, and 77% of games played against AlphaGo running on a single computer. The distributed version was using 1,202 *****Us and 176 GPUs, about 25 times as many as the single-computer version.<ref name="DeepMindnature2016" />

Match against Fan Hui

In October 2015, the distributed version of AlphaGo defeated the European Go champion Fan Hui,<ref name=MetzWired2016>Template:Cite web</ref> a 2 dan (out of 9 dan possible) professional, five to zero.<ref name="bbcgo">*****o</ref><ref>Template:Cite web</ref> This is the first time a computer Go program has beaten a professional human player on a full-sized board without handicap.<ref name="lemondego">*****o</ref> The announcement of the news was delayed until 27 January 2016 to coincide with the publication of a paper in the journal Nature<ref name="DeepMindnature2016">Template:Cite journal</ref> describing the algorithms used.<ref name="bbcgo" />

Match against Lee Se-dol

Template:Main AlphaGo is scheduled to challenge South Korean professional Go player Lee Se-dol, who is ranked 9 dan,<ref name="CNN0128" />Template:Update after with five games taking place at the Four Seasons Hotel in Seoul, South Korea on 9, 10, 12, 13, and 15 March 2016,<ref>Template:Cite web</ref><ref>Template:Cite web</ref> which will be video streamed live.<ref>Template:Cite web</ref> Aja Huang, a DeepMind team member and amateur 6-dan Go player, will place stones on the Go board for AlphaGo, which will be running through Google's cloud computing with its servers located in the United States.<ref name="JoongAng Ilbo">Template:Cite web</ref> The match will adopt the Chinese rules with a 7.5-point komi, and each side will have two hours of thinking time plus three times of 60-second byoyomi.<ref name="Korea Baduk Association"/>

The winner will get a $1M prize. If AlphaGo wins, the prize will be donated to charities, including UNICEF.<ref>Template:Cite web</ref> Besides the $1M prize, Lee Se-dol will receive at least $150,000 for participating in all the five games and an additional $20,000 for each win.<ref name="Korea Baduk Association">Template:Cite web</ref>

One game of the match has been played so far. AlphaGo won the first game when Lee Se-dol resigned.<ref>*****o</ref>

Algorithm

AlphaGo's algorithm uses a combination of machine learning and tree search techniques, combined with extensive training, both from human and computer play. It uses Monte Carlo tree search, guided by a "value network" and a "policy network", both implemented using deep neural network technology.<ref name="googlego" /><ref name="DeepMindnature2016" /> A limited amount of game-specific feature detection pre-processing is used to generate the inputs to the neural networks.<ref name="DeepMindnature2016" />

The system's neural networks were initially bootstrapped from human game-play expertise. AlphaGo was initially trained to mimic human play by attempting to match the moves of expert players from recorded historical games, using a database of around 30 million moves.<ref name=MetzWired2016/> Once it had reached a certain degree of proficiency, it was trained further by being set to play large numbers of games against other instances of itself, using reinforcement learning to improve its play.<ref name="googlego"/>

Style of play

AlphaGo has been described by the 9-dan player Myungwan Kim as playing "like a human" in its games against Fan Hui.<ref>Template:Cite web</ref> The match referee Toby Manning has described the program's style as "conservative".<ref name=":0">Template:Cite web</ref>

Responses

AlphaGo has been hailed as a landmark development in artificial intelligence research, as Go has previously been regarded as a hard problem in machine learning that was expected to be out of reach for the technology of the time.<ref>*****o</ref><ref>*****o</ref> Toby Manning, the referee of AlphaGo's match against Fan Hui, and Hajin Lee, secretary general of the International Go Federation, both reason that in the future, Go players will get help from computers to learn what they have done wrong in games and improve their skills.<ref>Template:Cite journal</ref>

Similar systems

Facebook has also been working on their own Go-playing system darkforest, also based on combining machine learning and tree search.<ref name=":0" /><ref name=facebook-paper>Template:Cite arXiv</ref> Although a strong player against other computer Go programs, as of early 2016, it had not yet defeated a professional human player.<ref>*****o</ref>

Example game

AlphaGo (black) v. Fan Hui, Game 4 (8 October 2015), AlphaGo won by resignation.<ref name="DeepMindnature2016"/>

Template:Goban

First 99 moves (96 at 10)

Template:Goban

Moves 100-165.

See also

References

1 }}
     | references-column-width 
     | references-column-count references-column-count-{{#if:1|30em}} }}
   | {{#if: 
     | references-column-width }} }}" style="{{#if: 30em
   | {{#iferror: {{#ifexpr: 30em > 1 }}
     | Template:Column-width
     | Template:Column-count }}
   | {{#if: 
     | Template:Column-width }} }} list-style-type: {{#switch: 
   | upper-alpha
   | upper-roman
   | lower-alpha
   | lower-greek
   | lower-roman = {{{group}}}
   | #default = decimal}};">
<references group=""></references>

External links