AlphaGo Zero trains itself to be most powerful Go player in the world

(credit: DeepMind)

Deep Mind has just announced AlphaGo Zero, an evolution of AlphaGo, the first computer program to defeat a world champion at the ancient Chinese game of Go. Zero is even more powerful and is now arguably the strongest Go player in history, according to the company.

While previous versions of AlphaGo initially trained on thousands of human amateur and professional games to learn how to play Go, AlphaGo Zero skips this step. It learns to play from scratch, simply by playing games against itself, starting from completely random play.

(credit: DeepMind)

It surpassed Alpha Lee in 3 days, then surpassed human level of play, defeating the previously published champion-defeating version of AlphaGo by 100 games to 0 in just 40 days.

The achievement is described in the journal Nature today (Oct. 18, 2017)


DeepMind | AlphaGo Zero: Starting from scratch


Abstract of Mastering the game of Go without human knowledge

A long-standing goal of artificial intelligence is an algorithm that learns, tabula rasa, superhuman proficiency in challenging domains. Recently, AlphaGo became the first program to defeat a world champion in the game of Go. The tree search in AlphaGo evaluated positions and selected moves using deep neural networks. These neural networks were trained by supervised learning from human expert moves, and by reinforcement learning from self-play. Here we introduce an algorithm based solely on reinforcement learning, without human data, guidance or domain knowledge beyond game rules. AlphaGo becomes its own teacher: a neural network is trained to predict AlphaGo’s own move selections and also the winner of AlphaGo’s games. This neural network improves the strength of the tree search, resulting in higher quality move selection and stronger self-play in the next iteration. Starting tabula rasa, our new program AlphaGo Zero achieved superhuman performance, winning 100–0 against the previously published, champion-defeating AlphaGo.

Leading brain-training game improves memory and attention better than competing method

EEGs taken before and after the training showed that the biggest changes occurred in the brains of the group that trained using the “dual n-back” method (right). (credit: Kara J. Blacker/JHU)

A leading brain-training game called “dual n-back” was significantly better in improving memory and attention than a competing “complex span” game, Johns Hopkins University researchers found in a recent experiment.*

These results, published Monday Oct. 16, 2017 in an open-access paper in the Journal of Cognitive Enhancement, suggest it’s possible to train the brain like other body parts — with targeted workouts to improve the cognitive skills needed when tasks are new and you can’t just rely on old knowledge and habits, says co-author Susan Courtney, a Johns Hopkins neuroscientist and professor of psychological and brain sciences.


Johns Hopkins University | The Best Way to Train Your Brain: A Game

The dual n-back game is a memory sequence test in which you must remember a constantly updating sequence of visual and auditory stimuli. As shown in a simplified version in the video above, participants saw squares flashing on a grid while hearing letters. But in the experiment, the subjects also had to remember if the square they just saw and the letter they heard were both the same as one round back.

As the test got harder, they had to recall squares and letters two, three, and four rounds back. The subjects also showed significant changes in brain activity in the prefrontal cortex, the critical region responsible for higher learning.

With the easier complex span game, there’s a distraction between items, but participants don’t need to continually update the previous items in their mind.

((You can try an online version of the dual n-back test/game here and of the digit-span test here. The training programs Johns Hopkins compared are tools scientists rely on to test the brain’s working memory, not the commercial products sold to consumers. )

30 percent improvement in working memory

The researchers found that the group that practiced the dual n-back exercise showed a 30 percent improvement in their working memory — nearly double the gains in the group using complex span. “The findings suggest that [the dual n-back] task is changing something about the brain,” Courtney said. “There’s something about sequencing and updating that really taps into the things that only the pre-frontal cortex can do, the real-world problem-solving tasks.”

The next step, the researchers say, is to figure out why dual n-back is so good at improving working memory, then figure out how to make it even more effective so that it can become a marketable or even clinically useful brain-training program.

* Scientists trying to determine if brain exercises make people smarter have had mixed results. Johns Hopkins researchers suspected the problem wasn’t the idea of brain training, but the type of exercise researchers chose to test it. They decided to compare directly the leading types of exercises and measure people’s brain activity before and after training; that had never been attempted before, according to lead author Kara J. Blacker, a former Johns Hopkins post-doctoral fellow in psychological and brain sciences, now a researcher at the Henry M. Jackson Foundation for Advancement of Military Medicine, Inc. For the experiment, the team assembled three groups of participants, all young adults. Everyone took an initial battery of cognitive tests to determine baseline working memory, attention, and intelligence. Everyone also got an electroencephalogram, or EEG, to measure brain activity. Then, everyone was sent home to practice a computer task for a month. One group used one leading brain exercise while the second group used the other. The third group practiced on a control task. Everyone trained five days a week for 30 minutes, then returned to the lab for another round of tests to see if anything about their brain or cognitive abilities had changed.


Abstract of N-back Versus Complex Span Working Memory Training

Working memory (WM) is the ability to maintain and manipulate task-relevant information in the absence of sensory input. While its improvement through training is of great interest, the degree to which WM training transfers to untrained WM tasks (near transfer) and other untrained cognitive skills (far transfer) remains debated and the mechanism(s) underlying transfer are unclear. Here we hypothesized that a critical feature of dual n-back training is its reliance on maintaining relational information in WM. In experiment 1, using an individual differences approach, we found evidence that performance on an n-back task was predicted by performance on a measure of relational WM (i.e., WM for vertical spatial relationships independent of absolute spatial locations), whereas the same was not true for a complex span WM task. In experiment 2, we tested the idea that reliance on relational WM is critical to produce transfer from n-back but not complex span task training. Participants completed adaptive training on either a dual n-back task, a symmetry span task, or on a non-WM active control task. We found evidence of near transfer for the dual n-back group; however, far transfer to a measure of fluid intelligence did not emerge. Recording EEG during a separate WM transfer task, we examined group-specific, training-related changes in alpha power, which are proposed to be sensitive to WM demands and top-down modulation of WM. Results indicated that the dual n-back group showed significantly greater frontal alpha power after training compared to before training, more so than both other groups. However, we found no evidence of improvement on measures of relational WM for the dual n-back group, suggesting that near transfer may not be dependent on relational WM. These results suggest that dual n-back and complex span task training may differ in their effectiveness to elicit near transfer as well as in the underlying neural changes they facilitate.