AlphaGo2.0版已自我学习目标用于科学医学领域

正文
我来说两句(人参与)

扫描到手机

关闭

2017-05-24 10:03:21

来源：搜狐体育

　　5月23日，当今世界围棋第一人柯洁九,23日下午在这里执黑289手以四分之一子的微弱劣势负于计算机围棋程序"阿尔法围棋"，在围棋"人机大战"三番棋中以0：1落后。

高清：柯洁对阵AlphaGo 眉头紧锁思考战术

　　AlphaGo团队在赛后接受媒体采访，对于新版本的AlphaGo进行解读。目前AlphaGo新版本变得更加强大，实现了自我学习。

　　Q: 这次的AlphaGo是纯净版的AlphaGo吗？也就是说，它是否是完全不依赖人类大师的棋谱来自我学习的？

　　Demis Hassabis: I’m not sure if I understand the question correctly, but… You know… obviously the version… AlphaGo initially learns from human games, and then…most of its learning now is from its own play against itself. So…but of course to truly test what it knows, we have to play against human experts, because we don't know playing the game against itself is not going to expose its weaknesses, because it will obviously fix those during the self-play. So we really have to test it against the world’s best players.

　　我不太确定我是否正确理解了这个问题。当然在最初的版本中，AlphaGo从人类棋谱中学习，后来到现在它大部分的学习材料都来自于自我对弈的棋谱。但是当然为了真正地测试它的所学，我们必须和人类高手对弈，因为我们不知道在自我对弈的过程中它是否会显露出它的缺点，因为显然它在自战过程中会避开不足。所以我们必须和世界上最优秀的棋手们对弈以测试它。

　　David Silver: Perhaps I could just add to that. One of the innovations of AlphaGo-Master, is that it actually relies much more on learning from itself. So in this version, AlphaGo has actually become its own teacher, learning from moves which are taken from examples of its own searches, that relies much less actually on human data than previous versions. And one of our goals in doing so is to make it more and more general so that its principal can be applied to other domains beyond Go.

　　我补充一下。AlphaGo-Master的一大创新就是它更多地依靠自我学习。在这个版本中，AlphaGo实际上成为了它自己的老师，从它自己的搜索中获得的下法中学习，和上一个版本相比大幅减少了对人类棋谱的依赖。我们这样做的目标之一就是是它变得更为通用，从而能被应用在围棋以外的领域上。

　　Q：我想知道Master的版本是V25，那么现在和柯洁对弈的AlphaGo是不是一个更新的版本？另外我想知道这是我们最后一次见到AlphaGo吗？AlphaGo未来会成为一个工具，帮助职业棋手继续提升自己的技术，还是从此就会和我们说再见？

　　David Silver: So maybe I can answer the first part to that question, regarding the technology inside AlphaGo. So AlphaGo-Master is a new version of AlphaGo, and we worked very hard to improve the fundamental algorithm that is used in AlphaGo. In fact, it turns out that the algorithm often matters more than the amount of data, or the amount of compute that actually goes into it. And if you get the algorithms right to make them general and powerful enough, then they can really progress very rapidly. So in fact in AlphaGo-Master, actually uses 10 times less computation, and is trained in match in weeks rather than months, compare to the version that played against Lee Sedol last year. So it is a different version, and is at least in self-play performance considerably stronger. And we are here to find out if indeed it’s stronger as it seems in self-play, or if it has weaknesses that can be exposed.

　　我可以回答问题的第一部分，关于AlphaGO内部的技术问题的。AlphaGo-Master是一个全新版本的AlphaGo，我们非常努力地工作，改进了AlphaGo的基础算法。事实证明，算法常常比数据的多少或者运算力更重要。当你把算法弄对使它们足够通用和强大，它们运行的速度是非常快的。所以事实上AlphaGo-Master用了和去年挑战李世石的那个版本相比来说十分之一的计算能力，用了几周在棋盘上训练而不是几个月。所以这是一个不同的版本，至少在自我对弈中它表现的更为强大了。我们来这里就是为了看看它是否真的像在自战中所表现的那样强大，还是它依然存在能被暴露出来的弱点。

　　Demis Hassabis: And as far as the second part of the question, I’ll just answer that. And later on in the event we will be announcing the next steps for AlphaGo. So I don't want to say anything in advance of that, but we will be talking about that later in the week. But one thing I want to say is that, just like with the last version of AlphaGo where we published all the technical details and results of the AlphaGo program in the Nature article, in the scientific journal Nature. And we published all the details and that allowed other companies, you know… Tencent and Japanese companies, to make their own versions of AlphaGo, and some of them are very strong now as well, I’m sure you all know, playing online, probably 9 Dan level. And we plan to publish more details of the new version of AlphaGo in the next few months. So we will review those technical details, and then again other teams and academic labs will be able to implement their versions of this AlphaGo-Master architecture.

　　至于第二部分的问题，由我来回答。今后在这个峰会上我们会公布AlphaGo的下一步计划，所以在那之前我不想多说，我们会在这周稍后谈到。但是有一件事是我想说的，我们在《自然》杂志中公布了上一个版本AlphaGo的技术细节和成果，这允许了其他的公司，比如腾讯和一些日本公司开发了他们自己版本的AlphaGo，这些程序中有一些已经很强大了，我相信你们都知道，它们在网上下棋，有着大概9段的水平。我们也计划在几个月内公布更多关于新版AlphaGo的技术细节。我们会回顾这些技术细节，然后其他的团队和实验室将会能够再次构建他们自己的AlphaGo-Master框架。

　　Q: 当越来越多顶尖棋手不愿意和AlphaGo对弈时，我们是否会考虑到用AlphaGo和AlphaGo对弈？

　　Demis Hassabis: We want to use AlphaGo, as I said, as a tool for the Go community to improve their knowledge about the game. We hope to, you know, release some details about the architecture we are using, maybe also some of the games that AlphaGo plays against itself. So we maybe will make some announcement about this later in the week. But don't forget, the reason, ultimately, we are developing these technologies is also to use them more widely in areas of science and medicine, and to try and help human experts in those areas. So we have lot of work ahead of us in the coming years.

　　就像我所说的，我们希望AlphaGo会是一个供围棋界提高他们对于这个游戏的认知的工具。我们会公布我们所使用的程序架构的细节，也可能还会公布一些AlphaGo自我对弈的棋谱，这周稍后会正式宣布。但是别忘了，我们发展这些科技的最终目的是为了在科学和医学领域更广阔地应用它们，也为了给人类专家提供帮助。所以在接下来几年我们还有很多工作要做。

棋局回顾：

·人机大战首局柯洁执黑先行在传统开局中求变化
·AlphaGo中盘阶段显示实力柯洁遇考验陷入长考
·AlphaGo大局清晰占主动柯洁孤注一掷图谋大龙
·柯洁官子阶段苦觅逆转良机 AlphaGo144手略意外

嘉宾讲棋：

·党毅飞、范蔚菁解析人机大战柯洁 VS AlphaGo（1）
·党毅飞、范蔚菁解析人机大战柯洁 VS AlphaGo（2）
·党毅飞、范蔚菁解析人机大战柯洁 VS AlphaGo（3）
·党毅飞、范蔚菁解析人机大战柯洁 VS AlphaGo（4）
·党毅飞、范蔚菁解析人机大战柯洁 VS AlphaGo（5）
·党毅飞、范蔚菁解析人机大战柯洁 VS AlphaGo（6）