深度学习

深度学习（也称为深度结构化学习或分层学习）是基于人工神经网络的更广泛的机器学习方法族的一部分。学习可以是有监督的、半监督的或无监督的。^[1]^[2]^[3]

深度学习架构，例如深度神经网络、深度信念网络、循环神经网络和卷积神经网络，已经被应用于包括计算机视觉、语音识别、自然语言处理、音频识别、社交网络过滤、机器翻译、生物信息学、药物设计、医学图像分析、材料检查和棋盘游戏程序在内的领域，在这些领域中，它们的成果可与人类专家媲美，并且在某些情况下胜过人类专家。^[4]^[5]^[6]

神经网络受到生物系统中信息处理和分布式通信节点的启发。人工神经网络与生物大脑有各种不同。具体而言，神经网络往往是静态和象征性的，而大多数生物的大脑是动态(可塑)和模拟的。^[7]^[8]^[9]

目录编辑

1 定义编辑

深度学习是一类机器学习算法：^[10] 使用多个层逐步从原始输入中逐步提取更高级别的特征。例如，在图像处理中，较低层可以识别边缘，而较高层可以识别对人类有意义的部分，例如数字/字母或面部。

2 概览编辑

大多数现代的深度学习模型基于人工神经网络，特别是卷积神经网络（CNN），尽管它们也可以包括命题公式或在深度生成模型中逐层组织的潜变量，例如深度信念网络和深度玻尔兹曼机中的节点。^[11]

在深度学习中，每一级学习将其输入数据转换成稍微抽象和复合的表示。在图像识别应用中，原始输入可以是像素矩阵；第一代表层可以提取像素并编码边缘；第二层可以组成和编码边缘排列；第三层可以编码鼻子和眼睛；并且第四层可以识别包含人脸的图像。重要的是，深入的学习过程可以学习将哪些特征放在哪个级别上是最优的。(当然，这并不能完全避免需要手动调整；例如，不同的层数和层大小可以提供不同程度的抽象。)^[1]^[12]

“深度学习”中的“深度”是指数据转换的层数。更准确地说，深度学习系统有一个实质的信用分配路径（CAP）深度。CAP是从输入到输出的转换链。CAP描述了输入和输出之间潜在的因果关系。对于前馈神经网络，CAP的深度是网络的深度，等于隐藏层的数量加上1(因为输出层也是参数化的)。对于递归神经网络，其中信号可能不止一次地通过一个层传播，CAP深度可能是无限的。^[2]没有普遍认同的深度阈值将浅层和深度学习区分开来，但是大多数研究者认同深度学习中的CAP深度>2。深度为2的CAP已被证明是一个通用逼近器，因为它可以模拟任何函数。除此之外，更多的层不会增加网络的函数逼近能力。深度模型（CAP > 2）能够提取比浅层模型更好的特征，因此，额外的层有助于学习特征。

深度学习架构通常是用贪婪逐层方法构建的。深度学习有助于理清这些抽象概念，并找出哪些特性可以提高性能。^[1]

对于监督学习任务，深度学习方法通过将数据转换成类似于主成分的紧凑中间表示，并导出消除冗余表示后的分层结构，从而避免了特征工程。

深度学习算法可以应用于无监督的学习任务。这是一个重要的好处，因为未标记的数据比标记的数据更丰富。可以无监督方式训练的深层结构的例子有神经历史压缩器^[13]和深度信念网络。^[1]^[14]

3 解释编辑

深度神经网络通常用万能近似定理或者概率推理^[10]^[11]^[1]^[2]^[14]^[15]^[16]来解释。^[17]^[18]^[19]^[20]^[21]^[22]

经典的万用近似定理关注具有有限大小的单个隐藏层的前馈神经网络逼近连续函数的能力。^[17]^[18]^[19]^[20]^[21]1989年，乔治·赛本科发表了关于sigmoid激活函数的首个证明^[18]，库尔特·霍尼克在1991年将其推广到前馈多层体系结构。^[19]

深度神经网络的万用近似定理涉及有限宽度但深度可增长的网络的容量。Lu等人^[22]证明了如果具有ReLU激活的深度神经网络的宽度严格大于输入维数，则网络可以近似任何勒贝格可积函数；如果宽度小于或等于输入维数，那么深度神经网络不是一个通用逼近器。

概率解释^[15]源自机器学习领域。它的特点是推理，^[10]^[11]^[1]^[2]^[14]^[15]以及分别与拟合和泛化相关的训练和测试的优化概念。更具体地说，概率解释将非线性激活函数视为累积分布函数。^[15]概率解释导致在神经网络中引入损失作为正则化。^[23]概率解释由霍普菲尔德、维卓尔和纳伦德拉等研究人员引入，并在毕晓普等人的调查中得到推广。^[24]

4 历史编辑

深度学习这个术语由Rina Dechter于1986年引入机器学习社区，^[25]^[13]伊戈尔·艾森堡和他的同事于2000年在布尔阈值神经元的背景下引入人工神经网络。^[26]^[27]

Alexey Ivakhnenko和帕拉在1965年发表了第一个用于监督的、深度的、前馈的多层感知器的通用工作学习算法。^[28]1971年的一篇论文描述了一个由数据处理算法的分组方法训练的8层深度网络。^[29]

其他深度学习工作架构，特别是那些为计算机视觉而构建的架构，始于1980年由福岛国彦引入的神经认知机。^[30]1989年，扬·勒丘恩等人对深度神经网络应用了标准的反向传播算法，这种算法自1970年以来一直是自动微分的反向模式，^[31]^[32]^[33]^[34]目的是识别邮件上手写的邮政编码。算法工作需要3天的训练。^[35]

到1991年，这种系统被用于识别孤立的二维手写数字，而识别三维物体是通过将二维图像与手工制作的三维物体模型相匹配来完成的。翁等人提出人脑并不使用单一的三维对象模型，1992年，他们发表了Cresceptron，^[36]^[37]^[38]一种在复杂场景中进行三维物体识别的方法。因为它直接使用自然图像，Cresceptron开启了自然3D世界的通用视觉学习。与神经认知机相似，Cresceptron是一多层的级联。但是，虽然神经认知机需要人类程序员手工合并特征，Cresceptron却在没有监督的情况下在每一层中学习了大量的特征，其中每个特征都由卷积核表示。Cresceptron通过网络进行反分析，从杂乱的场景中分割出每个学习对象。最大池化(Max pooling)现在经常被深度神经网络采用(例如图像网测试)，最早在Cresceptron中通过级联用来将位置分辨率降低(2x2)到1倍，以便更好地泛化。

1994年，安德烈德·卡瓦略与迈克·法尔赫斯特和大卫·比塞特一起发表了多层布尔神经网络（也称为失重神经网络）的实验结果，该网络由三层自组织特征提取神经网络模块(SOFT)和多层分类神经网络模块（GSN）组成，并经过独立训练。特征提取模块中的每一层提取的特征与前一层相比更加复杂。^[39]

1995年，布兰登·弗雷证明，使用由彼得·达扬和辛顿共同开发的唤醒睡眠算法，可以训练(超过两天)一个包含六个全连接的层和数百个隐藏单元的网络。^[40]许多因素导致了速度的缓慢，包括Sepp Hochreiter在1991年分析的梯度消失问题。^[41]^[42]

由于人工神经网络的计算成本和对大脑如何连接生物网络缺乏理解，使用特定任务的手工特征（如Gabor滤波器和支持向量机）的简单模型在20世纪90年代和2000年代是一个流行的选择。

人工神经网络的浅层和深度学习(如循环网络)经历了多年的探索。^[43]^[44]^[45]这些方法从未优于非均匀内部手工高斯混合模型/隐马尔可夫模型（HMM）技术，它们基于区别训练的语音生成模型。包括梯度递减^[41]和神经预测模型中的弱时间相关结构在内的^[46]关键困难也已经得到分析。^[47]^[48]另外的困难是缺乏训练数据和有限的计算能力。

大多数语音识别研究人员从神经网络转向了生成模型。一个例外是20世纪90年代末的斯坦福国际研究院（SRI International）。在美国国家安全局和美国国防部高级研究计划局的资助下，SRI研究了语音和说话人识别中的深度神经网络。Heck的说话人识别团队在1998年的国家标准与技术研究所说话人识别评估中，首次在语音处理中使用深度神经网络取得了重大成功。^[49]虽然SRI在说话人识别中使用深度神经网络取得了成功，但在语音识别中却没有取得类似的成功。在20世纪90年代后期的“原始”谱图或线性滤波器组特征的深度自动编码器的架构中，首次成功地探索到将“原始”特征提升到手工优化之上的原理，^[49]并表现出它优于包含光谱图固定变换阶段的Mel-Cepstral特征。语音、波形的原始特征后来产生了大规模卓越成果。^[50]

语音识别的许多方面被一种叫做长短期记忆(LSTM)的深度学习方法所取代，这是一种由霍克雷特和施密休伯在1997年发表的循环神经网络。^[51]LSTM神经网络避免了梯度消失问题，可以学习“非常深入学习”任务^[2]，这需要对之前发生的几千个离散时间步长的事件进行记忆，这对语音识别很重要。2003年，LSTM开始在某些特定任务上与传统的语音识别器竞争。^[52]后来，它与联结主义时间分类(CTC)相结合^[53]为成堆的LSTM循环神经网络。^[54] 据报道，在2015年，谷歌的语音识别通过CTC的LSTM产生了49%的惊人性能提升，并将它用于Google语音搜索。^[55]

2006年，杰夫·辛顿、鲁斯兰·萨拉赫丁诺夫、奥辛德罗和特赫的出版物^[56]^[57]^[58]展示了多层前馈神经网络如何有效地一次预训练一层，依次将每层视为无监督的受限玻尔兹曼机，然后使用有监督的反向传播对其进行微调。^[59]他们的论文参考了《learning for deep belief nets》。

深度学习是各学科最先进系统的一部分，特别是计算机视觉和自动语音识别(ASR)。TIMIT（ASR）和MNIST（图像分类）等常用评估集以及一系列大词汇量语音识别任务的结果都在稳步改善。^[60]^[61]^[62]ASR中的卷积神经网络被CTC取代^[53]为LSTM。^[51]^[55]^[63]^[64]^[65]^[66]^[67]但是在计算机视觉方面取得了更大成功。

据扬·勒丘恩称，行业中深度学习的影响始于21世纪初，当时CNN已经处理了大约10%至20%的美国手写支票。^[68]深度学习在大规模语音识别中的产业应用始于2010年左右。

2009年NIPS语音识别深度学习大会^[69]的动机是深层语音生成模型的局限性，以及给定更强力的硬件和大规模数据集使得深层神经网络（DNN）变实用的可能性。人们认为，使用深层信念网络（DBN）的生成模型预先训练深度神经网络将克服神经网络的主要困难。^[70]然而，当使用具有大的上下文相关输出层的深度神经网络时，发现用大量训练数据代替预训练用于直接反向传播，产生的错误率大大低于当时最先进的高斯混合模型(GMM)/隐马尔可夫模型(HMM)，也低于更先进的基于生成模型的系统。^[60]^[71]这两种系统产生的识别错误的性质是不同的，^[72]^[69]这为如何将深度学习集成到所有主要语音识别系统部署的现有高效运行语音解码系统中提供了技术见解。^[10]^[73]^[74]2009-2010年左右的分析对比了GMM（和其他生成性语音模型）和DNN模型，刺激了早期产业对语音识别深度学习的投资，^[72]^[69]最终导致该行业的普遍和主导使用。这一分析是在判别性DNN和生成性模型之间进行的，他们具有相当的性能（错误率不到1.5%）。^[60]^[72]^[70]^[75]

2010年，研究人员基于决策树构造的上下文相关隐马尔可夫模型，采用DNN的大输出层，将TIMIT的深度学习扩展到大词汇量语音识别。^[76]^[77]^[78]^[73]

硬件的发展使人们重新燃起了兴趣。2009年，英伟达参与了所谓的深度学习“大爆炸”，因为深度学习神经网络是由英伟达图形处理单元(GPU)训练的。^[79]那一年，谷歌大脑使用英伟达GPU创建了高性能深度神经网络。其中吴恩达确定GPU可以将深度学习系统的速度提高大约100倍。^[80]具体而言，GPU非常适合机器学习中涉及的矩阵/向量数学。^[81]^[82]GPU能将训练算法的速度提高几个数量级，将运行时间从数周缩短到数天。^[83]^[84]专用硬件和算法优化可用于高效处理。^[85]

4.1 深度学习革命

深度学习是机器学习的一个子集，机器学习是人工智能的子集

2012年，达尔领导的团队利用多任务深层神经网络预测一种药物的生物分子靶并以此赢得了“默克分子活性挑战”。^[86]^[87]2014年，霍克雷特的团队利用深度学习来检测营养素、家用产品和药物中环境化学品的脱靶和毒性效应，并赢得了美国国家卫生研究院、美国食品和药物管理局和NCATS的“Tox21数据挑战”。^[88]^[89]^[90]

从2011年到2012年，深度学习在图像或物体识别方面产生了显著的额外影响。虽然通过反向传播训练的卷积神经网络已经出现了几十年，GPU实现的网络也已经出现了几年，包括卷积伸进网络，但要在计算机视觉上取得进展，还需要以Ciresan和同事的方式在GPU上实现最大池化的快速网络。^[81]^[82]^[35]^[91]^[2]2011年，这种方法首次在视觉模式识别竞赛中实现了惊人的表现。同为2011年，它赢得了ICDAR中文手写比赛，并在2012年5月赢得了ISBI图像分割比赛。^[92]直到2011年，卷积神经网络还没有在计算机视觉会议上大展拳脚，但在2012年6月，Ciresan等人在CVPR的主要会议上发表了一篇论文^[4]说明了如何在GPU上最大限度地汇集CNN可以显著改善许多视觉基准记录。2012年10月，克里兹夫斯基等人提出了一个类似的系统。^[5]在大规模的图像网竞赛中以绝对优势战胜了浅层机器学习方法。2012年11月，西雷森等人的系统还在ICPR癌症检测大型医学图像分析竞赛中胜出，并在第二年赢得了同一主题的MICCAI大挑战。^[93]在2013年和2014年，使用深度学习的图像网任务的错误率进一步降低，这与大规模语音识别的趋势相近。沃尔夫勒姆图像识别项目公布了这些改进。^[94]

然后，图像分类被扩展到更具挑战性的任务，为图像生成描述（字幕），通常是由CNN和LSTM的组合进行。^[95]^[96]^[97]^[98]

一些研究人员估计，2012年10月图像网的胜利标志着一场“深度学习革命”的开始，这场革命改变了人工智能行业。^[99]

2019年3月，约书亚·本希奥、杰弗里·辛顿和扬·勒丘恩因概念和工程突破而被授予图灵奖，这些突破使深度神经网络成为计算的关键组成部分。

5 神经网络编辑

5.1 人工神经网络

人工神经网络（ANN）或联结系统是由构成动物大脑的生物神经网络启发的计算系统。这种系统通过考虑示例来学习（逐步提高它们的能力）完成任务，通常不需要特定任务的编程。例如，在图像识别中，他们可以通过分析手动标记为“猫”或“没有猫”的示例图像，并使用分析结果来识别其他图像中的猫，从而学会识别包含猫的图像。它们大多数使用于很难用传统的基于规则编程的计算机算法来表达的应用。

人工神经网络基于被称为人造神经元的连接单元的集合（类似于生物大脑中的生物神经元）。神经元之间的每个连接(突触)都可以向另一个神经元传递信号。接收（后突触）神经元可以处理信号，然后向与之相连的下游神经元发送信号。神经元可能有状态，通常用实数表示，一般在0和1之间。神经元和突触的权重也可能随着学习的进行而变化，这会增加或减少它向下游发送的信号的强度。

通常，神经元是分层组织的。不同的层可以对它们的输入执行不同种类的转换。信号可能在多次穿过这些层之后从第一（输入）层传播到最后一个（输出）层。

神经网络方法的起初目的是像人脑一样解决问题。随着时间的推移，重心集中在匹配特定的思维能力上，导致与生物学的偏差，例如反向传播，或者以相反的方向传递信息，并调整网络以反映这些信息。

神经网络已经用于各种任务，包括计算机视觉、语音识别、机器翻译、社交网络过滤、棋盘和视频游戏以及医学诊断。

截至2017年，神经网络通常有几千到几百万个单元和几百万个连接。尽管这个数字比人脑中的神经元数量少几个数量级，但这些网络可以在超出人类水平的水平上执行许多任务（例如，人脸识别，下围棋^[100] )。

5.2 深度神经网络

深度神经网络（DNN）是一个在输入层和输出层之间有多层的人工神经网络。^[11]^[2]DNN找到了将输入转化为输出的正确数学操作，无论是线性关系还是非线性关系。网络遍历各层并计算每个输出的概率。例如，被训练识别狗品种的DNN将检查给定的图像，并计算图像中的狗是某个品种的概率。用户可以查看结果并选择网络应显示的概率（高于某个阈值等）并返回建议的标签。每一个这样的数学操作都被认为是一个层，而复杂的DNN有许多层次，因此被称为“深度”网络。

DNN可以模拟复杂的非线性关系。DNN架构生成组合模型，其中对象被表示为图元的分层组合。^[101]额外的层使得能够从较低层合成特征，用比执行类似操作的浅层网络更少的单元来建模复杂数据。^[11]

深层架构包括一些基本方法的许多变体。每个架构都在特定领域取得了成功。除非对相同的数据集进行了评估，否则不可能总能比较多个体系结构的性能。

DNN是一种典型的前馈网络，数据从输入层流向输出层而不返回。首先，DNN创建了一个虚拟神经元的映射，并为它们之间的联系分配随机数值或者说“权重”。权重和输入相乘，返回0到1之间的输出。如果网络不能准确识别特定模式，算法会调整权重。^[102]这样，算法可以使某些参数更有影响力，直到它确定正确的数学操作来完全处理数据。

循环神经网络（RNN）中数据可以向任何方向流动，用于诸如语言建模的应用。^[103]^[104]^[105]^[106]^[107]长短期记忆在这方面特别有效。^[51]^[108]

深度卷积神经网络用于计算机视觉。^[109] CNN也被用于自动语音识别（ASR）的声学建模。^[67]

挑战

与人工神经网络一样，训练不完善的深度神经网络中可能会出现许多问题。两个常见的问题是过拟合和计算时间。

DNN倾向于过拟合，因为增加了抽象层，允许它们对训练数据中罕见的依赖关系建模。正则化方法如Ivakhnenko的单元剪枝^[29]或者权重衰减（ -正则化）或稀疏化（正规化）可以在避免过拟合的训练中使用。^[110]另外，在训练过程中，dropout正则化会随机省略隐藏层中的单元。这有助于排除罕见的依赖性。^[111]最后，可以通过剪枝和旋转等方法来增加数据，从而可以增加较小的训练集，以减少过拟合的机会。^[112]

DNN必须考虑许多训练参数，例如大小（层数和每层单元数）、学习速率和初始权重。由于时间和计算资源的成本，在参数空间中搜索最优参数可能是不可行的。有各种技巧如批处理（一次计算几个训练示例的梯度，而不是单个示例）^[113]加速计算。多核架构（如GPU或英特尔Xeon Phi）的强大处理能力大大加快了训练速度，因为这种处理架构适合矩阵和向量计算。^[114]^[115]

另外，工程师可以寻找其他具有更直接和收敛的训练算法的神经网络。CMAC（小脑神经网络）就是这样一种神经网络。CMAC不需要学习率或随机初始权重。可以保证训练过程与新的一批数据一步收敛，并且训练算法的计算复杂度与涉及的神经元数量成线性关系。^[116]^[117]

6 应用编辑

6.1 自动语音识别

大规模自动语音识别是深度学习的第一个也是最有说服力的成功案例。LSTM神经网络可以学习“非常深入学习”任务^[2]，这涉及包含由数千个离散时间步长分隔的语音事件的多秒间隔，其中一个时间步长对应约10ms。具有遗忘门的LSTM^[108]在特定任务上可以与传统的语音识别器相媲美。^[52]

语音识别的最初成功是基于TIMIT的小规模识别任务。该数据集包含来自美国英语八种主要方言的630名说话者，每个说话者读10个句子。^[118]它的小规模允许尝试许多配置。更重要的是，TIMIT任务涉及音素序列识别，这与单词序列识别不同，它允许弱音素二元语言模型。这使得语音识别的声学建模方面的强度更容易分析。以下列出的错误率，包括这些早期结果，以及以音素错误率百分比(PER)衡量的错误率，自1991年以来一直在汇总。

方法	声音误差率（PER，%)
随机初始化RNN^[119]	26.1
贝叶斯三音子GMM-HMM	25.6
隐藏轨迹（生成）模型	24.8
单音子重复初始化DNN	23.4
单音子DBN-DNN	22.4
带BMMI训练的三音子GMM-HMM	21.7
共享池上的单音子DBN-DNN	20.7
卷积DNN^[120]	20.0
卷积DNN w。异构池	18.7
DNN / CNN / RNN合奏^[121]	18.3
双向LSTM	17.9
分层卷积深度超出网络^[122]	16.5

20世纪90年代末首次出现用于说话人识别的深度神经网络，2009-2011年前后首次出现用于语音识别的深度神经网络，2003-2007年前后首次出现用于LSTM的深度神经网络，加速了八个主要领域的进展:^[10]^[75]^[73]

放大/缩小和加速DNN训练和解码
序列辨别训练
通过对潜在机制有深刻理解的深层模型进行特征处理
DNN和相关深度模型的适应
基于DnS和相关深层模型的多任务迁移学习
卷积神经网络以及如何设计它们来最好地利用语音领域知识
RNN及其丰富的LSTM变体
其他类型的深层模型包括基于张量的模型和集成的深层生成/判别模型。

所有主要的商业语音识别系统（如微软小娜、Xbox、Skype翻译器、亚马逊Alexa、Google Now、苹果Siri、百度和iFlyTek语音搜索，以及一系列Nuance语音产品等）都建立在深度学习的基础上。^[10]^[123]^[124]^[125]

6.2 图像识别

图像分类的常用评估集是MNIST数据库数据集。MNIST由手写数字组成，包括60000个训练示例和10000个测试示例。和TIMIT一样，它的小尺寸让用户可以测试多种配置。这个集合的完整结果列表是可获得的。^[126]

基于深度学习的图像识别已经成为“超人”，可以获得比人类参赛者更准确的结果。这首次出现在2011年。^[127]

经过深度学习训练的车辆现在可以理解360度摄像头的视角。^[128]另一个例子是面部畸形分析（FDNA），用于分析与一个大型遗传综合征数据库相关的人类畸形病例。

6.3 视觉艺术处理

与图像识别取得的进展密切相关的是深度学习技术在各种视觉艺术任务中的日益应用。DNN的强大能力已经得到证明，例如，a）识别给定绘画的风格周期，b）神经风格迁移-捕捉给定艺术品的风格，并以愉悦视觉方式将其应用于任意照片或视频，以及c）基于随机视觉输入字段生成醒目的图像。^[129]^[130]

6.4 自然语言处理

自21世纪初以来，神经网络就被用于实现语言模型。^[103]^[131]LSTM帮助改进了机器翻译和语言建模。^[104]^[105]^[106]

该领域的其他关键技术是负采样^[132]和单词嵌入。词嵌入如word2vec，可以被认为是深度学习体系结构中的表示层，该体系结构将原子单词转换为该单词相对于数据集中其他单词的位置表示；该位置表示为向量空间中的一个点。使用单词嵌入作为RNN输入层允许网络使用有效的合成向量语法来解析句子和短语。成分向量语法可以被认为是由RNN实现的概率上下文无关文法（PCFG）。^[133]建立在单词嵌入之上的递归自动编码器可以评估句子相似性并检测语义。^[133]深层神经架构为选区分析，^[134]情绪分析，^[135]信息检索，^[136]^[137]口语理解，^[138]机器翻译，^[104]^[139]上下文实体链接，^[139]写作风格识别，^[140]文本分类等^[141]提供了最佳结果。

最近的发展将单词嵌入推广到句子嵌入。

谷歌翻译使用大型端到端长短期记忆网络。^[142]^[143]^[144]^[145]^[146]^[147]Google神经机器翻译系统（GNMT）使用基于实例的机器翻译方法，其中系统“从数百万个实例中学习”^[143]它一次翻译“整个句子，而非片段”。谷歌翻译支持一百多种语言。^[143]网络对“句子的语义，而非简单地记忆短语的翻译”进行编码。^[143]^[148]GT使用英语作为大多数语言对之间的中间语言。^[148]

6.5 药物发现和毒理学

很大一部分候选药物未能获得监管部门的批准。这些失败是由功效不足（靶点效应）、意料之外的相互作用（脱靶效应）或意外的毒性效应引起的。^[149]^[150]研究已经探索了使用深度学习来预测生物分子目标，^[86]^[87]营养物、家用产品和药物中环境化学物质的脱靶和毒性影响。^[88]^[89]^[90]

AtomNet是一个基于结构的合理药物设计的深度学习系统。^[151]AtomNet用于预测埃博拉病毒等疾病靶标的新候选生物分子^[152]和多发性硬化症。^[153]^[154]

6.6 客户关系管理

深度强化学习已被用于估算可能的直销活动的价值，这是根据RFM变量定义的。估计价值函数显示为客户终身价值的自然解释。^[155]

6.7 推荐系统

推荐系统已经使用深度学习为基于内容的音乐推荐提取潜在因素模型的有意义的特征。^[156]多视角深度学习已经应用于从多个领域学习用户偏好。^[157]该模型使用了一种基于内容和协作的混合方法，并在多个任务中增强推荐。

6.8 生物信息学

自动编码器人工神经网络用于生物信息学，预测基因本体注释和基因功能关系。^[158]

在医学信息学中，深度学习被用来根据可穿戴设备的数据预测睡眠质量^[159]以及根据电子健康记录数据对健康并发症进行预测。^[160]深度学习也显示出医疗保健的功效。^[161]

6.9 医学图像分析

深度学习在医学应用例如癌细胞分类、病变检测、器官分割和图像增强^[162]^[163]中产生的结果已经与其他方法相当。

6.10 手机广告

为移动广告寻找合适的移动受众总是具有挑战性的，因为在任何广告服务器创建并在广告服务中使用目标片段之前，必须考虑和吸收许多数据点。^[164]深度学习已经被用于解释大的、多维的广告数据集。许多数据点是在请求/服务/点击互联网广告周期中收集的。这些信息可以形成机器学习的基础数据，以改进广告的选择。

6.11 图像恢复

深度学习已成功应用于反问题，如去噪、超分辨率、修复和胶片着色。这些应用包括学习方法如在图像数据集上进行训练的“有效图像恢复的收缩字段”^[165]方法和训练需要恢复的图像的深度图像先验方法。

6.12 金融欺诈检测

深度学习正成功应用于金融欺诈检测和反洗钱。“深度反洗钱检测系统可以发现和识别数据之间的关系和相似性，并在未来学习检测异常或分类和预测特定事件”。该解决方案利用监督学习（如可疑交易的分类）和非监督学习（如异常检测）技术。^[166]

6.13 军队

美国国防部通过观察应用深度学习来训练机器人完成新任务。^[167]

7 与人类认知和大脑发育的关系编辑

深度学习与认知神经科学家在20世纪90年代早期提出的一类大脑发育理论（特别是新皮质发育）密切相关。^[168]^[169]^[170]^[171]这些发展理论在计算模型中被实例化，使它们成为深度学习系统的前身。这些发展模型的共同特点是，大脑中各种提议的学习动力学（例如，神经生长因子波）支持自组织，这在某种程度上类似于深度学习模型中使用的神经网络。与新皮质一样，神经网络采用分层过滤器的层次结构，其中每一层考虑来自前一层（或操作环境）的信息，然后将其输出（可能还含有原始输入）传递给其他层。这一过程产生了一个自组织的传感器堆栈，可以很好地适应它们的工作环境。一份1995年的描述指出，“...婴儿的大脑似乎在所谓营养因子波的影响下自我组织...大脑的不同区域依次相连，一层组织先于另一层成熟，依此类推，直到整个大脑成熟。”^[172]

从神经生物学的角度研究深度学习模型的合理性已经使用了各种各样的方法。一方面，为了提高反向传播算法的处理真实感提出了几种反向传播算法的变体。^[173]^[174]其他研究人员认为，无监督形式的深度学习，例如基于层次生成模型和深度信念网络的学习，可能更接近生物现实。^[175]^[176]在这方面，生成性神经网络模型已经与大脑皮层中基于样本的处理的神经生物学证据相关联。^[177]

虽然人类大脑组织和深度网络中神经元编码之间的系统比较尚未建立，但报告中已有几个类比。例如，深度学习单元执行的计算可能类似于实际神经元的计算^[178]^[179]和神经群。^[180]类似地，由深度学习模型开发的表示类似于灵长类视觉系统^[181]在单一单元^[182]和在种群^[183]等级上测量的表示。

8 商业活动编辑

许多组织对特定的应用采用深度学习。脸书的人工智能实验室进行了一些任务如自动给上传的图片贴上标签，上面有人物的名字。^[184]

谷歌的DeepMind科技公司开发了一个系统，它能够学习如何只用像素作为数据输入来玩雅达利电子游戏。2015年，他们展示了他们的AlphaGo系统，该系统下围棋的能力足以击败职业围棋选手。^[185]^[186]^[187]谷歌翻译使用LSTM翻译100多种语言。

2015年，Blippar展示了一个移动增强现实应用，它使用深度学习实时识别物体。^[188]

截至2008年，^[189]德克萨斯大学奥斯汀分校（UT）的研究人员开发了一个名为“通过评估强化手动训练代理”的机器学习框架，该框架为机器人或计算机程序提供了通过与人类教师交互来学习如何执行任务的新方法。^[167]

最初作为TAMER开发、后来在2018年美国陆军研究实验室（ARL）和UT研究人员的合作中引入了一种称为Deep TAMER的新算法。Deep TAMER使用深度学习为机器人提供通过观察学习新任务的能力。^[167]

使用Deep TAMER，机器人与人类教练一起学习任务，观看视频流或观察人类亲自执行任务。机器人后来在教练的指导下练习了这项任务，教练在这个过程中提供了“做得好”和“做得不好”等反馈^[190]。

9 批判和议论编辑

深度学习吸引了批判和评论，在某些情况下来自计算机科学领域之外。

9.1 理论

一个主要的批评是缺乏围绕某些方法的理论。^[191]在最常见的深层架构中的学习是使用众所周知的梯度下降来实现的。然而，围绕其他算法的理论，如对比散度算法，则不太清楚。（例如，它会收敛吗？如果是，有多快？它近似于何值？)深度学习方法通常被视为一个黑盒，大多数证实是凭经验进行的，而不是理论上的。^[192]

其他人指出，深度学习应该被视为实现强人工智能的一个步骤，而不是一个包罗万象的解决方案。尽管有深度学习方法的力量，但它们仍然缺乏完全实现这一目标所需的许多功能。研究心理学家加里·马库斯指出:

“事实上，深度学习只是构建智能机的更大挑战的一部分。这些技术缺乏表述因果关系的方式（……）没有显式的方法进行逻辑推理，而且它们距整合抽象知识还有很长的路要走，例如关于什么是对象、它们的用途和它们通常如何使用。最强大的人工智能系统如Watson（……）将深度学习等技术作为相当复杂的技术集合中的一个元素，涉及从贝叶斯推理的统计技术到演绎推理。”^[193]

作为对深度学习极限的重点的可选项，一位作者推测，训练机器视觉堆栈来执行区分“老主人”和业余人物的绘图的复杂任务是可能的，并且假设这样的灵敏度可能代表了不平凡的机器共感的雏形。^[194]这位作者提出，这与人类学是一致的，人类学将美学视为行为现代性的一个关键要素。^[195]

在进一步提到艺术敏感性可能存在于相对较低的认知层次的观点时，一系列已发表的深层（20-30层）神经网络内部状态图表示试图在本质上是随机的数据中辨别它们所训练的图像^[196]中展现出了视觉吸引力：最初的研究通知收到了超过1000条评论，并且是《卫报》^[197]网站上一段时间内最常被访问的文章主题。

9.2 错误

一些深度学习架构显现出了有问题的行为，^[198]例如自信地将不可识别的图像分类为属于熟悉的普通图像类别^[199]以及对正确分类图像的微小扰动进行错误分类。^[200]戈泽尔假设，这些行为是由于其内部表现的限制，这些限制将抑制集成到异构多组件通用人工智能（AGI）架构。^[198]这些问题可以通过深度学习架构来解决，这种架构内部形成与观察到的实体和事件的图像-语法分解同源的状态^[201]。^[198]从训练数据中学习语法（视觉或语言的）相当于将系统限制在常识推理上，常识推理根据语法产生规则对概念进行操作，并且是人类语言习得和人工智能的基本目标^[202]。^[203]

9.3 网络威胁

随着深度学习从实验室走向世界，研究和经验表明，人工神经网络容易被黑客攻击和欺骗。通过识别这些系统运行的模式，攻击者可以修改人工神经网络的输入，使得人工神经网络找到人类观察者无法识别的匹配对象。例如，攻击者可以对图像进行细微的更改，使得人工神经网络能够找到匹配的图像，即使该图像在人类看来与搜索目标完全不同。这种操纵被称为“对抗性攻击”。2016年，研究人员使用一个人工神经网络以反复试验的方式对图像进行修改，识别另一个人工神经网络的焦点，从而生成欺骗它的图像。修改后的图像在人眼看来没有什么不同。另一组显示，打印出的篡改图像成功地欺骗了图像分类系统。^[204]一种防御方式是反向图像搜索，其中一个可能的假图像被提交到一个网站，如TinEye，然后可以找到它的其他实例。一种改进是只使用图像的一部分进行搜索，以识别可能拍摄到的图像。^[205]

另一组研究表明，某些迷惑现象可以愚弄面部识别系统，使其认为普通人是名人，这可能会让一个人冒充另一个人。2017年，研究人员在停车标志上添加了标签，导致人工神经网络对它们进行了错误分类。^[204]

然而，人工神经网络可以被进一步训练以检测欺骗意图，潜在地导致攻击者和防御者进入类似于一个已经定义恶意软件防御行业的军备竞赛。人工神经网络已经被训练来击败基于人工神经网络的反恶意软件，通过反复攻击反恶意软件的防御，该网络被遗传算法不断地改变，直到它成功欺骗了反恶意软件，同时保持其破坏目标的能力。^[204]

另一个小组证明了某些声音可以让Google即时语音指挥系统打开一个特定的网址来下载恶意软件。^[204]

在“数据中毒”中，错误数据不断地被偷放入机器学习系统的训练集中，以防止它掌握这个模型。^[204]

参考文献

[1]
^Bengio, Y.; Courville, A.; Vincent, P. (2013). "Representation Learning: A Review and New Perspectives". IEEE Transactions on Pattern Analysis and Machine Intelligence. 35 (8): 1798–1828. arXiv:1206.5538. doi:10.1109/tpami.2013.50. PMID 23787338..
[2]
^Schmidhuber, J. (2015). "Deep Learning in Neural Networks: An Overview". Neural Networks. 61: 85–117. arXiv:1404.7828. doi:10.1016/j.neunet.2014.09.003. PMID 25462637..
[3]
^Bengio, Yoshua; LeCun, Yann; Hinton, Geoffrey (2015). "Deep Learning". Nature. 521 (7553): 436–444. Bibcode:2015Natur.521..436L. doi:10.1038/nature14539. PMID 26017442..
[4]
^Ciresan, Dan; Meier, U.; Schmidhuber, J. (June 2012). "Multi-column deep neural networks for image classification". 2012 IEEE Conference on Computer Vision and Pattern Recognition: 3642–3649. arXiv:1202.2745. doi:10.1109/cvpr.2012.6248110. ISBN 978-1-4673-1228-8..
[5]
^Krizhevsky, Alex; Sutskever, Ilya; Hinton, Geoffry (2012). "ImageNet Classification with Deep Convolutional Neural Networks" (PDF). NIPS 2012: Neural Information Processing Systems, Lake Tahoe, Nevada..
[6]
^"Google's AlphaGo AI wins three-match series against the world's best Go player". TechCrunch. 25 May 2017..
[7]
^Marblestone, Adam H.; Wayne, Greg; Kording, Konrad P. (2016). "Toward an Integration of Deep Learning and Neuroscience". Frontiers in Computational Neuroscience. 10: 94. doi:10.3389/fncom.2016.00094. PMC 5021692. PMID 27683554..
[8]
^Olshausen, B. A. (1996). "Emergence of simple-cell receptive field properties by learning a sparse code for natural images". Nature. 381 (6583): 607–609. Bibcode:1996Natur.381..607O. doi:10.1038/381607a0. PMID 8637596..
[9]
^Bengio, Yoshua; Lee, Dong-Hyun; Bornschein, Jorg; Mesnard, Thomas; Lin, Zhouhan (2015-02-13). "Towards Biologically Plausible Deep Learning". arXiv:1502.04156 [cs.LG]..
[10]
^Deng, L.; Yu, D. (2014). "Deep Learning: Methods and Applications" (PDF). Foundations and Trends in Signal Processing. 7 (3–4): 1–199. doi:10.1561/2000000039..
[11]
^Bengio, Yoshua (2009). "Learning Deep Architectures for AI" (PDF). Foundations and Trends in Machine Learning. 2 (1): 1–127. CiteSeerX 10.1.1.701.9550. doi:10.1561/2200000006..
[12]
^LeCun, Yann; Bengio, Yoshua; Hinton, Geoffrey (28 May 2015). "Deep learning". Nature. 521 (7553): 436–444. Bibcode:2015Natur.521..436L. doi:10.1038/nature14539. PMID 26017442..
[13]
^Jürgen Schmidhuber (2015). Deep Learning. Scholarpedia, 10(11):32832. Online.
[14]
^Hinton, G.E. (2009). "Deep belief networks". Scholarpedia. 4 (5): 5947. Bibcode:2009SchpJ...4.5947H. doi:10.4249/scholarpedia.5947..
[15]
^Murphy, Kevin P. (24 August 2012). Machine Learning: A Probabilistic Perspective. MIT Press. ISBN 978-0-262-01802-9..
[16]
^Patel, Ankit; Nguyen, Tan; Baraniuk, Richard (2016). "A Probabilistic Framework for Deep Learning" (PDF). Advances in Neural Information Processing Systems..
[17]
^Balázs Csanád Csáji (2001). Approximation with Artificial Neural Networks; Faculty of Sciences; Eötvös Loránd University, Hungary.
[18]
^Cybenko (1989). "Approximations by superpositions of sigmoidal functions" (PDF). Mathematics of Control, Signals, and Systems. 2 (4): 303–314. doi:10.1007/bf02551274. Archived from the original (PDF) on 2015-10-10..
[19]
^Hornik, Kurt (1991). "Approximation Capabilities of Multilayer Feedforward Networks". Neural Networks. 4 (2): 251–257. doi:10.1016/0893-6080(91)90009-t..
[20]
^Haykin, Simon S. (1999). Neural Networks: A Comprehensive Foundation. Prentice Hall. ISBN 978-0-13-273350-2..
[21]
^Hassoun, Mohamad H. (1995). Fundamentals of Artificial Neural Networks. MIT Press. p. 48. ISBN 978-0-262-08239-6..
[22]
^Lu, Z., Pu, H., Wang, F., Hu, Z., & Wang, L. (2017). The Expressive Power of Neural Networks: A View from the Width. Neural Information Processing Systems, 6231-6239..
[23]
^Hinton, G. E.; Srivastava, N.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R.R. (2012). "Improving neural networks by preventing co-adaptation of feature detectors". arXiv:1207.0580 [math.LG]..
[24]
^Bishop, Christopher M. (2006). Pattern Recognition and Machine Learning (PDF). Springer. ISBN 978-0-387-31073-2..
[25]
^Rina Dechter (1986). Learning while searching in constraint-satisfaction problems. University of California, Computer Science Department, Cognitive Systems Laboratory.Online.
[26]
^Igor Aizenberg, Naum N. Aizenberg, Joos P.L. Vandewalle (2000). Multi-Valued and Universal Binary Neurons: Theory, Learning and Applications. Springer Science & Business Media..
[27]
^Co-evolving recurrent neurons learn deep memory POMDPs. Proc. GECCO, Washington, D. C., pp. 1795-1802, ACM Press, New York, NY, USA, 2005..
[28]
^Ivakhnenko, A. G. (1973). Cybernetic Predicting Devices. CCM Information Corporation..
[29]
^Ivakhnenko, Alexey (1971). "Polynomial theory of complex systems". IEEE Transactions on Systems, Man and Cybernetics. 1 (4): 364–378. doi:10.1109/TSMC.1971.4308320..
[30]
^Fukushima, K. (1980). "Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position". Biol. Cybern. 36 (4): 193–202. doi:10.1007/bf00344251. PMID 7370364..
[31]
^Seppo Linnainmaa (1970). The representation of the cumulative rounding error of an algorithm as a Taylor expansion of the local rounding errors. Master's Thesis (in Finnish), Univ. Helsinki, 6-7..
[32]
^Griewank, Andreas (2012). "Who Invented the Reverse Mode of Differentiation?" (PDF). Documenta Matematica (Extra Volume ISMP): 389–400..
[33]
^Werbos, P. (1974). "Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences". Harvard University. Retrieved 12 June 2017..
[34]
^Werbos, Paul (1982). "Applications of advances in nonlinear sensitivity analysis" (PDF). System modeling and optimization. Springer. pp. 762–770..
[35]
^LeCun et al., "Backpropagation Applied to Handwritten Zip Code Recognition," Neural Computation, 1, pp. 541–551, 1989..
[36]
^J. Weng, N. Ahuja and T. S. Huang, "Cresceptron: a self-organizing neural network which grows adaptively," Proc. International Joint Conference on Neural Networks, Baltimore, Maryland, vol I, pp. 576-581, June, 1992..
[37]
^J. Weng, N. Ahuja and T. S. Huang, "Learning recognition and segmentation of 3-D objects from 2-D images," Proc. 4th International Conf. Computer Vision, Berlin, Germany, pp. 121-128, May, 1993..
[38]
^J. Weng, N. Ahuja and T. S. Huang, "Learning recognition and segmentation using the Cresceptron," International Journal of Computer Vision, vol. 25, no. 2, pp. 105-139, Nov. 1997..
[39]
^de Carvalho, Andre C. L. F.; Fairhurst, Mike C.; Bisset, David (1994-08-08). "An integrated Boolean neural network for pattern classification". Pattern Recognition Letters. 15 (8): 807–813. doi:10.1016/0167-8655(94)90009-4..
[40]
^Hinton, Geoffrey E.; Dayan, Peter; Frey, Brendan J.; Neal, Radford (1995-05-26). "The wake-sleep algorithm for unsupervised neural networks". Science. 268 (5214): 1158–1161. Bibcode:1995Sci...268.1158H. doi:10.1126/science.7761831..
[41]
^S. Hochreiter., "Untersuchungen zu dynamischen neuronalen Netzen," Diploma thesis. Institut f. Informatik, Technische Univ. Munich. Advisor: J. Schmidhuber, 1991..
[42]
^Hochreiter, S.; et al. (15 January 2001). "Gradient flow in recurrent nets: the difficulty of learning long-term dependencies". In Kolen, John F.; Kremer, Stefan C. A Field Guide to Dynamical Recurrent Networks. John Wiley & Sons. ISBN 978-0-7803-5369-5..
[43]
^Morgan, Nelson; Bourlard, Hervé; Renals, Steve; Cohen, Michael; Franco, Horacio (1993-08-01). "Hybrid neural network/hidden markov model systems for continuous speech recognition". International Journal of Pattern Recognition and Artificial Intelligence. 07 (4): 899–916. doi:10.1142/s0218001493000455. ISSN 0218-0014..
[44]
^Robinson, T. (1992). "A real-time recurrent error propagation network word recognition system". ICASSP: 617–620..
[45]
^Waibel, A.; Hanazawa, T.; Hinton, G.; Shikano, K.; Lang, K. J. (March 1989). "Phoneme recognition using time-delay neural networks". IEEE Transactions on Acoustics, Speech, and Signal Processing. 37 (3): 328–339. doi:10.1109/29.21701. ISSN 0096-3518..
[46]
^Baker, J.; Deng, Li; Glass, Jim; Khudanpur, S.; Lee, C.-H.; Morgan, N.; O'Shaughnessy, D. (2009). "Research Developments and Directions in Speech Recognition and Understanding, Part 1". IEEE Signal Processing Magazine. 26 (3): 75–80. Bibcode:2009ISPM...26...75B. doi:10.1109/msp.2009.932166..
[47]
^Bengio, Y. (1991). "Artificial Neural Networks and their Application to Speech/Sequence Recognition". McGill University Ph.D. thesis..
[48]
^Deng, L.; Hassanein, K.; Elmasry, M. (1994). "Analysis of correlation structure for a neural predictive model with applications to speech recognition". Neural Networks. 7 (2): 331–339. doi:10.1016/0893-6080(94)90027-2..
[49]
^Heck, L.; Konig, Y.; Sonmez, M.; Weintraub, M. (2000). "Robustness to Telephone Handset Distortion in Speaker Recognition by Discriminative Feature Design". Speech Communication. 31 (2): 181–192. doi:10.1016/s0167-6393(99)00077-1..
[50]
^"Acoustic Modeling with Deep Neural Networks Using Raw Time Signal for LVCSR (PDF Download Available)". ResearchGate. Retrieved 2017-06-14..
[51]
^Hochreiter, Sepp; Schmidhuber, Jürgen (1997-11-01). "Long Short-Term Memory". Neural Computation. 9 (8): 1735–1780. doi:10.1162/neco.1997.9.8.1735. ISSN 0899-7667. PMID 9377276..
[52]
^Graves, Alex; Eck, Douglas; Beringer, Nicole; Schmidhuber, Jürgen (2003). "Biologically Plausible Speech Recognition with LSTM Neural Nets" (PDF). 1st Intl. Workshop on Biologically Inspired Approaches to Advanced Information Technology, Bio-ADIT 2004, Lausanne, Switzerland. pp. 175–184..
[53]
^Graves, Alex; Fernández, Santiago; Gomez, Faustino (2006). "Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks". Proceedings of the International Conference on Machine Learning, ICML 2006: 369–376. CiteSeerX 10.1.1.75.6306..
[54]
^Santiago Fernandez, Alex Graves, and Jürgen Schmidhuber (2007). An application of recurrent neural networks to discriminative keyword spotting. Proceedings of ICANN (2), pp. 220–229..
[55]
^Sak, Haşim; Senior, Andrew; Rao, Kanishka; Beaufays, Françoise; Schalkwyk, Johan (September 2015). "Google voice search: faster and more accurate"..
[56]
^Hinton, Geoffrey E. (2007-10-01). "Learning multiple layers of representation". Trends in Cognitive Sciences. 11 (10): 428–434. doi:10.1016/j.tics.2007.09.004. ISSN 1364-6613. PMID 17921042..
[57]
^Hinton, G. E.; Osindero, S.; Teh, Y. W. (2006). "A Fast Learning Algorithm for Deep Belief Nets" (PDF). Neural Computation. 18 (7): 1527–1554. doi:10.1162/neco.2006.18.7.1527. PMID 16764513..
[58]
^Bengio, Yoshua (2012). "Practical recommendations for gradient-based training of deep architectures". arXiv:1206.5533 [cs.LG]..
[59]
^G. E. Hinton., "Learning multiple layers of representation," Trends in Cognitive Sciences, 11, pp. 428–434, 2007..
[60]
^Hinton, G.; Deng, L.; Yu, D.; Dahl, G.; Mohamed, A.; Jaitly, N.; Senior, A.; Vanhoucke, V.; Nguyen, P.; Sainath, T.; Kingsbury, B. (2012). "Deep Neural Networks for Acoustic Modeling in Speech Recognition --- The shared views of four research groups". IEEE Signal Processing Magazine. 29 (6): 82–97. doi:10.1109/msp.2012.2205597..
[61]
^Deng, Li; Hinton, Geoffrey; Kingsbury, Brian (1 May 2013). "New types of deep neural network learning for speech recognition and related applications: An overview" – via research.microsoft.com..
[62]
^Deng, L.; Li, J.; Huang, J. T.; Yao, K.; Yu, D.; Seide, F.; Seltzer, M.; Zweig, G.; He, X. (May 2013). "Recent advances in deep learning for speech research at Microsoft". 2013 IEEE International Conference on Acoustics, Speech and Signal Processing: 8604–8608. doi:10.1109/icassp.2013.6639345. ISBN 978-1-4799-0356-6..
[63]
^Sak, Hasim; Senior, Andrew; Beaufays, Francoise (2014). "Long Short-Term Memory recurrent neural network architectures for large scale acoustic modeling" (PDF)..
[64]
^Li, Xiangang; Wu, Xihong (2014). "Constructing Long Short-Term Memory based Deep Recurrent Neural Networks for Large Vocabulary Speech Recognition". arXiv:1410.4281 [cs.CL]..
[65]
^Zen, Heiga; Sak, Hasim (2015). "Unidirectional Long Short-Term Memory Recurrent Neural Network with Recurrent Output Layer for Low-Latency Speech Synthesis" (PDF). Google.com. ICASSP. pp. 4470–4474..
[66]
^Deng, L.; Abdel-Hamid, O.; Yu, D. (2013). "A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion" (PDF). Google.com. ICASSP..
[67]
^Sainath, T. N.; Mohamed, A. r; Kingsbury, B.; Ramabhadran, B. (May 2013). "Deep convolutional neural networks for LVCSR". 2013 IEEE International Conference on Acoustics, Speech and Signal Processing: 8614–8618. doi:10.1109/icassp.2013.6639347. ISBN 978-1-4799-0356-6..
[68]
^Yann LeCun (2016). Slides on Deep Learning Online.
[69]
^NIPS Workshop: Deep Learning for Speech Recognition and Related Applications, Whistler, BC, Canada, Dec. 2009 (Organizers: Li Deng, Geoff Hinton, D. Yu)..
[70]
^Keynote talk: Recent Developments in Deep Neural Networks. ICASSP, 2013 (by Geoff Hinton)..
[71]
^D. Yu, L. Deng, G. Li, and F. Seide (2011). "Discriminative pretraining of deep neural networks," U.S. Patent Filing..
[72]
^Deng, L.; Hinton, G.; Kingsbury, B. (2013). "New types of deep neural network learning for speech recognition and related applications: An overview (ICASSP)" (PDF)..
[73]
^Yu, D.; Deng, L. (2014). Automatic Speech Recognition: A Deep Learning Approach (Publisher: Springer). ISBN 978-1-4471-5779-3..
[74]
^"Deng receives prestigious IEEE Technical Achievement Award - Microsoft Research". Microsoft Research. 3 December 2015..
[75]
^Li, Deng (September 2014). "Keynote talk: 'Achievements and Challenges of Deep Learning - From Speech Analysis and Recognition To Language and Multimodal Processing'". Interspeech..
[76]
^Yu, D.; Deng, L. (2010). "Roles of Pre-Training and Fine-Tuning in Context-Dependent DBN-HMMs for Real-World Speech Recognition". NIPS Workshop on Deep Learning and Unsupervised Feature Learning..
[77]
^Seide, F.; Li, G.; Yu, D. (2011). "Conversational speech transcription using context-dependent deep neural networks". Interspeech..
[78]
^Deng, Li; Li, Jinyu; Huang, Jui-Ting; Yao, Kaisheng; Yu, Dong; Seide, Frank; Seltzer, Mike; Zweig, Geoff; He, Xiaodong (2013-05-01). "Recent Advances in Deep Learning for Speech Research at Microsoft". Microsoft Research..
[79]
^"Nvidia CEO bets big on deep learning and VR". Venture Beat. April 5, 2016..
[80]
^"From not working to neural networking". The Economist..
[81]
^Oh, K.-S.; Jung, K. (2004). "GPU implementation of neural networks". Pattern Recognition. 37 (6): 1311–1314. doi:10.1016/j.patcog.2004.01.013..
[82]
^Chellapilla, K., Puri, S., and Simard, P. (2006). High performance convolutional neural networks for document processing. International Workshop on Frontiers in Handwriting Recognition..
[83]
^Cireşan, Dan Claudiu; Meier, Ueli; Gambardella, Luca Maria; Schmidhuber, Jürgen (2010-09-21). "Deep, Big, Simple Neural Nets for Handwritten Digit Recognition". Neural Computation. 22 (12): 3207–3220. arXiv:1003.0358. doi:10.1162/neco_a_00052. ISSN 0899-7667. PMID 20858131..
[84]
^Raina, Rajat; Madhavan, Anand; Ng, Andrew Y. (2009). "Large-scale Deep Unsupervised Learning Using Graphics Processors". Proceedings of the 26th Annual International Conference on Machine Learning. ICML '09. New York, NY, USA: ACM: 873–880. CiteSeerX 10.1.1.154.372. doi:10.1145/1553374.1553486. ISBN 9781605585161..
[85]
^Sze, Vivienne; Chen, Yu-Hsin; Yang, Tien-Ju; Emer, Joel (2017). "Efficient Processing of Deep Neural Networks: A Tutorial and Survey". arXiv:1703.09039 [cs.CV]..
[86]
^"Announcement of the winners of the Merck Molecular Activity Challenge"..
[87]
^"Multi-task Neural Networks for QSAR Predictions | Data Science Association". www.datascienceassn.org. Retrieved 2017-06-14..
[88]
^"Toxicology in the 21st century Data Challenge".
[89]
^"NCATS Announces Tox21 Data Challenge Winners"..
[90]
^"Archived copy". Archived from the original on 2015-02-28. Retrieved 2015-03-05.CS1 maint: Archived copy as title (link).
[91]
^Ciresan, D. C.; Meier, U.; Masci, J.; Gambardella, L. M.; Schmidhuber, J. (2011). "Flexible, High Performance Convolutional Neural Networks for Image Classification" (PDF). International Joint Conference on Artificial Intelligence. doi:10.5591/978-1-57735-516-8/ijcai11-210..
[92]
^Ciresan, Dan; Giusti, Alessandro; Gambardella, Luca M.; Schmidhuber, Juergen (2012). Pereira, F.; Burges, C. J. C.; Bottou, L.; Weinberger, K. Q., eds. Advances in Neural Information Processing Systems 25 (PDF). Curran Associates, Inc. pp. 2843–2851..
[93]
^Ciresan, D.; Giusti, A.; Gambardella, L.M.; Schmidhuber, J. (2013). "Mitosis Detection in Breast Cancer Histology Images using Deep Neural Networks". Proceedings MICCAI. Lecture Notes in Computer Science. 7908: 411–418. doi:10.1007/978-3-642-40763-5_51. ISBN 978-3-642-38708-1..
[94]
^"The Wolfram Language Image Identification Project". www.imageidentify.com. Retrieved 2017-03-22..
[95]
^Vinyals, Oriol; Toshev, Alexander; Bengio, Samy; Erhan, Dumitru (2014). "Show and Tell: A Neural Image Caption Generator". arXiv:1411.4555 [cs.CV]...
[96]
^Fang, Hao; Gupta, Saurabh; Iandola, Forrest; Srivastava, Rupesh; Deng, Li; Dollár, Piotr; Gao, Jianfeng; He, Xiaodong; Mitchell, Margaret; Platt, John C; Lawrence Zitnick, C; Zweig, Geoffrey (2014). "From Captions to Visual Concepts and Back". arXiv:1411.4952 [cs.CV]...
[97]
^Kiros, Ryan; Salakhutdinov, Ruslan; Zemel, Richard S (2014). "Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models". arXiv:1411.2539 [cs.LG]...
[98]
^Zhong, Sheng-hua; Liu, Yan; Liu, Yang (2011). "Bilinear Deep Learning for Image Classification". Proceedings of the 19th ACM International Conference on Multimedia. MM '11. New York, NY, USA: ACM: 343–352. doi:10.1145/2072298.2072344. ISBN 9781450306164..
[99]
^"Why Deep Learning Is Suddenly Changing Your Life". Fortune. 2016. Retrieved 13 April 2018..
[100]
^Silver, David; Huang, Aja; Maddison, Chris J.; Guez, Arthur; Sifre, Laurent; Driessche, George van den; Schrittwieser, Julian; Antonoglou, Ioannis; Panneershelvam, Veda (January 2016). "Mastering the game of Go with deep neural networks and tree search". Nature. 529 (7587): 484–489. Bibcode:2016Natur.529..484S. doi:10.1038/nature16961. ISSN 1476-4687. PMID 26819042..
[101]
^Szegedy, Christian; Toshev, Alexander; Erhan, Dumitru (2013). "Deep neural networks for object detection". Advances in Neural Information Processing Systems..
[102]
^Hof, Robert D. "Is Artificial Intelligence Finally Coming into Its Own?". MIT Technology Review. Retrieved 2018-07-10..
[103]
^Gers, Felix A.; Schmidhuber, Jürgen (2001). "LSTM Recurrent Networks Learn Simple Context Free and Context Sensitive Languages". IEEE Trans. Neural Netw. 12 (6): 1333–1340. doi:10.1109/72.963769. PMID 18249962..
[104]
^Sutskever, L.; Vinyals, O.; Le, Q. (2014). "Sequence to Sequence Learning with Neural Networks" (PDF). Proc. NIPS..
[105]
^Jozefowicz, Rafal; Vinyals, Oriol; Schuster, Mike; Shazeer, Noam; Wu, Yonghui (2016). "Exploring the Limits of Language Modeling". arXiv:1602.02410 [cs.CL]..
[106]
^Gillick, Dan; Brunk, Cliff; Vinyals, Oriol; Subramanya, Amarnag (2015). "Multilingual Language Processing from Bytes". arXiv:1512.00103 [cs.CL]..
[107]
^Mikolov, T.; et al. (2010). "Recurrent neural network based language model" (PDF). Interspeech..
[108]
^"Learning Precise Timing with LSTM Recurrent Networks (PDF Download Available)". ResearchGate. Retrieved 2017-06-13..
[109]
^LeCun, Y.; et al. (1998). "Gradient-based learning applied to document recognition". Proceedings of the IEEE. 86 (11): 2278–2324. doi:10.1109/5.726791..
[110]
^Bengio, Y.; Boulanger-Lewandowski, N.; Pascanu, R. (May 2013). "Advances in optimizing recurrent networks". 2013 IEEE International Conference on Acoustics, Speech and Signal Processing: 8624–8628. arXiv:1212.0901. CiteSeerX 10.1.1.752.9151. doi:10.1109/icassp.2013.6639349. ISBN 978-1-4799-0356-6..
[111]
^Dahl, G.; et al. (2013). "Improving DNNs for LVCSR using rectified linear units and dropout" (PDF). ICASSP..
[112]
^"Data Augmentation - deeplearning.ai | Coursera". Coursera. Retrieved 2017-11-30..
[113]
^Hinton, G. E. (2010). "A Practical Guide to Training Restricted Boltzmann Machines". Tech. Rep. UTML TR 2010-003..
[114]
^You, Yang; Buluç, Aydın; Demmel, James (November 2017). "Scaling deep learning on GPU and knights landing clusters". SC '17, ACM. Retrieved 5 March 2018..
[115]
^Viebke, André; Memeti, Suejb; Pllana, Sabri; Abraham, Ajith (March 2017). "CHAOS: a parallelization scheme for training convolutional neural networks on Intel Xeon Phi". The Journal of Supercomputing. 75: 197–227. doi:10.1007/s11227-017-1994-x..
[116]
^Ting Qin, et al. "A learning algorithm of CMAC based on RLS." Neural Processing Letters 19.1 (2004): 49-61..
[117]
^Ting Qin, et al. "Continuous CMAC-QRLS and its systolic array." Neural Processing Letters 22.1 (2005): 1-16..
[118]
^TIMIT Acoustic-Phonetic Continuous Speech Corpus Linguistic Data Consortium, Philadelphia..
[119]
^Robinson, Tony (30 September 1991). "Several Improvements to a Recurrent Error Propagation Network Phone Recognition System". Cambridge University Engineering Department Technical Report. CUED/F-INFENG/TR82. doi:10.13140/RG.2.2.15418.90567..
[120]
^Abdel-Hamid, O.; et al. (2014). "Convolutional Neural Networks for Speech Recognition". IEEE/ACM Transactions on Audio, Speech, and Language Processing. 22 (10): 1533–1545. doi:10.1109/taslp.2014.2339736..
[121]
^Deng, L.; Platt, J. (2014). "Ensemble Deep Learning for Speech Recognition" (PDF). Proc. Interspeech..
[122]
^Tóth, Laszló (2015). "Phone Recognition with Hierarchical Convolutional Deep Maxout Networks" (PDF). EURASIP Journal on Audio, Speech, and Music Processing. 2015. doi:10.1186/s13636-015-0068-3..
[123]
^"How Skype Used AI to Build Its Amazing New Language Translator | WIRED". www.wired.com. Retrieved 2017-06-14..
[124]
^Hannun, Awni; Case, Carl; Casper, Jared; Catanzaro, Bryan; Diamos, Greg; Elsen, Erich; Prenger, Ryan; Satheesh, Sanjeev; Sengupta, Shubho; Coates, Adam; Ng, Andrew Y (2014). "Deep Speech: Scaling up end-to-end speech recognition". arXiv:1412.5567 [cs.CL]..
[125]
^"Plenary presentation at ICASSP-2016" (PDF)..
[126]
^"MNIST handwritten digit database, Yann LeCun, Corinna Cortes and Chris Burges". yann.lecun.com..
[127]
^Cireşan, Dan; Meier, Ueli; Masci, Jonathan; Schmidhuber, Jürgen (August 2012). "Multi-column deep neural network for traffic sign classification". Neural Networks. Selected Papers from IJCNN 2011. 32: 333–338. CiteSeerX 10.1.1.226.8219. doi:10.1016/j.neunet.2012.02.023. PMID 22386783..
[128]
^Nvidia Demos a Car Computer Trained with "Deep Learning" (2015-01-06), David Talbot, MIT Technology Review.
[129]
^G. W. Smith; Frederic Fol Leymarie (10 April 2017). "The Machine as Artist: An Introduction". Arts. Retrieved 4 October 2017..
[130]
^Blaise Agüera y Arcas (29 September 2017). "Art in the Age of Machine Intelligence". Arts. Retrieved 4 October 2017..
[131]
^Bengio, Yoshua; Ducharme, Réjean; Vincent, Pascal; Janvin, Christian (March 2003). "A Neural Probabilistic Language Model". J. Mach. Learn. Res. 3: 1137–1155. ISSN 1532-4435..
[132]
^Goldberg, Yoav; Levy, Omar (2014). "word2vec Explained: Deriving Mikolov et al.'s Negative-Sampling Word-Embedding Method". arXiv:1402.3722 [cs.CL]..
[133]
^Socher, Richard; Manning, Christopher. "Deep Learning for NLP" (PDF). Retrieved 26 October 2014..
[134]
^Socher, Richard; Bauer, John; Manning, Christopher; Ng, Andrew (2013). "Parsing With Compositional Vector Grammars" (PDF). Proceedings of the ACL 2013 Conference..
[135]
^Socher, Richard (2013). "Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank" (PDF)..
[136]
^Shen, Yelong; He, Xiaodong; Gao, Jianfeng; Deng, Li; Mesnil, Gregoire (2014-11-01). "A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval". Microsoft Research..
[137]
^Huang, Po-Sen; He, Xiaodong; Gao, Jianfeng; Deng, Li; Acero, Alex; Heck, Larry (2013-10-01). "Learning Deep Structured Semantic Models for Web Search using Clickthrough Data". Microsoft Research..
[138]
^Mesnil, G.; Dauphin, Y.; Yao, K.; Bengio, Y.; Deng, L.; Hakkani-Tur, D.; He, X.; Heck, L.; Tur, G.; Yu, D.; Zweig, G. (2015). "Using recurrent neural networks for slot filling in spoken language understanding". IEEE Transactions on Audio, Speech, and Language Processing. 23 (3): 530–539. doi:10.1109/taslp.2014.2383614..
[139]
^Gao, Jianfeng; He, Xiaodong; Yih, Scott Wen-tau; Deng, Li (2014-06-01). "Learning Continuous Phrase Representations for Translation Modeling". Microsoft Research..
[140]
^Brocardo, Marcelo Luiz; Traore, Issa; Woungang, Isaac; Obaidat, Mohammad S. (2017). "Authorship verification using deep belief network systems". International Journal of Communication Systems. 30 (12): e3259. doi:10.1002/dac.3259..
[141]
^"Deep Learning for Natural Language Processing: Theory and Practice (CIKM2014 Tutorial) - Microsoft Research". Microsoft Research. Retrieved 2017-06-14..
[142]
^Turovsky, Barak (November 15, 2016). "Found in translation: More accurate, fluent sentences in Google Translate". The Keyword Google Blog. Retrieved March 23, 2017..
[143]
^Schuster, Mike; Johnson, Melvin; Thorat, Nikhil (November 22, 2016). "Zero-Shot Translation with Google's Multilingual Neural Machine Translation System". Google Research Blog. Retrieved March 23, 2017..
[144]
^Sepp Hochreiter; Jürgen Schmidhuber (1997). "Long short-term memory". Neural Computation. 9 (8): 1735–1780. doi:10.1162/neco.1997.9.8.1735. PMID 9377276..
[145]
^Felix A. Gers; Jürgen Schmidhuber; Fred Cummins (2000). "Learning to Forget: Continual Prediction with LSTM". Neural Computation. 12 (10): 2451–2471. CiteSeerX 10.1.1.55.5709. doi:10.1162/089976600300015015..
[146]
^Wu, Yonghui; Schuster, Mike; Chen, Zhifeng; Le, Quoc V; Norouzi, Mohammad; Macherey, Wolfgang; Krikun, Maxim; Cao, Yuan; Gao, Qin; Macherey, Klaus; Klingner, Jeff; Shah, Apurva; Johnson, Melvin; Liu, Xiaobing; Kaiser, Łukasz; Gouws, Stephan; Kato, Yoshikiyo; Kudo, Taku; Kazawa, Hideto; Stevens, Keith; Kurian, George; Patil, Nishant; Wang, Wei; Young, Cliff; Smith, Jason; Riesa, Jason; Rudnick, Alex; Vinyals, Oriol; Corrado, Greg; et al. (2016). "Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation". arXiv:1609.08144 [cs.CL]..
[147]
^"An Infusion of AI Makes Google Translate More Powerful Than Ever." Cade Metz, WIRED, Date of Publication: 09.27.16. https://www.wired.com/2016/09/google-claims-ai-breakthrough-machine-translation/.
[148]
^Boitet, Christian; Blanchon, Hervé; Seligman, Mark; Bellynck, Valérie (2010). "MT on and for the Web" (PDF). Retrieved December 1, 2016..
[149]
^Arrowsmith, J; Miller, P (2013). "Trial watch: Phase II and phase III attrition rates 2011-2012". Nature Reviews Drug Discovery. 12 (8): 569. doi:10.1038/nrd4090. PMID 23903212..
[150]
^Verbist, B; Klambauer, G; Vervoort, L; Talloen, W; The Qstar, Consortium; Shkedy, Z; Thas, O; Bender, A; Göhlmann, H. W.; Hochreiter, S (2015). "Using transcriptomics to guide lead optimization in drug discovery projects: Lessons learned from the QSTAR project". Drug Discovery Today. 20 (5): 505–513. doi:10.1016/j.drudis.2014.12.014. PMID 25582842..
[151]
^Wallach, Izhar; Dzamba, Michael; Heifets, Abraham (2015-10-09). "AtomNet: A Deep Convolutional Neural Network for Bioactivity Prediction in Structure-based Drug Discovery". arXiv:1510.02855 [cs.LG]..
[152]
^"Toronto startup has a faster way to discover effective medicines". The Globe and Mail. Retrieved 2015-11-09..
[153]
^"Startup Harnesses Supercomputers to Seek Cures". KQED Future of You. Retrieved 2015-11-09..
[154]
^"Toronto startup has a faster way to discover effective medicines"..
[155]
^Tkachenko, Yegor (April 8, 2015). "Autonomous CRM Control via CLV Approximation with Deep Reinforcement Learning in Discrete and Continuous Action Space". arXiv:1504.01840 [cs.LG]..
[156]
^van den Oord, Aaron; Dieleman, Sander; Schrauwen, Benjamin (2013). Burges, C. J. C.; Bottou, L.; Welling, M.; Ghahramani, Z.; Weinberger, K. Q., eds. Advances in Neural Information Processing Systems 26 (PDF). Curran Associates, Inc. pp. 2643–2651..
[157]
^Elkahky, Ali Mamdouh; Song, Yang; He, Xiaodong (2015-05-01). "A Multi-View Deep Learning Approach for Cross Domain User Modeling in Recommendation Systems". Microsoft Research..
[158]
^Chicco, Davide; Sadowski, Peter; Baldi, Pierre (1 January 2014). Deep Autoencoder Neural Networks for Gene Ontology Annotation Predictions. Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics - BCB '14. ACM. pp. 533–540. doi:10.1145/2649387.2649442. hdl:11311/964622. ISBN 9781450328944..
[159]
^Sathyanarayana, Aarti (2016-01-01). "Sleep Quality Prediction From Wearable Data Using Deep Learning". JMIR mHealth and uHealth. 4 (4): e125. doi:10.2196/mhealth.6562. PMC 5116102. PMID 27815231..
[160]
^Choi, Edward; Schuetz, Andy; Stewart, Walter F.; Sun, Jimeng (2016-08-13). "Using recurrent neural network models for early detection of heart failure onset". Journal of the American Medical Informatics Association. 24 (2): 361–370. doi:10.1093/jamia/ocw112. ISSN 1067-5027. PMC 5391725. PMID 27521897..
[161]
^"Deep Learning in Healthcare: Challenges and Opportunities". Medium. 2016-08-12. Retrieved 2018-04-10..
[162]
^Litjens, Geert; Kooi, Thijs; Bejnordi, Babak Ehteshami; Setio, Arnaud Arindra Adiyoso; Ciompi, Francesco; Ghafoorian, Mohsen; van der Laak, Jeroen A.W.M.; van Ginneken, Bram; Sánchez, Clara I. (December 2017). "A survey on deep learning in medical image analysis". Medical Image Analysis. 42: 60–88. doi:10.1016/j.media.2017.07.005..
[163]
^Forslid, Gustav; Wieslander, Hakan; Bengtsson, Ewert; Wahlby, Carolina; Hirsch, Jan-Michael; Stark, Christina Runow; Sadanandan, Sajith Kecheril (October 2017). "Deep Convolutional Neural Networks for Detecting Cellular Changes Due to Malignancy". 2017 IEEE International Conference on Computer Vision Workshops (ICCVW). Venice: IEEE: 82–89. doi:10.1109/ICCVW.2017.18. ISBN 9781538610343..
[164]
^De, Shaunak; Maity, Abhishek; Goel, Vritti; Shitole, Sanjay; Bhattacharya, Avik (2017). "Predicting the popularity of instagram posts for a lifestyle magazine using deep learning". 2nd IEEE Conference on Communication Systems, Computing and IT Applications: 174–177. doi:10.1109/CSCITA.2017.8066548. ISBN 978-1-5090-4381-1..
[165]
^Schmidt, Uwe; Roth, Stefan. Shrinkage Fields for Effective Image Restoration (PDF). Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on..
[166]
^Czech, Tomasz. "Deep learning: the next frontier for money laundering detection". Global Banking and Finance Review..
[167]
^"Army researchers develop new algorithms to train robots". EurekAlert!. Retrieved 2018-08-29..
[168]
^Utgoff, P. E.; Stracuzzi, D. J. (2002). "Many-layered learning". Neural Computation. 14 (10): 2497–2529. doi:10.1162/08997660260293319. PMID 12396572..
[169]
^Elman, Jeffrey L. (1998). Rethinking Innateness: A Connectionist Perspective on Development. MIT Press. ISBN 978-0-262-55030-7..
[170]
^Shrager, J.; Johnson, MH (1996). "Dynamic plasticity influences the emergence of function in a simple cortical array". Neural Networks. 9 (7): 1119–1129. doi:10.1016/0893-6080(96)00033-0. PMID 12662587..
[171]
^Quartz, SR; Sejnowski, TJ (1997). "The neural basis of cognitive development: A constructivist manifesto". Behavioral and Brain Sciences. 20 (4): 537–556. CiteSeerX 10.1.1.41.7854. doi:10.1017/s0140525x97001581..
[172]
^S. Blakeslee., "In brain's early growth, timetable may be critical," The New York Times, Science Section, pp. B5–B6, 1995..
[173]
^Mazzoni, P.; Andersen, R. A.; Jordan, M. I. (1991-05-15). "A more biologically plausible learning rule for neural networks". Proceedings of the National Academy of Sciences. 88 (10): 4433–4437. Bibcode:1991PNAS...88.4433M. doi:10.1073/pnas.88.10.4433. ISSN 0027-8424. PMC 51674. PMID 1903542..
[174]
^O'Reilly, Randall C. (1996-07-01). "Biologically Plausible Error-Driven Learning Using Local Activation Differences: The Generalized Recirculation Algorithm". Neural Computation. 8 (5): 895–938. doi:10.1162/neco.1996.8.5.895. ISSN 0899-7667..
[175]
^Testolin, Alberto; Zorzi, Marco (2016). "Probabilistic Models and Generative Neural Networks: Towards an Unified Framework for Modeling Normal and Impaired Neurocognitive Functions". Frontiers in Computational Neuroscience. 10: 73. doi:10.3389/fncom.2016.00073. ISSN 1662-5188. PMC 4943066. PMID 27468262..
[176]
^Testolin, Alberto; Stoianov, Ivilin; Zorzi, Marco (September 2017). "Letter perception emerges from unsupervised deep learning and recycling of natural image features". Nature Human Behaviour. 1 (9): 657–664. doi:10.1038/s41562-017-0186-2. ISSN 2397-3374..
[177]
^Buesing, Lars; Bill, Johannes; Nessler, Bernhard; Maass, Wolfgang (2011-11-03). "Neural Dynamics as Sampling: A Model for Stochastic Computation in Recurrent Networks of Spiking Neurons". PLOS Computational Biology. 7 (11): e1002211. Bibcode:2011PLSCB...7E2211B. doi:10.1371/journal.pcbi.1002211. ISSN 1553-7358. PMC 3207943. PMID 22096452..
[178]
^Morel, Danielle; Singh, Chandan; Levy, William B. (2018-01-25). "Linearization of excitatory synaptic integration at no extra cost". Journal of Computational Neuroscience. 44 (2): 173–188. doi:10.1007/s10827-017-0673-5. ISSN 0929-5313. PMID 29372434..
[179]
^Cash, S.; Yuste, R. (February 1999). "Linear summation of excitatory inputs by CA1 pyramidal neurons". Neuron. 22 (2): 383–394. doi:10.1016/s0896-6273(00)81098-3. ISSN 0896-6273. PMID 10069343..
[180]
^Olshausen, B; Field, D (2004-08-01). "Sparse coding of sensory inputs". Current Opinion in Neurobiology. 14 (4): 481–487. doi:10.1016/j.conb.2004.07.007. ISSN 0959-4388..
[181]
^Yamins, Daniel L K; DiCarlo, James J (March 2016). "Using goal-driven deep learning models to understand sensory cortex". Nature Neuroscience. 19 (3): 356–365. doi:10.1038/nn.4244. ISSN 1546-1726..
[182]
^Zorzi, Marco; Testolin, Alberto (2018-02-19). "An emergentist perspective on the origin of number sense". Phil. Trans. R. Soc. B. 373 (1740): 20170043. doi:10.1098/rstb.2017.0043. ISSN 0962-8436. PMC 5784047. PMID 29292348..
[183]
^Güçlü, Umut; van Gerven, Marcel A. J. (2015-07-08). "Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream". Journal of Neuroscience. 35 (27): 10005–10014. arXiv:1411.6422. doi:10.1523/jneurosci.5023-14.2015. PMID 26157000..
[184]
^Metz, C. (12 December 2013). "Facebook's 'Deep Learning' Guru Reveals the Future of AI". Wired..
[185]
^"Google AI algorithm masters ancient game of Go". Nature News & Comment. Retrieved 2016-01-30..
[186]
^Silver, David; Huang, Aja; Maddison, Chris J.; Guez, Arthur; Sifre, Laurent; Driessche, George van den; Schrittwieser, Julian; Antonoglou, Ioannis; Panneershelvam, Veda; Lanctot, Marc; Dieleman, Sander; Grewe, Dominik; Nham, John; Kalchbrenner, Nal; Sutskever, Ilya; Lillicrap, Timothy; Leach, Madeleine; Kavukcuoglu, Koray; Graepel, Thore; Hassabis, Demis (28 January 2016). "Mastering the game of Go with deep neural networks and tree search". Nature. 529 (7587): 484–489. Bibcode:2016Natur.529..484S. doi:10.1038/nature16961. ISSN 0028-0836. PMID 26819042..
[187]
^"A Google DeepMind Algorithm Uses Deep Learning and More to Master the Game of Go | MIT Technology Review". MIT Technology Review. Retrieved 2016-01-30..
[188]
^"Blippar Demonstrates New Real-Time Augmented Reality App". TechCrunch..
[189]
^"TAMER: Training an Agent Manually via Evaluative Reinforcement - IEEE Conference Publication". ieeexplore.ieee.org. Retrieved 2018-08-29..
[190]
^"Talk to the Algorithms: AI Becomes a Faster Learner". governmentciomedia.com. Retrieved 2018-08-29..
[191]
^Marcus, Gary (2018-01-14). "In defense of skepticism about deep learning". Gary Marcus. Retrieved 2018-10-11..
[192]
^Knight, Will (2017-03-14). "DARPA is funding projects that will try to open up AI's black boxes". MIT Technology Review. Retrieved 2017-11-02..
[193]
^Marcus, Gary (November 25, 2012). "Is "Deep Learning" a Revolution in Artificial Intelligence?". The New Yorker. Retrieved 2017-06-14..
[194]
^Smith, G. W. (March 27, 2015). "Art and Artificial Intelligence". ArtEnt. Archived from the original on June 25, 2017. Retrieved March 27, 2015.CS1 maint: BOT: original-url status unknown (link).
[195]
^Mellars, Paul (February 1, 2005). "The Impossible Coincidence: A Single-Species Model for the Origins of Modern Human Behavior in Europe" (PDF). Evolutionary Anthropology: Issues, News, and Reviews. Retrieved April 5, 2017..
[196]
^Alexander Mordvintsev; Christopher Olah; Mike Tyka (June 17, 2015). "Inceptionism: Going Deeper into Neural Networks". Google Research Blog. Retrieved June 20, 2015..
[197]
^Alex Hern (June 18, 2015). "Yes, androids do dream of electric sheep". The Guardian. Retrieved June 20, 2015..
[198]
^Goertzel, Ben (2015). "Are there Deep Reasons Underlying the Pathologies of Today's Deep Learning Algorithms?" (PDF)..
[199]
^Nguyen, Anh; Yosinski, Jason; Clune, Jeff (2014). "Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images". arXiv:1412.1897 [cs.CV]..
[200]
^Szegedy, Christian; Zaremba, Wojciech; Sutskever, Ilya; Bruna, Joan; Erhan, Dumitru; Goodfellow, Ian; Fergus, Rob (2013). "Intriguing properties of neural networks". arXiv:1312.6199 [cs.CV]..
[201]
^Zhu, S.C.; Mumford, D. (2006). "A stochastic grammar of images". Found. Trends Comput. Graph. Vis. 2 (4): 259–362. CiteSeerX 10.1.1.681.2190. doi:10.1561/0600000018..
[202]
^Miller, G. A., and N. Chomsky. "Pattern conception." Paper for Conference on pattern detection, University of Michigan. 1957..
[203]
^Eisner, Jason. "Deep Learning of Recursive Structure: Grammar Induction"..
[204]
^"AI Is Easy to Fool—Why That Needs to Change". Singularity Hub. 2017-10-10. Retrieved 2017-10-11..
[205]
^Gibney, Elizabeth (2017). "The scientist who spots fake videos". Nature. doi:10.1038/nature.2017.22784..

阅读 4.2w