A. C. C. Coolen, R. Kühn, P. Sollich9780198530244, 0198530242

This interdisciplinary graduate text gives a full, explicit, coherent and up-to-date account of the modern theory of neural information processing systems and is aimed at student with an undergraduate degree in any quantitative discipline (e.g. computer science, physics, engineering, biology, or mathematics). The book covers all the major theoretical developments from the 1940s tot he present day, using a uniform and rigorous style of presentation and of mathematical notation. The text starts with simple model neurons and moves gradually to the latest advances in neural processing. An ideal textbook for postgraduate courses in artificial neural networks, the material has been class-tested. It is fully self contained and includes introductions to the various discipline-specific mathematical tools as well as multiple exercises on each topic.

Table of contents :
Contents……Page 12
Part I: Introduction to neural networks……Page 18
1.1 Principles of neural information processing……Page 20
1.2 Biological neurons and model neurons……Page 24
1.3 Universality of McCulloch–Pitts neurons……Page 38
1.4 Exercises……Page 42
2.1 Linear separability……Page 46
2.2 Multilayer networks……Page 50
2.3 The perceptron……Page 53
2.4 Learning in layered networks: error backpropagation……Page 63
2.5 Learning dynamics in small learning rate perceptrons……Page 69
2.6 Numerical simulations……Page 75
2.7 Exercises……Page 81
3 Recurrent networks with binary neurons……Page 86
3.1 Noiseless recurrent networks……Page 87
3.2 Synaptic symmetry and Lyapunov functions……Page 94
3.3 Information processing in recurrent networks……Page 98
3.4 Exercises……Page 104
4 Notes and suggestions for further reading……Page 106
Part II: Advanced neural networks……Page 110
5.1 Vector quantization……Page 112
5.2 Soft vector quantization……Page 115
5.3 Time-dependent learning rates……Page 127
5.4 Self-organizing maps……Page 131
5.5 Exercises……Page 139
6.1 Preliminaries and introduction……Page 144
6.2 Bayesian learning of network weights……Page 153
6.3 Predictions with error bars: real-valued functions……Page 161
6.4 Predictions with error bars: binary classification……Page 169
6.5 Bayesian model selection……Page 173
6.6 Practicalities: measuring curvature……Page 180
6.7 Exercises……Page 181
7.1 The underlying idea……Page 186
7.2 Examples of networks reducing to Gaussian processes……Page 189
7.3 The ‘priors over functions’ point of view……Page 193
7.4 Stationary covariance functions……Page 194
7.5 Learning and prediction with Gaussian processes……Page 197
7.6 Exercises……Page 199
8.1 Optimal separating plane for linearly separable tasks……Page 202
8.2 Representation in terms of support vectors……Page 206
8.3 Preprocessing and SVM kernels……Page 214
8.4 Exercises……Page 219
9 Notes and suggestions for further reading……Page 222
Part III: Information theory and neural networks……Page 224
10.1 Brute force: counting messages……Page 226
10.2 Exploiting message likelihood differences via coding……Page 229
10.3 Proposal for a measure of information……Page 235
11.1 Coding theory and the Kraft inequality……Page 240
11.2 Entropy and optimal coding……Page 246
11.3 Shannon’s original proof……Page 250
12.1 Entropy……Page 252
12.2 Joint and conditional entropy……Page 257
12.3 Relative entropy and mutual information……Page 262
12.4 Information measures for continuous random variables……Page 268
12.5 Exercises……Page 275
13.1 Maximum likelihood estimation……Page 278
13.2 The maximum entropy principle……Page 281
13.3 Exercises……Page 287
14.1 Supervised learning: Boltzmann machines……Page 290
14.2 Maximum information preservation……Page 298
14.3 Neuronal specialization……Page 302
14.4 Detection of coherent features……Page 311
14.5 The effect of non-linearities……Page 314
14.6 Introduction to Amari’s information geometry……Page 316
14.7 Simple applications of information geometry……Page 323
14.8 Exercises……Page 328
15 Notes and suggestions for further reading……Page 330
Part IV: Macroscopic analysis of dynamics……Page 332
16 Network operation: macroscopic dynamics……Page 334
16.1 Microscopic dynamics in probabilistic form……Page 335
16.2 Sequential dynamics……Page 341
16.3 Parallel dynamics……Page 355
16.4 Exercises……Page 362
17.1 Probabilistic definitions, performance measures……Page 366
17.2 Explicit learning rules……Page 370
17.3 Optimized learning rules……Page 385
17.4 Exercises……Page 399
18.1 Online gradient descent……Page 402
18.2 Learning from noisy examples……Page 409
18.3 Exercises……Page 411
19 Notes and suggestions for further reading……Page 414
Part V: Equilibrium statistical mechanics of neural networks……Page 416
20.1 Stationary distributions and ergodicity……Page 418
20.2 Detailed balance and interaction symmetry……Page 425
20.3 Equilibrium statistical mechanics: concepts, definitions……Page 430
20.4 A simple example: storing a single pattern……Page 436
20.5 Phase transitions and ergodicity breaking……Page 442
20.6 Exercises……Page 447
21 Network operation: equilibrium analysis……Page 454
21.1 Hopfield model with finite number of patterns……Page 455
21.2 Introduction to replica theory: the SK model……Page 464
21.3 Hopfield model with an extensive number of patterns……Page 477
21.4 Exercises……Page 497
22.1 The space of interactions……Page 506
22.2 Capacity of perceptrons—definition and toy example……Page 511
22.3 Capacity of perceptrons—random inputs……Page 515
23 Notes and suggestions for further reading……Page 524
A.1 Discrete event sets……Page 528
A.2 Continuous event sets……Page 530
A.3 Averages of specific random variables……Page 532
B.1 Moment condition……Page 534
B.2 Lindeberg’s theorem……Page 535
Appendix C: Some simple summation identities……Page 538
D.1 General properties of Gaussian integrals……Page 540
D.2 Gaussian probability distributions……Page 544
D.3 A list of specific Gaussian integrals……Page 548
E.1 Block inverses……Page 554
E.2 The Woodbury formula……Page 555
F.1 Definition……Page 556
F.2 δ(x) as solution of the Liouville equation……Page 557
F.3 Representations, relations, and generalizations……Page 558
Appendix G: Inequalities based on convexity……Page 560
H.1 Local distance definitions……Page 566
H.2 The triangular inequality……Page 567
H.3 Global distance definitions……Page 568
Appendix I: Saddle-point integration……Page 570
References……Page 572
C……Page 580
F……Page 581
J……Page 582
N……Page 583
R……Page 584
T……Page 585
Z……Page 586

Reviews

There are no reviews yet.

Be the first to review “Theory of Neural Information Processing Systems”

You must be logged in to post a review.