Elzbieta Pekalska, Robert P. W. Duin9789812565303, 981-256-530-2
Table of contents :
Contents……Page 22
Preface……Page 8
Notation and basic terminology……Page 12
Abbreviations……Page 20
1.1 Recognizing the pattern……Page 28
1.2 Dissimilarities for representation……Page 29
1.3 Learning from examples……Page 31
1.4 Motivation of the use of dissimilarity representations……Page 35
1.5 Relation to kernels……Page 40
1.6 Outline of the book……Page 41
1.7 In summary……Page 43
PART 1 Concepts and theory……Page 48
2. Spaces……Page 50
2.1 Preliminaries……Page 52
2.2 A brief look at spaces……Page 55
2.3 Generalized topological spaces……Page 59
2.4 Generalized metric spaces……Page 73
2.5 Vector spaces……Page 83
2.6 Normed and inner product spaces……Page 89
2.6.1 Reproducing kernel Halbert spaces……Page 96
2.7 Indefinite inner product spaces……Page 98
2.7.1 Reproducing kernel Krez’n spaces……Page 112
2.8 Discussion……Page 114
3. Characterization of dissimilarities……Page 116
3.1.1 Embeddings……Page 117
3.2 Tree models for dissimilarities……Page 122
3.3.1 Transformations in semimetric spaces……Page 126
3.3.2 Direct product spaces……Page 129
3.3.3 Invariance and robustness……Page 130
3.4.1 Dissimilarity matrices……Page 132
3.4.2 Square distances and inner products……Page 143
3.5.1 Euclidean embedding……Page 145
3.5.2 Correction of non-Euclidean dissimilarities……Page 147
3.5.3 Pseudo-Euclidean embedding……Page 149
3.5.4 Generalized average variance……Page 151
3.5.5 Projecting new vectors t o an embedded space……Page 152
3.5.6 Reduction of dimension……Page 154
3.5.7 Reduction of complexity……Page 155
3.5.8 A general embedding……Page 156
3.5.9 Spherical embeddings……Page 157
3.6 Spatial representation of dissimilarities……Page 159
3.6.1 FastMap……Page 160
3.6.2 Multidimensional scaling……Page 162
3.6.3 Reduction of complexity……Page 170
3.7 Summary……Page 171
4. Learning approaches……Page 174
4.1.1 Data bias and model bias……Page 175
4.1.2 Statistical learning……Page 178
4.1.3 Inductive principles……Page 181
4.1.3.1 Empirical risk minimization (ERM)……Page 183
Structural Risk Minimization (SRM)…….Page 187
Regularization principle…….Page 188
Bayesian inference…….Page 189
4.1.4 Why is the statistical approach not good enough f o r learning from objects?……Page 190
4.2 The role of dissimilarity representations……Page 193
4.2.1 Learned proximity representations……Page 198
4.2.2 Dissimilarity representations: learning……Page 199
4.3 Classification in generalized topological spaces……Page 202
4.4.1 Characterization of dissimilarity spaces……Page 207
4.4.2 Classifiers……Page 212
4.5 Classification in pseudo-Euclidean spaces……Page 223
4.6 On generalized kernels and dissimilarity spaces……Page 232
4.6.1 Connection between dissimilarity spaces and pseudo- Euclidean spaces……Page 236
4.7 Discussion……Page 238
5. Dissimilarity measures……Page 242
5.1 Measures depending on feature types……Page 243
5.2.1 Normal distributions……Page 255
5.2.2 Divergence measures……Page 256
5.2.3 Discrete probability distributions……Page 260
5.3 Dissimilarity measures between sequences……Page 261
5.4 Information-theoretic measures……Page 264
5.5 Dissimilarity measures between sets……Page 265
5.6.2 Example measures……Page 269
5.7 Discussion and conclusions……Page 277
PART 2 Practice……Page 280
6. Visualization……Page 282
6.1 Multidimensional scaling……Page 284
6.1.1 First examples……Page 286
6.1.2 Linear and nonlinear methods: ezarnples……Page 288
6.1.3 Implementation……Page 294
6.2 Other mappings……Page 295
6.3 Examples: getting insight into the data……Page 301
6.4 Tree models……Page 308
6.5 Summary……Page 314
7. Further data exploration……Page 316
7.1.1 Standard approaches……Page 317
7.1.2 Clustering o n dissimilarity representations……Page 322
7.1.3 Clustering examples for dissimilarity representations……Page 330
7.2 Intrinsic dimension……Page 336
7.3 Sampling density……Page 346
7.3.1 Proposed criteria……Page 347
7.3.2 Experiments with the NIST digits……Page 352
7.4 Summary……Page 358
8. One-class classifiers……Page 360
8.1 General issues……Page 363
8.1.1 Construction of one-class classifiers……Page 364
8.1.2 One-class classijiers in feature spaces……Page 368
8.2 Domain descriptors for dissimilarity representations……Page 373
8.2.1 Neighborhood-based OCCs……Page 375
8.2.2 Generalized mean class descriptor……Page 377
8.2.3 Linear programming dissimilarit3 data description……Page 380
8.2.4 More issues on class descriptors……Page 386
8.3.1 Experiment I: Condition monitoring……Page 393
8.3.2 Experiment 11: Diseased mucosa in the oral cavity……Page 401
8.3.3 Experiment 111: Heart disease data……Page 404
8.4 Conclusions……Page 406
9. Classification……Page 410
9.1.1 NN rule us alternative dissimilarity-based classifiers……Page 411
9.1.2 Experiment I: square dissimilarity representations……Page 415
9.1.3 Experiment 11: the dissimilarity space approach……Page 416
9.1.4 Discussion……Page 422
9.2 Selection of the representation set: the dissimilarity space approach……Page 423
9.2.1 Prototype selection methods……Page 425
9.2.2 Experimental setup……Page 428
9.2.3 Results and discussion……Page 431
9.2.4 Conclusions……Page 443
9.3 Selection of the representation set: the embedding ap- proach……Page 444
9.3.1 Prototype selection methods……Page 445
9.3.2 Experiments and results……Page 448
9.4 On corrections of dissimilarity measures……Page 455
9.4.1 Going more Euclidean……Page 456
9.4.2 Experimental setup……Page 457
9.4.3 Results and conclusions……Page 459
9.5 A few remarks on a simulated missing value problem……Page 466
9.6 Existence of zero-error dissimilarity-based classifiers……Page 470
9.6.1 Asymptotic separability of classes……Page 471
9.7 Final discussion……Page 478
10. Combining……Page 480
10.1 Combining for one-class classification……Page 482
10.1.1 Combining strategies……Page 483
10.1.2 Data and experimental setup……Page 486
10.1.3 Results and discussion……Page 489
10.1.4 Summary and conclusions……Page 492
10.2.1 Combining strategies……Page 493
10.2.2 Experiments on the handwritten digit set……Page 495
10.2.3 Results……Page 497
10.2.4 Conclusions……Page 500
10.3 Classifier projection space……Page 501
10.3.1 Construction and the use of CPS……Page 502
10.4 Summary……Page 510
11.1 Representation review……Page 512
11.1.1 Three generalization ways……Page 513
11.I .2 Representation formation……Page 516
11.1.3 Generalization capabilities……Page 519
11.2 Practical considerations……Page 520
11.2.1 Clustering……Page 522
11.2.2 One-class classification……Page 523
11.2.3 Classification……Page 524
12. Conclusions and open problems……Page 530
12.1 Summary and contributions……Page 532
12.2 Extensions of dissimilarity representations……Page 535
12.3 Open questions……Page 537
Appendix A On convex and concave functions……Page 542
B.l Some facts on matrices in a Euclidean space……Page 546
B.2 Some facts on matrices in a pseudo-Euclidean space……Page 550
Appendix C Measure and probability……Page 554
D.l Likelihood and parameter estimation……Page 560
D.2 Expectation-maximization (EM) algorithm……Page 562
D.3 Model selection……Page 563
D.4.1 Gaussian model……Page 565
D.4.2 A Gaussian mixture model……Page 566
D.4.3 PCA……Page 568
D.4.4 Probabilistic PCA……Page 569
D.4.5 A mixture of probabilistic PCA……Page 570
E.l Artificial data sets……Page 572
E.2 Real-world data sets……Page 576
Bibliography……Page 588
Index……Page 626
Reviews
There are no reviews yet.