Toby Segaran0596529325, 9780596529321
Table of contents :
Programming Collective Intelligence……Page 1
Table of Contents……Page 8
Prerequisites……Page 14
Why Python?……Page 15
Significant Whitespace……Page 16
Open APIs……Page 17
Overview of the Chapters……Page 18
Conventions……Page 20
How to Contact Us……Page 21
Acknowledgments……Page 22
Introduction to Collective Intelligence……Page 24
What Is Collective Intelligence?……Page 25
What Is Machine Learning?……Page 26
Limits of Machine Learning……Page 27
Other Uses for Learning Algorithms……Page 28
Collaborative Filtering……Page 30
Collecting Preferences……Page 31
Finding Similar Users……Page 32
Euclidean Distance Score……Page 33
Pearson Correlation Score……Page 34
Ranking the Critics……Page 37
Recommending Items……Page 38
Matching Products……Page 40
Building a del.icio.us Link Recommender……Page 42
Building the Dataset……Page 43
Item-Based Filtering……Page 45
Building the Item Comparison Dataset……Page 46
Getting Recommendations……Page 47
Using the MovieLens Dataset……Page 48
User-Based or Item-Based Filtering?……Page 50
Exercises……Page 51
Supervised versus Unsupervised Learning……Page 52
Pigeonholing the Bloggers……Page 53
Counting the Words in a Feed……Page 54
Hierarchical Clustering……Page 56
Drawing the Dendrogram……Page 61
Column Clustering……Page 63
K-Means Clustering……Page 65
Clusters of Preferences……Page 67
Scraping the Zebo Results……Page 68
Clustering Results……Page 70
Viewing Data in Two Dimensions……Page 72
Exercises……Page 76
What’s in a Search Engine?……Page 77
Using urllib2……Page 79
Crawler Code……Page 80
Building the Index……Page 81
Setting Up the Schema……Page 82
Finding the Words on a Page……Page 83
Adding to the Index……Page 84
Querying……Page 86
Content-Based Ranking……Page 87
Word Frequency……Page 89
Document Location……Page 90
Word Distance……Page 91
Simple Count……Page 92
The PageRank Algorithm……Page 93
Using the Link Text……Page 96
Design of a Click-Tracking Network……Page 97
Setting Up the Database……Page 98
Feeding Forward……Page 101
Training with Backpropagation……Page 103
Connecting to the Search Engine……Page 106
Exercises……Page 107
Optimization……Page 109
Group Travel……Page 110
Representing Solutions……Page 111
The Cost Function……Page 112
Random Searching……Page 114
Hill Climbing……Page 115
Simulated Annealing……Page 118
Genetic Algorithms……Page 120
The Kayak API……Page 124
Flight Searches……Page 125
Student Dorm Optimization……Page 129
Running the Optimization……Page 132
The Layout Problem……Page 133
Counting Crossed Lines……Page 135
Drawing the Network……Page 136
Other Possibilities……Page 138
Exercises……Page 139
Filtering Spam……Page 140
Documents and Words……Page 141
Training the Classifier……Page 142
Calculating Probabilities……Page 144
Starting with a Reasonable Guess……Page 145
A Naïve Classifier……Page 146
Probability of a Whole Document……Page 147
A Quick Introduction to Bayes’ Theorem……Page 148
Choosing a Category……Page 149
The Fisher Method……Page 150
Category Probabilities for Features……Page 151
Combining the Probabilities……Page 152
Classifying Items……Page 153
Using SQLite……Page 155
Filtering Blog Feeds……Page 157
Improving Feature Detection……Page 159
Using Akismet……Page 161
Alternative Methods……Page 162
Exercises……Page 163
Predicting Signups……Page 165
Introducing Decision Trees……Page 167
Training the Tree……Page 168
Gini Impurity……Page 170
Entropy……Page 171
Recursive Tree Building……Page 172
Displaying the Tree……Page 174
Graphical Display……Page 175
Classifying New Observations……Page 176
Pruning the Tree……Page 177
Dealing with Missing Data……Page 179
Modeling Home Prices……Page 181
The Zillow API……Page 182
Modeling “Hotness”……Page 184
When to Use Decision Trees……Page 187
Exercises……Page 188
Building a Sample Dataset……Page 190
Number of Neighbors……Page 192
Code for k-Nearest Neighbors……Page 194
Inverse Function……Page 195
Subtraction Function……Page 196
Gaussian Function……Page 197
Weighted kNN……Page 198
Cross-Validation……Page 199
Adding to the Dataset……Page 201
Scaling Dimensions……Page 203
Optimizing the Scale……Page 204
Estimating the Probability Density……Page 206
Graphing the Probabilities……Page 208
Getting a Developer Key……Page 212
Setting Up a Connection……Page 213
Performing a Search……Page 214
Getting Details for an Item……Page 216
Building a Price Predictor……Page 217
When to Use k-Nearest Neighbors……Page 218
Exercises……Page 219
Matchmaker Dataset……Page 220
Decision Tree Classifier……Page 222
Basic Linear Classification……Page 225
Categorical Features……Page 228
Lists of Interests……Page 229
Using the Geocoding API……Page 230
Calculating the Distance……Page 231
Scaling the Data……Page 232
Understanding Kernel Methods……Page 234
The Kernel Trick……Page 235
Support-Vector Machines……Page 238
A Sample Session……Page 240
Applying SVM to the Matchmaker Dataset……Page 241
Getting a Developer Key……Page 242
Creating a Session……Page 243
Download Friend Data……Page 245
Building a Match Dataset……Page 246
Creating an SVM Model……Page 247
Exercises……Page 248
Finding Independent Features……Page 249
Selecting Sources……Page 250
Downloading Sources……Page 251
Converting to a Matrix……Page 253
Bayesian Classification……Page 254
A Quick Introduction to Matrix Math……Page 255
What Does This Have to Do with the Articles Matrix?……Page 257
Using NumPy……Page 259
The Algorithm……Page 260
Displaying the Results……Page 263
Displaying by Article……Page 265
What Is Trading Volume?……Page 266
Downloading Data from Yahoo! Finance……Page 267
Preparing a Matrix……Page 268
Displaying the Results……Page 269
Exercises……Page 271
What Is Genetic Programming?……Page 273
Genetic Programming Versus Genetic Algorithms……Page 274
Programs As Trees……Page 276
Representing Trees in Python……Page 277
Building and Evaluating Trees……Page 278
Displaying the Program……Page 279
Creating the Initial Population……Page 280
A Simple Mathematical Test……Page 282
Mutating Programs……Page 283
Crossover……Page 286
Building the Environment……Page 288
A Simple Game……Page 291
A Round-Robin Tournament……Page 293
Playing Against Real People……Page 295
More Numerical Functions……Page 296
Different Datatypes……Page 297
Exercises……Page 299
Bayesian Classifier……Page 300
Training……Page 301
Using Your Code……Page 302
Strengths and Weaknesses……Page 303
Training……Page 304
Using Your Decision Tree Classifier……Page 306
Strengths and Weaknesses……Page 307
Neural Networks……Page 308
Using Your Neural Network Code……Page 310
Strengths and Weaknesses……Page 311
Support-Vector Machines……Page 312
The Kernel Trick……Page 313
Using LIBSVM……Page 314
Strengths and Weaknesses……Page 315
k-Nearest Neighbors……Page 316
Scaling and Superfluous Variables……Page 317
Using Your kNN Code……Page 318
Clustering……Page 319
K-Means Clustering……Page 320
Using Your Clustering Code……Page 321
Multidimensional Scaling……Page 323
Using Your Multidimensional Scaling Code……Page 324
Non-Negative Matrix Factorization……Page 325
Optimization……Page 327
The Cost Function……Page 328
Using Your Optimization Code……Page 329
Python Imaging Library……Page 332
Beautiful Soup……Page 333
Installation on Other Platforms……Page 334
Installation on Windows……Page 335
Installation……Page 336
Simple Usage Example……Page 337
Euclidean Distance……Page 339
Pearson Correlation Coefficient……Page 340
Tanimoto Coefficient……Page 341
Gini Impurity……Page 342
Entropy……Page 343
Gaussian Function……Page 344
Dot-Products……Page 345
Index……Page 346
Reviews
There are no reviews yet.