David Cross1930110006, 9781930110007
Table of contents :
contents……Page 5
foreword……Page 11
preface……Page 13
About this book……Page 14
Typographical conventions……Page 15
Author Online……Page 16
Acknowledgments……Page 17
about the cover illustration……Page 19
Foundations……Page 21
Data, data munging, and Perl……Page 23
1.1.1 Data munging processes……Page 24
1.1.2 Data recognition……Page 25
1.1.5 Data transformation……Page 26
1.2.2 Transferring data between multiple systems……Page 27
1.2.3 Real-world data munging examples……Page 28
1.3.1 Data files……Page 29
1.3.2 Databases……Page 30
1.3.4 Other sources/sinks……Page 31
1.4.1 Unstructured data……Page 32
1.4.4 Binary data……Page 33
1.5 What is Perl?……Page 34
1.5.1 Getting Perl……Page 35
1.6 Why is Perl good for data munging?……Page 36
1.8 Summary……Page 37
General munging practices……Page 38
2.1 Decouple input, munging, and output processes……Page 39
2.2.1 Example: the CD file revisited……Page 40
2.3 Encapsulate business rules……Page 45
2.3.2 Ways to encapsulate business rules……Page 46
2.3.3 Simple module……Page 47
2.3.4 Object class……Page 48
2.4.1 Overview of the filter model……Page 51
2.4.2 Advantages of the filter model……Page 52
2.5.1 What to write to an audit trail……Page 56
2.5.3 Using the UNIX system logs……Page 57
2.7 Summary……Page 58
Useful Perl idioms……Page 59
3.1.1 Simple sorts……Page 60
3.1.2 Complex sorts……Page 61
3.1.3 The Orcish Manoeuvre……Page 62
3.1.4 Schwartzian transform……Page 63
3.1.6 Choosing a sort technique……Page 66
3.2.1 Sample DBI program……Page 67
3.3 Data::Dumper……Page 69
3.4 Benchmarking……Page 71
3.5 Command line scripts……Page 73
3.6 Further information……Page 75
3.7 Summary……Page 76
Pattern matching……Page 77
4.1.1 Substrings……Page 78
4.1.2 Finding strings within strings (index and rindex)……Page 79
4.2.1 What are regular expressions?……Page 80
4.2.2 Regular expression syntax……Page 81
4.2.3 Using regular expressions……Page 85
4.2.4 Example: translating from English to American……Page 90
4.2.5 More examples: /etc/passwd……Page 93
4.2.6 Taking it to extremes……Page 96
4.3 Further information……Page 97
4.4 Summary……Page 98
Data munging……Page 99
Unstructured data……Page 101
5.1.1 Reading the file……Page 102
5.1.2 Text transformations……Page 104
5.1.3 Text statistics……Page 105
5.2.1 Converting the character set……Page 107
5.2.2 Converting line endings……Page 108
5.2.3 Converting number formats……Page 110
5.3 Further information……Page 114
5.4 Summary……Page 115
Record-oriented data……Page 116
6.1.1 Reading simple record-oriented data……Page 117
6.1.2 Processing simple record-oriented data……Page 120
6.1.3 Writing simple record-oriented data……Page 122
6.1.4 Caching data……Page 125
6.2.1 Anatomy of CSV data……Page 128
6.2.2 Text::CSV_XS……Page 129
6.3 Complex records……Page 130
6.3.1 Example: a different CD file……Page 131
6.3.2 Special values for $/……Page 133
6.4.1 Built-in Perl date functions……Page 134
6.4.2 Date::Calc……Page 140
6.4.3 Date::Manip……Page 141
6.4.4 Choosing between date modules……Page 142
6.5 Extended example: web access logs……Page 143
6.7 Summary……Page 146
Fixed-width and binary data……Page 147
7.1.1 Reading fixed-width data……Page 148
7.1.2 Writing fixed-width data……Page 155
7.2 Binary data……Page 159
7.2.1 Reading PNG files……Page 160
7.2.2 Reading and writing MP3 files……Page 163
7.3 Further information……Page 164
7.4 Summary……Page 165
Simple data parsing……Page 167
Complex data formats……Page 169
8.1.1 Example: metadata in the CD file……Page 170
8.1.2 Example: reading the expanded CD file……Page 172
8.2.1 Removing tags from HTML……Page 174
8.2.2 Limitations of regular expressions……Page 177
8.3.1 An introduction to parsers……Page 178
8.3.2 Parsers in Perl……Page 181
8.5 Summary……Page 182
HTML……Page 183
9.1 Extracting HTML data from the World Wide Web……Page 184
9.2.1 Example: simple HTML parsing……Page 185
9.3.1 HTML::LinkExtor……Page 187
9.3.2 HTML::TokeParser……Page 189
9.3.3 HTML::TreeBuilder and HTML::Element……Page 191
9.4 Extended example: getting weather forecasts……Page 192
9.6 Summary……Page 194
XML……Page 195
10.1.2 What is XML?……Page 196
10.2.1 Example: parsing weather.xml……Page 198
10.2.2 Using XML::Parser……Page 199
10.2.3 Other XML::Parser styles……Page 201
10.2.4 XML::Parser handlers……Page 208
10.3.1 Example: parsing XML using XML::DOM……Page 211
10.4.2 A sample RSS file……Page 213
10.4.3 Example: creating an RSS file with XML::RSS……Page 215
10.4.4 Example: parsing an RSS file with XML::RSS……Page 216
10.5.1 Sample XML input file……Page 217
10.5.2 XML document transformation script……Page 218
10.5.3 Using the XML document transformation script……Page 225
10.7 Summary……Page 228
Building your own parsers……Page 229
11.1.1 Example: parsing simple English sentences……Page 230
11.2.1 Example: parsing a Windows INI file……Page 232
11.2.2 Understanding the INI file grammar……Page 233
11.2.4 Example: displaying the contents of @item……Page 234
11.2.5 Returning a data structure……Page 236
11.3 Another example: the CD data file……Page 237
11.3.1 Understanding the CD grammar……Page 238
11.3.2 Testing the CD file grammar……Page 239
11.3.3 Adding parser actions……Page 240
11.4 Other features of Parse::RecDescent……Page 243
11.6 Summary……Page 244
The big picture……Page 245
Looking back— and ahead……Page 247
12.1.2 The usefulness of Perl……Page 248
12.2.1 Know your data……Page 249
12.2.3 Know where to go for more information……Page 250
Modules reference……Page 252
A.1.1 Functions called on the DBI class……Page 253
A.1.4 Attributes of any DBI handle……Page 254
A.1.5 Functions called on a database handle……Page 255
A.1.7 Functions called on a statement handle……Page 256
A.2.1 Attributes……Page 258
A.2.2 Methods……Page 259
A.3 Date::Calc……Page 260
A.4 Date::Manip……Page 262
A.5 LWP::Simple……Page 264
A.6 HTML::Parser……Page 265
A.6.1 Handlers……Page 266
A.7 HTML::LinkExtor……Page 267
A.8 HTML::TokeParser……Page 268
A.9 HTML::TreeBuilder……Page 269
A.10 XML::Parser……Page 270
Essential Perl……Page 274
B.1 Running Perl……Page 275
B.2.1 Scalars……Page 276
B.2.2 Arrays……Page 277
B.2.3 Hashes……Page 279
B.3.1 Mathematical operators……Page 281
B.3.2 Logical operators……Page 282
B.4.1 Conditional execution……Page 283
B.4.2 Loops……Page 284
B.5 Subroutines……Page 286
B.6.1 Creating references……Page 288
B.6.4 Complex data structures using references……Page 289
B.7 More information on Perl……Page 292
index……Page 293
Reviews
There are no reviews yet.