Analyzing Linguistic Data

Free Download

Authors:

Edition: 1

ISBN: 0521709180, 9780521709187, 0521882591

Size: 5 MB (5228651 bytes)

Pages: 369/369

File format:

Language:

Publishing Year:

Category: Tags: , ,

R. H. Baayen0521709180, 9780521709187, 0521882591

Statistical analysis is a useful skill for linguists and psycholinguists, allowing them to understand the quantitative structure of their data. This textbook provides a straightforward introduction to the statistical analysis of language. Designed for linguists with a non-mathematical background, it clearly introduces the basic principles and methods of statistical analysis, using ‘R’, the leading computational statistics programme. The reader is guided step-by-step through a range of real data sets, allowing them to analyse acoustic data, construct grammatical trees for a variety of languages, quantify register variation in corpus linguistics, and measure experimental data using state-of-the-art models. The visualization of data plays a key role, both in the initial stages of data exploration and later on when the reader is encouraged to criticize various models. Containing over 40 exercises with model answers, this book will be welcomed by all linguists wishing to learn more about working with and presenting quantitative data.Statistical analysis is a useful skill for linguists and psycholinguists, allowing them to understand the quantitative structure of their data. This textbook provides a straightforward introduction to the statistical analysis of language. Designed for linguists with a non-mathematical background, it clearly introduces the basic principles and methods of statistical analysis, using ‘R’, the leading computational statistics programme. The reader is guided step-by-step through a range of real data sets, allowing them to analyse acoustic data, construct grammatical trees for a variety of languages, quantify register variation in corpus linguistics, and measure experimental data using state-of-the-art models. The visualization of data plays a key role, both in the initial stages of data exploration and later on when the reader is encouraged to criticize various models. Containing over 40 exercises with model answers, this book will be welcomed by all linguists wishing to learn more about working with and presenting quantitative data.

Table of contents :
Cover……Page 1
Half-title……Page 3
Title……Page 5
Copyright……Page 6
Dedication……Page 7
Contents……Page 9
Preface……Page 12
1 An introduction to R……Page 17
1.1 R as a calculator……Page 18
1.2 Getting data into and out of R……Page 20
1.3 Accessing information in data frames……Page 22
1.4.1 Sorting a data frame by one or more columns……Page 26
1.4.2 Changing information in a data frame……Page 28
1.4.3 Extracting contingency tables from data frames……Page 29
1.4.4 Calculations on data frames……Page 31
1.5 Session management……Page 34
2.1 Random variables……Page 36
2.2 Visualizing single random variables……Page 37
2.3 Visualizing two or more variables……Page 48
2.4 Trellis graphics……Page 53
3.2 Discrete distributions……Page 60
3.3 Continuous distributions……Page 73
3.3.1 The normal distribution……Page 74
3.3.2 The t, F, and X2 distributions……Page 79
4 Basic statistical methods……Page 84
4.1.1 Distribution tests……Page 87
4.1.2 Tests for the mean……Page 91
4.2 Tests for two independent vectors……Page 93
4.2.1 Are the distributions the same?……Page 94
4.2.2 Are the means the same?……Page 95
4.2.3 Are the variances the same?……Page 97
4.3.1 Are the means or medians the same?……Page 98
4.3.2 Functional relations: linear regression……Page 100
4.3.2.1 Slope and intercept……Page 101
4.3.2.2 Estimating slope and intercept……Page 102
4.3.2.3 Correlation……Page 103
4.3.2.4 Summarizing a linear model object……Page 105
4.3.2.5 Problems and pitfalls of linear regression……Page 107
4.3.3 What does the joint density look like?……Page 113
4.4 A numerical vector and a factor: analysis of variance……Page 117
4.4.1 Two numerical vectors and a factor: analysis of covariance……Page 124
4.5 Two vectors with counts……Page 127
4.6 A note on statistical significance……Page 130
5.1.1 Tables with measurements: principal components analysis……Page 134
5.1.2 Tables with measurements: factor analysis……Page 142
5.1.3 Tables with counts: correspondence analysis……Page 144
5.1.4 Tables with distances: multidimensional scaling……Page 152
5.1.5 Tables with distances: hierarchical cluster analysis……Page 154
5.2.1 Classification trees……Page 164
5.2.2 Discriminant analysis……Page 170
5.2.3 Support vector machines……Page 176
6.1 Introduction……Page 181
6.2 Ordinary least squares regression……Page 185
6.2.1 Nonlinearities……Page 190
6.2.2 Collinearity……Page 197
6.2.3 Model criticism……Page 204
6.2.4 Validation……Page 209
6.3.1 Logistic regression……Page 211
6.3.2 Ordinal logistic regression……Page 224
6.4 Regression with breakpoints……Page 230
6.5 Models for lexical richness……Page 238
6.6 General considerations……Page 252
7 Mixed models……Page 257
7.1 Modeling data with fixed and random effects……Page 258
7.2 A comparison with traditional analyses……Page 275
7.2.1 Mixed-effects models and quasi-F……Page 276
7.2.2 Mixed-effects models and Latin Square designs……Page 282
7.2.3 Regression with subjects and items……Page 285
7.3 Shrinkage in mixed-effects models……Page 291
7.4 Generalized linear mixed models……Page 294
7.5.1 Primed lexical decision latencies for Dutch neologisms……Page 300
7.5.2 Self-paced reading latencies for Dutch neologisms……Page 303
7.5.3 Visual lexical decision latencies of Dutch eight-year-olds……Page 305
7.5.4 Mixed-effects models in corpus linguistics……Page 311
Appendix A Solutions to the exercises……Page 319
Appendix B Overview of R functions……Page 351
References……Page 358
R……Page 363
Topic index……Page 365
Author index……Page 368

Reviews

There are no reviews yet.

Be the first to review “Analyzing Linguistic Data”
Shopping Cart
Scroll to Top