Distributional

Methods

#### SOC 590C

 Syllabus Class schedule Homeworks Lecture notes Data Links to resources Anonymous Questions and Comments

##### Location

Tuesday       9:00am-10:20am

Lecture

Thursday      9:00am-10:20am

Lecture/Laboratory

Professor

Mark S. Handcock, C14B Padelford Hall, 221-6930

Office Hours

 Mondays 11:30am - 12:30pm

Other times by arrangement. Clearly composed questions

sent to the handcock@u will receive written replies

Laboratories

Some class sessions may be held in Social Science Computation and Research (CSSCR).

## Motivation and Synopsis

This course provides an introduction to modern statistical methods for comparing distributions.  Social science research relies on these methods any time comparisons are made between groups. When the attribute of interest is continuous, for example racial differences in life expectancy, or earnings differences between men and women, the traditional methods make comparisons in terms of means, medians and standard deviations. Traditional methods, however, provide a weak and unnecessarily restrictive framework for comparison. Consider the earnings distribution in the United States. Over the past 30 years, median real earnings have declined by about 10% and the variance in earnings has risen dramatically. Hidden behind these summary statistics are a range of important questions. Have the upper and lower tails of the earnings distribution grown at the same rate? Can we determine the role played by the decade-long freeze in the minimum wage? Is there anything more to the narrowing of the gender wage gap than the convergence in median earnings between the two groups? The information we need to answer these questions is there in the data, but inaccessible using traditional statistical methods such as regression and Gini index summaries.

With the emergence of Exploratory Data Analysis (EDA, Chambers, et al 1983; Tukey 1977) and the development of high speed computing and graphical user interfaces, there has been a movement towards more nonparametric and distribution-oriented analytic methods.  A prominent feature of these methods is the use of graphical displays.  Graphics exploit the power of our visual senses to convey information in a direct way.

## Objectives of the Course

In this course we will start from scratch and introduce practical nonparametric, distribution-oriented and graphical analytic methodological tools to aid social science research.

We will follow the topics of traditional methods courses: univariate and multivariate summaries; simple and multivariate regression. These will be supplemented by quantile regression, methods for categorical data and an overall emphasis on distributional comparisons.

These methods aim to bridge the gap between exploratory tools and parametric restrictions. The goal is to present the concepts, theory and practical aspects of the methods in a coherent fashion, with a minimum of statistical prerequisites.

The course will have an applied focus on the development of tools for research in the social sciences. The course will involve the practical application of the ideas and their implementation through statistical software to make them accessible to social scientists.

This course is part of the curriculum of the new Center for Statistics and the Social Sciences (CSSS), with funding from the University Initiatives Fund. The CSSS is includes faculty members from the Department of Statistics and a broad-range of social science disciplines including Anthropology, Economics, Geography, Political Science, and Sociology. This curriculum is been developed to complement and strengthen the quantitative methods course offerings for social science students at both the undergraduate and graduate levels.

## Structure of the Course

There will be a two lectures per week. The lecture on Thursday will sometimes be a laboratory session.

There will be weekly homeworks and exercises relating to computing and programming. Students will be graded on a scale of 1 to 10 for each homework.

Discussion of homework problems is encouraged. However, each student is required to prepare and submit solutions (including computer work) to the assignments and project on their own; solutions prepared “in committee” are not acceptable. Duplication of homework solutions and computer output prepared in whole or in part by someone else is not acceptable and is considered plagiarism.  If you receive assistance from anyone, you must give due credit in your report.  (Example: “Since the data are all positive, and skewed to the right, a logarithmic transformation is clearly appropriate as a next step.  I thank David Cox for pointing this out to me.”)

I welcome comments or suggestions about the course at any time, either in person, by letter, or by anonymous email. Please feel free to use these ways make comments to me about any aspect of the course.