General guidelines for choosing a statistical analysis based on the number of dependent variables, the nature of your independent variables, and whether the dependent variable is an interval variable, ordinal or categorical variable, and whether it is normally distributed. From UCLA's Institute for Digital Research and Education.
This site provides online seminars designed to improve skills in statistical computing packages and statistical techniques. Topics covered include: Stata, R, SAS, SPSS, Mplus and Latent Variable Analysis, and more.
"The tidyverse is an opinionated collection of R packages designed for data science. All packages share an underlying design philosophy, grammar, and data structures."
Covers all areas of statistics. Regularly updated with revisions and new articles.
Covers all areas of statistics including probability theory, biostatistics, quality control, and applications of statistical methods. Includes the full text of the second print edition, the entire original edition, plus supplements and updates. Regularly updated with new articles and revisions to previously published articles.
GIF offers workshops, office hours, online trainings, geospatial data, networking and funding opportunities, and more. They also facilitate free UCB access to ArcGIS.
The Department of Statistics operates a free consulting service for members of the campus community. Advanced graduate students, under faculty supervision, consult by appointment in the fall and spring semesters. We do not run the consulting service during the summer.
Books
Data Science for Infectious Disease Data Analytics by Lily WangData Science for Infectious Disease Data Analytics: An Introduction with R provides an overview of modern data science tools and methods that have been developed specifically to analyze infectious disease data. With a quick start guide to epidemiological data visualization and analysis in R, this book spans the gulf between academia and practices providing many lively, instructive data analysis examples using the most up-to-date data, such as the newly discovered coronavirus disease (COVID-19). The primary emphasis of this book is the data science procedures in epidemiological studies, including data wrangling, visualization, interpretation, predictive modeling, and inference, which is of immense importance due to increasingly diverse and nonexperimental data across a wide range of fields. The knowledge and skills readers gain from this book are also transferable to other areas, such as public health, business analytics, environmental studies, or spatio-temporal data visualization and analysis in general. Aimed at readers with an undergraduate knowledge of mathematics and statistics, this book is an ideal introduction to the development and implementation of data science in epidemiology. Features Describes the entire data science procedure of how the infectious disease data are collected, curated, visualized, and fed to predictive models, which facilitates effective communication between data sources, scientists, and decision-makers. Explains practical concepts of infectious disease data and provides particular data science perspectives. Overview of the unique features and issues of infectious disease data and how they impact epidemic modeling and projection. Introduces various classes of models and state-of-the-art learning methods to analyze infectious diseases data with valuable insights on how different models and methods could be connected.
Publication Date: 2022
Guide to Social Science Data Preparation and Archiving Best Practice Throughout the Data Life Cycle: 6th Edition by ICPSRThis publication provides information on how to prepare data
for deposit and how researchers can ensure access to their
data by others in the future. The Guide to Social Science Data Preparation and Archiving is aimed at those engaged in the
cycle of research, from applying for a research grant, through the data collection phase, and
ultimately to preparation of the data for deposit in a public archive.
Publication Date: 2012
A Guide to Tactical Data EngagementTactical Data Engagement is a four-step method for City Hall to help residents make an impact in their community using open data.
This guide puts forth a vision for Tactical Data Engagement (TDE). We created this method
based on the core concepts of human-centered design and tactical urbanism, with steps that
guide city officials in carrying out interventions that facilitate the community use of open data for
local impact. This approach goes beyond basic resident engagement. It seeks to make open
data programs more transparent, accountable, and participatory by challenging city halls to
actively help residents use open data to better their communities.
The four steps outlined in this guide will help readers complete a resident-informed project,
product, or tool that supports the community use of data.
Call Number: Online
Publication Date: 2017
Indigenous Statistics: A Quantitative Research Methodology by Maggie Walter; Chris AndersenIn the first book ever published on Indigenous quantitative methodologies, Maggie Walter and Chris Andersen open up a major new approach to research across the disciplines and applied fields. While qualitative methods have been rigorously critiqued and reformulated, the population statistics relied on by virtually all research on Indigenous peoples continue to be taken for granted as straightforward, transparent numbers. This book dismantles that persistent positivism with a forceful critique, then fills the void with a new paradigm for Indigenous quantitative methods, using concrete examples of research projects from First World Indigenous peoples in the United States, Australia, and Canada. Concise and accessible, it is an ideal supplementary text as well as a core component of the methodological toolkit for anyone conducting Indigenous research or using Indigenous population statistics.
Publication Date: 2013
Innovative Statistical Methods for Public Health DataThe book brings together experts working in public health and multi-disciplinary areas to present recent issues in statistical methodological development and their applications. This timely book will impact model development and data analyses of public health research across a wide spectrum of analysis. Data and software used in the studies are available for the reader to replicate the models and outcomes. The fifteen chapters range in focus from techniques for dealing with missing data with Bayesian estimation, health surveillance and population definition and implications in applied latent class analysis, to multiple comparison and meta-analysis in public health data. Researchers in biomedical and public health research will find this book to be a useful reference and it can be used in graduate level classes.
Publication Date: 2015
Introduction to Biostatistical Applications in Health Research with Microsoft Office Excel and R by Robert P. HirschThe second edition of Introduction to Biostatistical Applications in Health Research delivers a thorough examination of the basic techniques and most commonly used statistical methods in health research. Retaining much of what was popular with the well-received first edition, the thoroughly revised second edition includes a new chapter on testing assumptions and how to evaluate whether those assumptions are satisfied and what to do if they are not. The newest edition contains brand-new code examples for using the popular computer language R to perform the statistical analyses described in the chapters within. You'll learn how to use Excel to generate datasets for R, which can then be used to conduct statistical calculations on your data. The book also includes a companion website with a new version of BAHR add-in programs for Excel. This new version contains new programs for nonparametric analyses, Student-Newman-Keuls tests, and stratified analyses. Readers will also benefit from coverage of topics like: Extensive discussions of basic and foundational concepts in statistical methods, including Bayes' Theorem, populations, and samples A treatment of univariable analysis, covering topics like continuous dependent variables and ordinal dependent variables An examination of bivariable analysis, including regression analysis and correlation analysis An analysis of multivariate calculations in statistics and how testing assumptions, like assuming Gaussian distributions or equal variances, affect statistical outcomes Perfect for health researchers of all kinds, Introduction to Biostatistical Applications in Health Research also belongs on the bookshelves of anyone who wishes to better understand health research literature. Even those without a great deal of mathematical background will benefit greatly from this text.
Publication Date: 2021
R for Data Science: import, tidy, transform, visualize, and model dataLearn how to use R to turn raw data into insight, knowledge, and understanding. This book introduces you to R, RStudio, and the tidyverse, a collection of R packages designed to work together to make data science fast, fluent, and fun. Suitable for readers with no previous programming experience, R for Data Science is designed to get you doing data science as quickly as possible. Authors Hadley Wickham and Garrett Grolemund guide you through the steps of importing, wrangling, exploring, and modeling your data and communicating the results. You'll get a complete, big-picture understanding of the data science cycle, along with basic tools you need to manage the details. Each section of the book is paired with exercises to help you practice what you've learned along the way. You'll learn how to: Wrangle--transform your datasets into a form convenient for analysis Program--learn powerful R tools for solving data problems with greater clarity and ease Explore--examine your data, generate hypotheses, and quickly test them Model--provide a low-dimensional summary that captures true "signals" in your dataset Communicate--learn R Markdown for integrating prose, code, and results
Publication Date: 2023
Smart Use of State Public Health Data for Health Disparity AssessmentHealth services are often fragmented along organizational lines with limited communication among the public health-related programs or organizations, such as mental health, social services, and public health services. This can result in disjointed decision making without necessary data and knowledge, organizational fragmentation, and disparate knowledge development across the full array of public health needs. When new questions or challenges arise that require collaboration, individual public health practitioners (e.g., surveillance specialists and epidemiologists) often do not have the time and energy to spend on them. Smart Use of State Public Health Data for Health Disparity Assessmentpromotes data integration to aid crosscutting program collaboration. It explains how to maximize the use of various datasets from state health departments for assessing health disparity and for disease prevention. The authors offer practical advice on state public health data use, their strengths and weaknesses, data management insight, and lessons learned. They propose a bottom-up approach for building an integrated public health data warehouse that includes localized public health data. The book is divided into three sections: Section I has seven chapters devoted to knowledge and skill preparations for recognizing disparity issues and integrating and analyzing local public health data. Section II provides a systematic surveillance effort by linking census tract poverty to other health disparity dimensions. Section III provides in-depth studies related to Sections I and II. All data used in the book have been geocoded to the census tract level, making it possible to go more local, even down to the neighborhood level.