Much of the data available today is unstructured and text-heavy, making it challenging for analysts to apply their usual data wrangling and visualization tools. With this practical book, you’ll explore text-mining techniques with tidytext, a package that authors Julia Silge and David Robinson developed using the tidy principles behind R packages like ggraph and dplyr . You’ll learn how tidytext and other tidy tools in R can make text analysis easier and more effective.
Real-world data sets are messy and complicated. Written for students in social science and public management, this authoritative but approachable guide describes all the tools needed to collect data and prepare it for analysis. Offering detailed, step-by-step instructions, it covers collection of many different types of data including web files, APIs, and maps; data cleaning; data formatting; the integration of different sources into a comprehensive data set; and storage using third-party tools to facilitate access and shareability, from Google Docs to GitHub. Assuming no prior knowledge of R and Python, the author introduces programming concepts gradually, using real data sets that provide the reader with practical, functional experience.
A tidyverse edition of the acclaimed textbook on data analysis and statistics for the social sciences and allied fields Quantitative analysis is an essential skill for social science research, yet students in the social sciences and related areas typically receive little training in it. Quantitative Social Science is a practical introduction to data analysis and statistics written especially for undergraduates and beginning graduate students in the social sciences and allied fields, including business, economics, education, political science, psychology, sociology, public policy, and data science. Proven in classrooms around the world, this one-of-a-kind textbook engages directly with empirical analysis, showing students how to analyze and interpret data using the tidyverse family of R packages. Data sets taken directly from leading quantitative social science research illustrate how to use data analysis to answer important questions about society and human behavior. Emphasizes hands-on learning, not paper-and-pencil statistics Includes data sets from actual research for students to test their skills on Covers data analysis concepts such as causality, measurement, and prediction, as well as probability and statistical tools Features a wealth of supplementary exercises, including additional data analysis exercises and programming exercises Offers a solid foundation for further study Comes with additional course materials online, including notes, sample code, exercises and problem sets with solutions, and lecture slides
All social and policy researchers need to synthesize data into a visual representation. Producing good visualizations combines creativity and technique. This book teaches the techniques and basics to produce a variety of visualizations, allowing readers to communicate data and analyses in a creative and effective way. Visuals for tables, time series, maps, text, and networks are carefully explained and organized, showing how to choose the right plot for the type of data being analysed and displayed. Examples are drawn from public policy, public safety, education, political tweets, and public health. The presentation proceeds step by step, starting from the basics, in the programming languages R and Python so that readers learn the coding skills while simultaneously becoming familiar with the advantages and disadvantages of each visualization. No prior knowledge of either Python or R is required. Code for all the visualizations are available from the book's website.
This book takes the reader through real-world examples for how to characterize and measure the productivity and performance of NFPs and education institutions--that is, organisations that produce value for society, which cannot be measured accurately in financial KPIs. It focuses on how best to frame non-profit performance and productivity, and provides a suite of tools for measurement and benchmarking. It further challenges the reader to consider alternative and appropriate uses of quantitative measures, which are fit-for-purpose in individual contexts. It is true that the risk of misusing quantitative measures is ever-present. But does that risk outweigh the benefits of forming a more precise and shared understanding of what could generate better outcomes? There will always be concerns about policy and performance management. Goodheart's Law states that once a measure becomes a target, it is no longer a good measure. This book helps to strike a meaningful balance between what can be measured, what cannot, and how best to use quantitative information in sectors that are often averse to being held up to the light and put on a scale by outsiders.
This book provides a narrative of how R can be useful in the analysis of public administration, public policy, and political science data specifically, in addition to the social sciences more broadly. It can serve as a textbook and reference manual for students and independent researchers who wish to use R for the first time or broaden their skill set with the program. While the book uses data drawn from political science, public administration, and policy analyses, it is written so that students and researchers in other fields should find it accessible and useful as well. By the end of the first seven chapters, an entry-level user should be well acquainted with how to use R as a traditional econometric software program. The remaining four chapters will begin to introduce the user to advanced techniques that R offers but many other programs do not make available such as how to use contributed libraries or write programs in R. The book details how to perform nearly every task routinely associated with statistical modeling: descriptive statistics, basic inferences, estimating common models, and conducting regression diagnostics. For the intermediate or advanced reader, the book aims to open up the wide array of sophisticated methods options that R makes freely available. It illustrates how user-created libraries can be installed and used in real data analysis, focusing on a handful of libraries that have been particularly prominent in political science. The last two chapters illustrate how the user can conduct linear algebra in R and create simple programs. A key point in these chapters will be that such actions are substantially easier in R than in many other programs, so advanced techniques are more accessible in R, which will appeal to scholars and policy researchers who already conduct extensive data analysis. Additionally, the book should draw the attention of students and teachers of quantitative methods in the political disciplines.