Reproducibility & Replication

Here’s the abstract of a presentation I gave to the 10th European Conference for Social Work Research (ECSWR) on the 6th May 2021:

Abstract

Quantitative research, reproducibility and replication: a guide for social work researchers

Professional journals share important knowledge (Gambrill, 2019). However, inappropriate use of quantitative methods can lead to claims which may not be warranted by the evidence.   Where such methods are used inappropriately, the contribution of social work research to practice, policy and social development may be compromised with unintended consequences for the specific research project and the broader social work research community. This paper therefore seeks to sensitise researchers to the challenges of conducting quantitative social work research whilst arguing that these challenges are not insurmountable. Advice is also given on how to do research which is reproducible and replicable by describing the process from reception of the data to delivery of the report using methods derived from computational research (Gandrud, 2015). The paper therefore provides practical advice on the content, evaluation and reporting of statistics based on the guidelines of the American Psychological Association. The paper also showcases ways of working with quantitative data which are methodologically innovative.   This approach will appeal both to those who have a broad overview of quantitative methods as well as to those who are new to this research tradition. The paper concludes with the recommendation that further research training for the social work community in quantitative methods is needed in order to ensure that findings are robust and ipso facto more likely to be impactful. Working within the post-positivist tradition, the author argues that quantitative and qualitative traditions can be mutually reinforcing.

Webb, P. (2021) Quantitative research, reproducibility and replication: a guide for social work researchers. In 10th European Conference for Social Work Research (ECSWR), University of Bucharest, Virtual Conference, 6th May 2021.

Limits of Social Science

Hammersley, M. (2014) The limits of social science. Causal explanation and value relevance

In this short book, Hammersley argues for a social science which eschews grand theorising in favour of the explanation of social phenomena. Drawing inspiration from Max Weber and referring to a range of social theorists and philosophers, Hammersley encourages social scientists to re-think what they are actually doing as researchers in order to create a social science which generates knowledge which is both reliable and valid. Some readers might, of course, reply that there are no problems with social research as an intellectual endeavour, but Hammersley’s purpose seems to be to awake us from our slumbers. This is a task in which he partially succeeds. Hammersley is not, for example, opposed to causal analysis in the social sciences, but argues that we should raise our game by adopting ‘within-case and cross-case analysis’. He also prioritises explanation over theorising with the proviso that ‘all purpose’ explanations are not possible because explanations are ‘always answers to particular questions’. He also argues that value conclusions cannot be derived from evidence, and offers convincing arguments why this might be the case. The consequence of Hammersley’s position is that social research should be limited to making ‘factual’ statements rather than ‘value’ claims. Although much of the book is theoretical, the author grounds his views by referring to social mobility research and to work on the English riots of 2011.

What I most enjoyed about this book is that Hammersley encourages the reader to think hard about social research practice. He is, for example, unconvinced by the view that there is a direct relationship between research and policy outcomes. On the contrary, he says that the relationship is ‘highly mediated and contingent’. Moreover, he recognises that different social science disciplines employ different methods of explanation. One has only to think of the very different approaches of the experimental psychologist and of the historian to appreciate that he has a point. But such explanatory pluralism in the social sciences has a disturbing consequence. If there is no agreed threshold which all social scientists have to meet in order to generate valid and reliable knowledge, then how do these disciplines differ from vocations like investigative or data journalism? In addition, Hammersley draws a sharp distinction between ‘facts’ which are of interest to the social scientist and ‘value claims’ which should be of interest to policymakers and think tanks. If true, it is very hard to see how social researchers can make the case for funding their work in a cultural environment which does not recognise that knowledge has value in itself. Hammersley recognises this point but does not offer any solutions.

This book is not a paean to social science as it is currently practised and will be, to use Hammersley’s own word, a ‘deflationary’ read for some. If, however, you want to read something which may question your preconceptions, this book is a good place to begin.

Review originally published in Research Matters, December 2015

Managing & Sharing Data

Corti, L., Van den Eynden, V., Bishop, L., Woollard, M. (2014) Managing and sharing research data: a guide to good practice

This is a guide to best practice for researchers who want to supplement existing data management skills and those who want to develop data management skills for the first time.

Written by members of the UK’s Data Archive, the authors describe those skills which will be needed to ensure that data is open and reusable, and collected, stored and shared in ways which respect ethical practice and relevant legislation. The authors also make a convincing case for why data sharing is beneficial, and present counter arguments to some of the more common reasons which are given for not sharing data.

The authors introduce the reader to the research data life cycle and approaches to research data management planning as well as referring to specific skills and software which the researcher could usefully acquire. There are, for example, very clear introductions to version control systems and to the encryption of sensitive data using open source software. I particularly enjoyed the chapter about formatting and organising data, which contains a section on how to organise data files logically. The book is written in very clear prose making the more technical topics accessible to the non-specialist. Moreover, the text is supplemented by case studies, exercises and useful references as well as a website.

The authors manage to successfully combine a discussion of abstract topics such as metadata with grounded examples of how these topics could be applied in practice. For the purposes of this review, I read the text sequentially but I think that one could usefully refer to particular chapters or sections in order to fill specific knowledge gaps. Indeed, I found myself repeatedly returning to particular sections of the text to reinforce my understanding of key concepts.

To conclude, this book fills a gap in the market and will, I’m sure, be read by researchers in any discipline where data management skills are needed. I would recommend this book without hesitation. Well written, informative and, with its commitment to transparency and data sharing, commendable.

Review originally published in Research Matters, March 2015

Social Media & Survey Research

Hill, C.A., Dean E., Murphy J. (2014) Social media, sociality and survey research

This book has been written because of the writers’ awareness that declining response rates and inadequate sampling frames present a challenge to all social researchers who wish to collect survey data which is ‘accurate, timely and accessible’. Primarily written by researchers from RTI International, the book is a compendium of chapters which describe how the researchers have incorporated social media data into their research projects. The authors suggest that the book is intended for survey and market researchers, as well as students in survey methodology and market research and I agree that this book will be useful for this constituency.

The writers don’t argue for the replacement of the more familiar survey modes but suggest that postal, web-based and telephone surveys can be supplemented by the imaginative use of social media. Indeed, they recognise that social media data has its own limitations and does not fit easily into designs where precise estimates are needed.

The writers define social media as ‘a collection of websites and web-based systems that allow for mass interaction, conversation, and sharing among members of a network’ and refer to web 2.0 with its user generated content. The book covers a diverse range of topics which include how to predict sentiments and emotions using consistent methods, how to pre-test questionnaires use Skype and Second Life and how to develop innovative research by using social media to collect ideas from large groups of people. There is also a chapter on how to apply the principles of the games designer to market research so that participation in research is more enjoyable.

Athough very wide ranging, the book retains its coherence because it is organised around the idea of a ‘sociality hierarchy’ which can be broken down into broadcast, conversational and community levels. The authors also consistently avoid the use of technical language and include a useful set of references – many of which are downloadable – at the end of each chapter.

This book is a must read for any researcher who wants to make use of social media data; it is incisive, instructive, easy to read and, above all, fascinating.

Review originally published in Research Matters, June 2014

Social Network Analysis

Borgatti, S.P., Everett, M.G. and Johnson J.C. (2013) Analyzing social networks

This book takes the reader on a tour of key theoretical concepts in social network analysis. It is divided into four sections: introduction, research methods, core concepts and measures and a final section which deals with what the writers describe as ‘three cross-cutting chapters’ on ‘affiliation type data’, ‘large networks’ and ‘ego network data’. Although primarily theoretical, the book refers to interesting empirical work across the social sciences and health care in order to illustrate core concepts. It introduces readers to software – UCINET and NetDraw – which they can use to analyse and visualise network data but refers to a dedicated website for readers who require a software tutorial.

There is much to commend in this book. The authors provide a clear introduction to graph theory and matrix algebra for non-mathematicians. There is also an interesting introduction to core concepts like ‘centrality’, ‘sub-group’ and ‘equivalence’ and a fascinating discussion of how hypothesis testing is possible with network data when the assumptions of standard inferential tests are violated. The authors also provide invaluable advice on how best to lay out network diagrams in order to make interpretation easier.

However, I think that how information is presented may need to be reviewed. The authors assume that readers are familiar with research terminology without necessarily defining their terms. Although this is a reasonable assumption if the book is for established researchers, beginners may need to refer to an introductory research methods textbook in order to take full advantage of the material. Borgatti et al. also state that a sequential reading of each chapter isn’t needed although this suggestion doesn’t work for readers who assume that a book will begin with straightforward material before moving to advanced topics. A glossary would be useful.

This is an informative book for established social researchers with some prior exposure to social network analysis. Aspirant social network analysts may find the book a little too advanced.

Review originally published in Research Matters, March 2014

Discovering statistics using R

Field, A., Miles J., Field, Z. (2012) Discovering statistics using R

This book teaches statistics by using R – the free statistical environment and programming language. It will be of use to undergraduate and postgraduate students and professional researchers across the social sciences, including material which ranges from the introductory to the advanced. Divided into four levels of difficulty with ‘Level 1’ representing introductory material and ‘Level 4’ the most advanced material, it may be read from beginning to end or with reference to particular techniques. An understanding of the advanced material may require knowing the material in earlier chapters. There is a comprehensive glossary of specialised terms and a selection of statistical tables in the appendix. There is also material on the publisher’s companion website and on the principal author’s own web pages.

The main strength of this book is that it presents a lot of information in an accessible, engaging and irreverent way. The style is informal with interesting excursions into the history of statistics and psychology. There are entertaining references to research papers which illustrate the methods explained, and are also very entertaining. The authors manage to pull off the Herculean task of teaching statistics through the medium of R. This is an achievement when one considers that R can be difficult to use for researchers who have never manipulated data from the command line. Another plus point is that the authors describe how to ‘extend’ R’s capabilities with ‘packages’. This is a massive time saver for any researcher who does not know which package is required in order to extend R’s base system to conduct a particular test. Field et al. also succeed in placing many of the statistical procedures to which they allude within the framework of the ‘general linear model’ giving the book a sense of theoretical coherence.

But I think that the book would have benefited from an explanation of how R fits into the wider ‘tool chain’ of public domain programs which can be used to produce a publication-ready paper. Moreover, some of the exemplars of R code may not work or may be illustrative of deprecated techniques but the principal author is maintaining an errata file on his own website. Nevertheless, I would recommend this book to students, academics and applied researchers. Although heavily weighted towards the interests of psychological researchers, it would not be too difficult to transfer the techniques to a different area of expertise. All in all, an invaluable resource.

Review originally published in Research Matters, December 2013

Hard-to-Survey Populations

Tourangeau R, Edwards B, Johnson T.P., Wolter K.M. & Bates, N (Eds.) Hard to Survey Populations

This is an excellent book that fills a gap in the methodological literature. With contributions from some of the most notable practitioners of survey methodology in the world, this collection is exceptionally comprehensive. The book contains discussions of how to survey groups as diverse as people with intellectual disabilities, the homeless, political extremists and stigmatised groups, as well as a fascinating chapter on the challenges of surveying linguistically diverse populations. One should not therefore assume that this is a dry statistical tome; there is much here for the student, applied researcher and clinician who need a jargon-free introduction to this topic.

There are also discussions of sampling methods for the more methodologically inclined, including explanations of location sampling, which has been used to sample the homeless, nomads and immigrants. Some of the explanations of sampling strategies may however be difficult for readers who are not comfortable with mathematics with Part IV on sampling strategies being particularly challenging in this regard.

Each chapter is, however, self-contained with useful references for the reader who wishes to investigate any topic in more depth. A chapter-by-chapter reading of the book isn’t therefore necessary. The book may profitably be read either as a comprehensive introduction to hard-to-survey populations or as a reference text for those who are thinking about surveying a particular group.

In short, an indispensable resource for any psychologist – irrespective of specialism or level of expertise – who wishes to collect robust data about the lives of people who aren’t always given a voice.

Review originally published in The Psychologist, March 2015

Social Physics: A New Science

Pentland, A. (2014) Social Physics: How Good Ideas Spread  – the Lessons from a New Science

Alex Pentland’s book is a hugely readable introduction to “social physics”, which the author defines “as a quantitative social science that describes reliable, mathematical connections between information and idea flow on the one hand and people’s behaviour on the other”. In contradistinction to what the author defines as conventional “individual-centric economic and policy thinking”, Pentland suggests that the primary drivers of cultural evolution in our wired world are “social learning” and “social pressure”.

Pentland entertainingly describes a range of studies which he and colleagues have conducted that are both interesting and counterintuitive. He shows, for example, how equal “conversational turn-taking” is the most important factor in predicting “group intelligence”. Other studies focus on trading and the determinants of political opinion. Indeed, there seems to be nothing which is outside of the purview of social physics.

But Pentland’s enthusiasm for his subject carries an overtone of hubris. For Pentland, constructs like “market”, “class” and “capital” should be replaced by the concepts he outlines in the book. Moreover, he gives a very partial interpretation of history since the Enlightenment, which is puzzling because he simultaneously extols the virtues of Adam Smith and John Locke while suggesting that conventional economic concepts are redundant.

In order to gain a more nuanced view of what drives cultural, social and economic evolution, my advice would be to imagine Pentland in a dialogue with economists, historians, sociologists and philosophers and then to form your own view of the truth of the claims made in this book.

Review originally published in Reviews. Significance, 12:6 45. doi: 10.1111/j.1740-9713.2015.00871.x

SSD for R and Single-Subject Data

Auerbach, C., Zeitlin, W. (2014) SSD for R: An R Package for Analyzing Single-Subject Data

This work is short but, in spite of its brevity, Charles Auerbach and Wendy Zeitlin’s book describes how to analyse single-subject data using their own package, SSD for R. They introduce its functions as well as providing advice on how to analyse baseline and intervention phase data.

I thought that their discussion of serial dependency was particularly well done, as was their emphasis on how to use SSD for R to visualise data. Other chapters provide introductions to statistical testing and to the analysis of group data.

Readers should note that the book does not deal with single-subject methodology in any depth, so additional resources will be needed in order to make best use of the package. Fortunately, the authors include useful references for those who need information on specific research designs.

R newbies may need to read an introductory R text as the book’s scope is understandably restricted to providing information about the package. But Auerbach and Zeitlin write well and the content does not demand much in the way of prior statistical knowledge or IT skills.

Statisticians may not need to avail themselves of this book, but practitioners who are working in applied disciplines such as social work, psychology and medicine will find it very appealing.

Review originally published in Reviews. Significance, 12:4 45. doi: 10.1111/j.1740-9713.2015.00846.x

Using R for Introductory Statistics

Versani, J (2013) Using R for Introductory Statistics (Second Edition)

This book has a laudable aim: to introduce R and topics from an introductory statistics curriculum to students “outside of a classroom environment”. Now in its second edition, the book introduces the reader to exploratory data analysis and manipulation, statistical inference and statistical models. Particular attention is given to thoroughly learning base R before extending R’s capabilities with packages.

Author John Verzani includes information on computationally intensive approaches and manages to explain these topics with interesting, topical and challenging examples. The text includes a plethora of exercises which encourage the reader to test their understanding of the material as well as a useful appendix on R programming and a valuable bibliography.

Although informative, I don’t think this text will be useful for readers without any previous exposure to either statistical computing or statistics. The text does begin simply enough, but my impression is that the reader will need to refer to additional resources. I’m therefore not convinced by claims that the book may be used without a teacher. Indeed, the fact that the solutions to exercises are only available to those who adopt the book as a course text suggests that the book is intended for use by university teachers rather than autodidacts.

In short, a stimulating read for the classroom-based student, but too challenging for a neophyte learner studying at home.

Review originally published in Reviews. Significance, 12:2 44{45. doi: 10.1111/j.1740-9713.2015.00818.x