• Home
  • About

Elevator Dings: A Severance Analysis

rstats
severance
data visualization
This data consists of all elevator dings in the Severance episodes along with the episode number, time stamp, pitch of the ding, and the action associated.
Mar 20, 2025
Lucy D’Agostino McGowan

How are we feeling: A Severance Analysis

rstats
severance
data visualization
We create a little sentiment profile for each episode, binning them in three minute increments and calculating the AFINN average sentiment score in each.
Feb 20, 2025
Lucy D’Agostino McGowan

Who’s talking: A Severance Analysis

rstats
severance
data visualization
For each episode we count the number of words each of the four main characters (Mark, Helly, Dylan, and Irving) speak for in each minute and visualize them.
Feb 5, 2025
Lucy D’Agostino McGowan

Spooky Seasons Greetings

rstats
normal
paranormal
Let’s do some paranormal plotting!
Oct 30, 2024
Lucy D’Agostino McGowan

Visual Diagnostic Tools for Causal Inference

rstats
causal inference
mirrored histograms
balance
propensity scores
treatment effect heterogeneity
Here we are going to look at several diagnostic plots that are helpful when attempting to answer a causal question. They can be used to visualize the target population, balance, and treatment effect heterogeneity.
Aug 4, 2023
Lucy D’Agostino McGowan

Visual Diagnostic Tool for Causal Inference: Heterogeneous Treatment Effects

rstats
causal inference
treatment heterogeneity
A simple diagnostic plot to examine potential treatment heterogeneity – what’s old is new!
Jul 26, 2023
Lucy D’Agostino McGowan

Are two wrong models better than one? Missing data style

rstats
simulations
missing data
Would imputing + fitting an outcome model using the wrong variables be better than just fitting the wrong outcome model? Let’s investigate!
Apr 29, 2023
Lucy D’Agostino McGowan

When is complete case analysis unbiased?

rstats
simulations
missing data
I have been thinking about scenarios under which it makes sense to use imputation for prediction models and am struggling to come up with a case. Yikes! Even for inference, as long as you do some doubly robust approach, I’m not sure I see the value (other than for precision, but then is no longer a question of bias and thus is a question for a different day!)
Apr 28, 2023
Lucy D’Agostino McGowan

It’s just a linear model: neural networks edition

statistical communication
data science pedoagogy
neural-networks
I created a little Shiny application to demonstrate that Neural Networks are just souped up linear models: https://lucy.shinyapps.io/neural-net-linear/
Apr 27, 2023
Lucy D’Agostino McGowan

Causal Quartets

causal inference
statistical communication
data science pedoagogy
confounding
On this weeks episode of Casual Inference we talk about a “Causal Quartet” a set of four datasets generated under different mechanisms, all with the same statistical summaries (including visualizations!) but different true causal effects.
Apr 24, 2023
Lucy D’Agostino McGowan

Transparency in Public Health

transparency
statistical communication
Transparency in public health messaging matters. Hannah Mendoza and I looked at how providing transparent information about why a public health recommendation is being made can increase uptake in a randomized trial published today in Plos One.
Dec 14, 2022
Lucy D’Agostino McGowan

Migrating from Hugo to Quarto

rstats
quarto
blog
We have migrated our blog from Hugo to Quarto! Here are a few quick tips that made the transition a bit smoother.
Sep 19, 2022
Lucy D’Agostino McGowan

tipr: An R package for sensitivity analyses for unmeasured confounding

rstats
The tipr R package has new updates!
Sep 8, 2022
Lucy D’Agostino McGowan

The Peril of Power when Prioritizing a Point Estimate

COVID-19
non-inferiority trials
clinical trials
power
I recently noticed that the Pfizer immunobridging trials, presumably set up to demonstrate that their COVID-19 vaccines elicit the same antibody response in children as was seen in 16-25 year olds, for whom efficacy has previously been demonstrated, have a strange criteria for “success”.
Feb 21, 2022
Lucy D’Agostino McGowan

Exploring the impacts of noninferiority trial thresholds

This post explores the impact of setting particular criteria for ‘success’ in clinical trial designs, with a specific example from the recent vaccine immunobridging trials.
Feb 20, 2022
Lucy D’Agostino McGowan

Would seeing Spider-Man: No Way Home decrease COVID-19 Cases?

covid-19
statistics
causal inference
In SNL’s cold open last night, President Joe Biden suggested that the COVID-19 surge we are seeing in the US is due to people seeing Spider-Man: No Way Home. If people would just stop seeing this film, he argues, cases will go back down! Interesting hypothesis, let’s take a looksy at the data, shall we?
Jan 16, 2022
Lucy D’Agostino McGowan

Vaccine effectiveness and breakthrough cases

covid-19
statistics
I’m seeing lots of confusion around the frequency of breakthrough cases and the effectiveness of vaccines (in fact, a recent interview I did resulted in a confusing headline on this topic!) so let’s dive in!
Aug 17, 2021
Lucy D’Agostino McGowan

Denominators Matter

covid-19
statistics
I’ve seen a lot today about how effective the vaccines are; mistakes aside, lots of folks seem to be mixing up which denominators matter - good thing statisticians LOVE denominators
Jul 21, 2021
Lucy D’Agostino McGowan

NYTimes Map How-to

rstats
nytimes
covid-19
A quick how-to for a neat New York Times visualization, inspired by an IsoStat listserv conversation.
Apr 7, 2021
Lucy D’Agostino McGowan

Grided: a web-app for building css-grid layouts

CSS
website
Grided is an app that lets you define CSS-Grid layouts in a simple GUI allowing you to see how your app will look while you define it.
Mar 1, 2021
Nick Strayer

How to build a DIY Lightboard

diy
lightboard
Learn how to build a lightboard with 5 quick supplies for less than $100!
Oct 2, 2020
Lucy D’Agostino McGowan

So you want to learn R…

rstats
There have been several twitter threads circulating with (free!) resources for learning R - I wanted to collect some of my favorites here.
Jul 2, 2020
Lucy D’Agostino McGowan

Survival Model Detective: Part 2

rstats
covid-19
survival analysis
competing risks
A paper by Grein et al. was recently published in the New England Journal of Medicine examining a cohort of patients with COVID-19 who were treated with compassionate-use remdesivir. This paper had a flaw in it’s main statistical analysis. Let’s learn a bit about competing risks!
May 22, 2020
Lucy D’Agostino McGowan

Survival Model Detective: Part 1

rstats
covid-19
survival analysis
competing risks
A paper by Grein et al. was recently published in the New England Journal of Medicine examining a cohort of patients with COVID-19 who were treated with compassionate-use remdesivir. This paper had a very cool figure - here’s how to recreate it in R!
May 21, 2020
Lucy D’Agostino McGowan

Graph detective

rstats
covid-19
data visualizations
A plot has been floating around on Twitter from Georgia where the x-axis is all scampled. Let’s look into it and see if we can fix it!
May 17, 2020
Lucy D’Agostino McGowan

This one cool hack will…help you categorize Harry Potter characters!

rstats
Inspired by the amazing Not So Standard Deviations, as usual, here is a fun way to categorize data using a left join instead of case_when or if/else statements!!!
May 16, 2020
Lucy D’Agostino McGowan

First year as faculty

higher education
thoughts
work-life harmony
I just completed my first year as a faculty member - here is what I’ve learned! I’ll start by giving some context for where I am, what my university is like, etc. Then I’ll describe four recommendations, summarized by: discover your institution’s culture, become a peer, find harmony, and build community!
May 15, 2020
Lucy D’Agostino McGowan

May the Fourth Be With You (#rstats style)

rstats
star wars
ggplot2
gganimate
Rafael Irizarry made a fabulous TIE fighter plot in R, Jake Thompson recreated it using ggplot2 and gganimate, I added some stars.
May 4, 2020
Lucy D’Agostino McGowan

Prevalence of a disease plays an important role in your probability of having COVID-19 given you tested positive

statistics
uncertainty
coronavirus
casual inference
bayes theorem
The prevalence of a disease plays an important role in your probability of having it given you test positive.
Apr 13, 2020
Lucy D’Agostino McGowan

Bayes Theorem and the Probability of Having COVID-19

statistics
uncertainty
coronavirus
casual inference
bayes theorem
I’ve seen a few papers describing the characteristics of people who tested positive for SARS-CoV-2 and this is sometimes being interpreted as describing people with certain characteristic’s the probability of infection. Let’s talk about why that’s likely not true.
Apr 9, 2020
Lucy D’Agostino McGowan

IHME Model Uncertainty: A quick explainer

statistics
uncertainty
coronavirus
casual inference
There has been a lot of talk about the IHME Covid-19 projection model. Ellie Murray & I have a chat about it on Episode 10 of Casual Inference; here is a quick description of what is going on here with a focus on the uncertainty.
Apr 8, 2020
Lucy D’Agostino McGowan

Pulling co-authors for grant docs

rstats
I just submitted my first grant. It turns out you need tons of little things when you submit a grant (who knew!) and one of the little things is a list of all of the coauthors you’ve published with in the past four years. Instead of tracking that down, I automated the process using R and then stuck the code here so I have it for next time!
Nov 14, 2019
Lucy D’Agostino McGowan

PheWAS-ME, an app for exploration of multimorbidity patterns in PheWAS

R
shiny
javascript
d3
EHR
biobank
This post is a longer-form and less-formal accompaniment to the manuscript “PheWAS-ME: A web-app for interactive exploration of multimorbidity patterns in PheWAS” and accompanying application. As the first of three papers that make up my PhD dissertation, the project represents a significant collaborative effort bringing together Electronic Health Records (EHR) and Biobank data using R and Shiny.
Oct 16, 2019
Nick Strayer

Building a data-driven CV with R

career
R
RMarkdown
CSS
Updating your CV or Resume can be a pain. It usually involves lots of copying and pasting along and then if you decide to tweak some style you may need to repeat the whole process. I decided to switch things up and design my CV so the format is just a wrapper around the underlying data. This post will help you do the same.
Sep 4, 2019
Nick Strayer

Extending the analogy: The boy who cried wolf was p-hacking!

During my postdoc with Jeff Leek, we worked on a few p-value, study design, and p-hacking “explainers”. Two of these were incorporated into TED-Ed cartoons (The totally ironically named (NOT BY ME) This one weird trick will help you spot clickbait and the less ironic Can you spot the problem with these headlines?), but the analogy written about here was never used, so here it is!
Aug 24, 2019
Lucy D’Agostino McGowan

Using AWK and R to parse 25tb

big data
awk
data cleaning
Recently I was tasked with parsing 25tb of raw genotype data. This is the story of how I brought the query time and cost down from 8 minutes and $20 to a tenth of a second and less than a penny, plus the lessons learned along the way.
Jun 4, 2019
Nick Strayer

Understanding propensity score weighting

propensity scores
causal inference
Come enjoy a graphical exploration of various propensity score weighting schemes.
Jan 17, 2019
Lucy D’Agostino McGowan

One year to dissertate

phd
one year dissertation
I’ve compiled some resources that I used when completing my dissertation and I wanted to share them with YOU! Throughout this post, I link to a bunch of different templates that I used throughout my process. You can find them all in a GitHub repo. This how-to has gotten a biiiiiit long. This post contains the whole kit-and-kaboodle, but I will also be releasing these in a series of smaller posts over the next couple of weeks.
Sep 14, 2018
Lucy D’Agostino McGowan

p-value thoughts: A twitter follow up

rstats
p-values
A conversation about how “convincing” various studies were based on sample size and p-values led me to post a poll on twitter. Here I discuss some thoughts that came up based on these results. tl;dr: p-values are hard, twitter is a fun way to spur stats convos!
Aug 21, 2018
Lucy D’Agostino McGowan

Shinyviewr: camera input for shiny

shiny
images
deeplearning
A brief intro to, and tutorial for, the new function in the shinysense packages: shinyviewr. This function allows you to take photos using the camera on your computer or phone and directly send them into your shiny applications.
Jul 22, 2018
Nick Strayer

R release names (Updated)

rstats
I always love discussions about R release names and their origin. I have been working on this list for a while – with the release of “Short Summer” today, I thought it’d be a good time to post!
Apr 23, 2018
Lucy D’Agostino McGowan

network3d - a 3D network visualization and layout library

visualization
networks
Recently, I have found myself needing to visualize networks. There are plenty of lovely options in R for visualizing networks in 2d, but I have found that many of the networks I want to visualize work much better when done in 3d and here the options are much smaller. This has prompted me to build the package network3d. This post will be a brief intro to using it.
Apr 9, 2018
Nick Strayer

The United States of Seasons

visualization
maps
climate
How different is the warmest day from the coldest day all around the country? Using readings from 7,000+ NOAA weather stations across the country we can find out.
Feb 12, 2018
Nick Strayer

Wrangling Data Day Texas Slides

rstats
Since twitter threads are excessively cumbersome to navigate, Maëlle asked me to relocate the list of #rstats Data Day Texas slides to a blog post, so here we are!
Jan 28, 2018
Lucy D’Agostino McGowan

A set.seed() + ggplot2 adventure

rstats
ggplot2
Recently I tweeted a small piece of advice re: when to set a seed in your script. Jenny pointed out that this may be blog post-worthy, so here we are!
Jan 22, 2018
Lucy D’Agostino McGowan

A year as told by fitbit

visualization
wearables
time series
Of all of the important things that happened in 2017, probably the most impactful on the world is that I managed to wear a fitbit the entire year. Here I download my entire years worth of heart rate and step data to see what my 2017 looked like, in terms of heart beats and steps.
Dec 27, 2017
Nick Strayer

Leveraging uncertainty information from deep neural networks for disease detection - a summary

deep learning
algorithms
uncertainty
bayesian
I was recently sent this fantastic paper on using uncertainty in deep neural networks. In it the authors demonstrate a practical use of approximate bayesian inference by dropout in the context of massively complicated computer vision models for diagnosing disease. The paper, while well written, is very long. Here I summarize it into its main points and comment on their impactfulness.
Dec 24, 2017
Nick Strayer

Secret Sampling

rstats
holiday cheer
’Tis the season for white elephant / גמד וענק / Yankee swap / secret santa-ing! We thought it’d be particularly fun to do it #rstats style.
Nov 15, 2017
Sarah Lotspeich and Lucy D’Agostino McGowan

Thanksgiving Gantt Chart

rstats
thankyou
Thanksgiving r emo::ji("turkey") is right around the corner r emo::ji("tada") – this year we are hosting 17 people r emo::ji("scream"). If you too are hosting way more than your kitchen normally cooks for, perhaps this will be of interest!
Nov 12, 2017
Lucy D’Agostino McGowan

LSTM neural nets as told by baseball

deep learning
algorithms
baseball
One thing I always found confusing when learning what an LSTM does is understanding intuitively why it’s doing what it does. Here I attempt to give an example of how a LSTM hidden layer can be thought of through baseball.
Nov 8, 2017
Nick Strayer

MCMC and the case of the spilled seeds

interactive
algorithms
visualization
For a long time I was confused by MCMC. I didn’t understand what it was, how it worked, and why we needed to do it. In this post I attempt to clear up those questions and allow you to play with the Metroplis Haystings algorithm as it attempts to find a posterior to help solve a mystery of two messy birds.
Oct 14, 2017
Nick Strayer

R release names

rstats
I always love discussions about R release names and their origin. I have been working on this list for a while – with the release of “Short Summer” today, I thought it’d be a good time to post!
Sep 28, 2017
Lucy D’Agostino McGowan

Commentary and follow up to p<0.005 suggestion

A recent paper, Redefine Statistical Significance by 72 co-authors, has caused quite a stir in the statistical community. Our student-run journal club at Vanderbilt will be discussing this contribution at our meeting led by Nathan James this week, so I’ve attempted to create a list of significant responses/commentary that have come out since this paper was posted on PsyArXiv.
Sep 25, 2017
Lucy D’Agostino McGowan

The traveling metallurgist

interactive
algorithms
visualization
Here I attempt to explain the concepts behind the optimization technique simulated annealing and the combinatorial optimization problem of the traveling salesman. First in words, and then more excitingly in an interactive visualization.
Sep 25, 2017
Nick Strayer

A Simple Slack Bot With Plumber

catslaps
gifs
slack
plumber
apis
I’ve been excited about the R package Plumber ever since hearing about it for the first time as useR2017. So when I finally found an application that would allow me to use it, sending cat and dog photos over slack, I jumped at the opportunity.
Sep 3, 2017
Nick Strayer

The Exponential Power Series

statistics
visualization
interactive
I find series expansions fascinating. I also find any math envolving e to be fascinating. Here I explain some of the facets of the exponential power series and its connection to my favorite distribution, the Poisson.
Aug 14, 2017
Nick Strayer

Why you maybe shouldn’t care about that p-value

p-values
statistics
sushi-cat
taco-tuesday
Recently, there seems to have been an uptick in citations of studies or statistics about this or that in the news and on the internet. Often these studies claim validity on the basis of a p-value. Through a small contrived example I make the point that in some situations we may want to ignore the forest and focus on the trees.
Aug 14, 2017
Nick Strayer

How to make an R Markdown website (with RStudio!)

rstats
website
tutorial
Interested in creating your personal website with R Markdown? We’ve updated our R Markdown website tutorial to depend on RStudio for simplicity, making website building easy as 🍰!
Aug 8, 2017
Nick Strayer & Lucy D’Agostino McGowan

Twitter trees

rstats
twitter
A little over a week ago, Hilary Parker tweeted out a poll about sending calendar invites that generated quite the repartee. It was quite popular – so much so that I couldn’t possible keep up with all of the replies! I personally am quite dependent on my calender, but I was intrigured to see what others had to say. This inspired me to try out some swanky R packages for visualizing trees.
Jul 24, 2017
Lucy D’Agostino McGowan

The making of “We R-Ladies”

rstats
rladies
Maëlle and I created a mosaic of R-Ladies for the JSM Data Art Show. Here is a quick tutorial if you are interested in trying something similar!
Jul 18, 2017
Lucy D’Agostino McGowan and Maëlle Salmon

Happy World Emoji Day: an analysis of rOpenSci’s Slack emojis

rOpenSci
rstats
emojis
HAPPY world emoji day! In honor of this momentous occasion, I have decided to analyze the emojis used on rOpenSci’s Slack.
Jul 17, 2017
Lucy D’Agostino McGowan

Introducing the tuftesque blogdown theme

rstats
If you like the way our blog looks, you too can have your own blogdown driven site just like it! In this post I walk through how to set up an RMarkdown driven blog from scratch using blogdown and the tuftesque theme constructed for Live Free Or Dichotomize.
Jul 13, 2017
Nick Strayer

useR!2017 digressions

rstats
conferences
travel
We both recently attended useR!2017 in Brussels. It was a blast to say the least. Here we will cover our favorite things about things about the conference and the lessons we learned.
Jul 12, 2017
Nick Strayer & Lucy D’Agostino McGowan

runconf17, an analysis of emoji use

rOpenSci
rstats
conferences
emojis
I had such a delightful time at rOpenSci’s unconference. Not only was it extremely productive (21 packages were produced!), but in between the crazy productivity was some epic community building.
Jun 4, 2017
Lucy D’Agostino McGowan

ENAR in words

ENAR
tidytext
conferences
rstats
I had an absolutely delightful time at ENAR this year. Lots of talk about the intersection between data science & statistics, diversity, and great advancements in statistical methods. Since there was quite a bit of twitter action, I thought I’d do a quick tutorial in scraping twitter data in R.
Mar 16, 2017
Lucy D’Agostino McGowan

Introducing shinyswipr: swipe your way to a great Shiny UI

javascript
visualization
Recently we have been working on a shiny app that mimics tinder for preprints. One of the more exciting things we’ve done in this app is implimented a swiping input. Now you can to with the package shinyswipr.
Mar 12, 2017
Nick Strayer

Intro to GMD

collaboration
rstats
Google Docs
Lucy and I have made a simple package that allows you to pull down a collaborative google doc directly into an RMD file on your computer. Hopefully speeding up the process of writing collaborative statistical documents.
Feb 24, 2017
Nick Strayer

The dire consequences of tests for linearity

rstats
rms
type 1 error
nonlinearity
This is a tale of the dire (type 1 error) consequences that occur when you test for linearity 😱
Feb 18, 2017
Lucy D’Agostino McGowan

The prevalence of drunk podcasts

NSSD
rstats
emojis
For today’s rendition of I am curious about everything, in Hilary Parker & Roger Peng’s Not So Standard Deviations Episode 32, Roger suggested the prevalence of drunk podcasting has dramatically increased - so I thought I’d dig into it.
Feb 9, 2017
Lucy D’Agostino McGowan

Yoga for modeling

A New Year’s resolution for all of our models: get more flexible! By flexible, we mean let’s be more intential about fitting nonlinear parametric models.
Jan 27, 2017
Lucy D’Agostino McGowan

CatterPlot thank you note

thankyou
rstats
emojis
Lara Harmon has put in countless hours to build and uplift the ASA Student community. We are SO grateful.
Jan 25, 2017
Lucy D’Agostino McGowan

Custom JavaScript visualizations in RMarkdown

javascript
visualization
Recently RStudio added JavaScript chunks to RMarkdown. This makes many exciting things possible. Among these things is making your own custom JavaScript visualizations of data managed in R, all without leaving the .Rmd document. This is a quick walkthrough of doing just that.
Jan 24, 2017
Nick Strayer

Regression modeling strategies: a student’s perspective

rms
rstats
Nick and I are starting a series following Frank Harrell’s Regression Modeling Strategies course. Get ready for some crazy fun.
Jan 18, 2017
Lucy D’Agostino McGowan

dplyr thank you note

thankyou
rstats
It’s that post-holiday time of year to write some thank yous! I’m getting excited to attend rstudio::conf next week, so in that spirit, I have put together a little thank you using dplyr
Jan 7, 2017
Lucy D’Agostino McGowan

Wait, what are P-values?

P-Values are annoying, let’s understand them so we dont get beaten by them.
Dec 24, 2016
Nick Strayer

Hill for the data scientist: an xkcd story

data-science
epidemiology
xkcd
NSSD
This was inspired by Hilary Parker & Roger Peng’s Not So Standard Deviations Episode 28. It was suggested that it would be useful to lay out Hill’s criterion for data scientists, I agree!
Dec 15, 2016
Lucy D’Agostino McGowan
No matching items