quarto-inputd3494df5f93aed6a

Good vibes only

rstats

vibe code

shiny app

user conference

Inspired by the practical problem of chairing a useR conference session, this talk walks through building a Slido replacement in pure “vibe coding” mode — no specs, no tests, just a vision and a stubborn refusal to stop until it worked (sort of). We’ll explore the chaotic joy, the accidental brilliance, and the inevitable questionable code that got us there.

Lucy D’Agostino McGowan

Elevator Dings: A Severance Analysis

rstats

severance

data visualization

This data consists of all elevator dings in the Severance episodes along with the episode number, time stamp, pitch of the ding, and the action associated.

Lucy D’Agostino McGowan

How are we feeling: A Severance Analysis

rstats

severance

data visualization

We create a little sentiment profile for each episode, binning them in three minute increments and calculating the AFINN average sentiment score in each.

Lucy D’Agostino McGowan

Who’s talking: A Severance Analysis

rstats

severance

data visualization

For each episode we count the number of words each of the four main characters (Mark, Helly, Dylan, and Irving) speak for in each minute and visualize them.

Lucy D’Agostino McGowan

Spooky Seasons Greetings

rstats

normal

paranormal

Let’s do some paranormal plotting!

Lucy D’Agostino McGowan

Visual Diagnostic Tools for Causal Inference

rstats

causal inference

mirrored histograms

balance

propensity scores

treatment effect heterogeneity

Here we are going to look at several diagnostic plots that are helpful when attempting to answer a causal question. They can be used to visualize the target population, balance, and treatment effect heterogeneity.

Lucy D’Agostino McGowan

Visual Diagnostic Tool for Causal Inference: Heterogeneous Treatment Effects

rstats

causal inference

treatment heterogeneity

A simple diagnostic plot to examine potential treatment heterogeneity – what’s old is new!

Lucy D’Agostino McGowan

Are two wrong models better than one? Missing data style

rstats

simulations

missing data

Would imputing + fitting an outcome model using the wrong variables be better than just fitting the wrong outcome model? Let’s investigate!

Lucy D’Agostino McGowan

When is complete case analysis unbiased?

rstats

simulations

missing data

I have been thinking about scenarios under which it makes sense to use imputation for prediction models and am struggling to come up with a case. Yikes! Even for inference, as long as you do some doubly robust approach, I’m not sure I see the value (other than for precision, but then is no longer a question of bias and thus is a question for a different day!)

Lucy D’Agostino McGowan

It’s just a linear model: neural networks edition

statistical communication

data science pedoagogy

neural-networks

I created a little Shiny application to demonstrate that Neural Networks are just souped up linear models: https://lucy.shinyapps.io/neural-net-linear/

Lucy D’Agostino McGowan

Causal Quartets

causal inference

statistical communication

data science pedoagogy

confounding

On this weeks episode of Casual Inference we talk about a “Causal Quartet” a set of four datasets generated under different mechanisms, all with the same statistical summaries (including visualizations!) but different true causal effects.

Lucy D’Agostino McGowan

Transparency in Public Health

transparency

statistical communication

Transparency in public health messaging matters. Hannah Mendoza and I looked at how providing transparent information about why a public health recommendation is being made can increase uptake in a randomized trial published today in Plos One.

Lucy D’Agostino McGowan

Migrating from Hugo to Quarto

rstats

quarto

blog

We have migrated our blog from Hugo to Quarto! Here are a few quick tips that made the transition a bit smoother.

Lucy D’Agostino McGowan

tipr: An R package for sensitivity analyses for unmeasured confounding

rstats

The tipr R package has new updates!

Lucy D’Agostino McGowan

The Peril of Power when Prioritizing a Point Estimate

COVID-19

non-inferiority trials

clinical trials

power

I recently noticed that the Pfizer immunobridging trials, presumably set up to demonstrate that their COVID-19 vaccines elicit the same antibody response in children as was seen in 16-25 year olds, for whom efficacy has previously been demonstrated, have a strange criteria for “success”.

Lucy D’Agostino McGowan

Exploring the impacts of noninferiority trial thresholds

This post explores the impact of setting particular criteria for ‘success’ in clinical trial designs, with a specific example from the recent vaccine immunobridging trials.

Lucy D’Agostino McGowan

Would seeing Spider-Man: No Way Home decrease COVID-19 Cases?

covid-19

statistics

causal inference

In SNL’s cold open last night, President Joe Biden suggested that the COVID-19 surge we are seeing in the US is due to people seeing Spider-Man: No Way Home. If people would just stop seeing this film, he argues, cases will go back down! Interesting hypothesis, let’s take a looksy at the data, shall we?

Lucy D’Agostino McGowan

Vaccine effectiveness and breakthrough cases

covid-19

statistics

I’m seeing lots of confusion around the frequency of breakthrough cases and the effectiveness of vaccines (in fact, a recent interview I did resulted in a confusing headline on this topic!) so let’s dive in!

Lucy D’Agostino McGowan

Denominators Matter

covid-19

statistics

I’ve seen a lot today about how effective the vaccines are; mistakes aside, lots of folks seem to be mixing up which denominators matter - good thing statisticians LOVE denominators

Lucy D’Agostino McGowan

NYTimes Map How-to

rstats

nytimes

covid-19

A quick how-to for a neat New York Times visualization, inspired by an IsoStat listserv conversation.

Lucy D’Agostino McGowan

Grided: a web-app for building css-grid layouts

CSS

website

Grided is an app that lets you define CSS-Grid layouts in a simple GUI allowing you to see how your app will look while you define it.

How to build a DIY Lightboard

diy

lightboard

Learn how to build a lightboard with 5 quick supplies for less than $100!

Lucy D’Agostino McGowan

So you want to learn R…

rstats

There have been several twitter threads circulating with (free!) resources for learning R - I wanted to collect some of my favorites here.

Lucy D’Agostino McGowan

Survival Model Detective: Part 2

rstats

covid-19

survival analysis

competing risks

A paper by Grein et al. was recently published in the New England Journal of Medicine examining a cohort of patients with COVID-19 who were treated with compassionate-use remdesivir. This paper had a flaw in it’s main statistical analysis. Let’s learn a bit about competing risks!

Lucy D’Agostino McGowan

Survival Model Detective: Part 1

rstats

covid-19

survival analysis

competing risks

A paper by Grein et al. was recently published in the New England Journal of Medicine examining a cohort of patients with COVID-19 who were treated with compassionate-use remdesivir. This paper had a very cool figure - here’s how to recreate it in R!

Lucy D’Agostino McGowan

Graph detective

rstats

covid-19

data visualizations

A plot has been floating around on Twitter from Georgia where the x-axis is all scampled. Let’s look into it and see if we can fix it!

Lucy D’Agostino McGowan

This one cool hack will…help you categorize Harry Potter characters!

rstats

Inspired by the amazing Not So Standard Deviations, as usual, here is a fun way to categorize data using a left join instead of case_when or if/else statements!!!

Lucy D’Agostino McGowan

First year as faculty

higher education

thoughts

work-life harmony

I just completed my first year as a faculty member - here is what I’ve learned! I’ll start by giving some context for where I am, what my university is like, etc. Then I’ll describe four recommendations, summarized by: discover your institution’s culture, become a peer, find harmony, and build community!

Lucy D’Agostino McGowan

May the Fourth Be With You (#rstats style)

rstats

star wars

ggplot2

gganimate

Rafael Irizarry made a fabulous TIE fighter plot in R, Jake Thompson recreated it using ggplot2 and gganimate, I added some stars.

Lucy D’Agostino McGowan

Prevalence of a disease plays an important role in your probability of having COVID-19 given you tested positive

statistics

uncertainty

coronavirus

casual inference

bayes theorem

The prevalence of a disease plays an important role in your probability of having it given you test positive.

Lucy D’Agostino McGowan

Bayes Theorem and the Probability of Having COVID-19

statistics

uncertainty

coronavirus

casual inference

bayes theorem

I’ve seen a few papers describing the characteristics of people who tested positive for SARS-CoV-2 and this is sometimes being interpreted as describing people with certain characteristic’s the probability of infection. Let’s talk about why that’s likely not true.

Lucy D’Agostino McGowan

IHME Model Uncertainty: A quick explainer

statistics

uncertainty

coronavirus

casual inference

There has been a lot of talk about the IHME Covid-19 projection model. Ellie Murray & I have a chat about it on Episode 10 of Casual Inference; here is a quick description of what is going on here with a focus on the uncertainty.

Lucy D’Agostino McGowan

Pulling co-authors for grant docs

rstats

I just submitted my first grant. It turns out you need tons of little things when you submit a grant (who knew!) and one of the little things is a list of all of the coauthors you’ve published with in the past four years. Instead of tracking that down, I automated the process using R and then stuck the code here so I have it for next time!

Lucy D’Agostino McGowan

PheWAS-ME, an app for exploration of multimorbidity patterns in PheWAS

R

shiny

javascript

d3

EHR

biobank

This post is a longer-form and less-formal accompaniment to the manuscript “PheWAS-ME: A web-app for interactive exploration of multimorbidity patterns in PheWAS” and accompanying application. As the first of three papers that make up my PhD dissertation, the project represents a significant collaborative effort bringing together Electronic Health Records (EHR) and Biobank data using R and Shiny.

Building a data-driven CV with R

career

R

RMarkdown

CSS

Updating your CV or Resume can be a pain. It usually involves lots of copying and pasting along and then if you decide to tweak some style you may need to repeat the whole process. I decided to switch things up and design my CV so the format is just a wrapper around the underlying data. This post will help you do the same.

Extending the analogy: The boy who cried wolf was p-hacking!

During my postdoc with Jeff Leek, we worked on a few p-value, study design, and p-hacking “explainers”. Two of these were incorporated into TED-Ed cartoons (The totally ironically named (NOT BY ME) This one weird trick will help you spot clickbait and the less ironic Can you spot the problem with these headlines?), but the analogy written about here was never used, so here it is!

Lucy D’Agostino McGowan

Using AWK and R to parse 25tb

big data

awk

data cleaning

Recently I was tasked with parsing 25tb of raw genotype data. This is the story of how I brought the query time and cost down from 8 minutes and $20 to a tenth of a second and less than a penny, plus the lessons learned along the way.

Understanding propensity score weighting

propensity scores

causal inference

Come enjoy a graphical exploration of various propensity score weighting schemes.

Lucy D’Agostino McGowan

One year to dissertate

phd

one year dissertation

I’ve compiled some resources that I used when completing my dissertation and I wanted to share them with YOU! Throughout this post, I link to a bunch of different templates that I used throughout my process. You can find them all in a GitHub repo. This how-to has gotten a biiiiiit long. This post contains the whole kit-and-kaboodle, but I will also be releasing these in a series of smaller posts over the next couple of weeks.

Lucy D’Agostino McGowan

p-value thoughts: A twitter follow up

rstats

p-values

A conversation about how “convincing” various studies were based on sample size and p-values led me to post a poll on twitter. Here I discuss some thoughts that came up based on these results. tl;dr: p-values are hard, twitter is a fun way to spur stats convos!

Lucy D’Agostino McGowan

Shinyviewr: camera input for shiny

shiny

images

deeplearning

A brief intro to, and tutorial for, the new function in the shinysense packages: shinyviewr. This function allows you to take photos using the camera on your computer or phone and directly send them into your shiny applications.

R release names (Updated)

rstats

I always love discussions about R release names and their origin. I have been working on this list for a while – with the release of “Short Summer” today, I thought it’d be a good time to post!

Lucy D’Agostino McGowan

network3d - a 3D network visualization and layout library

visualization

networks

Recently, I have found myself needing to visualize networks. There are plenty of lovely options in R for visualizing networks in 2d, but I have found that many of the networks I want to visualize work much better when done in 3d and here the options are much smaller. This has prompted me to build the package network3d. This post will be a brief intro to using it.

The United States of Seasons

visualization

maps

climate

How different is the warmest day from the coldest day all around the country? Using readings from 7,000+ NOAA weather stations across the country we can find out.

Wrangling Data Day Texas Slides

rstats

Since twitter threads are excessively cumbersome to navigate, Maëlle asked me to relocate the list of #rstats Data Day Texas slides to a blog post, so here we are!

Lucy D’Agostino McGowan

A set.seed() + ggplot2 adventure

rstats

ggplot2

Recently I tweeted a small piece of advice re: when to set a seed in your script. Jenny pointed out that this may be blog post-worthy, so here we are!

Lucy D’Agostino McGowan

A year as told by fitbit

visualization

wearables

time series

Of all of the important things that happened in 2017, probably the most impactful on the world is that I managed to wear a fitbit the entire year. Here I download my entire years worth of heart rate and step data to see what my 2017 looked like, in terms of heart beats and steps.

Leveraging uncertainty information from deep neural networks for disease detection - a summary

deep learning

algorithms

uncertainty

bayesian

I was recently sent this fantastic paper on using uncertainty in deep neural networks. In it the authors demonstrate a practical use of approximate bayesian inference by dropout in the context of massively complicated computer vision models for diagnosing disease. The paper, while well written, is very long. Here I summarize it into its main points and comment on their impactfulness.

Secret Sampling

rstats

holiday cheer

’Tis the season for white elephant / גמד וענק / Yankee swap / secret santa-ing! We thought it’d be particularly fun to do it #rstats style.

Sarah Lotspeich and Lucy D’Agostino McGowan

Thanksgiving Gantt Chart

rstats

thankyou

Thanksgiving r emo::ji("turkey") is right around the corner r emo::ji("tada") – this year we are hosting 17 people r emo::ji("scream"). If you too are hosting way more than your kitchen normally cooks for, perhaps this will be of interest!

Lucy D’Agostino McGowan

LSTM neural nets as told by baseball

deep learning

algorithms

baseball

One thing I always found confusing when learning what an LSTM does is understanding intuitively why it’s doing what it does. Here I attempt to give an example of how a LSTM hidden layer can be thought of through baseball.

MCMC and the case of the spilled seeds

interactive

algorithms

visualization

For a long time I was confused by MCMC. I didn’t understand what it was, how it worked, and why we needed to do it. In this post I attempt to clear up those questions and allow you to play with the Metroplis Haystings algorithm as it attempts to find a posterior to help solve a mystery of two messy birds.

R release names

rstats

I always love discussions about R release names and their origin. I have been working on this list for a while – with the release of “Short Summer” today, I thought it’d be a good time to post!

Lucy D’Agostino McGowan

The traveling metallurgist

interactive

algorithms

visualization

Here I attempt to explain the concepts behind the optimization technique simulated annealing and the combinatorial optimization problem of the traveling salesman. First in words, and then more excitingly in an interactive visualization.

Commentary and follow up to p<0.005 suggestion

A recent paper, Redefine Statistical Significance by 72 co-authors, has caused quite a stir in the statistical community. Our student-run journal club at Vanderbilt will be discussing this contribution at our meeting led by Nathan James this week, so I’ve attempted to create a list of significant responses/commentary that have come out since this paper was posted on PsyArXiv.

Lucy D’Agostino McGowan

A Simple Slack Bot With Plumber

catslaps

gifs

slack

plumber

apis

I’ve been excited about the R package Plumber ever since hearing about it for the first time as useR2017. So when I finally found an application that would allow me to use it, sending cat and dog photos over slack, I jumped at the opportunity.

Why you maybe shouldn’t care about that p-value

p-values

statistics

sushi-cat

taco-tuesday

Recently, there seems to have been an uptick in citations of studies or statistics about this or that in the news and on the internet. Often these studies claim validity on the basis of a p-value. Through a small contrived example I make the point that in some situations we may want to ignore the forest and focus on the trees.

The Exponential Power Series

statistics

visualization

interactive

I find series expansions fascinating. I also find any math envolving e to be fascinating. Here I explain some of the facets of the exponential power series and its connection to my favorite distribution, the Poisson.

How to make an R Markdown website (with RStudio!)

rstats

website

tutorial

Interested in creating your personal website with R Markdown? We’ve updated our R Markdown website tutorial to depend on RStudio for simplicity, making website building easy as 🍰!

Nick Strayer & Lucy D’Agostino McGowan

Twitter trees

rstats

twitter

A little over a week ago, Hilary Parker tweeted out a poll about sending calendar invites that generated quite the repartee. It was quite popular – so much so that I couldn’t possible keep up with all of the replies! I personally am quite dependent on my calender, but I was intrigured to see what others had to say. This inspired me to try out some swanky R packages for visualizing trees.

Lucy D’Agostino McGowan

The making of “We R-Ladies”

rstats

rladies

Maëlle and I created a mosaic of R-Ladies for the JSM Data Art Show. Here is a quick tutorial if you are interested in trying something similar!

Lucy D’Agostino McGowan and Maëlle Salmon

Happy World Emoji Day: an analysis of rOpenSci’s Slack emojis

rOpenSci

rstats

emojis

HAPPY world emoji day! In honor of this momentous occasion, I have decided to analyze the emojis used on rOpenSci’s Slack.

Lucy D’Agostino McGowan

Introducing the tuftesque blogdown theme

rstats

If you like the way our blog looks, you too can have your own blogdown driven site just like it! In this post I walk through how to set up an RMarkdown driven blog from scratch using blogdown and the tuftesque theme constructed for Live Free Or Dichotomize.

useR!2017 digressions

rstats

conferences

travel

We both recently attended useR!2017 in Brussels. It was a blast to say the least. Here we will cover our favorite things about things about the conference and the lessons we learned.

Nick Strayer & Lucy D’Agostino McGowan

runconf17, an analysis of emoji use

rOpenSci

rstats

conferences

emojis

I had such a delightful time at rOpenSci’s unconference. Not only was it extremely productive (21 packages were produced!), but in between the crazy productivity was some epic community building.

Lucy D’Agostino McGowan

ENAR in words

ENAR

tidytext

conferences

rstats

I had an absolutely delightful time at ENAR this year. Lots of talk about the intersection between data science & statistics, diversity, and great advancements in statistical methods. Since there was quite a bit of twitter action, I thought I’d do a quick tutorial in scraping twitter data in R.

Lucy D’Agostino McGowan

Introducing shinyswipr: swipe your way to a great Shiny UI

javascript

visualization

Recently we have been working on a shiny app that mimics tinder for preprints. One of the more exciting things we’ve done in this app is implimented a swiping input. Now you can to with the package shinyswipr.

Intro to GMD

collaboration

rstats

Google Docs

Lucy and I have made a simple package that allows you to pull down a collaborative google doc directly into an RMD file on your computer. Hopefully speeding up the process of writing collaborative statistical documents.

The dire consequences of tests for linearity

rstats

rms

type 1 error

nonlinearity

This is a tale of the dire (type 1 error) consequences that occur when you test for linearity 😱

Lucy D’Agostino McGowan

The prevalence of drunk podcasts

NSSD

rstats

emojis

For today’s rendition of I am curious about everything, in Hilary Parker & Roger Peng’s Not So Standard Deviations Episode 32, Roger suggested the prevalence of drunk podcasting has dramatically increased - so I thought I’d dig into it.

Lucy D’Agostino McGowan

Yoga for modeling

A New Year’s resolution for all of our models: get more flexible! By flexible, we mean let’s be more intential about fitting nonlinear parametric models.

Lucy D’Agostino McGowan

CatterPlot thank you note

thankyou

rstats

emojis

Lara Harmon has put in countless hours to build and uplift the ASA Student community. We are SO grateful.

Lucy D’Agostino McGowan

Custom JavaScript visualizations in RMarkdown

javascript

visualization

Recently RStudio added JavaScript chunks to RMarkdown. This makes many exciting things possible. Among these things is making your own custom JavaScript visualizations of data managed in R, all without leaving the .Rmd document. This is a quick walkthrough of doing just that.

Regression modeling strategies: a student’s perspective

rms

rstats

Nick and I are starting a series following Frank Harrell’s Regression Modeling Strategies course. Get ready for some crazy fun.

Lucy D’Agostino McGowan

dplyr thank you note

thankyou

rstats

It’s that post-holiday time of year to write some thank yous! I’m getting excited to attend rstudio::conf next week, so in that spirit, I have put together a little thank you using dplyr

Lucy D’Agostino McGowan

Wait, what are P-values?

P-Values are annoying, let’s understand them so we dont get beaten by them.

Hill for the data scientist: an xkcd story

data-science

epidemiology

xkcd

NSSD

This was inspired by Hilary Parker & Roger Peng’s Not So Standard Deviations Episode 28. It was suggested that it would be useful to lay out Hill’s criterion for data scientists, I agree!

Lucy D’Agostino McGowan