Punctuation in literature
This morning I was scrolling through Twitter and noticed Alberto Cairo share this lovely data visualization piece by Adam J. Calhoun about the varying prevalence of punctuation in literature. I...continue reading.
This morning I was scrolling through Twitter and noticed Alberto Cairo share this lovely data visualization piece by Adam J. Calhoun about the varying prevalence of punctuation in literature. I...continue reading.
Note: Cross-posted with the Stack Overflow blog. Starting today, you can access the public data release for Stack Overflow’s 2018 Developer Survey. Over 100,000 developers from around the world shared...continue reading.
This year, I have given some talks about understanding principal component analysis using what I spend day in and day out with, Stack Overflow data. You can see a recording...continue reading.
I am so lucky to work with so many generous, knowledgeable, and amazing people at Stack Overflow, including Ian Allen and Kirti Thorat. Both Ian and Kirti are part of...continue reading.
In a recent release of tidytext, we added tidiers and support for building Structural Topic Models from the stm package. This is my current favorite implementation of topic modeling in...continue reading.
I am pleased to announce that tidytext 0.1.6 is now on CRAN! Most of this release, as well as the 0.1.5 release which I did not blog about, was for...continue reading.
I recently passed my one-year anniversary of working at Stack Overflow as a data scientist. I have some very exciting news! I am joining the data team at @StackOverflow. ✨?✨?✨...continue reading.
A few weeks ago, I wrote a post about finding word vectors using tidy data principles, based on an approach outlined by Chris Moody on the StitchFix tech blog. I’ve...continue reading.
I love emoji ❤️ and I love xkcd, so this recent comic from Randall Munroe was quite a delight for me. I sat there, enjoying the thought of these new...continue reading.
Last week I saw Chris Moody’s post on the Stitch Fix blog about calculating word vectors from a corpus of text using word counts and matrix factorization, and I was...continue reading.
Note: cross-posted with the Stack Overflow blog. If you hang out on Meta Stack Overflow, you may have noticed news from time to time about A/B tests of various features...continue reading.
I have a new post on the Stack Overflow blog today about the complex, interrelated ecosystems of software development. On the data team at Stack Overflow, we spend a lot...continue reading.
I am pleased to announce that tidytext 0.1.4 is now on CRAN! This release of our package for text mining using tidy data principles has an excellent collection of delightfulness...continue reading.
I’ve been developing a course at DataCamp over the past several months, and I am happy to announce that it is now launched! The course is Sentiment Analysis in R:...continue reading.
I have a new visual essay up at The Pudding today, using text mining to explore how women are portrayed in film. The R code behind this analysis in publicly...continue reading.
At useR!2017 in Brussels last month, I contributed to an organized session focused on navigating the 11,000+ packages on CRAN. My collaborators on this session and I recently put together...continue reading.
Earlier this month, I, along with John Nash, Spencer Graves, and Ludovic Vannoorenberghe, organized a session at useR!2017 focused on discovering, learning about, and evaluating R packages. You can check...continue reading.
Note: Cross-posted with the Stack Overflow blog. This week, my fellow Stack Overflow data scientist David Robinson and I are happy to announce the publication of our book Text Mining...continue reading.
Recently, I have been following the development and release of Kyle Walker’s tidycensus package. I have been filled with amazement, delight, and well, perhaps another feeling… There should be a...continue reading.
I am pleased to announce that tidytext 0.1.3 is now on CRAN! In this release, my collaborator David Robinson and I have fixed a handful of bugs, added tidiers for...continue reading.