Blog

Machine learning, text analysis, and more

Water World

Exploring and Predicting Water Use in Salt Lake City

Health Care Indicators in Utah Counties

Correlation Coefficients, a Shiny App, Principal Component Analysis, and Clustering

This Is the Place, Apparently

Demographics and Choropleth Maps of My Home State

Joy to the World, and also Anticipation, Disgust, Surprise…

In my previous blog post, I analyzed my Twitter archive and explored some aspects of my tweeting behavior. When do I tweet, how much do retweet people, do I use hashtags? These are examples of one kind of question, but what about the actual verbal content of my tweets, the text itself? What kinds of questions can we ask and answer about the text in some programmatic way? This is what is called natural language processing, and I’ll give a first shot at it here.

Ten Thousand Tweets

I started learning the statistical programming language R this past summer, and discovering Hadley Wickham’s data visualization package ggplot2 has been a joy and a revelation. When I think back to how I made all the plots for my astronomy dissertation in the early 2000s (COUGH SUPERMONGO COUGH), I feel a bit in awe of what ggplot2 can do and how easy and, might I even say, delightful it is to use.

← Newer
26 of 26
Older →