Even then, you have to make a variety of (sometimes shaky) assumptions:
1. The data set is a representative sample
2. Sentiment of social media posts is a good proxy for happiness
3. Machine analysis is sentiment is reliable (i.e. doesn't fail when it encounters sarcasm, which is common on Twitter)
Gonna be tweaking the approach over time, trying out other models and frameworks.
ie. look at the average rate my prof rating per school. There's of course room for bias, but schools with higher professor satisfaction may correlate with overall student happiness.