Okay, so today I just wrote a simple poem reviewing random number generation because, well, wanted a change of pace. That and I still haven’t thought of a specific topic for today. So anyway, here goes:
With just one generation of a number.
You could be in Houston saying Howdy partner!
Or racing down a street in Indianapolis.
Or taking in the river sights of St. Louis.
So stats are fun, kids, if you do them right.
And doesn’t the Atlanta sun look bright.
Or are we in Philadelphia next to the bell.
Or maybe in Detroit, just as well.
So stay tuned next time for some more numbers fun.
But you didn’t know I just named the justified realms under the sun.
Until I just told you and so my subtleness still needs more work.
But hope this poem was at least full of quirk.
So related to our last post, we are going to talk about what’s the best outcome to take if we have multiple imputations, each giving us slightly different data. Or, going back to our trilogy example, which dimension is Tina most likely to be in. So say she could be in Indianapolis or Miami or Atlanta where the average temperatures in January are 31, 75, and 47 degrees Fahrenheit, respectively. And again, why did she not just stay in the Miami realm in that case? Well, again, you’ll just have to read the books and see, won’t you? Now, let’s say that we get 100 temperature readings but misplace 40 of them … but replace them via multiple imputation based and take the average of the average temperatures from each of the data sets. So with the missing values, the average temperature calculated from the original 60 observed values is 49.91911. But if we impute the data 5 times, we get averages of 49.67150, 50.53665, 50.37694, 50.14105, and 50.35803. And the average of those averages are 50.2168. Anyway, hope that’s enough averages for ya! And on the average of the average of the … you get the picture … it looks like Tina was most likely in the Atlanta realm that January. Maybe not as nice as the 75 she could have had in Miami, but hey, at least she, um, well uh, could always have access to some delicious peach cobbler.
I mean, just look at that deliciousness! But even being in the Indianapolis realm wouldn’t be that bad, as discussed before. But anyway, till next time, hope you dig in!
So today’s topic is fairly simple and I hope provides an effective argument on why multiple imputation is preferred to say, single imputation, an example of which is hot deck imputation. And that is that we may want to impute the data several times so that we don’t underestimate the variation in the data. And why is this important? Well, just take Tina in our trilogy. We may want to know where she is at one time and if we just impute the data once, we may determine she’s in Philadelphia. But she may not be in Philadelphia; she may be in Atlanta … or St. Louis … or Miami … or you get the picture. Although she might want to be in Miami. And why not, you say? Because she’s crazy! No, no, that’s not it. Truth is you might want to read my third installment, Final Orders, to find out why. So how can we determine which dimension Tina is most likely to be in? Well, one way is to take the average estimate of the probability that she is in each dimension. But we can cover that next time too. Now, I’m not saying being single is a bad thing. In fact, I’m being followed by this cool twitter account, live_singer. In fact, Tina also might want to stay single in the dimensions that she does. Why, you ask? Because she’d craz … again you might want to read my third installment, Final Orders, to find out why. But it might also be fun to imagine yourself in different places, like Indianapolis or Paris or some other place that end with -is too. And although Tina might not want to go back to Miami, we might want to. So here you go, Bienvenidos a Miami.
So today we will briefly cover two of the many types of imputation there are. There are more than one, you ask? Why yes, and hot deck imputation and cold deck imputation are just two of them! So what are they? Well, basically with hot deck imputation, you fill in missing values with the current data you have at hand and with cold deck imputation, you fill them in with values from another source. So like when Randy (oh come on! You know my schtick by now!) is looking for Anton in the current dimension he is in, I guess he could look up where his target is supposed to work, live, eat, sleep, vacation, fret about book sales (oh wait – that’s me), etc., in that particular dimension and try to find him in one of those places. But if he wants to catch Anton in another dimension, he would want to look up where Anton would work, live, eat, sleep, vacation, fret about book sales (oh wait – that’s me) in the other dimension. So say, Anton works at a computer shop in a justified realm, Randy might want to check out the computer shops in that area if he plans to catch him there. But say Randy is planning to catch Anton in another dimension where Anton just so happens to own an art gallery. Then Randy would check out the art galleries of that dimension before actually going there. So how does Randy choose when and where to catch Anton and in which dimension? Well, again it depends on the dangers of the dimension and how likely he is to actually get the place right and all that jazz. And are there other fun types of imputation he could try? Well, yeah — but I’m on another deadline today and need more material for another time so we’ll just cover it then. But since we covered hot n cold deck imputation, what more perfect way to end today’s post than with this. Take it away, Katy!