Imputing new worlds with multiversal appeal

Monthly Archives: May 2015

Hoping I won’t get too lazy in the summer though I am taking a break this week.  But I do have some suggestions for summer beach reading!!


Just think about it!  Until next time …

So I’ve been to this seminar a while back talking about how variance, or the measure explaining how stuff can be different from each other, can also change over time.  Kind of like in my story (of course you know I was going there!) when Randy is looking for Anton, he looks into a few dimensions that could be similar to each other to search for him there.  But(!!) if Anton gets any wind of what Randy is looking into, he might start hopping into other dimensions that are more different from each other.  So the variation between dimensions changes, namely increases over time!  And now we have models that can look into that!! Isn’t that cool?  Well, I thought that was cool anyway.  But until next time, lets pretend we’re in a dimension in Tahiti — because, hello! It’s Tahiti!!!






So how we can convert an ordinal variable to a numerical variable and vice versa? Well, we can work with these things called percentiles, which tell us where a value is within the range of a continuous variable! Yeah, I’ll explain it better – don’t worry.  Say we have 10 numbers (1.405906  2.506874  3.752974  4.818329  4.956841  5.978801  6.924259  7.711760  7.859507 10.218921).  Now, say the first level of our ordinal variable corresponds to any number less than the 30th percentile.  That would be the third smallest number in our list, or 3.752974.  So any number less than 3.752974 would correspond to the first level.  Or if the second level corresponds to numbers between the 30th and 50th percentile, we would assign any number between 3.752974 and 4.818329 to that level. And any number above 4.818329  or the 50th percentile would correspond to the third level.

Now, say MGM is more likely to do an Order movie than Universal which is more likely to do an Order movie than Warner Brothers.  So we group MGM in the first percentile, Universal in the second percentile, and Warner Brothers in the third percentile.  If I then come up a number of 3.665538, then we can say MGM is going to make my movie (YAY!! I get the lion!!).  But if my number is 5.965685 then Warner Brothers will make the movie. (Not bad … I like Bugs Bunny too).  And if our number is 4.308145, then Universal is most likely to make my movie (about time Universal made something with multiversal appeal!).  Of course, I’d be open to all three studios coming up with franchises for the trilogy and … yeah, okay, first I’ll focus on getting a few more sales this month.  But until next time …


MGM … Universal … Warner Brothers — just think about it 😉

So we have categorical data that can describe things not necessarily associated with a number, like cities or colors: red, blue, pink, etc.  But there are also data which might look categorical but you could still describe numerically like weight class where underweight < normal < overweight or smoking status never < ex-smoker < current smoker.  Which is stuff I see in the everyday data I analyze.  Anyway, we can call these data ordinal.  Now, can we make categorical variables into ordinal variables for say, imputing?  What? Did someone say … imputing?  Well, I guess we could by relating it to a continuous variable, say, the average annual temperature of a city.  So if we looked at the cities of Miami, Atlanta, and Philadelphia and ranged them by the average annual temperature, we would get Philadelphia < Atlanta < Miami.  Now, what’s even more fun than that?  Well, we can convert our transform our ordinal values to numeric values and do whatever we want to do with them, say, impute them, and then. But we’ll get more to that next time.  But anyway, have a great weekend, peeps, and remember that Sunday is Mother’s Day!


And you know what would be a great gift for mom?  Yeah, okay, I gotchya …

Well, now is as good a time as any!  So I have to do a Venn diagram for one urology project that I’m working on and thought “Holy Moly! This would be a good topic for my blog!” And then I realized it was Thursday and thought to myself “Holy Moly! I need a topic for tomorrow!”  So then I thought that this was a good time as any to cover.  So here we go …

So lets say out of 100 dimensions Randy is monitoring, he finds Anton is 25 good dimensions and 40 bad dimensions.  Well, that’s not right, you say. 25 + 40 = 65, not 100!  To which I say, aha! But there could be 35 intersection points, which include both good and bad dimensions!!  Which is something else the Order wanted to implement! But do they succeed.  Well …  But anyway, until next time, here’s my masterpiece with a demonstration of today’s example.


Eh? Now, if this doesn’t get my trilogy to sell, I don’t know what … yeah, I know back to the drawing board.