optionsScalper

verbose=on, snakeOil=off, pontification=on, humanIntelligence=off

Subscriptions

<December 2008>
SuMoTuWeThFrSa
30123456
78910111213
14151617181920
21222324252627
28293031123
45678910

News

I have been having problems with comments. If you need to comment, please see the contact button at the top of the page.

Navigation

Post Categories

About Me

JJBR

Articles

Milwaukee Bloggers

"Gentlemen" bloggers

GA/GP/EC/ML

Sensible People

F#

Math, NT, GT, TOC

Security Blogs

DirectX/Game Development

Thursday, January 19, 2006 - Posts

Estimates in errors

I ran into an ironic situation tonight.

One of my activities at JJB Research is identifying and estimating the errors in models.  There are a number of approaches to this problem.  Some may involve statistics, others, simulations and still others involve empirical study sans statistics (does everything fit those silly distributions?).  Once an estimate is made, the evaluation of the process of the error (or error terms) discovery is studied.  The models are then reconsidered and sometimes modified in favor of this new information (do not say Bayes Theorem).  This can result in overfitting or ((shhhhh)) data-mining.  A favorite quote is always at arm's length:

All models are wrong.  Some are useful.  ---unattributed.

Contrast this to experiments and the scientific process.  Loosely, a question is asked, a hypothesis is constructed, an experiment is formed and executed with data collection, and the evaluation of the collected data is done to determine if, in fact, the experiment represents a true test of the hypothesis and if the results of that test can provide conclusion to the hypothesis, whether in the positive or negative.  The results that are used to form the conclusion should provide information about the validity of the test of the hypothesis.  This is not an easy task.  But again, the goal is the result, i.e. to answer the hypothesis, thumbs up or thumbs down.  If there are improvements to this process, they occur in many forms including improved measurement of collected data, better quality of data collection, whether it be observations that were missed when comparing a control group to the experimental group, etc.  Of course there are varied ways to think of these tasks, but in the end, it is the discipline to conduct this process with control over each aspect of the test and the hypothesis.  That evil bias beast lurks around every corner.

So with models, I estimate something and attempt to reduce error and improve the model.  With experiments, I attempt to improve the process and provide some level of certainty that bias will not be introduced.

((not a topic switch; bear with me))

I'm a single Dad and like any parent, have a bunch of simple tasks to do.  Make meals, do the dishes, clean the toilets, etc.  Do the laundry is one task that I actually don't mind.  I have a nice laundry area and quite frankly, it is well organized.  While it is in the basement, it is well lit, has a utility shelf for all laundry products at arm's reach from the washer/dryer/stationary tub.  Across from these machines are two three-bin clothes sorters for all dirty clothes.  From left to right, darks, lights, reds, whites, delicates and presoaks and finally towels and sheets.  Just behind these sorters is a 3'x8' buffet table that is used for folding and other clean clothes activities.  I could provide more details, but it doesn't improve the discussion.

Teh Liz, (improved spelling taken from reading Teh Reducer), has two jobs in the laundry workload.  1) Differentiate her dirty clothes from her clean clothes by putting them in her hamper, thus relieving me of the burden of deciding which clothes on the floor are actually in need of laundering and 2) provide in said hamper, all (or nearly all) empty hangers from her closet.

This evening, I gathered her clothes from the hamper (to be taken down to be sorted, etc.; process, process) and asked why her empty hanger count was so low.  "You can check my closet if you like".  Gee thanks, I'll rifle through all of your clothes looking for the random empty hangers.  "I looked already and you'll be lucky to find another 5 or 6".  Badabing.  Error estimation in action.  Let's check out how Teh Liz faired.

The closet is a standard in-wall closet with single hanging rail and is approximately 9 feet in length.  This accomodates approximately 14 garments per foot for 7 feet of the closet for a total of 98 garments.  This means that her estimate of her error in the location of empty hangers during my search should yield about one hanger every 15" of hanging rail (don't make fun of cummulative distribution functions here).  Starting from the right, I looked through each hanger to determine if it was indeed empty.  Ummmmmmmmmm, after the first 12" of search, I had an empty hanger count of 7.  I had exceeded her estimate within the first 14% of the data collection and while there isn't much order in this closet, I couldn't believe that this distribution was random normal (chant: "stop the introduction of bias, think pure thoughts").  The total number of empty hangers harvested during this data collection was 27 (draw your own conclusions about the distribution with this incomplete information; the right heptile (7-quantiles?) of the closet yielded the highest empty hanger count).

So I got to thinking - Is this a model where steps to reduce errors could be introduced (the most obvious would be to put empty hangers in the hamper upon changing their status from in-use to empty; reduces the likelihood that a hanger with status "empty" exists in closet, work with me here, I'm parenting, too)? or was this an experiment with a bad hypothesis and no conclusion.

My conclusions:  1) Teh Liz must not be my child as her estimate of error is excessive, 2) 98 garments of any type is too many for a 16 year old and finally 3) I had last done laundry four days earlier, so why, oh why were there 27 empty hangers (with a corresponding 27 articles of clothing in need of laundering) for this interval?  Must . . . reduce . . . errors . . . in . . . parenting.  Either that or just lighten up.

 

posted Thursday, January 19, 2006 7:55 PM by optionsScalper with 0 Comments

Powered by Community Server, by Telligent Systems