optionsScalper

verbose=on, snakeOil=off, pontification=on, humanIntelligence=off

Subscriptions

<January 2009>
SuMoTuWeThFrSa
28293031123
45678910
11121314151617
18192021222324
25262728293031
1234567

News

I have been having problems with comments. If you need to comment, please see the contact button at the top of the page.

Navigation

Post Categories

About Me

JJBR

Articles

Milwaukee Bloggers

"Gentlemen" bloggers

GA/GP/EC/ML

Sensible People

F#

Math, NT, GT, TOC

Security Blogs

DirectX/Game Development

Bananas for the Honey Monkeys

If you live in an ant-hill or 5,000 feet below the surface of the ocean and you have not heard of the Honey Monkeys, I'll forgive you.  Otherwise, I'll expect that you know of these creatures.  They are socially responsible bots that live on the internet looking for spyware and viruses.  As a user of the internet, I truly appreciate what these guys are doing on my behalf.  Let's introduce the monkeys and then let's have a discussion on how to feed them bananas, i.e. simple EC methods for distracting the monkeys from doing their jobs.  Disclaimer:  I'm not a distributor nor author of spyware or viruses.  I'm not a proponent of spyware and viruses.  I hate this stuff.  I'm a socially responsible citizen (or at least that's where I try to keep my moral compass) in the physical world and on the internet.  Let's further be clear that this information is not a comprehensive document on the work that was done, nor was this information put under any scrutiny by any group other than JJBR (public and termites).

Static Nature of the Monkeys

the crazy one and I actually have a few ECs that measure many things about networks.  We are both TOC guys, i.e. networks have complexities and can be classified.  We thought it would be fun to put some monkeys on the simulators and see what happened.  While we remained unbiased for the experiments, we have agents that act like the monkeys (with evolvable strategies), so we had a pretty good idea of how the monkeys would respond.

A quick review of the monkeys reveals that they perform very simple tasks (observable facts):

  • Acquire URLs from HTML pages.
  • Navigate to the URLs
  • Determine if the URL planted a virus or spyware.  If so, perform some reporting activity of the URL.

Unknowns:

  • Do the monkeys always originate from the same IPs/domain?
  • Do the monkeys always present the same browser capabilities to the host of the URL?
  • Do the monkeys have any limitations on the number of URLs (pages) that they will scan for a particular domain?
  • Do the monkeys have any limitations on the consecutive number of URLs (pages) that they will visit for a particular domain?
  • Do the monkeys have any adaptive behaviors (observable or through induction)?

Since we don't have answers on the unknowns, we'll assume for the moment that the answer is no to each question.

A note on Framing:  A frame is temporal window in a run of an environment of an EC experiment.  A run contains a collection of frames, ordered in time.  A frame contains a record of all activities of the EC for that frame's frame index.  Those unfamiliar with the term decile, it means to break into 10 groups (quantile is the general term).  So when referring to a decile of frames, take the current number of frames, divide it by ten and then collect the frames in each decile.  For an experiment in progress, the only observable fact is the current frame, so the process of decilization can only occur for studies on this fact, e.g. if an experiment is on frame index 820, then the size of the decile is 82; frames 1-82 are considered the first decile, frames 83-164 are the second decile and so on.

Still in the City: No Monkeys, No Banana Trees

What we wanted to observe was whether the static monkeys would provide any advantage in the desired OEB.  The OEB we thought would be useful would be:  the monkeys must sustain a maximum infection rate below 5% (of all vertices) for an indefinite period of time (at least the last four deciles of framing of any experimental run).  As my OEB entry already has some of the basic concepts, let's copy it and expand on it.

Terms

  • Vertex (Vertices) - Each vertex of the graph in the EC represents a computer, enumerated as workstations, servers and bigservers.
  • Infection - A program that runs on a vertex.  The existence on the vertex of one or more infections marks the vertex as infected.  Note:  For these experiments, we use isolated infections, an infection cannot contact another vertex on its own and infect that vertex.
  • Disinfection - A program that identifies a single infection and provides a strategy for removal and protection of the infection.  Used by the disinfector.
  • Infections - Property of a vertex.  Implemented as a collection, infections are placed here.  If the count of the infection property of the vertex is greater than 0, the vertex is infected.
  • Occupation - A vertex must have at least one occupation (work to do and reason for doing it).  Occupations are from the enumeration: Infection Author, Infection Distributor, Disinfection Author, Disinfection Distributor, Disinfector, Page Distributor, End User.
  • Programs - Property of a vertex.  Implemented as a collection, programs are executed by users, i.e. Authors, Distributors, Disinfectors, End Users.
  • Infection Author - An agent that runs a program that occupies a vertex and is run for the purpose of the construction of a new infection.  An infection author occupies only one vertex at a time.  Infection authors carry their programs and may use any vertex marked as workstation.
  • Disinfection Author - An agent that runs a program that occupies a vertex and is run for the purpose of the construction of a disinfection.  A disinfection author occupies only one vertex at a time.  Disinfection authors carry their programs and may use any vertex marked as workstation.
  • User - An agent that runs programs on a vertex, demands disinfection from disinfection distributors and demands pages from page distributors.
  • Page - A unit of data requested by a user delivered by a page distributor.
  • Page Distributor - An agent on a vertex that provides on-demand pages to other vertices.
  • Disinfection Distributor - An agent on a vertex that contains a large number of disinfections and is responsible for distributing the disinfections to other computers.
  • Infection Distributor - An agent on a vertex that acts as a page distributor and contains a large number of infections.  It is responsible for distributing the infections with pages to other computers.
  • Disinfector - A program that runs on a vertex.  This program is responsible for infection identification, proactive infection protection and removal of infections.

A control environment that contains these items is set up and will have parameters set such that the network will be 100% saturated with infections (every vertex has an infection count > 1) at or before frame 10,000.  The control environment is run 100 times.  The mean frame for saturation is frame 9,994.  The last four decile infection mean is 81%.

Note that the population of agents remains constant.  All strategy for this control group is held constant.

Hey, hey we're the Monkeys

Another environment is set up with the same control set in the same experimental EC.  The following is added:

  • Monkey - An agent that acts like a user and spends 100% of it's vertex time demanding pages.  If a fulfilled page delivery results in an infection count increase on the monkey's vertex, the monkey notifies the police station.  One note: the monkey receives its vertex and page(s) information initially from a pages from "probabilistic infection distributors" and then from the last demanded unvisited page as determined by the current page.  If a monkey has no more pages to search, it either sits idle, or is awoken with a new list from another monkey.  Monkey communication is enabled.
  • Police Station - Message manager that receives notification from monkeys and keeps state of a collection of police units.
  • Police Unit - An agent that when notified visits a node and if present, removes and destroys an infection author or an infection distributor.  A police unit has statuses: idle, in transit, destroying.  Exactly one frame is required for "Notification to destroy" message to be sent from the station to a police unit.  Exactly one frame is required upon receipt of "Notification to destroy" to travel (in transit) to a vertex.

Everything else is held constant from the control group and this group is labeled the monkey group.  This group is run 100 times.  The mean frame for saturation is not measurable (through frame 100,000).  The last four decile infection mean is 61%.  No saturation and lowered four decile infection mean, i.e. the monkeys reduce the infections considerably.  We are not near the 5% OEB that we had hoped for, but we could try to improve the monkeys.  Without a full discussion, the monkeys were not able to improve the four decile infection mean to a value below 50% without significant bias in the model.  At this point, we have a failed hypothesis for the OEB.  Let's see if we can't make matters worse.

Knock knock; who's there; banana banana

Another environment is set up with the monkey group in the same experimental EC.  The following is added:

  • Banana Tree - A program on a vertex that is a page distributor.  The banana is a vertex page pair that points to a banana program.
  • Banana Program - A program on a vertex that generates fake vertex page pairs and allows for delivery of fake vertex page pairs.

Everything else is held constant from the monkey group and this group is labeled the banana group.  This group is run 100 times.  The mean frame for saturation is 66,442, i.e. the monkeys reduce the infections considerably, but the network still becomes saturated.  Also notice that the police are incapable of recognizing or shutting down a banana tree.

Toxic banana trees

Another environment is set up with the banana group in the same experimental EC.  The following is added:

  • Authors are allowed to spawn, i.e. evolvable strategy for authoring of infections and disinfections (in equal amounts) that "jumps" from agent to agent.  An author can "infect" another agent on the network with authoring skills.
  • Distributors are allowed to spawn (asexually reproduce) and move from vertex to vertex.
  • Vertex creation is allowed at probabilistic rates of growth with probabilistic programs and agents at initialization.  A vertex will always be created without infections and with disinfections.
  • Infection distributors are allowed to spawn vertices with banana trees that share vertex page pairs for page delivery, i.e. a banana tree can reference another banana tree for a contained page destination.
  • Monkeys are held constant, because they are static.

Everything else is held constant from the banana group and this group is labeled the Toxic group.  This group is run 100 times.  The mean frame for saturation is 3,019.  The last four decile infection mean is 86%.

Conclusions

The honey monkeys are a great idea.  When adaptive behaviors are introduced, the static monkeys become less effective, as expected.  This likely means either:

  1. All other experimental data must be held constant for effective honey monkeys.
  2. The honey monkeys must become adaptive.

Let's hope we hear some goodness from the researchers that brought us the monkeys.

posted on Wednesday, May 25, 2005 7:45 PM by optionsScalper

# Honey Monkeys and Bananas @ Thursday, May 26, 2005 4:59 AM

scalp posted up our (actually, mostly his)&amp;nbsp;work on the simulation of the&amp;nbsp;honey monkeys.&amp;nbsp;...

Anonymous

# Down the Rabbit Hole We Go @ Thursday, June 02, 2005 8:55 PM

(warning:&amp;nbsp; long, essay post)
Today I feel centered.&amp;nbsp; Not balanced, not inverted, but centered.&amp;nbsp;...

Anonymous

# Multidisciplinary Perspectives: Self-NonSelf Discrimination @ Monday, June 06, 2005 12:51 PM

In my recent rabbit hole post, I challenge the reader (after the challenge to actually read it), to become...

Anonymous

# AI: Sithsense.com and fun with neural nets @ Tuesday, June 07, 2005 10:37 PM

I love the guys at Microsoft Research.&amp;nbsp; I think that they have some brilliant minds.&amp;nbsp; I see...

Anonymous

# Security: RSS Feed Heisting @ Thursday, June 16, 2005 12:49 AM

I noticed that Bill had a post here on a blog that he found that happens to have the exact same content...

Anonymous

# Showing Some Skin and Hiding Some Drugs @ Saturday, June 25, 2005 8:45 AM

I'm looking forward to the weekend.&amp;nbsp; The forecast here is for hot weather and I'm looking forward...

Anonymous

# re: Maybe I like the CAPTCHA After all - The lesser of Two Evils. @ Sunday, July 31, 2005 1:17 AM

Anonymous

Powered by Community Server, by Telligent Systems