Wednesday
Jan182017

A self hacking evolver notebook for Mathematica

The attached mathematica notebook seems to self hack itself (or mathematica 10.02 / 10.3.1 functions) on a mac or Raspberry Pi in usually just 6 generations - about 20seconds of computer time on a 2017 computer, or about 330 seconds on a Raspberry.  Those machine percision numbers in mathematica seem to have subtle issues.  The evolver finds a way to exploit the fitness evaluation with machine numbers rather than trying to solve the otherwise near impossible task of pattern matching a model to random noise.  This code is not intended to be malicious in any way, but as it features self directed self writing code features - even if in a very small contained way - its future behavior can not be predicted.

I post because it is remarkable in its ability to expoit a function in very few generations. 

Here's example output:  in ~24 seconds it runs 6 generations at the end of which all models calculate to have 100% correlation to initially random noise.  Y axis is fitness/correlation, X is evolver population, one chart per generation.

 

       (*Coded by Dustan Doud 2016:  An example of a self hacking evolver that targets and exploits weaknesses in a fitness function*)

(*Data in table, columns are related data species, rows are individual data records.   *)

Quiet[LaunchKernels[] ];  (*maximum speed with more cores!*)
AbsoluteTiming[importedData = {Range[100],Range[100]*RandomReal[2,100],RandomReal[2,100]};(*a real numeric table data should go here*)
importedRowNumber=Length[importedData];

targetData =RandomReal[2,Length[importedData [[1]]  ]   ];  (*real numeric data target should be here.  I used randomness that can't be realistically matched, and that seems to be part of the reason why correlation hacking evolvers arise.  Forgive some code oddness, I am amature enthusiast and code cliped and cleaned from another project of mine *)

dataColumnLength = Length [importedData  ];

pollData[array_]:=RandomChoice[Flatten[array] ,1][[1]]  ;

population = 10000;  (*population of genetic agents*)

reproducerRatio =  1000/population; (*how many of elite are the reproducers?*)

lifespan = 6; (*how many seasons can a single formula replicate before it expires.  *)

(*here are formulae for the modeling.  I've avoided division because of high rates of division by zero.  Instead I have wavelets and pythagorian inspired features.  I've avoided power functions because of high rates of imaginary numbers.  Keeping it simple.*)

aDiv = Compile [{{x, _Real},{y,_Real}}, 2((x+y)*(x-y))/((x*x+y*y)^2+1), RuntimeAttributes->{Listable}];
bDiv = Compile [{{x, _Real},{y,_Real}}, 4((x+y)^2)/(1+x*x+y*y)^2, RuntimeAttributes->{Listable}];
cPlus = Compile [{{x, _Real},{y,_Real}}, (x*x+y*y)^0.5, RuntimeAttributes->{Listable}];
dPart = Compile [{{x, _Real},{y,_Real}}, x*x/(x*x+y*y), RuntimeAttributes->{Listable}];
compute = Compile[{{x,_Real},{y,_Real},{op, _Integer}},
If[op==1,aDiv[x,y] ,
If[op==2,bDiv [x,y],
If[op==3,cPlus[x,y],
If[op==4,dPart [x,y],
If[op==5, x+y,
If[op==6, x-y,
If[op==7,x*y, x] ] ] ] ] ] ] ,
RuntimeAttributes-> {Listable}
];

functionOperations = 8;  (*operation 8 is skip*)

geneticModelComplexity = 6;  (*how many data elements can the model contain?  3 = A - B * C     *)
    geneticModelOperations = geneticModelComplexity -1;  (*see above, one less operation not including equals than elements*)
    
(*make genes for data selector and function selector*)
 operatorgene[number_] :=RandomInteger[{1,functionOperations},geneticModelOperations]  &/@ Range[number]  ;
selectorgene[number_]:= RandomInteger[{1,importedRowNumber}, geneticModelComplexity]  &/@ Range [number]  ;
    
(*sexualCombination is a function that shuffles two lists in a way similar to sexual chromosomal selection.*)
sexualCombination[AnyX_,AnyY_] :=If[Length[AnyX]>1,RandomChoice[{#[[1]],#[[2]]}]&/@ Partition[Riffle[AnyX,AnyY],2],RandomChoice[{AnyX,AnyY}]];

(*sexualCrossover is a function that mixes two arrays of gene elements in way similar to chromosomal crossover - this is akin to watching the Lost Boys movie but switching to Jaws if the screen is filled with an ocean scene.*)
sexualCrossover[anyx_,anyy_]:= Module[{length=Min[Length[anyx] ,Length[anyy]],cut=0},If [length>1,
(cut=RandomInteger[{1,length}];Join[anyx[[;;cut-1]],anyy[[cut;;]]  ])
,RandomChoice[{anyx,anyy}  ]  ] (*endif*)  ]; (*module*)

(*This function takes two genes - selector and operator - and outputs a data matrix based on the model described by selector and operator genes *)
computeLife[s_,o_]:=Module[{start=importedData[[s[[1]] ]]},
Do[start=compute[ start,importedData[[s[[n+1]] ]], o[[n]] ],{n,1, Length[o]} ];start];

(*So how to start? with a random population of evolvers.  First the function to make evolvers, then a variable to hold a population*)

randomIndividual := Flatten[{selectorgene[1],operatorgene[1]} ,1];
randomPopulation :=Partition[Riffle[selectorgene[population],operatorgene[population] ] ,2];

livingPopulation=randomPopulation;

(*And next compute the models described by the evolvers.  Each model trys to be predictive toware target data, the most predictive models will reproduce*)
livingData = Parallelize[computeLife[#[[1]],#[[2]]  ] &/@ livingPopulation];

(*And compute the predictive fitness of each model, note Quiet is used because some models produce snaggledy mathematics*)
Quiet[Parallelize[livingFitnessValues=HoeffdingD[#,targetData]&/@ livingData]];      ;Quiet[Parallelize[livingFitnessValues=If[MachineNumberQ[#], Abs[#],0]&/@ livingFitnessValues   ]  ];

Winter:=((*Oh no!  A bleak winter sets in killing many of the evolvers*)
cut=Quantile[livingFitnessValues,1-reproducerRatio];

survivors=Flatten[Position[livingFitnessValues, xx_ /; 1>= xx>= cut]  ]  ; livingData=livingData[[survivors]] ;  (*Note:  evolvers are very good at finding solutions that create correlation coefficients that are greater than 1.  Not exactly how I believe the correlation functions were intended to work, but something that can be accomplished by hacking machine precision numbers.*)
livingPopulation=livingPopulation[[survivors]];livingFitnessValues=livingFitnessValues[[survivors]];

(*on a reasonably fast 4 core/Kernel machine everthing above should take about 2.5 seconds.  On to groovy love next *)

needChildren=population-Length[livingPopulation];);

makeChild[n_]:=Module[{chance=RandomChoice[{90,8,2}->{1,2,3}]&/@ Range[n]},
Piecewise[{   (*this is a mix of three kinds of sexual reproduction, with approximate odds of 90% 8% and 2% for sexual recombination, crossover and random reseeding*)
{sexualCombination[RandomChoice[livingPopulation],RandomChoice[livingPopulation]],#1==1},{sexualCrossover[RandomChoice[livingPopulation],RandomChoice[livingPopulation]],#1==2},{sexualCrossover[RandomChoice[livingPopulation],randomIndividual],#1==3} }   ]&/@chance];


Spring:=( (*Spring is when baby evolvers are born*)
children=makeChild[needChildren];

childData=Parallelize[computeLife[#[[1]],#[[2]]  ] &/@  children];
Quiet[Parallelize[childFitnessValues=HoeffdingD[#,targetData]&/@ childData]];   Quiet[Parallelize[childFitnessValues=If[MachineNumberQ[#], Abs[#],0]&/@ childFitnessValues   ]  ]; );

Autum:=( (*Autum is when little evolvers reach adulthood*)

livingFitnessValues=Join[livingFitnessValues, childFitnessValues];
livingData=Join[livingData,childData];
livingPopulation=Join[livingPopulation, children];);

Seasons:= (Winter;Spring;Autum;);  (*A generation of evolver activity*)

(*So we are set up for the evolvers to grow, season by season.  What can they do with random noise data targets in just 6 years?*)

Print["Charts of correlation to random noise (evolver fitness value) by evolver generation and across evolver population"];
Flatten[{  ListPlot[Sort[livingFitnessValues]],Table[(Seasons;ListPlot[{Sort[livingFitnessValues],Sort[childFitnessValues]}   ]),{lifespan}]  }]

]

PrintView Printer Friendly Version

EmailEmail Article to Friend

Reader Comments (1)

If one does not want self hacking, a verify machine numbers function can be added that seems to reduce (but not eliminate) unpredictable model behavior.
verifyMachineNumbers[list_] := (* this function may keep unusual self hacking with machine numbers at bay *)
If[ Abs[Max[list]] > $MaxMachineNumber / 1024 || ( (Max[list] - Min[list]) < 1024*$MachineEpsilon), 0.,
1.];

and redifine the fitness function to read:
Quiet[Parallelize[
livingFitnessValues = verifyMachineNumbers[#] * HoeffdingD[#, targetData] & /@ livingData]] ;

.... or just feed it problems it can solve.

January 25, 2017 | Registered CommenterDustan Doud

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>
« Taxation Without Representation | Main | Jesus Toast »