Here is the ovarian cancer data split randomly into a training set and a test set. The training set has 47 normal and 81 cancer, the test set has 44 normal and 81 cancer. ovtrain, ovtest.
19th October 06
Here is the ovarian cancer data! The full data set is here -- it is very large, 34Mb. There are 253 lines (253 different patients). Each one is a list of 15,154 values, ending with (field no 15,155) either the word "normal", or the word "cancer". In case it helps, I have split the file into several smaller files as follows. E.g. ov1-20 contains the first 20 patients, ov21-40 contains the next 20, and so on.
ov1-20, ov21-40, ov41-60, ov61-80, ov81-100, ov101-120, ov121-140, ov141-160, ov161-180, ov181-200, ov201-220, ov221-240, ov241-253
5th October 06
Robert, you might also find
this useful -- the main slides about Evolutionary
Algorithms from a course I used to do at Exeter.
Here is the the thesis skeleton.
Robert, see these introductions to EAs: here, and here
Here is the main original paper on Ovarian cancer: here.
This survey on data mining to find rules will also be good to read