User:Graeme E. Smith/Collections/Model Series/Datamining/What does Datamining have to do with Intuition

From Wikibooks, open books for an open world
Jump to navigation Jump to search

What does Datamining Have to do with Intuition?[edit | edit source]


If we think about datamining, datamining is the ability to winkle more information out of a database, than the original authors of that database thought they put into it. Intuition on the other hand, is a way the brain uses to make sure that it has completed its analysis/synthesis steps before it takes a new element and puts it into Declarative Memory.
If we were to datamine intuition, we would be winkling out the information hidden by the brain while it waits for its analysis/synthesis to complete, so we can use it, while we are waiting for the analysis/synthesis steps to complete.
There are essentially, so far two main ways of datamining that we can use in this case.

  1. Oversampling, a technique where you sample the same information and recalculate after each sample, in order to sense minute changes in analysis, and
  2. Acceleration, a technique where you increase the amount of the type of information related to your subject coming in, in order to increase the amount of synthesis that is going on. In the hopes of finding a key that unlocks the intuition.

Oversampling[edit | edit source]


Oversampling is a Datamining Technique used to find statistical relationships, despite the fact that the sample size is too small. It is thought that statistical techniques do not deal well with small sample sizes, but often you want to winkle out a statistical measure, even though the sample size is too small. Oversampling was designed with this in mind. Essentially, the idea is that random values have no significance in a statistical evaluation, and therefore picking from a sample population a sub-population at random, will not affect the statistical outcome. By sampling the same population randomly a number of times, a larger sample size can be simulated, even though the actual sample size is too small for significant statistical evaluation. Recombining the random samplings into a field of data as if they were each individual observations, gives a sample size large enough for a valid statistical evaluation. This does not validify the results, just makes the statistical requirements mute.
This means that we can come up with a statistical evaluation even if the statistical sample is too small for the evaluation to be valid on its own. It also means we can raise the assurance of a normal statistical sample, although the validity of any statistical measure taken by oversampling means is less sure than it would be by normal means. If your statistical work is meant to just indicate a direction to go with other research for instance, you might not care that the sample size is too small, just getting the direction will be good enough.
We can get a similar effect in our brains by reviewing the data, looking for artifacts that indicate that there might be hidden complexity within the data. The more often we review the data, the more familiar it gets, and the more likely we will either gloss over its detail, or catch statistical aberrations where the data is more complex in one area than in others. It is the statistical aberrations that lead us to deeper knowledge, by indicating where to dig deeper with our analysis.

Acceleration[edit | edit source]


Once we have found all the knots in the data that indicate hidden complexity, and dug down to find what complexity we can, the next step is to try to synthesize the system we were analyzing using the mechanisms we have discovered. Right away we find that the first mechanism we need is a framework of data conversion mechanisms that allow us to convert the information we have found by digging into the data complexity, back into forms that can be used to explain the mechanisms of the system that the data belongs to. Without this data framework, fitting the data back together is a little like trying to do a jigsaw puzzle, where the pieces don't necessarily fit in the locations they seem to belong in.
The next problem is making sure we have all the pieces. Nothing annoys a jigsaw puzzle enthusiast more than getting near the end of the puzzle, and having too few pieces to complete the puzzle. Sometimes Intuitions get hung up on a specific piece of the puzzle, and can't resolve themselves enough to get advanced to Declarative Memory. In this case having some sort of acceleration technique that will allow the system to find the missing piece faster, is a plus.
This book explores two types of techniques for speeding up intuition, or at least the learning it represents, Sampling Techniques and Accelleration Techniques.