Understanding Optimization: Difference between revisions
From MultiCharts
→Understanding Genetic Algorithm Optimization
No edit summary |
|||
(7 intermediate revisions by 2 users not shown) | |||
Line 5: | Line 5: | ||
Different traders use different criteria to define strategy performance. Some traders use the highest net profit, while other traders use the lowest drawdown. MultiCharts lets the trader define his own criteria. | Different traders use different criteria to define strategy performance. Some traders use the highest net profit, while other traders use the lowest drawdown. MultiCharts lets the trader define his own criteria. | ||
Optimization can have detrimental effects if the user searches for the combination of inputs based solely on the best performance over a period of historical data and focuses | Optimization can have detrimental effects if the user searches for the combination of inputs based solely on the best performance over a period of historical data and focuses too much on market conditions that may never occur again. This approach is known as over-optimization or curve-fitting. Performance will not be the same in real trading, since historical patterns are highly unlikely to be repeated. | ||
<br> | <br> | ||
Line 29: | Line 29: | ||
The drawback of the GA approach is that the solution found will be a solution approaching the absolute optimum solution, but not necessarily the absolute optimum solution itself. This drawback, however, is handsomely offset by the processing power and time savings in cases with a large number of possible solutions. | The drawback of the GA approach is that the solution found will be a solution approaching the absolute optimum solution, but not necessarily the absolute optimum solution itself. This drawback, however, is handsomely offset by the processing power and time savings in cases with a large number of possible solutions. | ||
In general, GA's work is primarily about two abstracts: an Individual (or Genome) and an Algorithm (i.e. Genetic Algorithm itself). Each Genome instance represents a single unique inputs combination, while GA itself defines how the evolution should take place. The GA uses a given trading strategy to determine how 'fit' a genome is for survival, e.g. how much Net Profit | In general, GA's work is primarily about two abstracts: an Individual (or Genome) and an Algorithm (i.e. Genetic Algorithm itself). Each Genome instance represents a single unique inputs combination, while GA itself defines how the evolution should take place. The GA uses a given trading strategy to determine how 'fit' a genome is for survival, e.g. how much Net Profit an inputs combination generates in case Net Profit was selected as an Optimization Criteria. | ||
Here are some GA definitions that help in understanding the process: | <br>Here are some GA definitions that help in understanding the process: | ||
<br>'''Fitness''' - the overall performance of an individual (e.g. Net Profit). | <br>'''Fitness''' - the overall performance of an individual (e.g. Net Profit). | ||
Line 57: | Line 57: | ||
# After a number of all possible combinations is determined, an optimal number of individuals is selected. | # After a number of all possible combinations is determined, an optimal number of individuals is selected. | ||
# Each individual is selected at random. These individuals form the first Generation. The optimal number of individuals is automatically placed | # Each individual is selected at random. These individuals form the first Generation. The optimal number of individuals is automatically placed next to the '''Set Population Size''' box and can be changed manually.<br><div style="background-color: #E5F6FF;">Tip: An excessively large Population Size value will result in an increase in calculation time, while an overly small Population Size value will result in a decrease in calculation accuracy.</div><br><div style="background-color: #E3FBE5;">Note: MultiCharts' GA support artificially exclusive population. This means that identical individuals cannot exist inside the same population, and thus the population size can not exceed the total number of input combinations. The population size is constant for each generation.</div> | ||
# The fitness of each individual is evaluated and the least fit individuals discarded. | |||
<br><div style="background-color: #E5F6FF;">Tip: An excessively large Population Size value will result in an increase in calculation time, while an overly small Population Size value will result in a decrease in calculation accuracy.</div> | # A new population of individuals is generated from the remaining members of the previous population by applying the crossover and mutation operations, as well as selection and/or replacement strategies that depend on the GA subtype: <br> | ||
#: '''Crossover and Mutation''' | |||
<br><div style="background-color: #E3FBE5;">Note: MultiCharts' GA support artificially exclusive population. This means that identical individuals cannot exist inside the same population, and thus the population size can not exceed the total number of input combinations. The population size is constant for each generation.</div> | #: MultiCharts uses the so-called Array Uniform Crossover. With this Crossover type, each of the child’s genes can come from each of the parents with equal probability. | ||
#: In the '''Crossover Probability''' field, the probability of a crossover for each individual is specified; the usual value range is 0.95-0.99, with the default value of 0.95. | |||
#: MultiCharts uses the so-called Random Flip Mutation. With this Mutation type, each gene can be replaced with any other possible gene on random basis. | |||
#: In the '''Mutation Probability''' field, the probability of a mutation for each individual is specified; the usual value range is 0.01-0.05, with the default value of 0.05. | |||
#: <div style="background-color: #E5F6FF;">Tip: An excessively large Mutation Probability value will cause the search to become a primitive random search.</div><br> | |||
#: '''GA Subtypes and Replacement Schemas''' | |||
#: GA subtype defines the way that GA creates new individuals and replaces old individuals when creating next generations. | |||
#: GA subtype can be set in the '''Genetic Algorithm Subtype''' section. | |||
#: Two GA subtypes are available: '''Basic''' and '''Incremental'''. | |||
#: '''Basic''' subtype is the standard so-called “simple genetic algorithm”. This algorithm uses non-overlapping generations and Elitism mode (optional). For each generation, the algorithm creates an entirely new population of individuals (if the '''Elitism''' option is selected, the most fit individuals move on to the next generation).<br> | |||
#: '''Elitism''' | |||
#:: Elitism mode, available for the Basic GA subtype only, allows the fittest individuals to survive and produce "children" over a span of multiple generations. | |||
#: '''Incremental''' subtype does not create an entirely new population for each generation. It simply adds only one or two children to the population each time the next generation is created. These one or two children replace one or two individuals in the previous generation. The individuals to be replaced by the children are chosen according to the Replacement Schemas used.<br> | |||
#: '''Replacement Schemas''' | |||
#: Replacement Schemas are available for Incremental subtype only. Schemas define how a new generation should be integrated into the population. There are three schemes available: '''Worst''', '''Parent''', and '''Random'''. | |||
#: '''Worst''' – least fit individuals are replaced | |||
#: '''Parent''' – parent individuals are replaced | |||
#: '''Random''' – individuals are replaced randomly<br> | |||
# The fitness of each individual is evaluated and the least fit individuals discarded. | # The fitness of each individual is evaluated and the least fit individuals discarded. | ||
# | # The process is repeated, until the specified degree of convergence or generation number is reached (depends on GA setting selected).<br> | ||
#: '''GA Convergence Type''' | |||
<br> | #: Genetic Algorithms optimization process has no implicit final result and thus can proceed forever. Therefore, an "ending-point" must be specified, indicating when the optimization process must come to an end. | ||
''' | #: Two GA optimization "ending-point" criteria types can be selected: '''Terminate-Upon-Generation''' and '''Terminate-Upon-Convergence'''. | ||
#: '''Terminate-Upon-Generation''' will stop the optimization process once the specified '''Maximum Number of Generations''' is reached. | |||
#: '''Terminate-Upon-Convergence''' will stop the optimization process once the defined '''Convergence Rate''' is reached, or once the defined '''Maximum Number of Generations''' is reached.<br> | |||
#: GA optimization "ending-point" criterion is selected in the '''Conversion Type''' section. | |||
#: The desired Maximum Number of Generations, Minimum Number of Generations, and Conversion Rate can be set in the corresponding text boxes.<br> | |||
#: '''Convergence Rate''' | |||
#: Convergence Rate of generations is the ratio between the Convergence value of the two most recent generations and the Convergence value of the current generation and the generation N generations ago. | |||
#: GA calculation is stopped after meeting С [x – N] / C [x] >= P condition where: | |||
#: x – ordinal number of the current generation; | |||
''' | #: С[x] – convergence value of the two most recent generations; | ||
#: N – defined minimal number of the generations; | |||
#: P - convergence rate; values used are usually close to 1, with the default value of 0.99. | |||
#: <div style="background-color: #E3FBE5;">Note: Convergence Rate is not calculated for generations that have an ordinal number less than the defined minimal number of the generations.</div><br> | |||
''' | |||
<br> | |||
<br> | |||
'''Further Reading''' | '''Further Reading''' | ||