For the same criteria (same signal, data setting and optimization parameter in the same workspace), walk forward optimization could produce different results in term of IS and OSS duration and subsequently different optimized parameters for the same symbol with different length of data. For example for setting different end point of data for CL with the same beginning date, e.g 1/1/2010 to 12/31/2018 and 1/1/2010 to 12/20/2019. The optimized results should be same for duration from 1/1/2010 to 12/31/2018, at least the OS and ISS duration should be identical in unanchored walk forward optimization. This incoherent result occurred in both MC64 12 R9 and MC64OpenBeta 3. The incoherent results is more severe in MC64OpenBeta 3.
The followings are the screen shot for Walk forward result in MC64 12 R9
(OOS skipped the duration 6/5/2010 to 12/12/2011 as compared to above, otherwise the rest is identical in term of OOS/IS duration and parameter result )
The followings are the screen shot for Walk forward result in MC64 OpenBeta 3 (the incoherent OOS and IS duration is severe that both started with different date, and subsequent huge differences in the result )
MC team please fix the issue as this could produce incoherent result for the same period as the walk forward optimization is used repeatedly for the same strategy when time evolve. The issue is more severe with the MC64 14 Beta version.
Incoherent walk forward optimization results
- Svetlana MultiCharts
- Posts: 645
- Joined: 19 Oct 2017
- Has thanked: 3 times
- Been thanked: 163 times
Re: Incoherent walk forward optimization results
Hello, Applecabi
This behaviour is connected with the fact that IS and OSS intervals are counted from the end of the data series, therefore all intervals depend on the data series end point.
More information about Walk-Forward Optimization splitting algorithm is below.
Walk-forward optimization (WFO) — is sequential run of optimization and backtesting with the optimized inputs for better result.
One optimization process followed by backtesting is called a sample.
WFO samples are built from the end of the data series. If there are not enough bars to build the sequential sample, then its building is stopped. Thus, the offset between the first bar in the data series and the first bar of the first sample may be more than the MaxBarsBack value.
There are two segment types in the WFO: by number of runs and by time span.
Let us consider them more thoroughly.
1. Time Span.
a) If OOS and IS are set in bars, then OOS and IS bars are counted in sequence from the end of the data series. When there are not enough bars for building the sequential sample — its building will be stopped.
For example:
IS = 21 bars
OOS = 10 bars
MaxBarsBack = 10 bars
Data series length (Len) = 72 bars We have 72 – 10 = 62 bars for splitting, as 10 bars are required for initial calculation (MaxBarsBack value).
OOS 10 bars + IS 21 bars are counted from the end of the data series to build a sample.
Then the next sample is counted (OOS 10 bars + IS 21 bars) from the end of the previous IS.
In this example we get 4 full samples. There are not enough bars for the 5th sample — 31 bars are required, but we have only 22 bars left. Consequently, the first IS calculation starts from the 12th bar of the data series.
b) If IS is set in bars, and OOS in % of first Run, then OOS length in bars shall be calculated first according to the formula:
OOS = IS * OOS% / (100% - OOS%).
Then, when IS and OOS lengths in bars are known, the data series is split into samples from the end (the same as for a) If OOS and IS are set in bars).
c) If IS and OOS are set in days, they are counted from the end of the data series by going through all bars.
The new day is determined by date, so a new day comes after midnight. Consequently, the samples are built not according to sessions, but according to midnight time disregard the actual dates on the bars.
For example: It is required to count 2 days from 06.01.2020. Let us suppose that the data series misses some days, then the final interval may include dates: 06.01.2020, 03.01.2020, and its length is 2 days, not 4 days as it may look like in the WFO report. Further splitting is the same as for a) If OOS and IS are set in bars. The last OOS (in the end of the data series) may contain an incomplete last day.
d) If IS is set in days, and OOS in % of first Run, then splitting according to days occurs similar to splitting according to bars.
2. Number of Runs
It starts from calculation of IS and OOS lengths in bars:
Rlen — Run length
OOSPcnt — OOS percent against Rlen (0...1)
RunCnt — number of runs
Sample length = (data series length - MaxBarsBack) / ((100% - OOS%) + NumberOfRuns * OOS%)
OOS = sample length * OOS%.
IS = sample length – OOS.
Further splitting is the same as for a) If OOS and IS are set in bars.
This behaviour is connected with the fact that IS and OSS intervals are counted from the end of the data series, therefore all intervals depend on the data series end point.
More information about Walk-Forward Optimization splitting algorithm is below.
Walk-forward optimization (WFO) — is sequential run of optimization and backtesting with the optimized inputs for better result.
One optimization process followed by backtesting is called a sample.
WFO samples are built from the end of the data series. If there are not enough bars to build the sequential sample, then its building is stopped. Thus, the offset between the first bar in the data series and the first bar of the first sample may be more than the MaxBarsBack value.
There are two segment types in the WFO: by number of runs and by time span.
Let us consider them more thoroughly.
1. Time Span.
a) If OOS and IS are set in bars, then OOS and IS bars are counted in sequence from the end of the data series. When there are not enough bars for building the sequential sample — its building will be stopped.
For example:
IS = 21 bars
OOS = 10 bars
MaxBarsBack = 10 bars
Data series length (Len) = 72 bars We have 72 – 10 = 62 bars for splitting, as 10 bars are required for initial calculation (MaxBarsBack value).
OOS 10 bars + IS 21 bars are counted from the end of the data series to build a sample.
Then the next sample is counted (OOS 10 bars + IS 21 bars) from the end of the previous IS.
In this example we get 4 full samples. There are not enough bars for the 5th sample — 31 bars are required, but we have only 22 bars left. Consequently, the first IS calculation starts from the 12th bar of the data series.
b) If IS is set in bars, and OOS in % of first Run, then OOS length in bars shall be calculated first according to the formula:
OOS = IS * OOS% / (100% - OOS%).
Then, when IS and OOS lengths in bars are known, the data series is split into samples from the end (the same as for a) If OOS and IS are set in bars).
c) If IS and OOS are set in days, they are counted from the end of the data series by going through all bars.
The new day is determined by date, so a new day comes after midnight. Consequently, the samples are built not according to sessions, but according to midnight time disregard the actual dates on the bars.
For example: It is required to count 2 days from 06.01.2020. Let us suppose that the data series misses some days, then the final interval may include dates: 06.01.2020, 03.01.2020, and its length is 2 days, not 4 days as it may look like in the WFO report. Further splitting is the same as for a) If OOS and IS are set in bars. The last OOS (in the end of the data series) may contain an incomplete last day.
d) If IS is set in days, and OOS in % of first Run, then splitting according to days occurs similar to splitting according to bars.
2. Number of Runs
It starts from calculation of IS and OOS lengths in bars:
Rlen — Run length
OOSPcnt — OOS percent against Rlen (0...1)
RunCnt — number of runs
Sample length = (data series length - MaxBarsBack) / ((100% - OOS%) + NumberOfRuns * OOS%)
OOS = sample length * OOS%.
IS = sample length – OOS.
Further splitting is the same as for a) If OOS and IS are set in bars.
Re: Incoherent walk forward optimization results
I performed a Walk Forward Optimization in which I specified that the In-sample [IS] periods should be 182 (Trading) days while Out-of-Sample [OOS] should be 57 (Trading) Days. I calculated the Julian Days length (which I believe is the same as the calendar days length) for the last several IS and OOS specified time periods so that I could see if they were comparable and if the last (49th) OOS period was complete. I did the calculations in TS using the DateToJulian and JulianToDate functions in a simple program I wrote. The 46th Run OOS time period is 82 (Calendar) days. The 47th Run OOS time period is 79 (Calendar) days . The 48th OOS run is 81 (Calendar) days. However, the last (49th) OOS time period, which is reported to end on the date the WFO was created (3/5/20) - the run displays an end date of 3/5/20 - does not appear to me to be complete yet. If it ends on 3/5/20, as shown in the report, then its Calendar length is only 70 days (which is noticeably less then the 79 to 82 days of the prior 3 runs OOS time periods).
Also, I have calculated that March 5th 2020 is only the 50th Trading day since 12/26/2019, the start of the last OOS period. The OOS periods are supposed to be 57 Trading days in duration according to my specifications. Thus, the last OOS period should not be shown as complete on 3/5/2020. It should be complete on 3/16/2020, a date in the future.
It appears that either there is a "bug" in the WFO regarding the last OOS time period or the user has to calculate when the last OOS time period should end so that he or she can then re-optimize. When it is time to re-optimize, it appears that the user will need to calculate what the start date of the chart should be so that he can optimize over the correct data. I don't see that the WFO provides this information.
Also, I have calculated that March 5th 2020 is only the 50th Trading day since 12/26/2019, the start of the last OOS period. The OOS periods are supposed to be 57 Trading days in duration according to my specifications. Thus, the last OOS period should not be shown as complete on 3/5/2020. It should be complete on 3/16/2020, a date in the future.
It appears that either there is a "bug" in the WFO regarding the last OOS time period or the user has to calculate when the last OOS time period should end so that he or she can then re-optimize. When it is time to re-optimize, it appears that the user will need to calculate what the start date of the chart should be so that he can optimize over the correct data. I don't see that the WFO provides this information.