Walk-Forward Optimization: การทดสอบกลยุทธ์ที่ซื่อสัตย์เพียงวิธีเดียว

คุณ optimize กลยุทธ์เสร็จแล้ว 12 separation parameters, 9 meta-parameters — รวม 21 ตัว Backtest 25 เดือนบน pair เดียวแสดง PnL +3342% ที่ MaxLev เส้น equity curve พุ่งขึ้นแทบไม่มี drawdown เลย Sharpe สูงกว่า 3 ทุกอย่างดูสมบูรณ์แบบ

คุณปล่อย bot ทำงาน สองสัปดาห์ผ่านไป กลยุทธ์ขาดทุน 18% ของทุน หนึ่งเดือนต่อมา — 34% พารามิเตอร์ที่ "ทำงานได้" กับข้อมูลในอดีต กลับกลายเป็นการ fit กับลำดับเหตุการณ์ตลาดเฉพาะเจาะจง คุณไม่ได้ค้นพบรูปแบบ — คุณแค่จำ noise

นี่คือ overfitting แบบคลาสสิก และวิธีเดียวที่เป็นระบบในการตรวจสอบ ก่อน นำไปใช้จริงคือ Walk-Forward Optimization (WFO)

กับดักของการแบ่ง Train/Test ครั้งเดียว

Train/test split trap visualization

วิธีมาตรฐาน: แบ่งข้อมูลเป็น 70% train และ 30% test Optimize บน train แล้วตรวจสอบบน test ถ้าผลเป็นบวก — ปล่อยใช้งาน

ปัญหา: นี่คือการทดสอบ หนึ่งครั้ง บนการแบ่ง หนึ่งแบบ ผลลัพธ์ขึ้นอยู่กับว่าคุณขีดเส้นแบ่งไว้ที่ไหน เลื่อนเส้นแบ่งไปหนึ่งเดือน — และ PnL out-of-sample อาจเปลี่ยนจาก +40% เป็น -15%

Data:    |===== Train (70%) =====|== Test (30%) ==|
Split 1: |===2024-01..2025-09====|==2025-10..26-01==|  → OOS PnL: +38%
Split 2: |===2024-01..2025-06====|==2025-07..26-01==|  → OOS PnL: -12%
Split 3: |===2024-04..2025-12====|==2026-01..26-04==|  → OOS PnL: +7%

การแบ่งสามแบบ — สามข้อสรุปที่ต่างกัน จะเชื่ออันไหน? ไม่มีอันไหนเลย การแบ่ง train/test ครั้งเดียวคือการประมาณค่าจุดเดียวที่มีปัญหาเหมือนที่เราอธิบายไว้ใน Monte Carlo Bootstrap คุณต้องการไม่ใช่แค่การตรวจสอบหนึ่งครั้ง แต่คือ ชุดการตรวจสอบอย่างเป็นระบบ บน segment ข้อมูลที่ต่อเนื่องกัน

นี่คือสิ่งที่ Walk-Forward Optimization มีไว้เพื่อ

Walk-Forward Optimization คืออะไร

Walk-forward rolling windows diagram

WFO คือกระบวนการ optimize และตรวจสอบกลยุทธ์แบบต่อเนื่องบน window ข้อมูลที่เลื่อน (หรือขยาย) แนวคิด: จำลองกระบวนการซื้อขายจริงที่คุณ re-optimize พารามิเตอร์บนข้อมูลที่มีอยู่เป็นระยะ แล้วซื้อขายจนถึงการ re-optimization ครั้งถัดไป

แต่ละ "window" ประกอบด้วยสองส่วน:

In-Sample (IS) — ช่วงเวลาที่ใช้ optimize พารามิเตอร์
Out-of-Sample (OOS) — ช่วงเวลาที่ใช้ทดสอบพารามิเตอร์ที่ค้นพบ โดยไม่มีการ fitting

คุณสมบัติหลัก: ช่วง OOS ไม่ทับซ้อนกัน และโดยรวมครอบคลุมส่วนสำคัญของข้อมูล เส้น equity curve ที่ได้สร้างจาก เฉพาะ segment OOS เท่านั้น — นี่คือการประเมินกลยุทธ์ที่ซื่อสัตย์

Anchored WFO (Expanding Window)

Anchored WFO expanding window visualization

ใน anchored WFO จุดเริ่มต้นของ train period ถูกกำหนดไว้ และท้ายของมันขยายออกในแต่ละ window:

Window 1: Train [2024-01]        → Test [2024-04]
Window 2: Train [2024-01..04]    → Test [2024-07]    (growing train)
Window 3: Train [2024-01..07]    → Test [2024-10]
Window 4: Train [2024-01..10]    → Test [2025-01]
Window 5: Train [2024-01..2025-01] → Test [2025-04]

ข้อดี:

แต่ละ train period ถัดไปมีข้อมูลมากขึ้น — การ optimization มีเสถียรภาพมากขึ้น
รูปแบบในช่วงแรกไม่สูญหาย — อยู่ใน training set เสมอ
ง่ายต่อการ implement

ข้อเสีย:

ข้อมูลเก่าอาจ "เจือจาง" รูปแบบปัจจุบัน
หากตลาดเปลี่ยนแปลงเชิงโครงสร้าง — ข้อมูลเก่าเป็นภัย
Train period โตขึ้นไม่สิ้นสุด เพิ่มเวลา optimization

Rolling WFO (Sliding Window)

ใน rolling WFO train period ที่มีความยาวคงที่ "เลื่อน" ผ่านข้อมูล:

Window 1: Train [2024-01..06] → Test [2024-07..09]
Window 2: Train [2024-04..09] → Test [2024-10..12]
Window 3: Train [2024-07..12] → Test [2025-01..03]
Window 4: Train [2024-10..2025-03] → Test [2025-04..06]
Window 5: Train [2025-01..06] → Test [2025-07..09]

ข้อดี:

ปรับตัวกับ market regime ปัจจุบัน
เวลา optimization คงที่
ข้อมูลเก่าที่ไม่เกี่ยวข้องไม่กระทบผลลัพธ์

ข้อเสีย:

ข้อมูลน้อยลงสำหรับการ training — variance ของพารามิเตอร์ที่เหมาะสมสูงขึ้น
ไวต่อการเลือกความยาว window
อาจ "ลืม" เหตุการณ์หายากแต่สำคัญ (flash crashes)

Combinatorial Purged Cross-Validation (CPCV)

Combinatorial purged cross-validation visualization

วิธีขั้นสูงที่เสนอโดย Marcos Lopez de Prado ข้อมูลถูกแบ่งออกเป็น $N$ กลุ่ม โดยเลือก $k$ กลุ่มสำหรับการทดสอบ ความแตกต่างหลักจาก cross-validation มาตรฐานคือ purging (การลบข้อมูลที่ขอบ train/test) และ embargo (ช่องว่างเพิ่มเติมเพื่อป้องกัน data leakage):

$\text{Number of combinations} = \binom{N}{k}$

ด้วย $N = 10, k = 2$ : 45 combinations train/test แต่ละ combination ให้ผล OOS และค่าประมาณสุดท้ายคือค่าเฉลี่ยของทุก combination

from itertools import combinations
import numpy as np

def cpcv_splits(n_groups: int, k_test: int, purge_pct: float = 0.01):
    """
    Generate CPCV splits with purging.

    Args:
        n_groups: number of groups
        k_test: number of test groups in each split
        purge_pct: fraction of data for purging (at the train/test boundary)
    """
    groups = list(range(n_groups))
    splits = []

    for test_groups in combinations(groups, k_test):
        train_groups = [g for g in groups if g not in test_groups]
        splits.append({
            "train": train_groups,
            "test": list(test_groups),
            "purge_groups": _get_purge_groups(train_groups, test_groups),
        })

    return splits

def _get_purge_groups(train, test):
    """Groups at the train/test boundary for purging."""
    purge = set()
    for t in test:
        if t - 1 in train:
            purge.add(t - 1)
        if t + 1 in train:
            purge.add(t + 1)
    return list(purge)

CPCV ดีกว่า rolling WFO เมื่อข้อมูลมีน้อย แต่ใช้ทรัพยากรคำนวณมากกว่า สำหรับกลยุทธ์ที่มี 21 พารามิเตอร์และข้อมูล 25 เดือน เราแนะนำให้เริ่มด้วย rolling WFO และใช้ CPCV เป็นการตรวจสอบเพิ่มเติม

พารามิเตอร์หลักของ WFO

Key WFO parameters visualization

ความยาว Train Period

Train สั้นเกินไป — ข้อมูลไม่เพียงพอสำหรับการ optimization ที่เชื่อถือได้ ยาวเกินไป — ข้อมูลเก่าเจือจางรูปแบบปัจจุบัน

กฎง่ายๆ: train ควรมีอย่างน้อย 200-300 trades หากกลยุทธ์ทำ 2 trades ต่อวัน:

$T_{min} = \frac{300\ \text{trades}}{2\ \text{trades/day}} = 150\ \text{days} \approx 5\ \text{months}$

สำหรับ crypto ที่มีการสลับ regime เราแนะนำไม่เกิน 6-12 เดือนสำหรับ rolling window

ความยาว Test Period

Test period ต้องยาวพอสำหรับการประเมินที่มีนัยสำคัญทางสถิติ แต่ไม่ยาวเกินไป — ไม่งั้นพารามิเตอร์จะเสื่อมสภาพระหว่างนั้น

กฎ: test = 20-33% ของ train ถ้า train = 6 เดือน, test = 1.5-2 เดือน

การซ้อนทับ (Overlap)

ใน rolling WFO window สามารถซ้อนทับกันได้ การซ้อนทับเพิ่มจำนวนจุดข้อมูล OOS แต่ทำให้เกิด correlation ระหว่างค่าประมาณ:

Without overlap:
  Train [01..06] → Test [07..09]
  Train [07..12] → Test [01..03]

With 50% overlap:
  Train [01..06] → Test [07..09]
  Train [04..09] → Test [10..12]
  Train [07..12] → Test [01..03]

คำแนะนำ: 50% overlap บน train period — ความสมดุลที่ดีระหว่างจำนวน window และความเป็นอิสระของค่าประมาณ

ความถี่ในการ Re-optimization

กำหนดว่าคุณคำนวณพารามิเตอร์ใหม่บ่อยแค่ไหน ในตลาด crypto ความถี่ที่เหมาะสมคือทุก 1-3 เดือน Re-optimization บ่อยเกินไปเพิ่มความเสี่ยง overfitting กับ noise; น้อยเกินไป — เสี่ยงที่พารามิเตอร์จะล้าสมัย

Walk-Forward Efficiency Ratio และ Degradation Rate

Walk-forward efficiency ratio and degradation visualization

Walk-Forward Efficiency Ratio (WFER)

ตัวชี้วัดหลักของ WFO — อัตราส่วนของผลตอบแทน OOS ต่อผลตอบแทน IS:

$\text{WFER} = \frac{\text{PnL}_{OOS}}{\text{PnL}_{IS}}$

การตีความ:

WFER	การตีความ
> 0.8	ความแข็งแกร่งดีเยี่ยม พารามิเตอร์ถ่ายโอนไปข้อมูลใหม่ได้
0.5 — 0.8	ความแข็งแกร่งยอมรับได้ กลยุทธ์ทำงานแต่มีการเสื่อม
0.3 — 0.5	กรณีขอบ มี overfitting บางส่วนน่าจะเกิดขึ้น
< 0.3	Overfitting พารามิเตอร์ถูก fit กับข้อมูล IS
< 0	กลยุทธ์ขาดทุน OOS Overfitting เต็มรูปแบบหรือ logic ผิดพลาด

หาก WFER < 0.5 — กลยุทธ์น่าจะ overfit นี่คือตัวกรองหลักของเรา

Degradation Rate

แสดงให้เห็นว่าพารามิเตอร์ที่เหมาะสมที่สุดสูญเสียประสิทธิภาพเร็วแค่ไหนตามเวลา:

$\text{Degradation rate} = \frac{d(\text{OOS PnL})}{dt}$

ในทางปฏิบัติ: แบ่ง test period ออกเป็น sub-interval และติดตามพลวัต PnL:

def degradation_rate(oos_returns: np.ndarray, n_subperiods: int = 4) -> float:
    """
    Estimate parameter degradation rate.

    Splits the OOS period into sub-intervals and computes the slope
    of linear regression of PnL against sub-interval number.

    Returns:
        slope: negative = degradation, positive = improvement
    """
    chunk_size = len(oos_returns) // n_subperiods
    subperiod_pnls = []

    for i in range(n_subperiods):
        start = i * chunk_size
        end = start + chunk_size
        sub_pnl = np.sum(oos_returns[start:end])
        subperiod_pnls.append(sub_pnl)

    x = np.arange(n_subperiods)
    slope = np.polyfit(x, subperiod_pnls, 1)[0]

    return slope

หาก degradation rate ติดลบมากๆ — พารามิเตอร์เสื่อมสภาพเร็ว และคุณต้องการ re-optimization บ่อยขึ้นหรือ train period ที่สั้นลง

การ implement WFO Pipeline แบบสมบูรณ์ใน Python

WFO pipeline architecture visualization

import numpy as np
import pandas as pd
from dataclasses import dataclass, field
from typing import Callable, List, Optional
import warnings

@dataclass
class WFOWindow:
    """A single walk-forward window."""
    window_id: int
    train_start: int         # train start index
    train_end: int           # train end index (exclusive)
    test_start: int          # test start index
    test_end: int            # test end index (exclusive)
    best_params: dict = field(default_factory=dict)
    is_pnl: float = 0.0     # in-sample PnL
    oos_pnl: float = 0.0    # out-of-sample PnL
    oos_returns: np.ndarray = field(default_factory=lambda: np.array([]))
    wfer: float = 0.0       # walk-forward efficiency ratio

@dataclass
class WFOResult:
    """Result of the entire WFO."""
    windows: List[WFOWindow]
    aggregate_oos_pnl: float
    aggregate_is_pnl: float
    wfer: float
    degradation_rate: float
    oos_equity: np.ndarray
    oos_sharpe: float
    oos_max_dd: float
    n_windows: int
    passed: bool             # whether the strategy passed the filter

class WalkForwardOptimizer:
    """
    Walk-Forward Optimization pipeline.

    Supports anchored (expanding) and rolling (sliding) modes.
    """

    def __init__(
        self,
        data: np.ndarray,
        optimize_fn: Callable,
        evaluate_fn: Callable,
        mode: str = "rolling",         # "rolling" or "anchored"
        train_size: int = 180,         # days
        test_size: int = 60,           # days
        step_size: int = 60,           # window step size, days
        min_trades: int = 30,          # min number of trades in OOS
        wfer_threshold: float = 0.5,   # WFER threshold for accept/reject
    ):
        self.data = data
        self.optimize_fn = optimize_fn
        self.evaluate_fn = evaluate_fn
        self.mode = mode
        self.train_size = train_size
        self.test_size = test_size
        self.step_size = step_size
        self.min_trades = min_trades
        self.wfer_threshold = wfer_threshold

    def generate_windows(self) -> List[WFOWindow]:
        """Generate walk-forward windows."""
        n = len(self.data)
        windows = []
        window_id = 0

        if self.mode == "rolling":
            start = 0
            while start + self.train_size + self.test_size <= n:
                w = WFOWindow(
                    window_id=window_id,
                    train_start=start,
                    train_end=start + self.train_size,
                    test_start=start + self.train_size,
                    test_end=min(start + self.train_size + self.test_size, n),
                )
                windows.append(w)
                start += self.step_size
                window_id += 1

        elif self.mode == "anchored":
            train_end = self.train_size
            while train_end + self.test_size <= n:
                w = WFOWindow(
                    window_id=window_id,
                    train_start=0,
                    train_end=train_end,
                    test_start=train_end,
                    test_end=min(train_end + self.test_size, n),
                )
                windows.append(w)
                train_end += self.step_size
                window_id += 1

        return windows

    def run(self) -> WFOResult:
        """Run the full WFO pipeline."""
        windows = self.generate_windows()
        all_oos_returns = []

        for w in windows:
            train_data = self.data[w.train_start:w.train_end]
            test_data = self.data[w.test_start:w.test_end]

            best_params, is_pnl = self.optimize_fn(train_data)
            w.best_params = best_params
            w.is_pnl = is_pnl

            oos_pnl, oos_returns = self.evaluate_fn(test_data, best_params)
            w.oos_pnl = oos_pnl
            w.oos_returns = oos_returns

            if is_pnl != 0:
                w.wfer = oos_pnl / is_pnl
            else:
                w.wfer = 0.0

            all_oos_returns.extend(oos_returns)

        all_oos = np.array(all_oos_returns)
        oos_equity = np.cumprod(1 + all_oos)
        peak = np.maximum.accumulate(oos_equity)
        max_dd = ((oos_equity - peak) / peak).min()

        aggregate_is = sum(w.is_pnl for w in windows)
        aggregate_oos = sum(w.oos_pnl for w in windows)
        wfer = aggregate_oos / aggregate_is if aggregate_is != 0 else 0

        if np.std(all_oos) > 0:
            oos_sharpe = np.mean(all_oos) / np.std(all_oos) * np.sqrt(252)
        else:
            oos_sharpe = 0

        deg_rate = self._degradation_rate(windows)

        passed = wfer >= self.wfer_threshold and aggregate_oos > 0

        return WFOResult(
            windows=windows,
            aggregate_oos_pnl=aggregate_oos,
            aggregate_is_pnl=aggregate_is,
            wfer=wfer,
            degradation_rate=deg_rate,
            oos_equity=oos_equity,
            oos_sharpe=oos_sharpe,
            oos_max_dd=max_dd,
            n_windows=len(windows),
            passed=passed,
        )

    def _degradation_rate(self, windows: List[WFOWindow]) -> float:
        """Slope of OOS PnL across window numbers."""
        if len(windows) < 3:
            return 0.0
        pnls = [w.oos_pnl for w in windows]
        x = np.arange(len(pnls))
        slope = np.polyfit(x, pnls, 1)[0]
        return slope

ตัวอย่างการใช้งาน

import numpy as np

np.random.seed(42)
prices = 100 * np.cumprod(1 + np.random.normal(0.0002, 0.02, 750))

def my_optimize(train_data):
    """
    Optimize strategy on train data.
    Returns (best_params, is_pnl).
    """
    best_pnl = -np.inf
    best_params = {}

    for fast in range(5, 30, 5):
        for slow in range(20, 100, 10):
            if fast >= slow:
                continue
            pnl, _ = _run_strategy(train_data, fast, slow)
            if pnl > best_pnl:
                best_pnl = pnl
                best_params = {"fast": fast, "slow": slow}

    return best_params, best_pnl

def my_evaluate(test_data, params):
    """
    Evaluate strategy on test data with fixed parameters.
    Returns (oos_pnl, oos_returns).
    """
    pnl, returns = _run_strategy(test_data, params["fast"], params["slow"])
    return pnl, returns

def _run_strategy(data, fast_period, slow_period):
    """Simple MA crossover strategy."""
    fast_ma = pd.Series(data).rolling(fast_period).mean().values
    slow_ma = pd.Series(data).rolling(slow_period).mean().values

    position = 0
    returns = []

    for i in range(slow_period, len(data) - 1):
        if fast_ma[i] > slow_ma[i] and position <= 0:
            position = 1
        elif fast_ma[i] < slow_ma[i] and position >= 0:
            position = -1

        daily_ret = (data[i + 1] - data[i]) / data[i]
        returns.append(position * daily_ret)

    total_pnl = np.sum(returns)
    return total_pnl, np.array(returns)

wfo = WalkForwardOptimizer(
    data=prices,
    optimize_fn=my_optimize,
    evaluate_fn=my_evaluate,
    mode="rolling",
    train_size=180,    # 6 months
    test_size=60,      # 2 months
    step_size=60,      # step = test
)

result = wfo.run()

print(f"Windows: {result.n_windows}")
print(f"OOS PnL: {result.aggregate_oos_pnl:.4f}")
print(f"IS PnL:  {result.aggregate_is_pnl:.4f}")
print(f"WFER:    {result.wfer:.3f}")
print(f"OOS Sharpe: {result.oos_sharpe:.2f}")
print(f"OOS MaxDD:  {result.oos_max_dd:.2%}")
print(f"Degradation: {result.degradation_rate:.5f}")
print(f"Passed:  {result.passed}")

for w in result.windows:
    print(f"  Window {w.window_id}: IS={w.is_pnl:.4f} OOS={w.oos_pnl:.4f} "
          f"WFER={w.wfer:.2f} params={w.best_params}")

การตีความผลลัพธ์: เมื่อไหรควรเชื่อ เมื่อไหรควรปฏิเสธ

กลยุทธ์ผ่าน WFO

หาก WFER >= 0.5 ในทุก window, OOS PnL เป็นบวกและมีเสถียรภาพ:

Window 0: IS=0.0812  OOS=0.0531  WFER=0.65  params={'fast': 10, 'slow': 50}
Window 1: IS=0.0744  OOS=0.0489  WFER=0.66  params={'fast': 10, 'slow': 50}
Window 2: IS=0.0698  OOS=0.0401  WFER=0.57  params={'fast': 15, 'slow': 50}
Window 3: IS=0.0823  OOS=0.0512  WFER=0.62  params={'fast': 10, 'slow': 60}
Window 4: IS=0.0756  OOS=0.0478  WFER=0.63  params={'fast': 10, 'slow': 50}
→ Aggregate WFER: 0.63, all windows > 0.5, parameters are stable

สัญญาณที่ดี:

WFER มีเสถียรภาพ ใน window ต่างๆ (ไม่มีการกระโดดอย่างรวดเร็ว)
พารามิเตอร์คล้ายกัน ระหว่าง window (fast = 10-15, slow = 50-60)
OOS PnL เป็นบวก ในส่วนใหญ่ของ window
Degradation rate ใกล้ศูนย์

กลยุทธ์ล้มเหลว WFO

Window 0: IS=0.2341  OOS=-0.0312  WFER=-0.13  params={'fast': 5, 'slow': 95}
Window 1: IS=0.1987  OOS=0.0089   WFER=0.04   params={'fast': 25, 'slow': 30}
Window 2: IS=0.2156  OOS=-0.0567  WFER=-0.26  params={'fast': 10, 'slow': 90}
Window 3: IS=0.1834  OOS=0.0234   WFER=0.13   params={'fast': 20, 'slow': 40}
→ Aggregate WFER: -0.07, IS is high, OOS is near zero → overfitting

สัญญาณของ overfitting:

IS PnL สูง, OOS PnL ต่ำ/ติดลบ — overfitting แบบคลาสสิก
พารามิเตอร์เปลี่ยนแปลงมาก ระหว่าง window — ไม่มี optimum ที่เสถียร
WFER < 0.3 ใน window ส่วนใหญ่ — พารามิเตอร์ไม่ถ่ายโอน
Degradation rate ติดลบมาก — เสื่อมสภาพอย่างรวดเร็ว

ข้อมูลเพิ่มเติมเกี่ยวกับการวิเคราะห์ความเสถียรของพารามิเตอร์ — ในบทความ Plateau analysis หาก optimum "แหลม" (ลดลงอย่างรวดเร็วเมื่อพารามิเตอร์เปลี่ยนเล็กน้อย) — นี่คือสัญญาณ overfitting เพิ่มเติม

ข้อเฉพาะของ WFO สำหรับ Cryptocurrency

Cryptocurrency WFO specifics visualization

Cryptocurrency สร้างปัญหาเฉพาะตัวสำหรับ WFO ที่ไม่มีในตลาดดั้งเดิม

การสลับ Regime

ตลาด crypto สลับระหว่าง regime ที่ต่างกันอย่างสิ้นเชิง: bull trend, bear trend, sideways ที่มี volatility สูง/ต่ำ พารามิเตอร์ที่เหมาะสมใน regime หนึ่งอาจขาดทุนใน regime อื่น

วิธีแก้: ใช้ rolling WFO (ไม่ใช่ anchored) ที่มี window 4-6 เดือน ซึ่งช่วย "ลืม" regime เก่า นอกจากนี้ — จัดกลุ่มข้อมูลตาม volatility และรัน WFO แยกสำหรับแต่ละกลุ่ม

ประวัติสั้น

altcoin ส่วนใหญ่มีประวัติการซื้อขายน้อยกว่า 3 ปี ด้วย train = 6 เดือน และ test = 2 เดือน คุณจะได้เพียง 4-5 window — ค่าประมาณทางสถิติที่อ่อนแอ

วิธีแก้: ใช้ CPCV แทน rolling WFO หรือควบคู่กัน CPCV สร้าง combinations มากกว่าจากข้อมูลเดียวกัน สำหรับ 10 กลุ่มและ k=2: 45 combinations แทน 4-5 window

การเปลี่ยนแปลง Liquidity เชิงโครงสร้าง

Liquidity ของ crypto pair ไม่คงที่: pair อาจ liquid 6 เดือน แล้ว volume ลดลง 10 เท่า พารามิเตอร์ที่ optimize บนตลาด liquid ไม่ทำงานบนตลาด illiquid

วิธีแก้: เพิ่ม liquidity filter ใน WFO pipeline ยกเว้น window ที่ average daily volume ต่ำกว่าเกณฑ์ ตรวจสอบว่า liquidity ใน test period เทียบเคียงได้กับ train period

ผลกระทบของ Funding Rate

สำหรับกลยุทธ์ leveraged futures funding rates อาจเปลี่ยนผล OOS อย่างสิ้นเชิง กลยุทธ์แสดง +5% OOS ใน 2 เดือน แต่ที่ leverage 10x funding กิน 3.6%

การวิเคราะห์โดยละเอียดเกี่ยวกับผลกระทบของ funding — ในบทความของเรา Funding rates kill your leverage อย่าลืมคำนึงถึงค่าใช้จ่าย funding เมื่อประเมิน OOS PnL ใน WFO

กลยุทธ์หลายพารามิเตอร์: ทำไม WFO ถึงสำคัญกับ 12+ พารามิเตอร์

Curse of dimensionality in multi-parameter optimization

กลยุทธ์ที่มี 21 พารามิเตอร์ (12 separation + 9 meta) บนข้อมูล 25 เดือนจาก pair เดียวคือโมเดลที่มี search space มหาศาล

The Curse of Dimensionality

จำนวน combinations ของพารามิเตอร์โตแบบ exponential ตามจำนวนพารามิเตอร์:

$\text{Combinations} = \prod_{i=1}^{n} |P_i|$

หากแต่ละใน 21 พารามิเตอร์มีค่าอย่างน้อย 10 ค่า:

$10^{21} = 10\ \text{sextillion combinations}$

แม้ด้วย Bayesian optimization (รายละเอียดใน Coordinate Descent vs Bayesian) คุณสำรวจเพียงส่วนเล็กน้อยที่ไม่มีนัยสำคัญของ space โอกาสที่ optimum ที่ค้นพบเป็น noise artifact แทนที่จะเป็นรูปแบบจริง เพิ่มขึ้นตามจำนวนพารามิเตอร์

สูตร Bonferroni สำหรับการเปรียบเทียบหลายครั้ง

หากคุณทดสอบ $M$ combinations ของพารามิเตอร์ ความน่าจะเป็นของ "การค้นพบ" เท็จ (การพบผลดีโดยบังเอิญ):

$P(\text{false discovery}) = 1 - (1 - \alpha)^M \approx 1 - e^{-\alpha M}$

ที่ $\alpha = 0.05$ และ $M = 10000$ combinations ที่ลอง:

$P \approx 1 - e^{-500} \approx 1.0$

คุณรับประกันได้ว่าจะพบพารามิเตอร์ "ที่ทำงาน" — ซึ่งจริงๆ แล้ว fit กับ noise หากไม่มี WFO คุณไม่มีทางแยกแยะ edge จริงจาก artifact ทางสถิติได้

กฎ: จำนวนจุดข้อมูล OOS เทียบกับจำนวนพารามิเตอร์

กฎง่ายๆ สำหรับการเชื่อถือผล WFO:

$\frac{\text{OOS trades}}{\text{Parameters}} > 10$

สำหรับ 21 พารามิเตอร์ คุณต้องการอย่างน้อย 210 OOS trades หาก WFO ของคุณสร้างน้อยกว่า — ผลลัพธ์ไม่สามารถเชื่อถือได้

กลยุทธ์ที่มี PnL@ML +3342%: 21 พารามิเตอร์, ข้อมูล 25 เดือน สมมติ 5 OOS window ของ 60 วัน, 2 trades/วัน — รวม $5 \times 60 \times 2 = 600$ OOS trades อัตราส่วน $600/21 = 28.6$ — ยอมรับได้ แต่เฉพาะถ้า WFER > 0.5 เท่านั้น

การรวม WFO กับ Optuna

Bayesian optimization with Optuna integration

ในแต่ละ WFO window คุณต้องการ optimize พารามิเตอร์ สำหรับ 21 พารามิเตอร์ grid search เป็นไปไม่ได้ coordinate descent ไม่มีประสิทธิภาพ ทางเลือกที่เหมาะสมคือ Bayesian optimization ผ่าน Optuna

import optuna
from optuna.samplers import TPESampler

def optuna_optimize(train_data: np.ndarray, n_trials: int = 500) -> tuple:
    """
    Optimize strategy parameters using Optuna.
    Used inside each WFO window.
    """

    def objective(trial):
        fast = trial.suggest_int("fast_period", 3, 50)
        slow = trial.suggest_int("slow_period", 20, 200)
        atr_period = trial.suggest_int("atr_period", 5, 50)
        atr_mult = trial.suggest_float("atr_multiplier", 0.5, 4.0)
        rsi_period = trial.suggest_int("rsi_period", 5, 30)
        rsi_upper = trial.suggest_int("rsi_upper", 60, 85)
        rsi_lower = trial.suggest_int("rsi_lower", 15, 40)
        vol_window = trial.suggest_int("vol_window", 10, 100)
        position_size = trial.suggest_float("position_size", 0.1, 1.0)
        take_profit = trial.suggest_float("take_profit", 0.005, 0.05)
        stop_loss = trial.suggest_float("stop_loss", 0.003, 0.03)
        trailing_pct = trial.suggest_float("trailing_pct", 0.002, 0.02)

        if fast >= slow:
            return -1e6  # invalid combination

        params = {
            "fast_period": fast, "slow_period": slow,
            "atr_period": atr_period, "atr_multiplier": atr_mult,
            "rsi_period": rsi_period, "rsi_upper": rsi_upper,
            "rsi_lower": rsi_lower, "vol_window": vol_window,
            "position_size": position_size,
            "take_profit": take_profit, "stop_loss": stop_loss,
            "trailing_pct": trailing_pct,
        }

        pnl, _ = run_strategy(train_data, params)

        _, returns = run_strategy(train_data, params)
        if len(returns) < 30 or np.std(returns) == 0:
            return -1e6
        sharpe = np.mean(returns) / np.std(returns) * np.sqrt(252)
        return sharpe

    optuna.logging.set_verbosity(optuna.logging.WARNING)

    study = optuna.create_study(
        direction="maximize",
        sampler=TPESampler(seed=42),
    )
    study.optimize(objective, n_trials=n_trials, show_progress_bar=False)

    best_params = study.best_params
    best_pnl, _ = run_strategy(train_data, best_params)

    return best_params, best_pnl

wfo = WalkForwardOptimizer(
    data=prices,
    optimize_fn=optuna_optimize,     # Optuna instead of grid search
    evaluate_fn=my_evaluate,
    mode="rolling",
    train_size=180,
    test_size=60,
    step_size=60,
)

result = wfo.run()

สำคัญ: ภายใน WFO ให้ optimize Sharpe ไม่ใช่ PnL การ optimize PnL ค้นหาพารามิเตอร์ที่ maximize กำไรบนลำดับ trade เฉพาะ การ optimize Sharpe ค้นหาพารามิเตอร์ที่มีอัตราส่วน return/risk ที่ดีที่สุด — ซึ่ง robust กว่า OOS

การเปรียบเทียบโดยละเอียดของ Optuna TPE กับ coordinate descent — ในบทความ Coordinate Descent vs Bayesian

การ visualize ผลลัพธ์ WFO

import matplotlib.pyplot as plt
import matplotlib.patches as mpatches

def plot_wfo_results(result: WFOResult, data: np.ndarray):
    """Visualize Walk-Forward Optimization results."""
    fig, axes = plt.subplots(3, 1, figsize=(16, 14))

    ax = axes[0]
    ax.plot(result.oos_equity, color='#4FC3F7', linewidth=1.5)
    ax.axhline(1.0, color='#FF5252', linestyle='--', alpha=0.5, label='Break-even')
    ax.set_title(f'OOS Equity Curve (WFER={result.wfer:.2f}, Sharpe={result.oos_sharpe:.2f})')
    ax.set_ylabel('Equity')
    ax.legend()
    ax.grid(True, alpha=0.3)

    ax = axes[1]
    wfers = [w.wfer for w in result.windows]
    colors = ['#69F0AE' if w >= 0.5 else '#FFAB40' if w >= 0.3 else '#FF5252'
              for w in wfers]
    ax.bar(range(len(wfers)), wfers, color=colors, edgecolor='#1A237E', alpha=0.8)
    ax.axhline(0.5, color='#E040FB', linestyle='--', label='Threshold (0.5)')
    ax.axhline(0, color='gray', linestyle='-', alpha=0.3)
    ax.set_title('Walk-Forward Efficiency Ratio by Window')
    ax.set_xlabel('Window')
    ax.set_ylabel('WFER')
    ax.legend()

    ax = axes[2]
    x = np.arange(len(result.windows))
    width = 0.35
    ax.bar(x - width/2, [w.is_pnl for w in result.windows],
           width, label='IS PnL', color='#7C4DFF', alpha=0.7)
    ax.bar(x + width/2, [w.oos_pnl for w in result.windows],
           width, label='OOS PnL', color='#4FC3F7', alpha=0.7)
    ax.set_title('In-Sample vs Out-of-Sample PnL')
    ax.set_xlabel('Window')
    ax.set_ylabel('PnL')
    ax.legend()
    ax.grid(True, alpha=0.3)

    plt.tight_layout()
    plt.savefig('wfo_results.png', dpi=150)
    plt.show()

คำแนะนำเชิงปฏิบัติ

Checklist ก่อนปล่อยกลยุทธ์ใช้งานจริง

1. รัน WFO (rolling + anchored)

เปรียบเทียบผลของทั้งสองโหมด หาก rolling WFO ล้มเหลวแต่ anchored ผ่าน — น่าจะแสดงว่ากลยุทธ์ทำงานได้เฉพาะกับข้อมูลช่วงแรก

2. ตรวจสอบ WFER สำหรับแต่ละ window

ไม่ใช่แค่ aggregate WFER แต่แต่ละ window ทีละตัว หาก 2 ใน 6 window มี WFER < 0 — นั่นคือปัญหา แม้ aggregate > 0.5

3. เปรียบเทียบพารามิเตอร์ระหว่าง window

หากพารามิเตอร์ที่เหมาะสม "กระโดด" จาก window สู่ window — ไม่มี edge ที่เสถียร ใช้ Plateau analysis เพื่อตรวจสอบความเสถียรของ optimum

4. ตรวจสอบ degradation rate

Degradation rate ที่ติดลบมาก = พารามิเตอร์สูญเสียประสิทธิภาพอย่างรวดเร็ว คุณต้องการ re-optimization บ่อยขึ้นหรือการปรับปรุงกลยุทธ์

5. ใช้ Monte Carlo bootstrap กับผล OOS

Aggregate OOS PnL ก็ยังเป็นค่าประมาณจุดเดียว ใช้ Monte Carlo bootstrap กับ array ของ OOS returns เพื่อได้ confidence interval

6. คำนึงถึงค่าใช้จ่าย

OOS PnL ต้องรวมค่าคอมมิชชั่น slippage และ funding rates OOS PnL ที่ดูดีโดยไม่มีค่าใช้จ่ายเป็นภาพลวงตา รายละเอียดเพิ่มเติม — Funding rates kill your leverage

ข้อกำหนดข้อมูลขั้นต่ำ

จำนวนพารามิเตอร์	OOS trades ขั้นต่ำ	WFO windows ขั้นต่ำ	ข้อมูลขั้นต่ำ (2 trades/วัน)
2-5	50	3	~6 เดือน
6-10	100	4	~12 เดือน
11-15	150	5	~18 เดือน
16-21	210	6	~24 เดือน
22+	300+	8+	~36+ เดือน

กลยุทธ์ที่มี 21 พารามิเตอร์และข้อมูล 25 เดือน

กลับมาที่คำถามจากต้นบทความ: 21 พารามิเตอร์ที่ optimize บนข้อมูล 25 เดือนจาก pair เดียว PnL@ML = +3342% จะตรวจสอบอย่างไร?

ขั้นตอนที่ 1. Rolling WFO: train = 8 เดือน, test = 2 เดือน, step = 2 เดือน เราจะได้ ~8 window

ขั้นตอนที่ 2. Anchored WFO: train แรก = 8 เดือน, test = 2 เดือน เราจะได้ ~8 window

ขั้นตอนที่ 3. CPCV: 10 กลุ่มของ ~2.5 เดือน, k = 2 เราจะได้ 45 combinations

ขั้นตอนที่ 4. สำหรับแต่ละวิธี ตรวจสอบ:

WFER >= 0.5?
พารามิเตอร์มีเสถียรภาพระหว่าง window?
Degradation rate ยอมรับได้?
OOS trades / Parameters >= 10?

ขั้นตอนที่ 5. Monte Carlo bootstrap บน aggregate OOS returns PnL percentile ที่ 5 > 0?

หากการทดสอบเหล่านี้ล้มเหลว — กลยุทธ์ที่มี +3342% น่าจะ overfit 21 พารามิเตอร์บนข้อมูล 25 เดือนของ pair เดียว — นี่คืออัตราส่วนพารามิเตอร์ต่อข้อมูลที่สูงมาก หากไม่ผ่าน WFO จะไม่มีความน่าเชื่อถือ

เราแนะนำให้ตรวจสอบประสิทธิภาพของกลยุทธ์เพิ่มเติมโดยคำนึงถึง PnL by active time — ซึ่งจะเผยให้เห็นว่าส่วนไหนของ +3342% มาจากเวลาที่ถือ position เทียบกับ edge จริง

สรุป

Walk-Forward Optimization ไม่ใช่ตัวเลือก — มันคือความจำเป็น มันเป็นวิธีเดียวที่ตรวจสอบการถ่ายโอนพารามิเตอร์ไปยังข้อมูลใหม่อย่างเป็นระบบ การแบ่ง train/test ครั้งเดียวคือการเล่นล็อตเตอรี Backtest เต็มรูปแบบบนข้อมูลทั้งหมดคือการหลอกตัวเอง

ข้อสรุปหลัก:

WFER < 0.5 = overfitting หาก OOS PnL น้อยกว่าครึ่งหนึ่งของ IS — พารามิเตอร์ถูก fit
ความเสถียรของพารามิเตอร์สำคัญกว่าค่าสูงสุด พารามิเตอร์ที่ให้ +15% ในทุก window ดีกว่าพารามิเตอร์ที่ให้ +40% ใน window หนึ่งและ -10% ในอีก window
Rolling WFO สำหรับ crypto การสลับ regime ทำให้ anchored WFO น่าเชื่อถือน้อยกว่า rolling window 4-6 เดือนคือความสมดุลที่เหมาะสม
พารามิเตอร์มากขึ้น — ข้อกำหนดเข้มงวดขึ้น 21 พารามิเตอร์ต้องการอย่างน้อย 210 OOS trades และ 6+ WFO window หากไม่มีสิ่งนี้ ผลลัพธ์ไม่สามารถตรวจสอบได้
WFO + Monte Carlo bootstrap + Plateau analysis — สามชั้นของการป้องกัน overfitting แต่ละชั้นจับสิ่งที่ชั้นอื่นพลาดไป

กลยุทธ์ที่ผ่าน WFO ด้วย WFER > 0.5 ในทุก window พารามิเตอร์ที่เสถียร และ bootstrap percentile ที่ 5 เป็นบวก — นั่นคือกลยุทธ์ที่คุณสามารถไว้วางใจกับเงินจริง ทุกอย่างอื่นคือ curve fitting ที่มีเส้น equity curve สวยงาม

ลิงก์ที่เป็นประโยชน์

การอ้างอิง

@article{soloviov2026walkforwardoptimization,
  author = {Soloviov, Eugen},
  title = {Walk-Forward Optimization: การทดสอบกลยุทธ์ที่ซื่อสัตย์เพียงวิธีเดียว},
  year = {2026},
  url = {https://marketmaker.cc/th/blog/post/walk-forward-optimization},
  version = {0.1.0},
  description = {ทำไมการแบ่งข้อมูล train/test ครั้งเดียวถึงไม่สามารถป้องกัน overfitting ได้ WFO ช่วยตรวจสอบความแข็งแกร่งของพารามิเตอร์อย่างเป็นระบบได้อย่างไร และทำไมกลยุทธ์ที่มี PnL@ML +3342\% บน 21 พารามิเตอร์จึงเป็น time bomb ที่รอระเบิดหากไม่ผ่าน WFO}
}