Chapter 7 · Direct Methods · Merged Responsive Edition

Algorithms for Optimization · Chapter 7 · 第七章

Derivative-Free Optimisation
无导数优化方法 · Six Methods, One Logic

当目标函数是黑盒（由仿真、实验或不可微代码定义），只能输入点、得到函数值时，所有基于梯度的方法完全失效。本章六种方法从最朴素的逐坐标搜索出发，每种方法回应前一种的局限，最终到达唯一的全局方法 DIRECT。
Six methods, each one a direct response to the limitation of its predecessor — from axis-by-axis search to global domain partitioning.

Black-box objective 黑盒目标 Function evaluations only 只看函数值 Local → Global 局部到全局 Each method answers the previous one's limitation

Method Chain · 方法演进链

§ 7.1 · Local · 局部

Cyclic Coordinate Search

循环坐标搜索 · Taxicab search · Coordinate descent

The simplest starting point: optimise one coordinate at a time, cycling through all n axes. Think of it as a hiker who can only walk east-west or north-south — never diagonally.

最简单的起点：每次只动一个坐标，轮流遍历。就像只能横向或纵向移动的登山者——永远不走斜线。

Per step

1 line search along one axis
沿一个坐标轴做线搜索

Watch out

Diagonal valleys → gets stuck permanently
对角谷问题，会永久卡死

localaxis-aligned

Explore →

↓

Limitation → Motivation

Fixed coordinate axes can't descend diagonal valleys. → Let search directions evolve with each cycle. 让方向自适应函数几何结构。

§ 7.2 · Local · 局部

Powell's Method

Powell 方法 · Conjugate direction method · 共轭方向

Keep doing line searches, but replace the oldest direction with the overall direction of progress each cycle. Directions gradually align with the valley geometry. Exact on quadratics in n iterations.

保留线搜索，但每轮用实际进展方向替换最旧方向。方向组逐渐对齐函数山谷。对二次函数 n 轮精确求解。

Quadratic guarantee

Exact in n full cycles
n 轮精确解二次函数

Watch out

Directions may become linearly dependent → reset every n iters
方向退化，需定期重置

localconjugate dirs

Explore →

↓

Limitation → Motivation

Full line searches are expensive; n(n+1) searches per cycle. → Just take cheap ±α probing steps instead. 用小步探测代替完整线搜索。

§ 7.3 · Local · 局部

Hooke-Jeeves Method

Hooke-Jeeves 方法 · Direct search · 坐标探步

Replace expensive line searches with cheap probing: test ±α steps in every coordinate direction, accept the best improvement, shrink α when stuck. 2n evaluations per step.

用廉价探测替代完整线搜索：在每个坐标方向走 ±α 小步，接受最优改善，卡住时缩小步长。每步 2n 次求值。

Cost per step

2n function evaluations
每步 2n 次函数求值

Watch out

2n is wasteful in high dimensions
高维时求值代价大

local2n evals/step

Explore →

↓

Limitation → Motivation

2n evaluations per step is more than necessary. → Any positive spanning set needs only n+1 directions and still guarantees descent. 正生成集只需 n+1 个方向。

§ 7.4 · Local · 局部

Generalized Pattern Search

广义模式搜索 · Mesh adaptive · 正生成集

Use any positive spanning set of directions (minimum n+1 instead of 2n). Adds opportunistic acceptance (grab first improvement) and dynamic ordering (try winning directions first).

使用任意正生成集（最少 n+1 个方向）。加入机会性接受和动态排序两个增强机制。

Minimum directions

n+1 vs 2n in Hooke-Jeeves
最少 n+1 个方向

Key requirement

Positive spanning set
正生成集

local≥n+1 dirs

Explore →

↓

Paradigm shift · 范式转变

All above search from a single point — always at risk of local minima. → Maintain a whole geometric shape (n+1 points = simplex) that deforms like an amoeba. 维护 n+1 个点构成的几何体，像变形虫一样爬向极小值。

§ 7.5 · Local · 局部

Nelder-Mead Simplex

Nelder-Mead 单纯形法 · Amoeba method

Maintain a simplex of n+1 points (triangle in 2D). Four operations — reflect, expand, contract, shrink — deform the shape toward the minimum. Converges when std of function values < ε.

维护 n+1 个点的单纯形（2D 中是三角形）。通过反射、扩展、收缩、缩减四种操作变形。收敛判据：顶点函数值标准差 < ε。

4 operations

Reflect · Expand · Contract · Shrink

Convergence

std(f-values at vertices) < ε

localn+1 vertices

Explore →

↓

Shift to global · 转向全局

All 5 methods are local — no guarantee of finding the global minimum. → Divide the entire domain into rectangles; use Lipschitz theory to systematically rule out regions. 将整个域划分为矩形，用 Lipschitz 理论系统性地排除不可能含全局最小值的区域。

§ 7.6 · Global · 全局

DIRECT — Divided Rectangles

分矩形算法 · DIvided RECTangles · Lipschitzian global optimisation

Trisect the search space into rectangles, sampling each centre. A convex hull rule selects "potentially optimal" intervals — covering all possible Lipschitz constants simultaneously, without specifying any.

将搜索空间三等分为矩形，在每个中心采样。凸壳规则选取潜在最优区间——同时覆盖所有可能的 Lipschitz 常数，无需指定任何一个。

Global guarantee

No ℓ needed · 无需 Lipschitz 常数

Selection rule

Convex hull in (r, f(c)) space · 凸壳规则覆盖所有 ℓ

globalno ℓ needed

Explore →

Quick Jump · 快速跳转

★

Summary

全章总结

§7.1

Cyclic

循环坐标

§7.2

Powell

共轭方向

§7.3

Hooke-Jeeves

坐标探步

§7.4

Pattern

广义模式

§7.5

Nelder-Mead

单纯形法

§7.6

DIRECT

分矩形

✎

Quiz

随机自测

★ Chapter Summary · 全章总结

Direct Methods — The Complete Picture

把第七章六种方法放进同一条逻辑链里：为什么出现、解决什么、又留下什么

One-Line Backbone · 一条主线

Fixed coordinate directions → adaptive directions → cheap local probes → positive spanning sets → simplex deformation → global domain partitioning.

固定坐标方向 → 自适应方向 → 廉价局部探测 → 正生成集 → 单纯形变形 → 全局区域划分。整章不是孤立算法堆砌，而是一条连续的“回应局限”的演化链。

Question 1

When gradients are unavailable, what geometric information can still guide search?
没有梯度时，还能利用哪些几何直觉来找下降方向？

Question 2

How do we reduce function evaluations without losing the ability to descend?
怎样在减少函数求值的同时仍保留下降能力？

Question 3

How do we move from local search to a global-search mindset?
如何从局部搜索过渡到全局搜索？

Method-by-Method Logic · 每种方法在整章中的位置

1. Cyclic Coordinate Search · 循环坐标搜索

Core idea: optimize one coordinate at a time. Why it matters: it is the cleanest possible derivative-free baseline. Main weakness: axis directions are fixed, so diagonal valleys can trap it badly.
核心思想是逐坐标优化。它的重要性在于提供了最原始的无导数起点；但方向固定在坐标轴上，所以一旦谷底斜着走，它就会很笨。

2. Powell’s Method · Powell 方法

Core idea: still use line searches, but let directions evolve from actual progress. Improvement over cyclic: directions can gradually align with the valley. New issue: full line searches remain expensive, especially in high dimensions.
它保留线搜索的精度，但方向不再死板，而是逐步学习函数地形；问题是线搜索本身仍然贵。

3. Hooke–Jeeves · Hooke-Jeeves 方法

Core idea: replace full line searches with cheap ±α probes. Improvement over Powell: much cheaper per move. New issue: checking all 2n coordinate directions can itself become wasteful.
它用“试一小步看看有没有变好”代替完整线搜索，更像试探式走路；但在高维时，2n 个方向依然不少。

4. Generalized Pattern Search · 广义模式搜索

Core idea: use any positive spanning set, not necessarily all ± coordinate directions. Improvement over Hooke–Jeeves: only enough directions to guarantee a descent opportunity. New issue: it is still a single-point local method.
它把“哪些方向足够用”这件事抽象成正生成集，因此比单纯坐标探测更一般也更省；但本质上还是局部搜索。

5. Nelder–Mead Simplex · Nelder–Mead 单纯形法

Core idea: carry a whole simplex of n+1 points and deform its shape. Improvement over point-based methods: geometry now comes from a moving body, not one point plus test directions. New issue: still local, and theoretical guarantees are weaker than the neat positive-spanning framework.
它不再只盯住一个点，而是让一个会变形的几何体往低处滑，直观性很强；但它仍然可能困在局部极小值。

6. DIRECT · 划分矩形法

Core idea: partition the whole domain and select potentially optimal rectangles using lower-bound logic. Why it is different: this is the chapter’s global method. Tradeoff: broader exploration, slower fine local polishing.
DIRECT 不再问“当前点附近怎么走”，而是问“整个定义域里哪些区域最值得继续分裂”。这是本章从局部思维转向全局思维的关键一步。

Full Comparison · 完整对比

Method	Main move	Typical cost	Local / Global	Best for
Cyclic 循环坐标	Axis-by-axis line search	1 line search each coordinate move	Local	Teaching intuition, very simple structure
Powell Powell	Adaptive direction line searches	Many line searches per cycle	Local	Smooth valley-like problems without derivatives
Hooke–Jeeves	±α local probes	About 2n evaluations per exploratory step	Local	Cheap function evaluations, simple implementation
Pattern Search 模式搜索	Positive spanning directions	At least n+1 directions	Local	More principled direct search design
Nelder–Mead	Simplex deformation	n+1 vertices updated geometrically	Local	Low-dimensional black-box tuning
DIRECT	Rectangle partition + lower bound	Global partition budget grows over time	Global	Unknown landscape, many local minima

How to Choose · 实际怎么选

If the dimension is low and you want a very intuitive method, start from Nelder–Mead or Powell.
If line searches are too expensive, think Hooke–Jeeves or Pattern Search.
If the geometry is strongly axis-misaligned, avoid relying on Cyclic Coordinate Search alone.
If you worry the function has many local minima and the search box matters, DIRECT is the natural chapter-7 answer.
If you need a method that looks elegant in theory, Pattern Search gives the cleanest bridge from coordinate probing to abstract direction sets.

简单说：低维调参看 Nelder–Mead；想保留线搜索精度看 Powell；想省函数值看 Hooke–Jeeves / Pattern Search；怕掉进局部极小值就看 DIRECT。

Common Confusions · 常见易混点

Powell 不是“换坐标轴”，而是“更新一组搜索方向”。
Hooke–Jeeves 与 Pattern Search 都是探测步思想，但后者强调正生成集这一更一般框架。
Nelder–Mead simplex 不是线性规划里的 simplex algorithm，它们完全不是一回事。
DIRECT 也不是随机全局搜索；它是确定性的区域划分方法。
前五个方法本质上都偏局部；本章真正代表“全局搜索视角”的是 DIRECT。

What Examiners Usually Want · 复习时最该会什么

Understand the motivation

You should be able to say not only what each method does, but exactly which weakness of the previous method it is trying to fix.
复习时最值钱的不是背步骤，而是会说“它为什么会出现”。

Distinguish local and global

Know clearly where the chapter changes mindset: before DIRECT, search is centered around improving from current point(s); DIRECT evaluates entire regions.
一定分清局部搜索和全局区域划分的分水岭。

Compare evaluation budgets

A core practical theme is how many function evaluations are needed: line search costs, 2n probes, n+1 directions, simplex updates, domain partition growth.

Keep the geometry in mind

Axes, adaptive directions, positive spanning sets, simplexes, and rectangles are the geometric language of this chapter.
这一章本质上是在学不同的“几何搜索载体”。