Ch 7 Direct Methods · 直接方法
Algorithms for Optimization · Chapter 7 · 第七章
Derivative-Free Optimisation
无导数优化方法 · Six Methods, One Logic
当目标函数是黑盒(由仿真、实验或不可微代码定义),只能输入点、得到函数值时,所有基于梯度的方法完全失效。本章六种方法从最朴素的逐坐标搜索出发,每种方法回应前一种的局限,最终到达唯一的全局方法 DIRECT。
Six methods, each one a direct response to the limitation of its predecessor — from axis-by-axis search to global domain partitioning.
Black-box objective 黑盒目标 Function evaluations only 只看函数值 Local → Global 局部到全局 Each method answers the previous one's limitation
Method Chain · 方法演进链
1
§ 7.1 · Local · 局部
Cyclic Coordinate Search
循环坐标搜索 · Taxicab search · Coordinate descent
The simplest starting point: optimise one coordinate at a time, cycling through all n axes. Think of it as a hiker who can only walk east-west or north-south — never diagonally.
最简单的起点:每次只动一个坐标,轮流遍历。就像只能横向或纵向移动的登山者——永远不走斜线。
Per step
1 line search along one axis
沿一个坐标轴做线搜索
Watch out
Diagonal valleys → gets stuck permanently
对角谷问题,会永久卡死
localaxis-aligned
Explore →
Limitation → Motivation
Fixed coordinate axes can't descend diagonal valleys. → Let search directions evolve with each cycle. 让方向自适应函数几何结构。
2
§ 7.2 · Local · 局部
Powell's Method
Powell 方法 · Conjugate direction method · 共轭方向
Keep doing line searches, but replace the oldest direction with the overall direction of progress each cycle. Directions gradually align with the valley geometry. Exact on quadratics in n iterations.
保留线搜索,但每轮用实际进展方向替换最旧方向。方向组逐渐对齐函数山谷。对二次函数 n 轮精确求解。
Quadratic guarantee
Exact in n full cycles
n 轮精确解二次函数
Watch out
Directions may become linearly dependent → reset every n iters
方向退化,需定期重置
localconjugate dirs
Explore →
Limitation → Motivation
Full line searches are expensive; n(n+1) searches per cycle. → Just take cheap ±α probing steps instead. 用小步探测代替完整线搜索。
3
§ 7.3 · Local · 局部
Hooke-Jeeves Method
Hooke-Jeeves 方法 · Direct search · 坐标探步
Replace expensive line searches with cheap probing: test ±α steps in every coordinate direction, accept the best improvement, shrink α when stuck. 2n evaluations per step.
用廉价探测替代完整线搜索:在每个坐标方向走 ±α 小步,接受最优改善,卡住时缩小步长。每步 2n 次求值。
Cost per step
2n function evaluations
每步 2n 次函数求值
Watch out
2n is wasteful in high dimensions
高维时求值代价大
local2n evals/step
Explore →
Limitation → Motivation
2n evaluations per step is more than necessary. → Any positive spanning set needs only n+1 directions and still guarantees descent. 正生成集只需 n+1 个方向。
4
§ 7.4 · Local · 局部
Generalized Pattern Search
广义模式搜索 · Mesh adaptive · 正生成集
Use any positive spanning set of directions (minimum n+1 instead of 2n). Adds opportunistic acceptance (grab first improvement) and dynamic ordering (try winning directions first).
使用任意正生成集(最少 n+1 个方向)。加入机会性接受和动态排序两个增强机制。
Minimum directions
n+1 vs 2n in Hooke-Jeeves
最少 n+1 个方向
Key requirement
Positive spanning set
正生成集
local≥n+1 dirs
Explore →
Paradigm shift · 范式转变
All above search from a single point — always at risk of local minima. → Maintain a whole geometric shape (n+1 points = simplex) that deforms like an amoeba. 维护 n+1 个点构成的几何体,像变形虫一样爬向极小值。
5
§ 7.5 · Local · 局部
Nelder-Mead Simplex
Nelder-Mead 单纯形法 · Amoeba method
Maintain a simplex of n+1 points (triangle in 2D). Four operations — reflect, expand, contract, shrink — deform the shape toward the minimum. Converges when std of function values < ε.
维护 n+1 个点的单纯形(2D 中是三角形)。通过反射、扩展、收缩、缩减四种操作变形。收敛判据:顶点函数值标准差 < ε。
4 operations
Reflect · Expand · Contract · Shrink
Convergence
std(f-values at vertices) < ε
localn+1 vertices
Explore →
Shift to global · 转向全局
All 5 methods are local — no guarantee of finding the global minimum. → Divide the entire domain into rectangles; use Lipschitz theory to systematically rule out regions. 将整个域划分为矩形,用 Lipschitz 理论系统性地排除不可能含全局最小值的区域。
6
§ 7.6 · Global · 全局
DIRECT — Divided Rectangles
分矩形算法 · DIvided RECTangles · Lipschitzian global optimisation
Trisect the search space into rectangles, sampling each centre. A convex hull rule selects "potentially optimal" intervals — covering all possible Lipschitz constants simultaneously, without specifying any.
将搜索空间三等分为矩形,在每个中心采样。凸壳规则选取潜在最优区间——同时覆盖所有可能的 Lipschitz 常数,无需指定任何一个。
Global guarantee
No ℓ needed · 无需 Lipschitz 常数
Selection rule
Convex hull in (r, f(c)) space · 凸壳规则覆盖所有 ℓ
globalno ℓ needed
Explore →
Quick Jump · 快速跳转
Summary
全章总结
§7.1
Cyclic
循环坐标
§7.2
Powell
共轭方向
§7.3
Hooke-Jeeves
坐标探步
§7.4
Pattern
广义模式
§7.5
Nelder-Mead
单纯形法
§7.6
DIRECT
分矩形
Quiz
随机自测
★ Chapter Summary · 全章总结
Direct Methods — The Complete Picture
把第七章六种方法放进同一条逻辑链里:为什么出现、解决什么、又留下什么
One-Line Backbone · 一条主线
Fixed coordinate directions → adaptive directions → cheap local probes → positive spanning sets → simplex deformation → global domain partitioning.
固定坐标方向 → 自适应方向 → 廉价局部探测 → 正生成集 → 单纯形变形 → 全局区域划分。整章不是孤立算法堆砌,而是一条连续的“回应局限”的演化链。
Question 1
When gradients are unavailable, what geometric information can still guide search?
没有梯度时,还能利用哪些几何直觉来找下降方向?
Question 2
How do we reduce function evaluations without losing the ability to descend?
怎样在减少函数求值的同时仍保留下降能力?
Question 3
How do we move from local search to a global-search mindset?
如何从局部搜索过渡到全局搜索?
Method-by-Method Logic · 每种方法在整章中的位置
1. Cyclic Coordinate Search · 循环坐标搜索

Core idea: optimize one coordinate at a time. Why it matters: it is the cleanest possible derivative-free baseline. Main weakness: axis directions are fixed, so diagonal valleys can trap it badly.
核心思想是逐坐标优化。它的重要性在于提供了最原始的无导数起点;但方向固定在坐标轴上,所以一旦谷底斜着走,它就会很笨。

2. Powell’s Method · Powell 方法

Core idea: still use line searches, but let directions evolve from actual progress. Improvement over cyclic: directions can gradually align with the valley. New issue: full line searches remain expensive, especially in high dimensions.
它保留线搜索的精度,但方向不再死板,而是逐步学习函数地形;问题是线搜索本身仍然贵。

3. Hooke–Jeeves · Hooke-Jeeves 方法

Core idea: replace full line searches with cheap ±α probes. Improvement over Powell: much cheaper per move. New issue: checking all 2n coordinate directions can itself become wasteful.
它用“试一小步看看有没有变好”代替完整线搜索,更像试探式走路;但在高维时,2n 个方向依然不少。

4. Generalized Pattern Search · 广义模式搜索

Core idea: use any positive spanning set, not necessarily all ± coordinate directions. Improvement over Hooke–Jeeves: only enough directions to guarantee a descent opportunity. New issue: it is still a single-point local method.
它把“哪些方向足够用”这件事抽象成正生成集,因此比单纯坐标探测更一般也更省;但本质上还是局部搜索。

5. Nelder–Mead Simplex · Nelder–Mead 单纯形法

Core idea: carry a whole simplex of n+1 points and deform its shape. Improvement over point-based methods: geometry now comes from a moving body, not one point plus test directions. New issue: still local, and theoretical guarantees are weaker than the neat positive-spanning framework.
它不再只盯住一个点,而是让一个会变形的几何体往低处滑,直观性很强;但它仍然可能困在局部极小值。

6. DIRECT · 划分矩形法

Core idea: partition the whole domain and select potentially optimal rectangles using lower-bound logic. Why it is different: this is the chapter’s global method. Tradeoff: broader exploration, slower fine local polishing.
DIRECT 不再问“当前点附近怎么走”,而是问“整个定义域里哪些区域最值得继续分裂”。这是本章从局部思维转向全局思维的关键一步。

Full Comparison · 完整对比
MethodMain moveTypical costLocal / GlobalBest for
Cyclic 循环坐标Axis-by-axis line search1 line search each coordinate moveLocalTeaching intuition, very simple structure
Powell PowellAdaptive direction line searchesMany line searches per cycleLocalSmooth valley-like problems without derivatives
Hooke–Jeeves±α local probesAbout 2n evaluations per exploratory stepLocalCheap function evaluations, simple implementation
Pattern Search 模式搜索Positive spanning directionsAt least n+1 directionsLocalMore principled direct search design
Nelder–MeadSimplex deformationn+1 vertices updated geometricallyLocalLow-dimensional black-box tuning
DIRECTRectangle partition + lower boundGlobal partition budget grows over timeGlobalUnknown landscape, many local minima

How to Choose · 实际怎么选

  • If the dimension is low and you want a very intuitive method, start from Nelder–Mead or Powell.
  • If line searches are too expensive, think Hooke–Jeeves or Pattern Search.
  • If the geometry is strongly axis-misaligned, avoid relying on Cyclic Coordinate Search alone.
  • If you worry the function has many local minima and the search box matters, DIRECT is the natural chapter-7 answer.
  • If you need a method that looks elegant in theory, Pattern Search gives the cleanest bridge from coordinate probing to abstract direction sets.
简单说:低维调参看 Nelder–Mead;想保留线搜索精度看 Powell;想省函数值看 Hooke–Jeeves / Pattern Search;怕掉进局部极小值就看 DIRECT。

Common Confusions · 常见易混点

  • Powell 不是“换坐标轴”,而是“更新一组搜索方向”。
  • Hooke–JeevesPattern Search 都是探测步思想,但后者强调正生成集这一更一般框架。
  • Nelder–Mead simplex 不是线性规划里的 simplex algorithm,它们完全不是一回事。
  • DIRECT 也不是随机全局搜索;它是确定性的区域划分方法。
  • 前五个方法本质上都偏局部;本章真正代表“全局搜索视角”的是 DIRECT。
What Examiners Usually Want · 复习时最该会什么
Understand the motivation
You should be able to say not only what each method does, but exactly which weakness of the previous method it is trying to fix.
复习时最值钱的不是背步骤,而是会说“它为什么会出现”。
Distinguish local and global
Know clearly where the chapter changes mindset: before DIRECT, search is centered around improving from current point(s); DIRECT evaluates entire regions.
一定分清局部搜索和全局区域划分的分水岭。
Compare evaluation budgets
A core practical theme is how many function evaluations are needed: line search costs, 2n probes, n+1 directions, simplex updates, domain partition growth.
Keep the geometry in mind
Axes, adaptive directions, positive spanning sets, simplexes, and rectangles are the geometric language of this chapter.
这一章本质上是在学不同的“几何搜索载体”。
Self-Test · 自测系统
Quiz Bank with Scoring
7题随机抽取 · 可查翻译 · 提交后自动评分并显示解析
Your Score · 得分
答题后点击 Submit & Score。