后端Pass简介——BranchFolding
BranchFolding
BranchFolderLegacy 是 LLVM 后端的一个 跳转优化(Branch Folding)Pass,主要用于优化 汇编级别的控制流图(CFG),提升性能和可读性。
该 Pass 依赖如下的分析数据:
- 基本块频率预测分析:MachineBlockFrequencyInfoWrapperPass
- 分支频率预测分析:MachineBranchProbabilityInfoWrapperPass
- PGO 也可以作为输入: ProfileSummaryInfoWrapperPass
当然还有其他的,例如 Subtarget、InstrInfo、RegisterInfo,这就不用说了。
核心入口为:
bool EnableTailMerge = !MF.getTarget().requiresStructuredCFG() &&
PassConfig->getEnableTailMerge();
MBFIWrapper MBBFreqInfo(
getAnalysis<MachineBlockFrequencyInfoWrapperPass>().getMBFI());
BranchFolder Folder(
EnableTailMerge, /*CommonHoist=*/true, MBBFreqInfo,
getAnalysis<MachineBranchProbabilityInfoWrapperPass>().getMBPI(),
&getAnalysis<ProfileSummaryInfoWrapperPass>().getPSI());
return Folder.OptimizeFunction(MF, MF.getSubtarget().getInstrInfo(),
MF.getSubtarget().getRegisterInfo());
和很多迭代式优化一样,其通常是 while 循环,一直合并基本块直到不能合并为止:
bool MadeChangeThisIteration = true;
while (MadeChangeThisIteration) {
MadeChangeThisIteration = TailMergeBlocks(MF);
// No need to clean up if tail merging does not change anything after the
// block placement. if (!AfterBlockPlacement || MadeChangeThisIteration)
MadeChangeThisIteration |= OptimizeBranches(MF);
if (EnableHoistCommonCode)
MadeChangeThisIteration |= HoistCommonCode(MF);
MadeChange |= MadeChangeThisIteration;
}
那么 TailMergeBlocks 究竟是什么?
举个例子:
假设我们有以下两个基本块,在汇编级别上:
BB1:
cmp r1, #0
bne BB3
mov r0, #1
bx lr
BB2:
cmp r2, #0
beq BB4
mov r0, #1
bx lr
注意:
- BB1 和 BB2 的尾部指令完全相同:
BB1:
cmp r1, #0
bne BB3
b Tail
BB2:
cmp r2, #0
beq BB4
b Tail
Tail:
mov r0, #1
bx lr
这样:
- 重复的尾部指令只保留一份。
- 原先两个块跳转到公共的 Tail 块。
- 减少了代码大小,提高了 cache 局部性。
相信到这儿,你一定能看出,这可能是一个无底洞(复杂度较高的 Pass),索性开发者提供了两个选项:
// Throttle for huge numbers of predecessors (compile speed problems)
static cl::opt<unsigned>
TailMergeThreshold("tail-merge-threshold",
cl::desc("Max number of predecessors to consider tail merging"),
cl::init(150), cl::Hidden);
// Heuristic for tail merging (and, inversely, tail duplication).
static cl::opt<unsigned>
TailMergeSize("tail-merge-size",
cl::desc("Min number of instructions to consider tail merging"),
cl::init(3), cl::Hidden);
用来启发式指定合并的最大最小指令数量。
也可以通过 -pass-remarks-analysis=branch-folder
打印统计信息:
STATISTIC(NumDeadBlocks, "Number of dead blocks removed");
STATISTIC(NumBranchOpts, "Number of branches optimized");
STATISTIC(NumTailMerge , "Number of block tails merged");
STATISTIC(NumHoist , "Number of times common instructions are hoisted");
STATISTIC(NumTailCalls, "Number of tail calls optimized");
还有一些其他的知识,如这个 Pass 进行了如下四个步骤:
- 相同代码提升
- 基本块尾合并
- 分支优化(带权重的,如 FallThrough)
附录 A PassManager
该 Pass 目前支持新旧两个版本的 Pass 管理器,均可以使用(前者是新版):
PreservedAnalyses BranchFolderPass::run(MachineFunction &MF,
MachineFunctionAnalysisManager &MFAM) {
MFPropsModifier _(*this, MF);
bool EnableTailMerge =
!MF.getTarget().requiresStructuredCFG() && this->EnableTailMerge;
auto &MBPI = MFAM.getResult<MachineBranchProbabilityAnalysis>(MF);
auto *PSI = MFAM.getResult<ModuleAnalysisManagerMachineFunctionProxy>(MF)
.getCachedResult<ProfileSummaryAnalysis>(
*MF.getFunction().getParent());
if (!PSI)
report_fatal_error(
"ProfileSummaryAnalysis is required for BranchFoldingPass", false);
auto &MBFI = MFAM.getResult<MachineBlockFrequencyAnalysis>(MF);
MBFIWrapper MBBFreqInfo(MBFI);
BranchFolder Folder(EnableTailMerge, /*CommonHoist=*/true, MBBFreqInfo, MBPI,
PSI);
if (!Folder.OptimizeFunction(MF, MF.getSubtarget().getInstrInfo(),
MF.getSubtarget().getRegisterInfo()))
return PreservedAnalyses::all();
return getMachineFunctionPassPreservedAnalyses();
}
bool BranchFolderLegacy::runOnMachineFunction(MachineFunction &MF) {
if (skipFunction(MF.getFunction()))
return false;
TargetPassConfig *PassConfig = &getAnalysis<TargetPassConfig>();
// TailMerge can create jump into if branches that make CFG irreducible for
// HW that requires structurized CFG. bool EnableTailMerge = !MF.getTarget().requiresStructuredCFG() &&
PassConfig->getEnableTailMerge();
MBFIWrapper MBBFreqInfo(
getAnalysis<MachineBlockFrequencyInfoWrapperPass>().getMBFI());
BranchFolder Folder(
EnableTailMerge, /*CommonHoist=*/true, MBBFreqInfo,
getAnalysis<MachineBranchProbabilityInfoWrapperPass>().getMBPI(),
&getAnalysis<ProfileSummaryInfoWrapperPass>().getPSI());
return Folder.OptimizeFunction(MF, MF.getSubtarget().getInstrInfo(),
MF.getSubtarget().getRegisterInfo());
}
评论