BranchFolding

BranchFolderLegacy 是 LLVM 后端的一个 跳转优化(Branch Folding)Pass,主要用于优化 汇编级别的控制流图(CFG),提升性能和可读性。
该 Pass 依赖如下的分析数据:

  • 基本块频率预测分析:MachineBlockFrequencyInfoWrapperPass
  • 分支频率预测分析:MachineBranchProbabilityInfoWrapperPass
  • PGO 也可以作为输入: ProfileSummaryInfoWrapperPass

当然还有其他的,例如 Subtarget、InstrInfo、RegisterInfo,这就不用说了。

核心入口为:

bool EnableTailMerge = !MF.getTarget().requiresStructuredCFG() &&  
                       PassConfig->getEnableTailMerge();  
MBFIWrapper MBBFreqInfo(  
    getAnalysis<MachineBlockFrequencyInfoWrapperPass>().getMBFI());  
BranchFolder Folder(  
    EnableTailMerge, /*CommonHoist=*/true, MBBFreqInfo,  
    getAnalysis<MachineBranchProbabilityInfoWrapperPass>().getMBPI(),  
    &getAnalysis<ProfileSummaryInfoWrapperPass>().getPSI());  
return Folder.OptimizeFunction(MF, MF.getSubtarget().getInstrInfo(),  
                               MF.getSubtarget().getRegisterInfo());

和很多迭代式优化一样,其通常是 while 循环,一直合并基本块直到不能合并为止:

bool MadeChangeThisIteration = true;  
while (MadeChangeThisIteration) {  
  MadeChangeThisIteration    = TailMergeBlocks(MF);  
  // No need to clean up if tail merging does not change anything after the  
  // block placement.  if (!AfterBlockPlacement || MadeChangeThisIteration)  
    MadeChangeThisIteration |= OptimizeBranches(MF);  
  if (EnableHoistCommonCode)  
    MadeChangeThisIteration |= HoistCommonCode(MF);  
  MadeChange |= MadeChangeThisIteration;  
}

那么 TailMergeBlocks 究竟是什么?
举个例子:
假设我们有以下两个基本块,在汇编级别上:

BB1:
  cmp r1, #0
  bne BB3
  mov r0, #1
  bx lr

BB2:
  cmp r2, #0
  beq BB4
  mov r0, #1
  bx lr

注意:

  • BB1 和 BB2 的尾部指令完全相同
BB1:
  cmp r1, #0
  bne BB3
  b Tail

BB2:
  cmp r2, #0
  beq BB4
  b Tail

Tail:
  mov r0, #1
  bx lr

这样:

  • 重复的尾部指令只保留一份。
  • 原先两个块跳转到公共的 Tail 块。
  • 减少了代码大小,提高了 cache 局部性。

相信到这儿,你一定能看出,这可能是一个无底洞(复杂度较高的 Pass),索性开发者提供了两个选项:

// Throttle for huge numbers of predecessors (compile speed problems)  
static cl::opt<unsigned>  
TailMergeThreshold("tail-merge-threshold",  
          cl::desc("Max number of predecessors to consider tail merging"),  
          cl::init(150), cl::Hidden);  
  
// Heuristic for tail merging (and, inversely, tail duplication).  
static cl::opt<unsigned>  
TailMergeSize("tail-merge-size",  
              cl::desc("Min number of instructions to consider tail merging"),  
              cl::init(3), cl::Hidden);

用来启发式指定合并的最大最小指令数量。
也可以通过 -pass-remarks-analysis=branch-folder 打印统计信息:

STATISTIC(NumDeadBlocks, "Number of dead blocks removed");  
STATISTIC(NumBranchOpts, "Number of branches optimized");  
STATISTIC(NumTailMerge , "Number of block tails merged");  
STATISTIC(NumHoist     , "Number of times common instructions are hoisted");  
STATISTIC(NumTailCalls,  "Number of tail calls optimized");

还有一些其他的知识,如这个 Pass 进行了如下四个步骤:

  • 相同代码提升
  • 基本块尾合并
  • 分支优化(带权重的,如 FallThrough)

附录 A PassManager

该 Pass 目前支持新旧两个版本的 Pass 管理器,均可以使用(前者是新版):

PreservedAnalyses BranchFolderPass::run(MachineFunction &MF,  
                                        MachineFunctionAnalysisManager &MFAM) {  
  MFPropsModifier _(*this, MF);  
  bool EnableTailMerge =  
      !MF.getTarget().requiresStructuredCFG() && this->EnableTailMerge;  
  
  auto &MBPI = MFAM.getResult<MachineBranchProbabilityAnalysis>(MF);  
  auto *PSI = MFAM.getResult<ModuleAnalysisManagerMachineFunctionProxy>(MF)  
                  .getCachedResult<ProfileSummaryAnalysis>(  
                      *MF.getFunction().getParent());  
  if (!PSI)  
    report_fatal_error(  
        "ProfileSummaryAnalysis is required for BranchFoldingPass", false);  
  
  auto &MBFI = MFAM.getResult<MachineBlockFrequencyAnalysis>(MF);  
  MBFIWrapper MBBFreqInfo(MBFI);  
  BranchFolder Folder(EnableTailMerge, /*CommonHoist=*/true, MBBFreqInfo, MBPI,  
                      PSI);  
  if (!Folder.OptimizeFunction(MF, MF.getSubtarget().getInstrInfo(),  
                               MF.getSubtarget().getRegisterInfo()))  
    return PreservedAnalyses::all();  
  return getMachineFunctionPassPreservedAnalyses();  
}  
  
bool BranchFolderLegacy::runOnMachineFunction(MachineFunction &MF) {  
  if (skipFunction(MF.getFunction()))  
    return false;  
  
  TargetPassConfig *PassConfig = &getAnalysis<TargetPassConfig>();  
  // TailMerge can create jump into if branches that make CFG irreducible for  
  // HW that requires structurized CFG.  bool EnableTailMerge = !MF.getTarget().requiresStructuredCFG() &&  
                         PassConfig->getEnableTailMerge();  
  MBFIWrapper MBBFreqInfo(  
      getAnalysis<MachineBlockFrequencyInfoWrapperPass>().getMBFI());  
  BranchFolder Folder(  
      EnableTailMerge, /*CommonHoist=*/true, MBBFreqInfo,  
      getAnalysis<MachineBranchProbabilityInfoWrapperPass>().getMBPI(),  
      &getAnalysis<ProfileSummaryInfoWrapperPass>().getPSI());  
  return Folder.OptimizeFunction(MF, MF.getSubtarget().getInstrInfo(),  
                                 MF.getSubtarget().getRegisterInfo());  
}