每日Pass简介——MachineFunctionSplitter
MachineFunctionSplitter
MachineFunctionSplitter 是 LLVM 中的一个机器函数拆分(Machine Function Splitting)优化通道,根据配置的阈值和运行时/采样的性能分析数据,将冷(rarely executed)的基本块拆分到独立的冷代码段,以提升指令缓存和 TLB 利用率。
该 Pass 主要涉及函数拆分的理由,即为什么要拆分:
- ProfileSummaryInfo (PSI) 和 MachineBlockFrequencyInfo (MBFI):分别提供函数/基本块的概要分析数据和具体的执行频率,用于判断哪些基本块是“冷”的。
- BasicBlockSections:LLVM 支持将基本块分组到不同的节(section),本 Pass 会将冷块标记为特定的冷节(如 .text.unlikely.*)。
- 阈值判断:基于百分位数(PercentileCutoff)或最低执行次数(ColdCountThreshold)来决定冷块;不同的 Profile 类型(Instrumentation vs. Sample)有不同的冷判规则。
所以这是一个明显的启发式相关的 Pass,允许下列参数:
// FIXME: This cutoff value is CPU dependent and should be moved to
// TargetTransformInfo once we consider enabling this on other platforms.
// The value is expressed as a ProfileSummaryInfo integer percentile cutoff.
// Defaults to 999950, i.e. all blocks colder than 99.995 percentile are split.
// The default was empirically determined to be optimal when considering cutoff
// values between 99%-ile to 100%-ile with respect to iTLB and icache metrics on
// Intel CPUs.
static cl::opt<unsigned>
PercentileCutoff("mfs-psi-cutoff",
cl::desc("Percentile profile summary cutoff used to "
"determine cold blocks. Unused if set to zero."),
cl::init(999950), cl::Hidden);
static cl::opt<unsigned> ColdCountThreshold(
"mfs-count-threshold",
cl::desc(
"Minimum number of times a block must be executed to be retained."),
cl::init(1), cl::Hidden);
static cl::opt<bool> SplitAllEHCode(
"mfs-split-ehcode",
cl::desc("Splits all EH code and it's descendants by default."),
cl::init(false), cl::Hidden);
-mfs-psi-cutoff=<N>
:按 ProfileSummaryInfo 百分位数切割(默认 999950,对应 99.995%)。-mfs-count-threshold=<N>
:最低执行次数阈值(默认 1)。-mfs-split-ehcode
:是否无条件拆分所有异常处理代码及其后代(布尔开关,默认关闭)。
但是该 Pass 仅 200 行。
该 Pass 和 -basic-block-sections=all
是冲突的,依赖 BasicBlockSectionsProfileReaderWrapperPass
,MachineBlockFrequencyInfoWrapperPass
, ProfileSummaryInfoWrapperPass
(即依赖 PGO 作为 Profiling 输入)。
然后基本逻辑是,遍历基本块,标记“冷”块。
for (auto &MBB : MF) {
if (MBB.isEntryBlock()) continue;
if (MBB.isEHPad())
LandingPads.push_back(&MBB);
else if (UseProfileData && isColdBlock(MBB, MBFI, PSI)
&& TII.isMBBSafeToSplitToCold(MBB) && !SplitAllEHCode)
MBB.setSectionID(MBBSectionID::ColdSectionID);
}
最后一步是重排基本块并更新分支:
finishAdjustingBasicBlocksAndLandingPads(MF);
这里完全复用已有 API 了:
static void finishAdjustingBasicBlocksAndLandingPads(MachineFunction &MF) {
auto Comparator = [](const MachineBasicBlock &X, const MachineBasicBlock &Y) {
return X.getSectionID().Type < Y.getSectionID().Type;
};
llvm::sortBasicBlocksAndUpdateBranches(MF, Comparator);
llvm::avoidZeroOffsetLandingPad(MF);
}
评论