SimplifyCFG, part 1
Introduction
Code optimization is one of the most important steps in the compiler’s code generation pipeline. Optimization is performed in multiple passes, which are categorized into analysis passes and transformation passes. Analysis passes help the compiler gather information and understand the code structure. Examples of such passes include dominance tree construction and alias analysis, which provide insights into code. Transformation passes, on the other hand, are responsible for modifying the target program to improve its performance while ensuring correctness. These passes rely on the information obtained from analysis passes to make safe and effective optimizations.
A prominent example of a transformation pass is SimplifyCFG. This pass is applied repeatedly throughout the optimization pipeline and performs various tasks, including dead code elimination, speculative execution, and eliminate duplicate phi nodes. In fact, it handles so many operations that its source code spans approximately 8,500 lines. Here we will focus on explaining the default actions performed by this pass.
Let’s start
The SimplifyCFGPass class
The SimplifyCFG.h file defines the following class:
class SimplifyCFGPass : public PassInfoMixin<SimplifyCFGPass> {
SimplifyCFGOptions Options;
public:
/// The default constructor configures the pass
/// to prioritize canonical IR over optimal IR
SimplifyCFGPass();
/// Construct a pass with optional optimizations.
SimplifyCFGPass(const SimplifyCFGOptions &PassOptions);
/// Run the pass over the function.
PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);
The SimplifyCFGPass
class inherits from PassInfoMixin<SimplifyCFGPass>
, which is designed for use with the New Pass Manager. Additionally, the class contains an attribute named Options
, which allows us to specify the optimizations we want the pass to perform, in addition to the default ones.
Invoking the pass
The pass manager invokes the pass by calling its run
member function, which is defined in SimplifyCFG.h. This function takes two primary arguments:
- A reference to
F
, the function we want to optimize. - The
AM
(AnalysisManager), a utility class that helps the pass invoke various analysis passes to gather information about the code.
In the code snippet, we can see that AM
is used to call TargetIRAnalysis and AssumptionAnalysis. While we won’t delve deeply into these passes, it’s worth noting their roles:
- TargetIRAnalysis: This is an LLVM analysis pass that provides target-specific information. The returned
TTI
(TargetTransformInfo) object exposes APIs for querying such information. - AssumptionAnalysis: This pass identifies and collects assumptions within the function
F
, enabling the pass to use theseassumptions
(llvm.assume’ Intrinsic) to guide its transformations.
PreservedAnalyses SimplifyCFGPass::run(Function &F,
FunctionAnalysisManager &AM) {
auto &TTI = AM.getResult<TargetIRAnalysis>(F);
Options.AC = &AM.getResult<AssumptionAnalysis>(F);
DominatorTree *DT = nullptr;
if (RequireAndPreserveDomTree)
DT = &AM.getResult<DominatorTreeAnalysis>(F);
if (!simplifyFunctionCFG(F, TTI, DT, Options))
return PreservedAnalyses::all();
PreservedAnalyses PA;
if (RequireAndPreserveDomTree)
PA.preserve<DominatorTreeAnalysis>();
return PA;
}
The dominator tree (DT
) is initialized as nullptr
. As previously mentioned, this pass offers a variety of options, one of which is maintaining the dominator tree. This behavior can be controlled using the simplifycfg-require-and-preserve-domtree
flag. By default, the RequireAndPreserveDomTree
option returns false
unless explicitly set to true
. When this option is not enabled, the DT
will not be initialized.
DominatorTree *DT = nullptr;
if (RequireAndPreserveDomTree)
DT = &AM.getResult<DominatorTreeAnalysis>(F);
}
We can verify that the DT
is not initialized using gdb
.
379 if (RequireAndPreserveDomTree)
(gdb) source commandLineOtionsPrettyPrinter.py ####load pretty printer i made :)####
(gdb) p RequireAndPreserveDomTree
$1 = Hidden: Hidden
NumOcurr: 0
Ocurrences: Optional
Formatting: Normal
Misc: None
FullyInitialized: true
last position at args 0
Additional vals 0
Argument name: 0x555556314818 "simplifycfg-require-and-preserve-domtree"
Type of arg<0x0>
Help description 0x5555563147b8 "Temporary development switch used to gradually uplift SimplifyCFG into preserving DomTree,"
#### call to operator bool() ####
(gdb) call RequireAndPreserveDomTree.operator bool()
$2 = false
(gdb) n #### jump line 380 DT was not initialized ####
381 if (!simplifyFunctionCFG(F, TTI, DT, Options))
}
The RequireAndPreserveDomTree
flag is implemented as a command-line option of type bool
, allowing us to inspect its value directly in gdb
. This can be done by invoking the bool operator or by observing that the code skips the associated if
statement during debugging.
Finally, we are ready to invoke the optimizations. The function simplifyFunctionCFG(F, TTI, DT, Options)
is called to perform the optimizations. It returns a bool
indicating whether any changes were made to the IR. If no changes were made, the pass will preserve all the analysis results.
if (!simplifyFunctionCFG(F, TTI, DT, Options))
return PreservedAnalyses::all(); //Preserve analysis if no changes were made.
}
Inside the simplifyFunctionCFG
function, we can identify three primary calls for optimizations. These are the key functions that we will explain in detail:
removeUnreachableBlocks
:
This function traverses the control flow graph (CFG) and removes all basic blocks that are unreachable within the graph.tailMergeBlocksWithSimilarFunctionTerminators
iterativelySimplifyCFG
:
This function is closely related to theSimplifyCFGOptions
class mentioned earlier. It performs the majority of the various optimizations provided by the SimplifyCFG pass. Examples of these optimizations include:- Simplifying
switch
statements. - Deleting dead basic blocks.
- Sinking common code.
- Simplifying
static bool simplifyFunctionCFGImpl(Function &F, const TargetTransformInfo &TTI,
DominatorTree *DT,
const SimplifyCFGOptions &Options) {
DomTreeUpdater DTU(DT, DomTreeUpdater::UpdateStrategy::Eager);
bool EverChanged = removeUnreachableBlocks(F, DT ? &DTU : nullptr);
EverChanged |=
tailMergeBlocksWithSimilarFunctionTerminators(F, DT ? &DTU : nullptr);
EverChanged |= iterativelySimplifyCFG(F, TTI, DT ? &DTU : nullptr, Options);
}
removeUnreachableBlocks function
This is the first function called by the pass. The parameters passed to the function are as follows:
- A reference to
F
, the function that we want to optimize. DTU
(DominatorTreeUpdater): This parameter is set tonullptr
because we do not intend to preserve the dominator tree.MSSAU
(MemorySSAUpdater): This parameter is initialized to its default value,nullptr
.
bool llvm::removeUnreachableBlocks(Function &F, domtreeupdater *DTU,
MemorySSAUpdater *MSSAU) {
SmallPtrSet<BasicBlock *, 16> Reachable;
bool Changed = markAliveBlocks(F, Reachable, DTU);
}
279 bool EverChanged = removeUnreachableBlocks(F, DT ? &DTU : nullptr);
(gdb) s
llvm::removeUnreachableBlocks (F=..., DTU=0x0, MSSAU=0x0)
## DTU and MSSAU are nullptr
}
The core of this pass lies in the markAliveBlocks function, which receives a set of basic blocks named Reachable
. This function populates the Reachable
set with the basic blocks that can be reached within the control flow graph.
Explanation of the markAliveBlocks
Function
The markAliveBlocks
function relies on three key variables:
worklist
: (vector of basic blocks) – Acts like a stack, storing blocks that need to be processed.SmallVector<BasicBlock*, 128>
.BB
: (basic block) – Represents the current basic block being analyzed.BasicBlock *BB
.Reachable
: (set of basic blocks) – Keeps track of the blocks that can be reached in the control flow graph (CFG).
The function traverses the CFG, adding the successors of the current basic block (BB
) to the worklist
and marking them as reachable. If a successor is already in the Reachable
set, it will not be inserted again.
Apart from flagging reachable blocks, the function also performs checks on each basic block. These checks can modify the CFG by marking certain blocks as unreachable. Below are two scenarios where the CFG can change:
- Storing to a
null
or an undefined memory location – This can lead to the block being considered unreachable. - An
llvm.assume
intrinsic evaluates tofalse
– If an assumption is explicitly marked asfalse
, the block becomes unreachable. The following example demonstrates this using the C++[[assume]]
attribute:
Visualization of How markAliveBlocks
Works
Below are three examples demonstrating how the markAliveBlocks
function operates in different scenarios:
Example 1: Standard
markAliveBlocks
Execution- Shows how the function traverses the control flow graph and marks reachable blocks.
Example 2: Marking a Block as Unreachable Due to an Undefined Store
- Demonstrates a case where a block is removed because it stores a value to an undefined memory address.
Example 3: Marking a Block as Unreachable Using
llvm.assume
- Illustrates how LLVM assumptions (
llvm.assume
) can result in a block being marked as unreachable when an assumption is known to be false.
- Illustrates how LLVM assumptions (
In examples 2 and 3, we saw cases where markAliveBlocks
returns Changed = true
. This happens when blocks are marked as unreachable due to:
- Storing to an undefined address.
- An
llvm.assume
intrinsic evaluating tofalse
.
Checking for Unreachable Blocks
Once we have the reachable set, the next step is verifying whether all blocks in the function are reachable:
- If all blocks are reachable (
Reachable.size() == F.size()
), the function simply returnsChanged
. - If some blocks are unreachable, the function iterates over all blocks in
F
, checking whether they are in theReachable
set using:Reachable.count(&BB)
- If
count == 1
, the block is reachable, and we continue. - If
count == 0
, the block is unreachable and is added to a vectorBlocksToRemove
. - If
DTU
(Dominator Tree Updater) is enabled, it also skips blocks pending deletion.
- If
Removing Dead Blocks
After iterating through the function:
- If
BlocksToRemove
is empty, the function returnsChanged
. - Otherwise,
Changed
is set totrue
, andDeleteDeadBlocks
is called to eliminate the unreachable blocks.
bool Changed = markAliveBlocks(F, Reachable, DTU);
// If there are unreachable blocks in the CFG...
if (Reachable.size() == F.size())
return Changed;
SmallSetVector<BasicBlock *, 8> BlocksToRemove;
for (BasicBlock &BB : F) {
if (Reachable.count(&BB))
continue;
// Skip already-deleted blocks
if (DTU && DTU->isBBPendingDeletion(&BB))
continue;
BlocksToRemove.insert(&BB);
}
if (BlocksToRemove.empty())
return Changed;
Changed = true;
DeleteDeadBlocks(BlocksToRemove.takeVector(), DTU);
return Changed;
}
Conclusion
In this discussion, we explored how the SimplifyCFGPass is invoked and examined the command-line options class in LLVM. We saw that by default, the dominator tree is not preserved in this pass. However, it can be preserved by enabling the --simplifycfg-require-and-preserve-domtree
command-line option.
We then delved into the removeUnreachableBlocks function, which starts with the markAliveBlocks function. This function identifies reachable basic blocks by traversing the control flow graph (CFG). Any blocks that cannot be reached are flagged for removal. Furthermore, the function accounts for obvious undefined behavior (UB) cases that also make certain blocks unreachable.
The process continues with removeUnreachableBlocks, which compares the number of reachable blocks with the total number of blocks in the function. If any blocks are found to be unreachable, they are collected in a set called BlocksToRemove
. These unreachable blocks are then removed using the DeleteDeadBlocks function.
In the next part, we will explore the tailMergeBlocksWithSimilarFunctionTerminators function and its role in further optimizing the control flow graph.
References
Reference | Description | Link |
---|---|---|
1 | Dominator | Dominator |
2 | Alias Analysis | Alias Analysis |
3 | Simplify the CFG | Simplify the CFG |
4 | SimplifyCFG.h | SimplifyCFG.h |
5 | New Pass Manager | New Pass Manager |
6 | llvm::AnalysisManager | llvm::AnalysisManager |
7 | llvm::TargetTransformInfo | llvm::TargetTransformInfo |
8 | llvm::AssumptionAnalysis | llvm::AssumptionAnalysis |
9 | llvm.assume | llvm.assume |
10 | llvm::Function | llvm::Function |
11 | Remove Unreachable | Remove Unreachable |
12 | llvm::DomTreeUpdater | llvm::DomTreeUpdater |
13 | llvm::MemorySSAUpdater | llvm::MemorySSAUpdater |
14 | markAliveBlocks | markAliveBlocks |
15 | llvm::SmallPtrSet | llvm::SmallPtrSet |
16 | BasicBlock | BasicBlock |
17 | markAliveBlocks | markAliveBlocks |
18 | assume | assume |