Preface
I had the honour of authoring this challenge as part of Traboda for the jeopardy CTF in ICC 2024. Huge thanks to everyone involved!
You can find the challenge source here
Challenge Description
An experimental calculator that uses JIT to speed up calculations, or does it? what harm could some math expression really do?
Handout has the binary, source and the Dockerfile, needed to recreate the environment.
Initial Analysis
The binary has all mitigations that are expected:
[k1r4@enderman handout]$ pwn checksec calc
[*] '/home/k1r4/work/projects/calc/handout/calc'
Arch: amd64-64-little
RELRO: Full RELRO
Stack: Canary found
NX: NX enabled
PIE: PIE enabled
The sources provided are in C++. The program evaluates mathematical expressions interactively.
Implementation
The program consists of a Parser
and an Evaluator
, which are two separate classes. The Parser
parses the input expression and converts it to a binary tree. The Evaluator
takes the binary tree as input and evaluates the expression to produce the result.
Parser
class Parser {
private:
bool isOperator(char token);
int precedence(char op);
void processOperator(std::stack<char>& operators, std::stack<std::unique_ptr<Node>>& operands);
public:
std::unique_ptr<Node> parseExpression(const std::string& expression);
};
The parser uses the parseExpression()
method to parse the input expression. It does so iteratively by maintaining two stacks to keep track of operators and operands separately. The processOperator()
method is for processing the current operator on top of the operators stack.
Evaluator
class Evaluator {
private:
char *jitCode, *jitCodeBase;
long *jitConsts, *jitConstsBase;
long jitConstCount, jitOpCount;
long jitConstsSize, jitCodeSize;
long jitPrevConst;
std::unordered_map<long, short> constMap;
void *createJitRegion(long size);
void updateJitPerm(void);
void jitEmitCode(const std::unique_ptr<Node>& root);
void jitCountNodes(const std::unique_ptr<Node> & root);
public:
long evaluate(const std::unique_ptr<Node>& root);
long jitEvaluate(const std::unique_ptr<Node>& root);
~Evaluator(void);
};
The evaluator uses the evaluate()
method for evaluating binary expression trees when JIT is disabled. If JIT is enabled, then the jitEvaluate()
method is used. jitEvalute()
approximates the size of JIT memory required using jitCountNodes()
. It then creates the JIT memory map using createJitRegion()
. JIT code is then emitted by jitEmitCode()
. updateJitPerm()
is then used to set the permissions for the JIT memory appropriately. Finally the JIT-ed code is called and the result is returned.
All constants (numbers) used in the expression are stored in a read-only region. The JIT-ed code fetches the constants from the read-only memory through push qword ptr [rdi+rdx*8]
where rdx
is the index of the constant.
The constants are then popped into rax
and rsi
and the respective operation is done such as xor rax, rsi
. Finally push rax
is done to save the result. This is done until the epilogue of the JIT-ed code is reached, which performs pop rax ; ret
.
Vulnerability
The mul instruction
There are two intended vulnerabilities in this program. The first involves the mul
instruction which is emitted for the *
operator. The single operand 64 bit mul
instruction specification states that the result will be stored in rdx:rax
. This means that the overflow of the multiplication will be stored in the rdx
register. However rdx
is used to index the memory used to fetch constants.
There is an “optimization” in the jitEmitCode()
which doesn’t update rdx
(technically edx
) when the same constant is used back to back. Consider the example of 1+2
, the emitted code would look something like:
mov edx, 0
push qword ptr [rdi+rdx*8]
mov edx, 1
push qword ptr [rdi+rdx*8]
pop rsi
pop rax
add rax, rsi
push rax
If 1*1
is inputted instead, the emitted code would look something like:
mov edx, 0
push qword ptr [rdi+rdx*8]
nop
nop
nop
nop
push qword ptr [rdi+rdx*8]
pop rsi
pop rax
add rax, rsi
push rax
This is because of the following code:
if(jitPrevConst == root->value) {
*(int *)(jitCode-7) = 0x90909090;
}
else {
jitPrevConst = root->value;
}
This replaces the mov edx, xxxx
with nop
instructions, when the previous constant that needed to be fetched, is the same as the current one.
Consider the case when an overflowing mul
operation is followed by fetching a constant that is the exact same one as fetched previously. The emitted code following the mul rsi
would be, a few nop
instructions and finally a push qword ptr [rdi+rdx*8]
. Hence controlling the overflow value stored in rdx
will allow for fetching arbitrary constants from memory.
The JIT memory size
jitCountConsts()
is used to calculate the number of constants and operators in the expression. Based on these numbers, the JIT memory required is approximated and page aligned.
jitCountNodes(root);
jitConstsSize = PAGE_ALIGN((jitConstCount*sizeof(long)));
jitCodeSize = PAGE_ALIGN(((jitConstCount*FETCH_SZ)+(jitOpCount*EXEC_SZ)));
FETCH_SZ
and EXEC_SZ
is the number of bytes of code required for a constant fetch and operation execution respectively. Although the size calculation is not incorrect by itself, it fails to account for the epilogue that is included at the end of the JIT-ed code.
std::memset(jitCodeBase, 0x90, jitCodeSize);
*(jitCode+jitCodeSize-2) = '\x58';
*(jitCode+jitCodeSize-1) = '\xc3';
jitEmitCode(root);
If the expression causes code to emit such that the pop rax
in the epilogue is overwritten by push rax
of the last operation, then this would allow for hijacking control flow and code execution.
Typically an expression contains N
operands (constants) and N-1
operators, like in the case of 5+4-3*2/1
, which has 5 constants and 4 operators. In order to get push rax
to overwrite pop rax
, the push rax
must be present at one byte before the page boundary.
Consider the following equation where x
is the number of operations in the expression and y
is any natural number:
x * EXEC_SZ + (x+1) * FETCH_SZ = 0xyfff
x * 6 + (x+1) * 7 = 0xyfff
6x + 7x + 7 = 0xyfff
13x = 0xyfff - 7
Solving this equation for values of y
from 0, until we get x
as a whole number, we get x
as 2520 and y
as 7.
So the JIT-ed code size will be 0x7fff and the number of operations will be 2520 with 2521 constants.
Exploit Strategy
The JIT memory regions are mmapped addresses which would mean that they are at constant offsets from libc, other loaded libraries and other mmapped regions. This allows fetching addresses from nearby regions and adding offset to get address of any library function in memory. Furthermore chaining this with the second bug allows execution of any function. The goal is to execute system("sh")
for code execution.
The exploit expression would do the following:
- Pad a lot of dummy operations (around 2520) at the start, to reach the target size for the second bug
- Perform a mul operation with an overflow, which corresponds to the index of the out-of-bound fetch
- Fetch the last operand used in the mul operation so that, the
mov edx, xxxx
is optimized out - The fetch now gets a memory address, preferably in libc
- Add an offset to get the
system
function - Perform a dummy operation that uses the numerical represnation of the “sh” string
- Ensure that the total number of operations in the expression are 2520
- Profit!
Since rdi
contains the address of the constants, it will point to the last constant used in the expression, which would be “sh”. This is because the binary expression tree contains the last operations and constants near the root of the tree.
Conclusion
I love authoring challenges and this challenge was no exception. The development process for this challenge was a lot of fun. All the lectures in university on data structures finally paid off!
You can find the exploit here