-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathroot.tex~
376 lines (331 loc) · 19.3 KB
/
root.tex~
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
\documentclass[12pt]{book}
%\documentclass[12pt]{article}
% This work is about rootkits
%\usepackage[T2A]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{multicol}
\usepackage{gensymb}
\usepackage{graphicx}
\usepackage{listings}
\usepackage{verbatim}
\usepackage{hyperref}
\hypersetup{colorlinks,
citecolor=black,
filecolor=black,
linkcolor=black,
urlcolor=black,
bookmarksopen=true,
pdftex}
\hfuzz = .6pt % avoid black boxes
% TODO List
\usepackage{color}
\usepackage{index} % use index package to create indices
\newindex{todo}{tod}{tnd}{TODO List} % start todo list
\newindex{fixme}{fix}{fnd}{FIXME List} % start fixme list
\newcommand{\todo}[1]{\textcolor{blue}{TODO: #1}\index[todo]{#1}}
\newcommand{\fixme}[1]{\textcolor{red}{FIXME: #1}\index[fixme]{#1}}
\title{Kernel rootkits in Linux: low-level approach and prevention}
\date{}
\author{Ivan Galinskiy}
\begin{document}
\maketitle
\newpage
\emph{
Copyright (C) 2010 Ivan Galinskiy.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3
or any later version published by the Free Software Foundation;
with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
A copy of the license is included in the section entitled "GNU
Free Documentation License".}
\newpage
\section{Kinds of rootkits}
\subsection{Classification}
What exactly are the rootkits? In the malware classification, rootkits are
programs designed to hide the fact of system intrusion by hiding processes,
users, files etc. This is the base classification, which is true for all
kinds of rootkits. But if we look at real samples, deviations appear. For
example, in some cases the rootkit is not just ``standalone'', but a part of
another piece of malware which is being hidden. A very good example is
Rustock.C designed for Windows.
\subsection{Basic principles of work}
Obviously, the process of hiding something is based on modifying system
``internals'', requiring thus some way to gain administrative (root)
privileges. This can be done in very different ways, and besides, it is not
part of rootkit's job, so we will skip that. But there are basically two
ways the rootkit ``holds'' itself on the main system:
\begin{enumerate}
\item Modifying files on the filesystem. When a program has administrative
rights on the target machine, it can (almost always) do whatever it
``wants''. For example, modifying the passwd or sudo utilities will
probably get users' passwords. The disadvantages are obvious. To detect
the rootkit, the user needs to check main utilities' checksums from a
trusted operating system (either by loading with LiveCD or by taking the
harddisk to another machine).
\item Modifying only the RAM. Of course, at first sight it may look a bit
strange, as with a reboot anything will return to normal. But just
imagine a server with, lets say, 2 years uptime? Now it looks better,
and this kind of rootkits is much tougher (and more interesting to
research).
\end{enumerate}
\chapter{A brief look at DR Linux rootkit}
Well, finding this one was not a difficult task. Besides, it's one of the
most up-to-date open-source rootkits available. Others are either very old,
or don't match our context, so we will not look at them. The source code
indicates that this rootkit is based on debug registers. According to Intel
documentation, the debugging registers are DR0 - DR7. DR0 - DR3 registers
hold four linear addresses. DR4 and DR5 are reserved for extended debugging
and we are not going to look at them now. DR6 is the ``Debug State
Register'' and DR7 is the ``Debug Control Register''. What is their purpose?
The below scheme from Intel documentation explains some things.
\begin{figure}[h]
\includegraphics[width=\linewidth, keepaspectratio]{dregs}
\end{figure}
The one interesting is the DR7, which controls the debugging behaviour. For
the rootkit, the usefullness is in the ability to control read, write or
execute operations (or their combinations) on the breakpoints (note: the
settings are individual for each breakpoint). Obviously, the breakpoints are
not useful by themselves. When a breakpoint is reached, after executing it,
the processor emits a \verb!#DB! exception, which is catched by the kernel
handler in normal cases. But the rootkit changes the handler in Interrupt
Descriptor Table to its own or either modifies the system handler (in this
case the IDT remains untouched).
\chapter{Interception techniques}
Wait, is this the only way to control system internals? Actually, more
methods were created through the time, but all of them are based on
modifying well-known system structures, the quantity of which is not so
big. What structures? Some of them are IDT, MSR, DR registers (as seen
above), syscall tables... What are all these abbreviatures? They may look
scary at first sight, but the things are simpler. So what all those things
do?
\begin{itemize}
\item IDT means Interrupt Descriptor Table. In simple words, it contains
addresses of handlers for interrupts. As we have seen in the DR rootkit,
it may be very useful. There is also another detail, before Pentium II
was introduced, system calls (i.e. calls from user applications to
kernel) were performed using the \verb!0x80! interrupt (loading the
system call number into EAX register before invoking interrupt). And
guess what? The pointer to the handler for that interrupt was stored in
IDT too.
\item MSR stands for Model Specific Registers. Before Pentium II,
interrupts were used to make system calls. It's a simple way, but
unfortunately, slow. That's why \verb!SYSENTER/SYSCALL! and
\verb!SYSEXIT/SYSRET! (for Intel/AMD respectively) commands were
introduced, providing a faster way to make system calls. Now the pointer
to that handler of the call was not in IDT, but in a set of MSR
registers. They store the target instruction, stack segment pointer
etc. But the most interesting and useful for us is the
\verb!IA_32_SYSENTER_EIP! which stores the target instruction. Changing
it to something else will redirect all the system calls into the new
procedure.
\item So what is the difference between the above two methods? Only the
way to \emph{call} system procedures! Even if we examine the source of
\verb!system_call! and \verb!ia32_sysenter_target! (there is where
\verb!IA_32_SYSENTER_EIP! points by default), in both we find
``\verb!call *sys_call_table(,%eax,4)!''. This means that those
procedures are the same in both cases (otherwise, it would be
strange). And, of course, modifying pointers in this table can be very
funny for the rootkit (and more for the machine owner).
\end{itemize}
\subsection{Modifying IDT}
There are some other ways, of course, but now we will concentrate on the
ones listed above and try to perform these ``tricks''. So, the first in the
list is interception of interrupts via IDT. Ok, let's begin.
\begin{itemize}
\item First of all, I am going to be as kernel version independent as I
can. It means that I am going to use resources available in the
processor rather that predefined macros or whatever.
\item OK, before we make any changes to IDT, we obviously need to know
where exactly it resides. An assembly command \verb!SIDT! can help us
with this, getting the IDTR register contents. In 32-bit systems, IDTR
contains two fields: the 16-bit limit, specifying the size of the table,
and 32-bit address which is the location of IDT. But there is a detail
that led me to mistakes: the address is stored with low-order bytes
first! We can define a function to get the register contents in
easily-readable form: \verbatiminput{sidt.h}
\item Half of the job is done now. However, we still need to get a
particular entry in the IDT (to store the original interrupt handler,
for example). In Linux on 32-bit systems, each entry in IDT is 8-bytes
long and consists of an offset to the handler and some attributes. The
funny thing is the offset is not continuous in the entry! The first 16
bits begin at bit 0 of the IDT entry, and the last 16 are at the end of
the entry. Tricky, right? The following code can handle with this:
\verbatiminput{idt.h}
\item Very good! Now that we have the entries, many things can be
done. But let's stop with IDT hooking and continue to the next method.
\end{itemize}
\subsection{SYSENTER/SYSCALL interception}
Well, this one is a juicy one! As I already told, beginning with Pentium II,
Intel processors implemented the new \verb!SYSENTER!/\verb!SYSEXIT!
instructions (and AMD used \verb!SYSCALL!/\verb!SYSRET!). They decreased the
overhead of switching from user mode and vice-versa (the interrupts are
slow). These instructions used special MSR registers to know where the
target procedure was located. There is one that is specially relevant to us:
\verb!SYSENTER_EIP_MSR!. Its contents are loaded to the EIP register
(basically a jump) at the end of \verb!SYSENTER! execution. Initially it
points to a kernel procedure, but we can change it to our procedure. How can
it be done?
\begin{itemize}
\item First, of course, we need a way to access the
\verb!SYSENTER_EIP_MSR! register. It's not accessed like the, let's say,
EAX register. There is a special instruction, \verb!RDMSR!, that does
it. The only requirement is that the number of MSR register should be
loaded into ECX (the number of \verb!SYSENTER_EIP_MSR! is 0x176). The
contents are stored in EDX and EAX registers (EDX is 0 on 32-bit
systems).
\item Now we can put a new value to the register. This process is
basically an inverted version of the previous. We load the new value
into EDX:EAX (EDX is zero, EAX is the new procedure pointer), the MSR
register number into ECX, and perform the instruction \verb!WRMSR!.
\item I thought that an example would be more helpful than a dry
description, so here is a minimal sample kernel module for this task
(the includes were (duh) excluded). It's a kernel module because
\verb!WRMSR! can only be performed at ring 0: \verbatiminput{msr.h}
\item Obviously, this module doesn't do anything special, it's more like a
``proof-of-concept''. But the payload will come later.
\end{itemize}
\chapter{Designing a rootkit}
Now we have the most popular methods of intercepting system internals, which
are also pretty easy to detect. Usually the check consists of retrieving the
system structures, registers etc. and comparing them to the original ones
found in an uncompressed kernel (or the System.map file). Can the results of
such a check be trusted? No! The rootkits now prefer to modify the system
handlers themselves instead, as it's more difficult to discover. For
example, let's see the method for debugging registers (it's simpler than
other methods), but in a new way.
\subsection{Modifying the original debug handler}
\begin{enumerate}
\item What happens when a breakpoint is reached? The 0x1 interrupt. Now we
need to see what procedure is called to handle that interrupt, so let's
see the IDT.
\item On my system, it reported the address 0xc125af80. This doesn't tell
a lot, right? To discover what it is, I used the System.map file. The
result was a function ``\verb!debug!''. Now this is interesting! Let's
see what this function does in kernel sources. Actually, the debug entry
is located (kernel 2.6.33) in the \verb!arch/x86/kernel/entry_32.S! (was
tricky to find). And this is the code. \verbatiminput{deb_entry.S}
Good! The ``\verb!call do_debug!'' seems to be the call to the
``official'' debug handler. And if we change it with our own handler,
which then gives control to \verb!do_debug!? Lots of fun! The only
problem here is that we actually need to find this call and replace the
original address to our own handler.
\item Yes, in theory it's simple, but the practice is a bit more
complicated. It should be good to see part of disassembled listing of
``\verb!debug!'':
\verbatiminput{debug.s} Interesting, right? But here is a problem. As the
hex code of ``\verb!call!'' in this case is \verb!0xe8!, it's a relative
near call. Obviously, it's not acceptable for the hooking function (the
addresses will be different), so first we need to calculate the absolute
offset of ``\verb!do_debug!''. Yes, and just for clarity: the 4-byte
value after ``\verb!0xe8! is a signed integer. The offset is added to the
address of the next instruction, in my case \verb!0xc125afd0!, and
(voilà!) we obtain the linear address of ``\verb!do_debug!''. But first,
we need to find this call. According to the objdump listing provided
above, the 4-byte pattern we are looking for is
\verb!0xd289e0e8!. Digging in the kernel is hard for a human, so let's
define another function. \emph{Important: when we get a value from
memory and use it as an integer, it's inverted (because of the
endianness). So if we need to find code, we need to invert the pattern
again}: \verbatiminput{search.h} \emph{WARNING: It's not the best or
the fastest code, but considering that it will usually be called only
once, it's not critical}
\end{enumerate}
Well, this is a good technique, but I will not use it because of the
relatively easy way to access DR0-DR7 registers. Instead, I will use a
little bit more complicated, but more reliable (in terms of ease of
discovery) method of hijacking system calls directly in the
\verb!sys_call_table!.
\begin{itemize}
\item The responsible code, found both in \verb!system_call! and
\verb!ia32_sysenter_target! (what additionally proves that system calls
are located in that table), is \verb!call *sys_call_table(,%eax,4)!. This
is the disassembled fragment of \verb!ia32_sysenter_target!:
\verbatiminput{syscall_dis.S} A negative value? It cannot be, since the
opcode \verb!0xff! always means an absolute offset. It's a mistake in
\verb!objdump!, so I sent a bug report. However, it's not that critical,
and we may continue.
\item Now the function ``\verb!search!'', defined above, can be used to
search the pattern \verb!0x00ff1485! (taken from the disassembly listing),
and that is how we obtain the address of \verb!sys_call_table!!
\item Well, the table is here, but we have no idea on what entry is
interesting for us. But there is a very useful file in kernel sources,
\verb!arch/x86/kernel/syscall_table_32.S!. I will provide a little
fragment of that file: \verbatiminput{syscall_table.S}
\item Why waiting?! Let's have some fun and modify the \verb!sys_open!!
First, I would like to define a function to find \verb!sys_call_table! and
a inline function to read a particular entry. Here they go:
\verbatiminput{syscall.h}
\item Now that we have all the nessesary addresses, \verb!sys_open! can be
``patched''. How? I found it easier to read the disassembled listing of
\verb!sys_open! (again, right?) than searching through the kernel
source. Also the function is not so big, so you may see the complete
listing of it: \verbatiminput{sys_open.S} Why so tiny? Looks more like a
wrapper or something similar. And it is! Look at the
``\verb!call 0xc10b0307!'' (\verb!0xe8! opcode, another relative
offset). In my system this address represents function
``\verb!do_sys_open!''. Feeling the power? Oh yes.
\item So, now it's only a question of technique to hook
\verb!sys_open!. After that the function \verb!search!, a pointer to the
beggining of the address (or better said, relative offset) is
obtained. This offset is stored as a signed integer, after that we obtain
the address of the next instruction by adding 4 (4 bytes) to the
pointer. Then the offset is added to that pointer and that is how we
obtain the absolute offset of ``\verb!do_sys_open!''. Well, it's not that
simple. Why? The pages that contain this code are write-protected, so an
attempt to write there will cause an exception and nothing more (it took
me some time to figure it out). But there is the WP bit in CR0 register
which enables/disables write protection, so we can use it in the following
helper function:\verbatiminput{rw_protect.h}
Problem solved! The complete module code will look like this:
\verbatiminput{open_hook.c}
\item Let's now modify the hook in such a way that it will block
access to, for example, all the filenames ending with ``st''. It's
not pretty useful, but it shows some principles. But as we are
replacing the \verb!do_sys_open! call, we will take the
\verb!do_sys_open! definition in kernel sources as a base for our
hook. So the modified hook looks like
this:\verbatiminput{open_hook_inter.c} Don't pay attention to the
strange way of checking filename ending. The reason for this is
that the name is provided to \verb!open! in different ways, and
that function then calls \verb!getname! to determine full
filename, but I am not going to work with it right now, because I
was only showing the technique itself.
\end{itemize}
\chapter{Detection}
Now that the basic principles are known, we can finally develop an
application that will detect hijacking attempts \emph{before} any
damage can be done to the OS and possibly discover existing
rootkits. The functions of that ``IDS'' will be the following:
\begin{enumerate}
\item At startup, read the \verb!GDTR!, \verb!IDTR!, \verb!SYSENTER_EIP_MSR!
registers.
\item Retrieve the values of debugging registers and, if
a ``suspicious'' value is found, do something.
\item A good technique of hiding something is by clearing the P flag
of that ``something''. I mean, the page will look like it's not
present in the memory and the #PF exception will be raised, which
is then caught by the handler. The handler itself can be
malicious, permitting it to intercept things.
\item Retrieve \verb!ia32_sysenter_target! and \verb!system_call! in
order to compare them to the version found in vmlinux.
\item Retrieve GDT. The rootkit may add descriptors with base not
equal to zero and use them in order to make the disassembly much
more complicated, but becoming more detectable.
\item Retrieve \verb!sys_call_table!, all the system calls, the IDT
and the interrupt handlers (especially the #PF handler).
\item Clear the P flag on some ``attractive'' system structures. Why
the P flag instead of debugging registers? Well, the debugging
registers can be modified very easily, that's the reason. It's
more complicated, of course, but gives more reliability. Here I am
going to use a predefined function to ensure compatibility and
portability.
\end{enumerate}
Well, it's not such a large list, so let's begin.I think it might be
better to first make the IDT comparison function:
\verbatiminput{idt_compare.h}
\printindex[todo]
%\newpage
%\appendix
%\chapter{GNU Free Documentation License}
%\input{license.tex}
\end{document}