Sunday, June 10, 2012

Malware Analysis Tutorial 29: Stealthy Library Loading II (Using Self-Modifying APC)

Learning Goals:
  1. Practice WinDbg for Intercepting Driver Loading
  2. Trace and Modify Control Flow Using IMM
  3. Understand the techniques employed at Max++ for hiding library loading
  4. Understand image loading notifier, asynchronous procedure call, kernel and normal routines.
Applicable to:
  1. Operating Systems
  2. Assembly Language
  3. Operating System Security
1. Introduction

This tutorial analyzes the malicious driver B48DADF8.sys. We assume that you have retrieved the driver from the hidden drive, following the instructions of Tutorial 28.

2. Lab Configuration

We need two windows images: one for taking notes and one for actually running the malware. Also a kernel mode WinDbg instance is needed on the host.

To set up the Notes images, you can follow the instructions of Tutorial 20. The basic idea is to start two instances of IMM, using one to debug the other. Then at 0x004E6095 to set a breakpoint and skip the instruction when there will be an illegal memory write. Once B48DADF8.sys is loaded, in the second IMM (as shown in Figure 1), if we check the executable modules (View -> Executable Modules), we can find out the entry of the driver is +1259. Jumping to that address, we can see the driver entry (which makes a bunch of calls on hooking image loading and creation of driver device).

Figure 1. Identify B48DADF8.sys Entry

The set up of the windows image for debugging and WinDbg should follow the instructions of Tutorial 28. We need to stop at the entry of the driver. This could be achieved by first finding out the starting address of the module in WinDbg, and then plus offset 1259.

kd> g
80506d3e cc              int     3
kd> g
80506d3e cc              int     3
kd> lm
start    end        module name
804d7000 806ed680   nt         (pdb symbols)          c:\windows\symbols\ntoskrnl.pdb\47A5AC97343A4A7ABF14EFD9E99337722\ntoskrnl.pdb
faf0c000 faf11000   B48DADF8   (deferred)            
faf54000 faf5c000   _          (deferred)          

First, we learned that the B48DADF8 module starts at faf0c000. Given that the offset of the entry is 1259, we can set a breakpoint at faf0d259, as shown below.  

kd> bp faf0d259
*** ERROR: Module load completed but symbols could not be loaded for B48DADF8.sys
kd> g
Wed May 30 09:55:24.281 2012 (UTC - 4:00): Breakpoint 0 hit
faf0d259 55              push    ebp
kd> u
faf0d259 55              push    ebp
faf0d25a 8bec            mov     ebp,esp
faf0d25c 56              push    esi
faf0d25d 8b7508          mov     esi,dword ptr [ebp+8]
faf0d260 893518f0f0fa    mov     dword ptr [B48DADF8+0x3018 (faf0f018)],esi
faf0d266 33c0            xor     eax,eax
faf0d268 6882d0f0fa      push    offset B48DADF8+0x1082 (faf0d082)
faf0d26d c746384cd1f0fa  mov     dword ptr [esi+38h],offset B48DADF8+0x114c (faf0d14c)

You can verify that the code starts at faf0d259 (shown in the WinDbg dump above) matches the instructions in the IMM window in Figure 1. From now on, we can start the analysis. The basic approach is to execute the driver in the WinDbg instance and annotate the code in the WinNotes image.

3.Hook Up Driver with New Device and Set Image Load Notifier

We now observe the first section of the code at the beginning of the driver loading. Figure 3 shows the annotated code.

Figure 3. First Part of B48DADF8.sys

The first interesting part is that the driver takes itself (DRIVER_OBJECT) and saves to a global variable. It first reads from EBP+8, i.e., the first parameter to a driver (PDRIVER_OBJECT), as shown in the first highlighted part. We will see that the malware later will need this value.

We can verify that the value ffb81268 is really a driver object as shown below. It is clear that the driver object is not fully set up yet, e.g., the DeviceObject is null.

kd> dd esp
f7c88920  faf4d121 ffb81268 00000000 02002000
kd> dd _DRIVER_OBJECT ffb81268
Couldn't resolve error at '_DRIVER_OBJECT ffb81268'
kd> dt _DRIVER_OBJECT ffb81268
   +0x000 Type             : 0n4
   +0x002 Size             : 0n168
   +0x004 DeviceObject     : (null)
   +0x008 Flags            : 4
   +0x00c DriverStart      : (null)
   +0x010 DriverSize       : 0
   +0x014 DriverSection    : 0xffb8c8f0 Void
   +0x018 DriverExtension  : 0xffb81310 _DRIVER_EXTENSION
   +0x01c DriverName       : _UNICODE_STRING "\driver\4157114776"

Next B48DADF8.sys tries to call function psSetLoadImageNotifyRoutine at to +1082. This is clearly an operation that tries to hide the loading of modules.  We are not getting into the details yet, but we can set a breakpoint on it. We can see that the breakpoint will be hit multiple times and DebugService + 2bde is not hit any more.

Then, B48DADF8.sys tries to create an IO device and hooks itself up as the driver for that device. According to MSDN,  IoCreateDevice() has 6 parameters: PDRIVER_OBJECT, DriverExtension, DeviceName, DeviceType, DeviceCharacteristics, Exclusive, PDEVICE_OBJECT.

From the WinDbg dump below, we can soon infer that the name of the new device is \??\EBB02C33..\#...0CFE and the device type is FILE_DEVICE_UNKNOWN. This is confirmed by the following WinDbg dump:

kd> dd esp
f7c888fc  ffb81268 00000000 faed6114 00000022
kd> dt _UNICODE_STRING faed6114
   +0x000 Length           : 0x50
   +0x002 MaximumLength    : 0x52
   +0x004 Buffer           : 0xfaed60c0  "\??\EBB02C33#910D#415d#BB61#FBD3CC1D0CFE"

4.Hide Driver Module
We now discuss the efforts of B48DADF8.sys to hide itself. This part contains no more than 20 instructions, as shown in Figure 4.
Figure 4. Hide Driver Module B48DADF8.sys

At the beginning of the code, ESI points to the _DRIVER_OBJECT of B48DADF8, and then the code retrieves the word at offset 0x14 of the _DRIVER_OBJECT, and now EDX points to DriverSection (whose data type is _LDR_DATA_TABLE_ENTRY). Using WinDbg, we can easily verify its contents as below. You can see that it's full DLL name is "\??\... C2CAD...B48DADF8.sys".

kd> dt _LDR_DATA_TABLE_ENTRY ffbd61b0 -r1
   +0x000 InLoadOrderLinks : _LIST_ENTRY [ 0x8055b1c0 - 0xffbacea8 ]
      +0x000 Flink            : 0x8055b1c0 _LIST_ENTRY [ 0x8131db20 - 0xffbd61b0 ]
      +0x004 Blink            : 0xffbacea8 _LIST_ENTRY [ 0xffbd61b0 - 0x811974c8 ]
   +0x008 InMemoryOrderLinks : _LIST_ENTRY [ 0x0 - 0x0 ]
      +0x000 Flink            : (null)
      +0x004 Blink            : (null)
   +0x010 InInitializationOrderLinks : _LIST_ENTRY [ 0x630069 - 0x0 ]
      +0x000 Flink            : 0x00630069 _LIST_ENTRY
      +0x004 Blink            : (null)
   +0x018 DllBase          : 0xfaedc000 Void
   +0x01c EntryPoint       : 0xfaedd259 Void
   +0x020 SizeOfImage      : 0x5000
   +0x024 FullDllName      : _UNICODE_STRING "\??\C2CAD972#4079#4fd3#A68D#AD34CC121074\B48DADF8.sys"
      +0x000 Length           : 0x6a
      +0x002 MaximumLength    : 0x6a
      +0x004 Buffer           : 0xe1a678c8  "\??\C2CAD972#4079#4fd3#A68D#AD34CC121074\B48DADF8.sys"
   +0x02c BaseDllName      : _UNICODE_STRING "B48DADF8.sys"
      +0x000 Length           : 0x18
      +0x002 MaximumLength    : 0x18
      +0x004 Buffer           : 0xffbd61fc  "B48DADF8.sys"
   +0x034 Flags            : 0x1104000
   +0x038 LoadCount        : 1
   +0x03a TlsIndex         : 0x49
   +0x03c HashLinks        : _LIST_ENTRY [ 0xffffffff - 0x4f51 ]
      +0x000 Flink            : 0xffffffff _LIST_ENTRY
      +0x004 Blink            : 0x00004f51 _LIST_ENTRY
   +0x03c SectionPointer   : 0xffffffff Void
   +0x040 CheckSum         : 0x4f51
   +0x044 TimeDateStamp    : 0xfffffffe
   +0x044 LoadedImports    : 0xfffffffe Void
   +0x048 EntryPointActivationContext : (null)
   +0x04c PatchInformation : 0x00340042 Void

The next couple of instructions (from 0x100012CE to 0x100012D2 in figure 4) clears the FullDLLName. After 0x100012D2, if you display the same DriverSection again, you would notice that the FullDllName is gone, as shown below. However, the BaseDllName is still there, I guess the malware author forgot to clear it as well.

kd> dt _LDR_DATA_TABLE_ENTRY ffbd61b0
   +0x024 FullDllName      : _UNICODE_STRING ""
   +0x02c BaseDllName      : _UNICODE_STRING "B48DADF8.sys"

Next, B48DADF8.sys tries to remove itself from the module list. As shown in Figure 4, at 100012D6, EAX and ECX now have the FLINK and BLINK of the first module of the InLoadOrderModule list. The next four instructions constitute a typical REMOVE_NODE operation on a doubly linked list, which removes B48DADF8 module from the list.

Challenge 2. Explain the logic of code from 100012D6 to 100012E3 in Figure 4.

4.Hook Up on PCI Device
The next step (function 0x100011D0) is to hook up on the PCI Device by copying from the original PCI driver. This is shown in Figure 5.
Figure 5. Copy from PCI Driver
As shown in Figure 5, the first step of function 0x100011D0 is to retrieve the PCI Driver object by name "driver\PCI". Then it copies several attributes of the PCI driver to the current driver, such as driver_start, driver_init, driver size, and driver name. However, the driver major functions 0xfaed514c (an array that contains the IRP handler entry addresses) is not changed. Here all entries are redirected to 0xfaed514, as shown below (which dumps the contents of the B4DADF8.sys driver object, before and after the call of 0x100011D0.

kd> dt _DRIVER_OBJECT ffb80b90
   +0x000 Type             : 0n0
   +0x002 Size             : 0n168
   +0x004 DeviceObject     : 0xffbb4af8 _DEVICE_OBJECT
   +0x008 Flags            : 4
   +0x00c DriverStart      : (null)
   +0x010 DriverSize       : 0
   +0x014 DriverSection    : 0xffb8c8f0 Void
   +0x018 DriverExtension  : 0xffb80c38 _DRIVER_EXTENSION
   +0x01c DriverName       : _UNICODE_STRING "\driver\4157114776"
   +0x024 HardwareDatabase : (null)
   +0x028 FastIoDispatch   : (null)
   +0x02c DriverInit       : 0xfaf572c5     long  +0
   +0x030 DriverStartIo    : (null)
   +0x034 DriverUnload     : (null)
   +0x038 MajorFunction    : [28] 0xfaed514c     long  +0
kd> p
kd> dt _DRIVER_OBJECT ffb80b90
   +0x000 Type             : 0n0
   +0x002 Size             : 0n168
   +0x004 DeviceObject     : 0xffbb4af8 _DEVICE_OBJECT
   +0x008 Flags            : 4
   +0x00c DriverStart      : 0xfaafc000 Void
   +0x010 DriverSize       : 0x10a80
   +0x014 DriverSection    : 0x8131d8a0 Void
   +0x018 DriverExtension  : 0xffb80c38 _DRIVER_EXTENSION
   +0x01c DriverName       : _UNICODE_STRING "\Driver\PCI"
   +0x024 HardwareDatabase : (null)
   +0x028 FastIoDispatch   : (null)
   +0x02c DriverInit       : 0xfab0a004     long  +fffffffffab0a004
   +0x030 DriverStartIo    : (null)
   +0x034 DriverUnload     : (null)
   +0x038 MajorFunction    : [28] 0xfaed514c     long  +0

Challenge 3. Note that at this moment, the PCI device has not been completely hooked up to the new driver. Find a way to find out where the new driver is eventually set as the handling driver for the PCI device.

5.Image Loading
We now go back to study the image loading call at +1082, which is discussed earlier in section 3. Max++ sets +1082 as the call back function whenever NtImageLoad is called. This first, of course, disrupts WinDbg in monitoring image loading. But the code itself is doing a lot of malicious stuff. Let's set a breakpoint at +1082 and watch its behavior. The set up is shown as below:

kd> lm
start    end        module name
804d7000 806ed680   nt         (pdb symbols)          c:\windows\symbols\ntoskrnl.pdb\47A5AC97343A4A7ABF14EFD9E99337722\ntoskrnl.pdb
faee4000 faee9000   B48DADF8   (deferred)            
faf64000 faf6c000   _          (deferred)            

Unloaded modules:
kd> bp faee4000 + 1082
*** ERROR: Module load completed but symbols could not be loaded for B48DADF8.sys
kd> g
Mon Jun  4 08:34:46.187 2012 (UTC - 4:00): Breakpoint 0 hit
faee5082 55              push    ebp

Figure 6 displays the major function of +1082 (also written as 0x10001082).
Figure 6. Function Body of 0x10001082

As shown in Figure 6, the majority part of +1082 is to set up and queue an APC object (Asynchronous Procedure Call). APC is frequently used in I/O operation, it stands for an object that will be executed a while later.

See the highlighted parts in Figure 6, the control flow is very clear: Max++ first tries to call ExAllocatePool to reserve 30 bytes of kernel memory for the APC object, then it calls KeInitializeAPC and KeInsertQueue to queue the APC call. We need to look at the details of KeInitializeAPC. According to ReactOS documentation, the prototype of KeInitializeAPC is shown below:

VOID NAPI KeInitializeAPC(
        IN PKAPC pApc,
        IN PKTHREAD thread, 
        IN PKKERNEL_ROUTINE kernelRoutine,
        IN PKROUNDOWN_ROUTINE rundownRoutine,
        IN PKNORMAL_ROUTINE normalRoutine,
        IN KPROCESSOR_MODE mode,
        IN VOID context

The dump from WinDbg can be found in the following:
kd> dd esp
f7c88bb4  ffbb2e48 81176320 00000000 faee530c
f7c88bc4  faee52f0 71a50000 00000001 00000000

Here, the kernelRoutine is faee530c (+130c), rundownRoutine is faee52f0 (+12f0), and normalRoutine is 71a50000 (your job: find out which module does it belong to), mode is 1. By MSDN documentation, if mode is 1 and normalRoutine is not 0, this is a user mode APC, which will call the normalRoutine later. However, to be safe, we want to set up breakpoints on all of the routines +130c, +12f0, 71a50000.
Pay special attention, at this moment, the normal routine is 71a50000!

Interestingly, it's the kernel routine 130c which is hit first. The following is the call stack:
kd> kv
ChildEBP RetAddr  Args to Child             
WARNING: Stack unwind information not available. Following frames may be wrong.
f7c88d4c 804de855 00000001 00000000 f7c88d64 B48DADF8+0x130c
f7c88d4c 7c90e4f4 00000001 00000000 f7c88d64 nt!KiServiceExit+0x58 (FPO: [0,0] TrapFrame @ f7c88d64)
0012c8f8 7c91624a 0017c0e0 0012c984 0012ceac 0x7c90e4f4
0012cbb8 7c9164b3 00000000 0017c0e0 0012ceac 0x7c91624a
0012ce60 7c801bbd 0017c0e0 0012ceac 0012ce8c 0x7c9164b3
0012cec8 7c801d72 7ffdfc00 00000000 00000000 0x7c801bbd
0012cedc 7c801da8 0012d15c 00000000 00000000 0x7c801d72
0012cef8 71ab78f1 0012d15c 00155218 0017f410 0x7c801da8
0012d27c 71ab496d 0017f694 0017f420 00000000 0x71ab78f1
0012d29c 71ab49cc 0017f410 c000010e 00000000 0x71ab496d
0012d2b8 71ab40a3 00000002 00000001 00000006 0x71ab49cc
0012d310 003c2315 00000002 00000001 00000006 0x71ab40a3
0012d604 003c24ef 0012d634 7c906786 7c903400 0x3c2315
0012d648 003c2507 00401166 003c0000 fffffffe 0x3c24ef
0012ffd0 8054b6b8 0012ffc8 81176320 ffffffff 0x3c2507
00413a40 ec81ec8b 0000030c d98b5653 f4b58d57 nt!ExFreePoolWithTag+0x676 (FPO: [Non-Fpo])
00413a4c f4b58d57 8bfffffd f45589c3 0000c0e8 0xec81ec8b
00413a50 8bfffffd f45589c3 0000c0e8 10c38300 0xf4b58d57
00413a54 f45589c3 0000c0e8 10c38300 0cf85d89 0x8bfffffd
00413a58 00000000 10c38300 0cf85d89 f3b58dff 0xf45589c3

Challenge 4. Analyze who triggers +130c. (hint: 0x71 range is the mwsock.dll, 0x7c range is ntdll.dll).

Now let's study what the function +130c (0x1000130c in IMM) is doing. Figure 7 shows its function body.

Figure 7. Function Body of +1307
The first part of +130c is pretty interesting. It's a collection of exchange functions, which essentially rotates 6 words on top of the stack. In the following, we show you the contents of the stack before the rotation.

kd> dd esp
f7c88d00  804e60f1 ffb9b638 f7c88d48 f7c88d3c
f7c88d10  f7c88d40 f7c88d44 f7c88d64 0012c834

The stack contents after the rotation is shown below. You can find that 0x804e60f1 is shifted to the right (now the 6'th word in the stack).
kd> dd esp
f7c88d00  ffb9b638 f7c88d48 f7c88d3c f7c88d40
f7c88d10  f7c88d44 804e60f1 f7c88d64 0012c834

Why does Max++ have to do this? The reason is that the function copyMaliciousCodeToNewVM (the call located at 10001327 in Figure 7) actually consumes 5 additional words in stack. In the following, we display the stack contents after the call of copyMaliciousCodeToNewVM is completed.
kd> dd esp
f7c88d14  804e60f1 f7c88d64 0012c834 f7c88d64
f7c88d24  ffffffff 804e2490 804f2001 7c90e4f4

 You can notice that 804e60f1 is now at the top of the stack. At this moment, the control flow (at 0x10001332) is going to jump to ntoskrnl.ObfDereferenceObject, which when finishes, will jump to 0x804e60f1 (which is originally the return address of +130c). By manipulating the stack this way, Max++ can successfully confuse the control flow analysis performed by static analysis tools.

Now let's observe the logic of copyMaliciousCodeToNewVM , which is located at +100f. Figure 8 shows its function body

Figure 8. Function Body of +100f

The logic of copyMalicoiusCodeToNewVM (+100f) is very simple, it first lowers the IRQ level and then it allocates a small piece from the memory and copies the contents of +1338 to the target address (0x00380000).Recall the "Stealthy 0x00380000 memory segment" in Tutorial 27,it's now your job to figure out what is copied into the region 0x00380000.

Challenge 5. Figure out what is copied by the copyMaliciousCodeToNewVM.

Then the JMP ntoskrnl.obfDerefrenceObject de-references the new driver object and returns to the system call that triggers the kernel function +130c.

Figure 9 shows the contents of the copyMalicoiusCodeToNewVM (+100f). If you look it its logic, it basically matches the description above. However, there is one thing we'd like you to pay special attention:

Challenge 6. See figure 9, where is the new VM address (allocated by ZwAllocatevirtualMemory, i.e., 0x00380000) stored at?
Figure 9. Function Body of copyMalicousCodeToNewVM (0x1000100f)

If you look at the two highlighed instructions in Figure 9, you might notice that the 3rd parameter of 0x1000100F (copyMaliciousCodeToNewVM) is used to store 0x00380000. But why?

Challenge 7. Figure out what is the motivation for storing 0x00380000 in the 3rd parameter of 0x1000100F.

To solve the above challenge, we need to go back to Figure 8, and we notice that it's f7c88d3cwhich is passed to the call of 0x1000100F (copyMaliciousCodeToNewVM).
Looking at the ReactOS information about kernel routine (search for PKKERNEL_ROUTINE on ReactOS), you will find that kernel routine (130c) accepts 5 parameters: APC, pNormal_Routine, Normal_Context, System_Arg, System_Arg2). So the f7c88d3c is actually the NORMAL_CONTEXT parameter of the KERNEL_ROUTINE.! Similarly, you would find that f7c88d48 (the second parameter) is the NORMAL_ROUTINE.

Now, look the highlighted part in Figure 9 (two yellow underlines and three thicker ones), you will find that copyMalicoiusCodeToNewVM writes value 0x00380000 into both the place holders for NORMAL_CONTEXT and NORMAL_ROUTINE!

Lab Configuration for Analyzing  Code 0x00380000: 
Clearly, the next step we would like to pursue is to debug the code located at 0x00380000 (originally copied from 0x10001338). Interestingly, if you set a hardware BP in WinDbg, the debugger never stops on 0x00380000. We suspect that somehow hardware BP is cleared at some point. Only the software BP works in this scenario ("bp 0x00380000"), and in some scenarios you might find that your IMM actually gets the INT3 (software BP) interrupt and you have to now debug it in IMM. Figure 10 shows you the screen of IMM when the software BP is intercepted. You can see that the "instruction" at 0x00380000 is "INT3".

Figure 10. Interesting Debugging Behavior of WinDbg/IMM
Open Challenge: Figure out why the software BP set by WinDbg is intercepted by Immunity Debugger. (conjecture: the debug port of the operating system is reset by Max++).

Figure 11. Code at 0x00380000

Figure 11 displays the logic of code at 0x00380000 (originally copied from 0x10001338). It mainly consists of three steps: (1) it searches for module "kernel32.dll", (2) it searches for funciton LoadLibraryW in the PE header, (3) it invokes function +1404 (we named it LoadMax++00x86).

The main part of function LoadMax++00x86 (located at +1404) is shown in Figure 12.

Figure 12. Load Max++.00.x86

The major bulk of the funciton + 1404 is the call of LoadLibraryW("\.\C2CAD...max++.00.x86").

Challenge 8. Prove the above statement is true (especially, why is its parameter "\.\C2CAD...max++.00.x86"?).

Up to now, combined with Tutorial 27, we have shown you the complete picture of the stealthy remote DLL loading technique. Max++ loads max++.x86.dll via three steps: (1) load B48DADF8.sys; (2) load max++.00.x86; and (3) load max++.x86.dll. During each step, a variety of techniques are employed to hide the trace, e.g., by modifying the kernel data structures of libraries. There are also techniques that we have not completely understand, see the open challenge in this tutorial.

No comments:

Post a Comment