Authored by: Anandeshwar Unnikrishnan
Stage 1: GULoader Shellcode Deployment
In latest GULoader campaigns, we’re seeing an increase in NSIS-based installers delivered by way of E-mail as malspam that use plugin libraries to execute the GU shellcode on the sufferer system. The NSIS scriptable installer is a extremely environment friendly software program packaging utility. The installer habits is dictated by an NSIS script and customers can lengthen the performance of the packager by including customized libraries (dll) often known as NSIS plugins. Since its inception, adversaries have abused the utility to ship malware.
NSIS stands for Nullsoft Scriptable Installer. NSIS installer recordsdata are self-contained archives enabling malware authors to incorporate malicious belongings together with junk information. The junk information is used as Anti-AV / AV Evasion approach. The picture under reveals the construction of an NSIS GULoader staging executable archive.
The NSIS script, which is a file discovered within the archive, has a file extension “.nsi” as proven within the picture above. The deployment technique employed by the risk actor could be studied by analyzing the NSIS script instructions offered within the script file. The picture proven under is an oversimplified view of the entire shellcode staging course of.
The file that holds the encoded GULoader shellcode is dropped on to sufferer’s disc based mostly on the script configuration together with different information. Junk is appended firstly of the encoded shellcode. The encoding type varies from pattern to pattern. However in all most all of the circumstances, it’s a easy XOR encoding. As talked about earlier than, the shellcode is appended to junk information, due to this, an offset is used to retrieve encoded GULoader shellcode. Within the picture, the FileSeek NSIS command is used to do correct offsetting. Some samples have unprotected GULoader shellcode appended to junk information.
A plugin utilized by the NSIS installer is nothing however a DLL which will get loaded by the installer program at runtime and invokes features exported by the library. Two DLL recordsdata are dropped in person’s TEMP listing, in all analyzed samples one DLL has a constant identify of system.dll and identify of the opposite one varies.
The system.dll is liable for allocating reminiscence for the shellcode and its execution. The next picture reveals how the NSIS script calls features in plugin libraries.
The system.dll has the following exports as proven the in the picture under. The perform named “Name” is getting used to deploy the shellcode on sufferer’s system.
- The Name perform exported by system.dll resolves following features dynamically and execute them to deploy the shellcode.
- CreateFile – To learn the shellcode dumped on to disk by the installer. As a part of installer arrange, all of the recordsdata seen within the installer archive earlier are dumped on to disk in new listing created in C: drive.
- VirtualAlloc – To carry the shellcode within the RWX reminiscence.
- SetFilePointer – To hunt the precise place of the shellcode within the dumped file.
- ReadFile – To learn the shellcode.
- EnumResourceTypesA – Execution by way of callback mechanism. The second parameter is of the sort ENUMRESTYPEPROCA which is solely a pointer to a callback routine. The handle the place the shellcode is allotted within the reminiscence is handed because the second argument to this API resulting in execution of the shellcode. Callback features parameters are good assets for oblique execution of the code.
Vectored Exception Dealing with in GULoader
The implementation of the exception dealing with by the Working System offers a chance for the adversary to take over execution move. The Vectored Exception Dealing with on Home windows offers the person with capability to register customized exception handler, which is solely a code logic that will get executed on the occasion of an exception. The attention-grabbing factor about dealing with exceptions is that the way in which by which the system resumes its regular execution move of this system after the occasion of exception. Adversaries exploit this mechanism and take possession of the execution move. Malware can divert the move to the code which is beneath its management when the exception happens. Usually it’s employed by the malware to realize following objectives:
- Covert code execution and anti-analysis
The GuLoader employs the VEH primarily for obfuscating the execution move and to decelerate the evaluation. This part will cowl the internals of Vectored exception dealing with on Home windows and investigates how GUloader is abusing the VEH mechanism to thwart any evaluation efforts.
- The Vectored Exception Dealing with (VEH) is an extension of Structured Exception Dealing with (SEH) with which we are able to add a vectored exception handler which can be referred to as regardless of of our place in a name body, merely put VEH is just not frame-based.
- VEH is abused by malware, both to govern the management move or covertly execute person features.
- Home windows offers AddVectoredExceptionHandler Win32 API so as to add customized exception handlers. The perform signature is proven under.
The Handler routine is of the sort PVECTORED_EXCEPTION_HANDLER. Additional checking the documentation, we are able to see the handler perform takes a pointer to _EXCEPTION_POINTERS sort as its enter as proven within the picture under.
The _EXCEPTION_POINTERS sort holds two vital buildings; PEXCEPTION_RECORD and PCONTEXT. PEXCEPTION_RECORD incorporates all the data associated to exception raised by the system like exception code and so on. and PCONTEXT construction maintains CPU register (like RIP/EIP, debug registers and so on.) values or state of the thread captured when exception occurred.
- This implies the exception handler can entry each ExceptionRecord and ContextRecord. Right here from inside the handler one can tamper with the information saved within the ContextRecord, thus manipulating EIP/RIP to manage the execution move when person utility resumes from exception dealing with.
- There’s one attention-grabbing factor about exception dealing with, the execution to the appliance is given again by way of NtContinue native routine. Exception dispatch routines name the handler and when handler returns to dispatcher, it passes the ContextRecord to the NtContinue and execution is resumed from the EIP/RIP within the report. On a aspect word, that is an oversimplified clarification of the entire exception dealing with course of.
Vectored Handler in GULoader
- GULoader registers a vectored exception handler by way of RtlAddVectoredExceptionHandler native routine. The under picture reveals the management move of the handler code. Apparently a lot of the code blocks current listed below are junk added to thwart the evaluation efforts.
- The GULoader’s handler implementation is as follows (disregarding the junk code).
- Reads ExceptionInfo handed to the handler by the system.
- Reads the ExceptionCode from ExceptionRecord construction.
- Checks the worth of ExceptionCode subject in opposition to the computed exception codes for STATUS_ACCESS_VIOLATION, STATUS_BREAKPOINT and STATUS_SINGLESTEP.
- Based mostly on the exception code, malware takes a department and executes code that modifies the EIP.
The GULoader units the lure flag to set off single stepping deliberately to detect evaluation. The handler code will get executed as mentioned earlier than, a block of code is executed based mostly on the exception code. If the exception is single stepping, standing code is 0x80000004, following actions happen:
- The GULoader reads the ContextRecord and retrieves EIP worth of the thread.
- Increments the present EIP by 2 and reads the one byte from there.
- Performs an XOR on the one-byte information fetched from step earlier than and a static worth. The static worth adjustments with samples. In our pattern worth is 0x1A.
- The XOR’ed worth is then added to the EIP fetched from the ContextRecord.
- Lastly, the modified EIP worth from prior step is saved within the ContextRecord and returns the management again to the system(dispatcher).
- The malware has the identical logic for the entry violation exception.
- When the shellcode is executed with out debugger, INT3 instruction invokes the vectored exception handler routine, with an exception of EXCEPTION_BREAKPOINT, handler computes EIP by incrementing the EIP by 1 and fetching the information from incremented location. Later XORing the fetched information with a relentless in our case 0x1A. The result’s added to present EIP worth. The logic carried out for dealing with INT3 exceptions additionally scan this system code for 0xCC directions put by the researchers. If 0xCC are discovered which are positioned by researchers then EIP is just not calculated correctly.
EIP Calculation Logic Abstract
|Set off by way of interrupt instruction (INT3)||eip=((ReadByte(eip+1)^0x1A)+eip)|
|Set off by way of Single Stepping(PUSHFD/POPFD)||eip=((ReadByte(eip+2)^0x1A)+eip)|
*The worth 0x1A adjustments with samples
Detecting Irregular Execution Stream by way of VEH
- The shellcode is structured in such a means that the malware can detect irregular execution move by the order by which exception occurred at runtime. The pushfd/popfd directions are adopted by the code that when executed throws STATUS_ACCESS_VIOLATION. When program is executed usually, the execution is not going to attain the code that follows the pushfd/popfd instruction block, thus elevating solely STATUS_SINGLESTEP. When accidently stepped over the pushfd/popfd block in debugger, the STATUS_SINGLESTEP is just not thrown on the debugger because it suppreses this as a result of the debugger is already single stepping by way of the code, that is detected by the handler logic once we encounter code that follows the pushfd/popfd instruction block wich throws a STATUS_ACCESS_VIOLATION. Now it runs right into a nested exception scenario (the entry violation adopted by suppressed single stepping exception by way of lure). Due to this, at any time when an entry violation happens, the handler routine checks for nested exception data in _EXCEPTION_POINTERS construction as mentioned at first.
Under picture reveals this the fastidiously laid out code to detect evaluation.
The Egg searching: VEH Assisted Runtime Padding
One attention-grabbing characteristic seen in GULoader shellcode within the wild is runtime padding. Runtime padding is an evasive habits to beat automated scanners and different safety checks employed at runtime. It delays the malicious actions carried out by the malware on the goal system.
- The egg worth within the analyzed pattern is 0xAE74B61.
- It initiates a seek for this worth in its personal information phase of the shellcode.
- Don’t neglect the truth that that is carried out by way of VEH handler. This search itself provides 0.3 million of VEH iteration on high of standard VEH management manipulation employed within the code.
- The loader ends this search when it retrieves the handle location of the egg worth. To ensure the worth is just not being manipulated by any means by the researcher, it performs two extra checks to validate the egg location.
- If the test fails, the search continues. The method of retrieving the placement of the egg is proven within the picture under.
- As talked about above, the validity of the egg location is checked by retrieving byte values from two offsets: one is 4 bytes away from the egg location and the worth is 0xB8. The opposite is at 9 bytes from the egg location and the worth is 0xC3. This test must be handed for the loader to proceed to the subsequent stage of an infection. Core malware actions are carried out after this runtime padding loop.
The next photographs present the egg location validity checks carried out by GULoader. The values 0xB8 and 0xC3 are checked through the use of correct offsets from the egg location.
Stage 2: Atmosphere Test and Code Injection
Within the second stage of the an infection chain, the GULoader performs anti-analysis and code injection. Main anti-analysis vectors are listed under. After ensuring that shellcode is just not working in a sandbox, it proceeds to conduct code injection right into a newly spawned course of the place stage 3 is initiated to obtain and deploy precise payload. This payload could be both commodity stealer or RAT.
- Employs runtime padding as mentioned earlier than.
- Scans complete course of reminiscence for evaluation instrument particular strings
- Makes use of DJB2 hashing for string checks and dynamic API handle decision.
- Strings are decoded at runtime
- Checks if qemu is put in on the system by checking the set up path:
- C:Program Informationqqaqqa.exe
- Patches the next APIs:
- The perform’s prologue is patched with ExitProcess name
- The preliminary bytes are patched with instruction “mov edi edi”
- Patches with instruction nop
- Clears hooks positioned in ntdll.dll by safety merchandise or researcher for the evaluation.
- Window Enumeration by way of EnumWindows
- Hides the shellcode thread from the debugger by way of ZwSetInformationThread by passing 0x11 (ThreadHideFromDebugger)
- Gadget driver enumeration by way of EnumDeviceDrivers andGetDeviceDriverBaseNameA
- Put in software program enumeration by way of MsiEnumProductsA and MsiGetProductInfoA
- System service enumeration by way of OpenSCManagerA and EnumServiceStatusA
- Checks use of debugging ports by passing ProcessDebugPort (0x7) class to NtQueryInformationProcess
- Use of CPUID and RDTSC directions to detect digital environments and instrumentation.
Each time GULoader invokes a Win32 api, the decision is sandwiched between two XOR loops as proven within the picture under. The loop previous to the decision encoded the energetic shellcode area the place the decision is going down to stop the reminiscence from getting dumped by the safety merchandise based mostly on occasion monitoring or api calls. Following the decision, the shellcode area is decoded once more again to regular and resumes execution. The XOR key used is a phrase current within the shellcode itself.
This part covers the method undertaken by the GUloader to decode the strings on the runtime.
- The NtAllocateVirtualMemory is known as to allocate a buffer to carry the encoded bytes.
- The encoded bytes are computed by performing numerous arithmetic and logical operations on static values embedded as operands of meeting directions. Under picture reveals the restoration of encoded bytes by way of numerous mathematical and logical operations. The EAX factors to reminiscence buffer, the place computed encoded values get saved.
The primary byte/phrase is reserved to carry the dimensions of the encoded bytes. Under reveals a 12 byte lengthy encoded information being written to reminiscence.
Later, the primary phrase will get changed by the primary phrase of the particular encoded information. Under picture reveals the buffer after changing the primary phrase.
The encoded information is totally recovered now, and malware proceeds to decode it. For decoding the easy XOR is employed, and key’s current within the shellcode. The meeting routine that does the decoding is proven in the picture under. Every byte within the buffer is XORed with the important thing.
The results of the XOR operation is written to identical reminiscence buffer that holds the encoded information. A closing view of the reminiscence buffer with decoded information is proven under.
The picture reveals the decoding the string “psapi.dll”, later this string is utilized in fetching the handlees of numerous features to make use of anti-evaluation.
The stage 2 culminates in code injection, to be particular GULoader employs a variation of the method hollowing approach, the place a benign course of is spawned in a suspended state by the malware stager course of and proceeds to overwrite the unique content material current within the suspended course of with malicious content material, later the state of the thread within the suspended course of is modified by modifying processor register values like EIP and at last the method resumes its execution. By controlling EIP, malware can now direct the management move within the spawned course of to a desired code location. After a profitable hollowing, the malware code can be working beneath the quilt of a legit utility.
The variation of hollowing approach employed by the GULoader doesn’t change the file contents, however as an alternative injects the identical shellcode and maps the reminiscence within the suspended course of. Apparently, GULoader employs an extra approach if the hollowing try fails. Extra particulars are lined within the following part.
Listed under Win32 native APIs are dynamically resolved at runtime to carry out the code injection.
Overview of Code Injection
- Initially picture “%windirpercentMicrosoft.NETFrameworkversion on 32-bit methods<model>CasPol.exe” is spawned in suspended mode by way of CreateProcessInternalW native API.
- The Gu loader retrieves a deal with to the file “C:WindowsSysWOW64iertutil.dll” which is utilized in part creation. The part object created by way of NtCreateSection can be backed by iertutil.dll.
- This habits is principally to keep away from suspicion, a piece object which isn’t backed by any file could draw undesirable consideration from safety methods.
- The following section within the code injection is the mapping of the view created on the part backed by the iertutil.dll into the spawned CasPol.exe course of. As soon as the view is efficiently mapped to the method, malware can inject the shellcode within the mapped reminiscence and resume the method thus initiating stage 3. The native api ZwMapViewOfSection is used to carry out this job. Following the execution of the above API, the malware checks the results of the perform name in opposition to the under listed error statuses.
- C0000018 (STATUS_CONFLICTING_ADDRESS)
- C0000220 (STATUS_MAPPED_ALIGNMENT)
- 40000003 (STATUS_IMAGE_NOT_AT_BASE).
- If the mapping is unsuccessful and standing code returned by ZwMapViewOfSection matches with any of the code talked about above, it has a backup plan.
- The GuLoader calls NtAllocateVirtualMemory by instantly calling the system name stub which is often present in ntdll.dll library to bypass EDR/AV hooks. The reminiscence is allotted within the distant CasPol.exe course of with an RWX reminiscence safety. Following picture reveals the direct use of NtAllocateVirtualMemory system name.
After reminiscence allocation, it writes itself into distant course of by way of NtWriteVirtualMemory as mentioned above. GULoader shellcodes taken from the subject are larger in dimension, samples taken for this evaluation are all larger than 20 mb. In samples analyzed, the buffer dimension allotted to carry the shellcode is 2950000 bytes. The under picture reveals the GuLoader shellcode within the reminiscence.
Deceptive Entry level
- The GULoader is very evasive in nature, if irregular execution move is detected with assist of employed anti-analysis vectors, the EIP and EBX fields of thread context construction (of CasPol.exe course of) can be overwritten with a decoy handle, which is required for the stage 3 of malware execution. The placement ebp+4 is used to carry the entry level regardless of of the very fact whether or not program is being debugged or not.
- The Gu loader makes use of ZwGetContextThread and NtSetContextThread routines to perform modification of the thread state. The CONTEXT construction is retrieved by way of ZwGetContextThread, the worth [ebp+14C] is used because the entry level handle. The present EIP worth held within the EIP subject within the context construction of the thread can be modified to a recalculated handle based mostly on worth at ebp+4. Under picture reveals the RVA calculation. The bottom handle of the executing shellcode (stage 2) is subtracted from the digital handle [ebp+4] to acquire RVA.
The RVA is added to the base handle of the newly allotted reminiscence within the CasPol.exe course of to acquire new VA which can be utilized within the distant course of. The brand new VA is written into EIP and EBX subject within the thread context construction of the CasPol.exe course of retrieved by way of ZwGetContextThread. Under picture reveals the modified context construction and worth of EIP.
Lastly, by calling ZwSetContextThread, the changes made to the CONTEXT construction is dedicated within the goal thread of CasPol.exe course of. The thread is resumed by calling NtResumeThread. The CasPol.exe resumes execution and performs stage 3 of the an infection chain.
Stage 3: Payload Deployment
The GULoader shellcode resumes execution from inside a brand new host course of, on this report, analyzed samples inject the shellcode both into the identical course of spawned as a baby course of or caspol.exe. Stage3 performs all of the anti-analysis as soon as once more to ensure this stage is just not being analyzed. In any case checks, GUloader proceeds to carry out stage3 actions by decoding the encoded C2 string within the reminiscence as proven within the picture under. The decoding methodology is similar as mentioned earlier than.
Later the addresses of following features are resolved dynamically by loading wininet.dll:
The under picture reveals the response from the content material supply community (cdn) server the place the ultimate payload is saved. On this evaluation, a payload of dimension 0x2E640 bytes is distributed to the loader. Apparently, the primary 40 bytes are ignored by the loader. The precise payload begins from the offset 40 which is highlighted within the picture.
The cdn server is nicely protected, it solely serves to shoppers with correct headers and cookies. If these usually are not current within the HTTP request, the next message is proven to the person.
Quasi Key Era
Step one in decoding the the downloaded closing payload by the GUloader is producing a quasi key which can be later utilized in decoding the precise key embeded within the GULoader shellcode. The encoded embeded key dimension is 371 bytes in analysed pattern. The method of quasi key technology is as follows:
- The 40th and 41st bytes (phrase) are retrived from the obtain buffer within the reminiscence.
- The above phrase is XORed with the primary phrase of the encoded embeded key alongside and a counter worth.
- The method is repeated untill the the phrase taken from the downloaded information totally decodes and have a price of 0x4D5A “MZ”.
- The worth current within the counter when the 4D5A will get decoded is taken because the quasi key. This key’s proven as “key-1” within the picture under. Within the analysed pattern the worth of this key’s “0x5448”
Decoding Precise Key
The embedded key within the GULoader shellcode is of the dimensions 371 bytes as mentioned earlier than. The quasi key’s used to decode the embeded key as proven within the picture under.
- Every phrase within the embeded key’s XORed with quasi key key-1.
- When the interation counter exceeds the dimensions worth of 371 bytes, it stops and proceeds to decode the downloaded payload with this new key.
The decoded 371 bytes of embeded key’s proven under within the picture under.
A byte degree decoding occurs after embeded key’s decoded within the reminiscence. Every byte of the downloaded information is XORed with the important thing to acquire the precise information, which is a PE file. The decoded information is overwritten to the identical buffer used to obtain the decoded information.
The ultimate decoded PE file residing within the reminiscence is proven within the picture under:
Lastly, the loader hundreds the PE file by allocating the reminiscence with RWX permission within the stage3 course of, based mostly on analyzing a number of samples it’s both the identical course of in stage 2 because the baby course of, or casPol.exe. The loading concerned code relocation and IAT correction as anticipated in such a state of affairs. The ultimate payload resumes execution from inside the hollowed stage3 course of. Under malware households are often seen deployed by the GULoader:
- Vidar (Stealer)
- Raccoon (Stealer)
- Remcos RAT
Under picture reveals the injected reminiscence areas in stage3 course of caspol.exe on this report.
The position performed by malware loaders popularly often known as “crypters” is critical within the deployment of Distant Administration Instruments and stealer malwares that focus on client information. The exfiltrated Private Identifiable Info (PII) extracted from the compromised endpoints are largely collected and funneled to varied underground information promoting marketplaces. This additionally impacts companies as numerous crucial data used for authentication functions are getting leaked from the private methods of the person resulting in preliminary entry on the corporate networks. The GuLoader is closely utilized in mass malware campaigns to contaminate the customers with common stealer malware like Raccoon, Vidar, and Redline. Commodity RATs like Remcos are additionally seen delivered in such marketing campaign actions. On the intense aspect, it’s not tough to fingerprint malware specimens used within the mass campaigns due to the amount its quantity and relevance, detection guidelines and methods could be constructed round this actual fact.
Following desk summarizes all of the dynamically resolved Win32 APIs