On the 7th of September 2021, Microsoft announced a vulnerability that leads to remote code execution in MSHTML. According to the advisory, attackers can include a specially crafted ActiveX control within a Microsoft Office document which is executed when the document is opened.
MSHTML is an engine used by multiple Microsoft products including Internet Explorer to render HTML files. At the time of writing, no details have been released that explain the specifics of the vulnerability itself. However, one of the researchers credited with the find stated on Twitter that it stems from logic flaws and is therefore highly reliable.
When CVE-2021-40444 was first announced, there was no public information or PoC exploit code available. It was noted by Microsoft, however, that the vulnerability had been observed being exploited in the wild. After stitching together clues from numerous sources that were discussing the vulnerability, we were able to locate a document on VirusTotal that contained the exploit.
Examination of the document reveals no macro code, which is unusual as one would often expect to find at least one in a malicious document. Instead, the exploit is downloaded and executed using a Relationship element which is part of the Open Office XML specification. The element is defined in the document.xml.rels file after extracting the contents of the docx file.
The URL loaded by the Target attribute uses the mhtml: prefix, which stands for ‘MIME Encapsulation of Aggregate HTML Documents’. Essentially, it’s a way of storing a web page and all of its associated resources in a single file. The !x-usc directive after the first appearance of the hidusi URL instructs the MSHTML engine to treat the URL that follows it as an external link. In this case, this actually results in MSHTML making two requests to the same URL. The response to the second request is the one that is rendered.
The exploit code extracted from the PCAP was obfuscated by the creator to make it harder to analyze. The deobfuscation process for this particular piece of code was relatively simple and consisted of several steps.
A cursory review of the code was conducted and several functions and variables were renamed to clearly indicate their purpose. This initial review identified that the exploit contained an array of string values named a0_0x127f (which we renamed ‘strings’), and a function called a0_0x15ec was used by the code to extract strings from the array on demand. This is a fairly common technique used to try and prevent analysts from being able to quickly identify file paths, URLs, and values being passed to function calls.
The exploit begins by executing a function that shuffles the order of the strings stored in the a0_0x127f array. Values are taken from specific offsets in the array and converted into integer values. Those values are then used as part of a calculation. The shuffle function returns when the result of the calculation matches a specified value, denoting that the strings in the a0_0x127f array are in the correct order. In this case, that value is 0x5df71 (384881 in decimal).
By this stage in the exploit code, there was only one outer function left to analyze. The anonymous function starts by immediately assigning a reference to the a0_0x15ec function to a variable called _0x2ee207.
The rest of the function makes regular use of _0x2ee207 to retrieve string values. To ease analysis, we renamed the variable and wrote a small Python script to find and replace all the calls to that function with the string value they would normally return. For instance, here’s the first set of variable definitions before and after the conversion.
To insert the correct strings, the a0_0x127f array and the shuffle function were fed into a NodeJS console and executed.
The resulting value of the a0_0x127f array was then added to the Python script. After a further manual review and renaming of symbols, we were left with exploit code that was far easier to read and analyze. We have made the deobfuscated code available on GitHub.
Reviewing the deobfuscated code, we were able to identify that the exploit begins by creating a new iframe element, adding it to the DOM, and then instantiating a new ActiveXObject object within the iframe. The iframe’s Document is then opened and closed without any changes being written to it before the iframe element is removed from the page.
A series of three additional ActiveXObject objects are then instantiated, which are nested within the previously created object. Each ActiveXObject object is opened and closed without any changes being written to it before the next object is instantiated.
The exploit uses XMLHttpRequest to make a GET request to a malicious cabinet file (ministry.cab) hosted on the same domain as the exploit code. This is presumably an availability check as the response body is never actually used by the remaining code. Following the request, a new object HTML element is created and added into the document of the final nested ActiveXObject. The URL of the cabinet file is set as the element’s codebase attribute, which results in MSHTML downloading the file and unpacking it into the current user’s temp directory.
Several additional ActiveXObject objects are instantiated. This time, the objects are standalone and not nested within each other or any of the previous objects. These objects are then used to launch multiple instances of the Windows Control Panel (control.exe), which the exploit uses to load and execute DLL files.
DLLs are loaded in a semi brute-force-like fashion with different path variations to ensure that the championship.inf file extracted from ministry.cab is executed.
In the sample analyzed, the championship.inf file contained a CobaltStrike agent.
At the time of writing, there is no patch available from Microsoft to resolve CVE-2021-40444. Instead, Microsoft offers a workaround by advising that ActiveX be disabled entirely where possible by modifying some registry keys, as detailed on the CVE announcement page.
Want to know more?
We’ve released a new lab covering CVE-2021-40444 in which you can get hands-on experience analyzing this exploit, alongside its network traffic, and learn how to identify important IoCs to help with detection in your own environments.
To see the lab in action, or to find out more about the Immersive Labs platform, book a demo.