Fuzzing Image Parsing in Windows, Part Two: Uninitialized Memory
Continuing our discussion of image parsing vulnerabilities in Windows, we take a look at a comparatively less popular vulnerability class: uninitialized memory. In this post, we will look at Windows’ inbuilt image parsers—specifically for vulnerabilities involving the use of uninitialized memory.
The Vulnerability: Uninitialized Memory
In unmanaged languages, such as C or C++, variables are not initialized by default. Using uninitialized variables causes undefined behavior and may cause a crash. There are roughly two variants of uninitialized memory:
- Direct uninitialized memory usage: An uninitialized pointer or an index is used in read or write. This may cause a crash.
- Information leakage (info leak) through usage of uninitialized memory: Uninitialized memory content is accessible across a security boundary. An example: an uninitialized kernel buffer accessible from user mode, leading to information disclosure.
In this post we will be looking closely at the second variant in Windows image parsers, which will lead to information disclosure in situations such as web browsers where an attacker can read the decoded image back using JavaScript.
Detecting Uninitialized Memory Vulnerabilities
Compared to memory corruption vulnerabilities such as heap overflow and use-after-free, uninitialized memory vulnerabilities on their own do not access memory out of bound or out of scope. This makes detection of these vulnerabilities slightly more complicated than memory corruption vulnerabilities. While direct uninitialized memory usage can cause a crash and can be detected, information leakage doesn’t usually cause any crashes. Detecting it requires compiler instrumentations such as MemorySanitizer or binary instrumentation/recompilation tools such as Valgrind.
Detour: Detecting Uninitialized Memory in Linux
Let's take a little detour and look at detecting uninitialized memory in Linux and compare with Windows’ built-in capabilities. Even though compilers warn about some uninitialized variables, most of the complicated cases of uninitialized memory usage are not detected at compile time. For this, we can use a run-time detection mechanism. MemorySanitizer is a compiler instrumentation for both GCC and Clang, which detects uninitialized memory reads. A sample of how it works is given in Figure 1.
$ cat sample.cc
int main() $ clang++ -fsanitize=memory -fno-omit-frame-pointer -g sample.cc
$ ./a.out
SUMMARY: MemorySanitizer:
use-of-uninitialized-value
(/home/dan/uni/a.out+0x496db8) |
Figure 1: MemorySanitizer detection of uninitialized memory
Similarly, Valgrind can also be used to detect uninitialized memory during run-time.
Detecting Uninitialized Memory in Windows
Compared to Linux, Windows lacks any built-in mechanism for detecting uninitialized memory usage. While Visual Studio and Clang-cl recently introduced AddressSanitizer support, MemorySanitizer and other sanitizers are not implemented as of this writing.
Some of the useful tools in Windows to detect memory corruption vulnerabilities such as PageHeap do not help in detecting uninitialized memory. On the contrary, PageHeap fills the memory allocations with patterns, which essentially makes them initialized.
There are few third-party tools, including Dr.Memory, that use binary instrumentation to detect memory safety issues such as heap overflows, uninitialized memory usages, use-after-frees, and others.
Detecting Uninitialized Memory in Image Decoding
Detecting uninitialized memory in Windows usually requires binary instrumentation, especially when we do not have access to source code. One of the indicators we can use to detect uninitialized memory usage, specifically in the case of image decoding, is the resulting pixels after the image is decoded.
When an image is decoded, it results in a set of raw pixels. If image decoding uses any uninitialized memory, some or all of the pixels may end up as random. In simpler words, decoding an image multiple times may result in different output each time if uninitialized memory is used. This difference of output can be used to detect uninitialized memory and aid writing a fuzzing harness targeting Windows image decoders. An example fuzzing harness is presented in Figure 2.
#define ROUNDS 20
unsigned char* DecodeImage(char
*imagePath)
// use GDI or WIC to decode image and
get the resulting pixels
return pixels;
void Fuzz(char *imagePath)
if(refPixels != NULL) |
Figure 2: Diff harness
The idea behind this fuzzing harness is not entirely new; previously, lcamtuf used a similar idea to detect uninitialized memory in open-source image parsers and used a web page to display the pixel differences.
Fuzzing
With the diffing harness ready, one can proceed to look for the supported image formats and gather corpuses. Gathering image files for corpus is considerably easy given the near unlimited availability on the internet, but at the same time it is harder to find good corpuses among millions of files with unique code coverage. Code coverage information for Windows image parsing is tracked from WindowsCodecs.dll.
Note that unlike regular Windows fuzzing, we will not be enabling PageHeap this time as PageHeap “initializes” the heap allocations with patterns.
Results
During my research, I found three cases of uninitialized memory usage while fuzzing Windows built-in image parsers. Two of them are explained in detail in the next sections. Root cause analysis of uninitialized memory usage is non-trivial. We don’t have a crash location to back trace, and have to use the resulting pixel buffer to back trace to find the root cause—or use clever tricks to find the deviation.
CVE-2020-0853
Let’s look at the rendering of the proof of concept (PoC) file before going into the root cause of this vulnerability. For this we will use lcamtuf’s HTML, which loads the PoC image multiple times and compares the pixels with reference pixels.
Figure 3: CVE-2020-0853
As we can see from the resulting images (Figure 3), the output varies drastically in each decoding and we can assume this PoC leaks a lot of uninitialized memory.
To identify the root cause of these vulnerabilities, I used Time Travel Debugging (TTD) extensively. Tracing back the execution and keeping track of the memory address is a tedious task, but TTD makes it only slightly less painful by keeping the addresses and values constant and providing unlimited forward and backward executions.
After spending quite a bit of time debugging the trace, I found the source of uninitialized memory in windowscodecs!CFormatConverter::Initialize. Even though the source was found, it was not initially clear why this memory ends up in the calculation of pixels without getting overwritten at all. To solve this mystery, additional debugging was done by comparing PoC execution trace against a normal TIFF file decoding. The following section shows the allocation, copying of uninitialized value to pixel calculation and the actual root cause of the vulnerability.
Allocation and Use of Uninitialized Memory
windowscodecs!CFormatConverter::Initialize allocates 0x40 bytes of memory, as shown in Figure 4.
0:000> r
//Uninitialized memory after
allocation: |
Figure 4: Allocation of memory
The memory never gets written and the uninitialized values are inverted in windowscodecs!CLibTiffDecoderBase::HrProcessCopy and further processed in windowscodecs!GammaConvert_16bppGrayInt_128bppRGBA and in later called scaling functions.
As there is no read or write into uninitialized memory before HrProcessCopy, I traced the execution back from HrProcessCopy and compared the execution traces with a normal tiff decoding trace. A difference was found in the way windowscodecs!CLibTiffDecoderBase::UnpackLine behaved with the PoC file compared to a normal TIFF file, and one of the function parameters in UnpackLine was a pointer to the uninitialized buffer.
The UnpackLine function has a series of switch-case statements working with bits per sample (BPS) of TIFF images. In our PoC TIFF file, the BPS value is 0x09—which is not supported by UnpackLine—and the control flow never reaches a code path that writes to the buffer. This is the root cause of the uninitialized memory, which gets processed further down the pipeline and finally shown as pixel data.
Patch
After presenting my analysis to Microsoft, they decided to patch the vulnerability by making the files with unsupported BPS values as invalid. This avoids all decoding and rejects the file in the very early phase of its loading.
CVE-2020-1397
Figure 5: Rendering of CVE-2020-1397
Unlike the previous vulnerability, the difference in the output is quite limited in this one, as seen in Figure 5. One of the simpler root cause analysis techniques that can be used to figure out a specific type of uninitialized memory usage is comparing execution traces of runs that produce two different outputs. This specific technique can be helpful when an uninitialized variable causes a control flow change in the program and that causes a difference in the outputs. For this, a binary instrumentation script was written, which logged all the instructions executed along with its registers and accessed memory values.
Diffing two distinct execution traces by comparing the instruction pointer (RIP) value, I found a control flow change in windowscodecs!CCCITT::Expand2DLine due to a usage of an uninitialized value. Back tracing the uninitialized value using TTD trace was exceptionally useful for finding the root cause. The following section shows the allocation, population and use of the uninitialized value, which leads to the control flow change and deviance in the pixel outputs.
Allocation
windowscodecs!TIFFReadBufferSetup allocates 0x400 bytes of memory, as shown in Figure 6.
windowscodecs!TIFFReadBufferSetup:
0:000> k
After allocation:
//Uninitialized memory after
allocation |
Figure 6: Allocation of memory
Partially Populating the Buffer
0x10 bytes are copied from the input file to this allocated buffer by TIFFReadRawStrip1. The rest of the buffer remains uninitialized with random values, as shown in Figure 7.
if ( !TIFFReadBufferSetup(v2, a2,
stripCount) ) {
0:000> r
0:000> db 00000297`44382140 |
Figure 7: Partial population of memory
Use of Uninitialized Memory
0:000> r
0:000> db 00000297`44382140
0:000> k |
Figure 8: Reading of uninitialized value
Depending on the uninitialized value (Figure 8), different code paths are taken in Expand2DLine, which will change the output pixels, as shown in Figure 9.
{ { if ( v11 != 1 || a2 ) { unintValue = *++allocBuffer | (unintValue << 8); // uninit mem read } else { unintValue <<= 8; ++allocBuffer; } --v11; v16 += 8; } v29 = unintValue >> (v16 - 8); dependentUninitValue = *(l + 2i64 * v29); v16 -= *(l + 2i64 * v29 + 1); if ( dependentUninitValue >= 0 ) // path 1 break; if ( dependentUninitValue < '\xC0' ) return 0xFFFFFFFFi64; // path 2 } if ( dependentUninitValue <= 0x3F ) // path xx break; |
Figure 9: Use of uninitialized memory in if conditions
Patch
Microsoft decided to patch this vulnerability by using calloc instead of malloc, which initializes the allocated memory with zeros.
Conclusion
Part Two of this blog series presents multiple vulnerabilities in Windows’ built-in image parsers. In the next post, we will explore newer supported image formats in Windows such as RAW, HEIF and more.