Two more bugs in VAC module family 20150609-214236

Sometimes it feels like the stream of bugs will never end…

The first problem is again with the way VAC tries to get the overlay of a file and add it as a faux section (presumably so it participates in the section hashing etc. without any special handling). You may remember a different bug from the same function which I blogged about earlier.  This time the problem is in the way files with zero sections are handled. Basically the problem is that if the number of sections in a file is S, then S+1 section headers are allocated and VAC attempts to access section S-1 in order to calculate where the previous last section ends so it can add the overlay as the new last section. However this will obviously be incorrect for files with zero sections as there is no last section, and so a read buffer underflow will occur due to a negative index into the buffer being used.

.text:10006D16 014 2B DD sub ebx, ebp
; Increment NumberOfSections
.text:10006D18 014 66 FF 86 26 22 00 00 inc word ptr [esi+2226h]
.text:10006D1F 014 0F B7 86 26 22 00 00 movzx eax, word ptr [esi+2226h]
.text:10006D26 014 6B C0 28 imul eax, 28h
.text:10006D29 014 50 push eax ; dwBytes
.text:10006D2A 018 FF B6 08 24 00 00 push dword ptr [esi+2408h] ; lpMem
; Allocate memory for NumberOfSections * sizeof(IMAGE_SECTION_HEADER)
.text:10006D30 01C E8 B5 D9 FF FF call HeapAllocOrRealloc
.text:10006D35 01C 0F B7 96 26 22 00 00 movzx edx, word ptr [esi+2226h]
.text:10006D3C 01C 59 pop ecx
.text:10006D3D 018 89 86 08 24 00 00 mov [esi+2408h], eax
.text:10006D43 018 59 pop ecx
.text:10006D44 014 8D 4A FE lea ecx, [edx-2] 
.text:10006D47 014 6B F1 28 imul esi, ecx, 28h
.text:10006D4A 014 6B CA 28 imul ecx, edx, 28h
; ESI now holds the pointer to the previous last section, which in the case of a zero 
; section file lies before the start of the buffer (remember that NumberOfSections 
; is now 1, and we're accessing index NumberOfSections-2)
.text:10006D4D 014 03 F0 add esi, eax
.text:10006D4F 014 83 C1 D8 add ecx, 0FFFFFFD8h
.text:10006D52 014 03 C1 add eax, ecx
.text:10006D54 014 89 44 24 10 mov [esp+14h+overlay_section_header], eax
; Read from the potentially invalid pointer
.text:10006D58 014 8B 46 08 mov eax, [esi+8]
.text:10006D5B 014 3B 46 10 cmp eax, [esi+10h]
.text:10006D5E 014 73 0B jnb short loc_10006D6B

The second bug occurs in the string scanning routine. This one I haven’t finished investigating yet, so I could be wrong, but basically the problem appears to be due to a signed/unsigned integer mismatch. The outer part of the algorithm does some signed math to calculate the length of data to process, and whilst it does a check for negative values, it then does another subtraction afterwards and doesn’t check again. Then the inner part of the algorithm treats the length as unsigned and clamps it to a maximum length of 0x40, so in the case that the string being processed is near the end of the buffer, a read overflow will occur due to 0x40 bytes being attempted to be read regardless of the remaining buffer length. I may write more about this in the future once I put some time aside to understand it better, but right now I can’t be bothered because it’s not particularly interesting.

.text:10002FFA 010 85 C9 test ecx, ecx
.text:10002FFC 010 7E 46 jle short loc_10003044
.text:10002FFE 010 8D 04 0A lea eax, [edx+ecx]
.text:10003001 010 89 45 00 mov [ebp+0], eax
.text:10003004 010 2B 44 24 18 sub eax, [esp+10h+pe_buffer_len]
.text:10003008 010 85 C0 test eax, eax
.text:1000300A 010 7E 02 jle short loc_1000300E
; ECX is used as the length in the function which actually reads out the string
; from the PE file buffer and writes it to the output buffer, but it could be 
; negative after this instruction, due to insufficient validation.
.text:1000300C 010 2B C8 sub ecx, eax

Bug #6 in VAC module family 20150609-214236

Found another bug in VAC over the weekend. Once again, it can result in an access violation. This time it’s during the parsing of resource data entries. For each entry, VAC attempts to get the size and read its entire contents into a buffer for later processing. However in the case of extremely large size values their attempt to allocate memory for the buffer will fail, and no null check is performed, resulting in an attempted write to a null pointer.

.text:10003CF7 020 57 push edi ; dwBytes
.text:10003CF8 024 E8 BF 09 00 00 call HeapAllocWrapper
.text:10003CFD 024 8B 74 24 30 mov esi, [esp+24h+resource_buffer_size]
.text:10003D01 024 59 pop ecx
.text:10003D02 020 89 03 mov [ebx], eax ; No null check
.text:10003D04 020 8B CD mov ecx, ebp ; this
.text:10003D06 020 57 push edi ; size
.text:10003D07 024 89 3E mov [esi], edi
.text:10003D09 024 FF 33 push dword ptr [ebx] ; buffer
.text:10003D0B 028 E8 11 E5 FF FF call PeFile__ReadData ; Crash in memcpy in here

SteamService hangs and VAC family 20151001-202837 (plus a bonus rant because I’m sleep deprived)

A few friends and I have been having problems recently with SteamService not quitting properly, which results in Steam being in a semi-broken state if you restart it (to change accounts or whatever). One very annoying observable side-effect of this is if you change accounts and then try to play a CSGO game, you will get an error message something along the lines of “not connected to matchmaking servers”. The workaround is to close Steam, force terminate SteamService (and any other lingering Steam processes), then restart it.

At the time, I noticed in the debug tracing (I have DebugView running pretty much 24/7) when shutting down Steam that there were mentions of some of the client module manager threads not shutting down cleanly. That led me to believe it may have been VAC related, but I didn’t really think much of it because while it was pretty annoying I couldn’t be bothered looking into it. Plus, given it was happening to so many of my friends I figured Valve would notice it soon enough and push out an update to fix it, so I went on with my life.

Fast forward to yesterday and I’m doing some testing/fuzzing of the latest VAC update, when lo and behold I stumble upon a way to make it hang. It turns out that the way VAC is using NtQueryObject is susceptible to hangs… See this SysInternals thread for more information. Unfortunately I haven’t had SteamService hang since realizing this (and I forgot to take dumps at the time) so I can’t confirm 100% that this is what’s causing it, but it would be a pretty big coincidence if it isn’t… If someone does encounter this please grab a dump using ProcDump and send it to me.

<rant>

Literally all I needed to do to find this bug was to test the first scan in the module against every process in my system. Nothing fancy, just literally testing against regular processes running on a pretty ordinarily configured machine… Valve, please, I’m begging you… DO SOME DAMN TESTING (also, doing some basic research of the APIs you’re calling would help too, that forum thread is one of the top Google results for NtQueryObject). I’m sick of getting unnecessary VAC Authentication Errors because of stupid bugs which can be found in an afternoon of testing by somebody who is working purely with binaries (it would be 10x easier with source code – hint hint).

By the way, your Australian CSGO servers suck ass, it’s not fun for them to be marked as “High Load” every night and get 5ms var every second game.

Oh, and why is Process Hacker on the list of “Common Conflicts” for VAC? I don’t know what the specific reason for the conflict is, but as developers you guys should know how useful tools like that are in general, and that they’re not cheat-specific at all. I should be able to run Process Hacker and not get kicked from my game.

</rant>

EDIT:

Finally got a repro, and I’ve been able to confirm that I’m right.

[5764] Thread "ClientModuleManagerPool:0" (ID 1608) failed to shut down
 3 Id: 1684.648 Suspend: 0 Teb: 7e8c6000 Unfrozen
 # Memory ChildEBP RetAddr 
00 00cfac8c 02b828b2 ntdll!NtQueryObject(void)+0xc [d:\th.obj.x86fre\minkernel\ntdll\wow6432\objfre\i386\usrstubs.asm @ 221]
WARNING: Frame IP not in any known module. Following frames may be wrong.
01 e4 00cfad70 02b84a89 0x2b828b2
02 17bc 00cfc52c 00000000 0x2b84a89

Sigh…

VAC module family 20151001-202837 overview

Pretty boring update overall. Three scans in this one.

The first scan is the main one, it returns information about a process like path, command line, version info, current directory, hashes, and more interestingly, the PIDs of any process handles it has open. One other thing that makes this scan slightly interesting is the fact that the class used to grab things like the command line, current directory, etc. now supports x64 processes (via the WoW64 memory reading APIs). However, the PE file class still hasn’t been updated with x64 support, so the information returned will still be limited in this case.

The second scan I haven’t seen called yet, but appears to simply try and return the raw contents of a PE file. I say ‘try’ because at first glance it appears to be broken. I doubt this will be called/used, because if it is it will cause a crash (at least as far as I can tell from static analysis). I need to wait for an input packet (or spoof one, which is what I’ll probably end up doing) in order to confirm though.

The third scan I also haven’t seen called yet. It returns information on the contents of a directory. It returns the number of files, number of directories, and the names of all files/dirs (plus a count of how many names were able to fit in the buffer).  It also returns the file sizes, but in a way that appears to be targeted at a specific cheat, because it doesn’t just return the raw file size like you’d expect, it contains some logic to return a special value if the file size is greater than or equal to 0x3F400, otherwise it returns the file size divided by 0x400 plus 1, or a different special value in the case of a directory, This will be another one where I’ll probably have to spoof an input packet to confirm.

 

Yet another bug in VAC module family 20150609-214236

Found yet another bug in VAC module family 20150609-214236, and again it can be used to prevent VAC from successfully scanning a file.[1] This time it’s a divide-by-zero exception (EXCEPTION_INT_DIVIDE_BY_ZERO). The problem lies in the function used by VAC to read the headers and overlay (to initialize the PE file class). Basically, VAC is attempting to add the file overlay (if any) as a fake section in its PE class (presumably to make the hashing/scanning/etc logic simpler as no special handling for overlays is needed after doing this), and to do so it needs to align the new section correctly. However the alignment calculation function involves a division and it doesn’t check whether IMAGE_OPTIONAL_HEADER::SectionAlignment is non-zero before using it as the divisor.

Alignment function here:

.text:10002782 mov eax, [esp+value]
.text:10002786 xor edx, edx
.text:10002788 mov ecx, [esp+alignment]
.text:1000278C dec eax
.text:1000278D add eax, ecx
.text:1000278F div ecx
.text:10002791 imul eax, ecx
.text:10002794 retn

[1] Just a note that I haven’t actually tested what the backend reaction is to this. I was thinking about it and I’m guessing it will probably result in a kick (i.e. “VAC Authentication Error”), but I’d have to test it to be sure.

More bugs in VAC module family 20150609-214236

Found some more bugs in the VAC module I’m currently looking into deeper. Valve, please hire a tester, this is pretty basic stuff.

Bug 1:

Off-by-one error in code parsing. There’s actually more than one instance of this, but I’ll just reference one because the others are essentially identical. The remaining length of the file buffer is checked, and then an extra byte is (potentially) read off the end.

See here:

.text:100041CE cmp [esp+9Ch+var_24], 4
.text:100041D3 jl loc_1000408A
.text:100041D9 mov eax, [esp+9Ch+arg_0]
.text:100041E0 cmp dword ptr [ebx+eax+1], 80000h

First a check is done to see whether there is at least 4 bytes remaining in the buffer, but then 5 bytes are accessed.

Bug 2:

Improper checking of debug directory data. This one is particularly interesting because it could be abused to prevent VAC from successfully scanning a file in some cases. Basically the problem is that when attempting to get the PDB path, VAC does not correctly check that IMAGE_DEBUG_DIRECTORY::AddressOfRawData is non-zero. This results in an integer underflow, and the attempt to move the file pointer will fail, which then results in cleaning up the PE file (strange design, but whatever). This breaks the preconditions of the type, and so a later attempt to read data from the file results in VAC attempting to use the image base of the DLL directly as the start of the PE buffer, which depending on the current memory layout will result in either garbage being read or an access violation.

Bug caused here:

.text:10007054 cmp dword ptr [ebx+2444h], 2
.text:1000705B jnz loc_100070E6
.text:10007061 mov eax, [ebx+244Ch]
.text:10007067 mov ecx, ebx
.text:10007069 push 0
.text:1000706B sub eax, ebp
.text:1000706D push eax
.text:1000706E call sub_10002CF2

Access violation caused here (inside the function call):

.text:10002C0E mov eax, [esi+2120h]
.text:10002C14 add eax, edx
.text:10002C16 add eax, ebx
.text:10002C18 push edi
.text:10002C19 push eax
.text:10002C1A lea eax, [esi+104h]
.text:10002C20 add eax, ebx
.text:10002C22 push eax
.text:10002C23 call sub_10004699

Bug 3:

0x200 bytes of heap memory are leaked on every invocation of runfunc (i.e. every time a scan is executed). It appears to be intended for use as a global function pointer array, but it’s only referenced in the initialization and cleanup functions called by runfunc, and the cleanup func doesn’t actually free the memory, it only calls the callbacks (if present).

Leak is here:

.text:10004752 push 200h ; dwBytes
.text:10004757 push 8 ; dwFlags
.text:10004759 mov dword_104145A4, 80h
.text:10004763 call ds:GetProcessHeap
.text:10004769 push eax ; hHeap
.text:1000476A call ds:HeapAlloc
.text:10004770 mov dword_104145A0, eax
.text:10004775 retn

This particular bug appears to be present in all active VAC modules (I didn’t look comprehensively, but I took a very quick look and I don’t remember noticing any notable differences when doing my analysis of individual modules either).

Bug in VAC module family 20150609-214236

I just discovered a funny bug while reversing VAC. Not the first I’ve found, and I’m sure it won’t be the last, but this one was more interesting than most as it causes the results of the scan to be non-deterministic across variants within the same family.

The problem lies in part of the string scanning routines. Specifically, there is a global array which is used to classify bytes into different categories, and this is used in part to detect when to terminate strings. The bug is that when accessing this array (in one instance, but not the other), VAC is sign-extending the byte instead of zero-extending it, causing an out-of-bounds read. Compounding this is the fact that the data that lies before the global array being accessed is actually part of the variant-specific volatile data (scan ids, encryption keys, etc), which means that the byte being read and treated as the classification/category for the current byte in the string is actually different for each variant, causing strings to sometimes not be terminated when they should be.

The offending code can be seen here:

.text:10002EDD loc_10002EDD: ; CODE XREF: sub_10002EA1:loc_10002F35j
.text:10002EDD movsx eax, [esp+edi+50h+var_40]
.text:10002EE2 movzx eax, byte_1000E0D8[eax]
.text:10002EE9 cmp eax, 8
.text:10002EEC jb short loc_10002EF4
.text:10002EEE cmp ecx, 0FFFFFFFFh
.text:10002EF1 cmovz ecx, edi

The correct usage can be seen here:

.text:10003DF2 loc_10003DF2: ; CODE XREF: sub_10003D32+596j
.text:10003DF2 mov ecx, [esp+9Ch+arg_0]
.text:10003DF9 imul ebp, 21h
.text:10003DFC movzx ecx, byte ptr [ebx+ecx]
.text:10003E00 mov [esp+9Ch+var_8], ecx
.text:10003E07 add ebp, ecx
.text:10003E09 mov [esp+9Ch+var_20], ebp
.text:10003E0D movzx ebp, byte_1000E0D8[ecx]
.text:10003E14 mov [esp+9Ch+var_4], ebp
.text:10003E1B cmp ebp, 9
.text:10003E1E jbe loc_10003EC2

This explains why you will sometimes see differing results for scans of the same file/region when all that has changed is the VAC module variant.

This also serves as an example of why tools like Hex-Rays should be treated as an aid and not a crutch. Problems like this tend to jump out at you when you’re looking at the disassembly, but are far more subtle when you’re reading pseudo-code.

VAC Module Overview (2/2)

The time has finally arrived to round out my VAC3 module overview (previous installments here and here). Same conditions as before still apply.

I’ve been a bit lazier with this one than in previous installments because I’ve been so busy recently and I haven’t really had any good chunks of time on the weekend to work on this. I haven’t finished reversing every single corner case of the modules before posting like I did for the last two posts, however given that this is only an overview anyway it doesn’t matter. So, my apologies if this post reads like it was rushed, but that’s because it was. :)

20150326-201829: Three scans. The first simply checks whether a file specified in the scan parameters exists and is accessible. The second retrieves events from the Steam service process monitor (which uses ETW to track process starts/stops). The third retrieves information on a specific PID from the process monitor (it gets the path, which is then used to look up the VSN and file index), along with some PE header metadata of steamservice.dll, and the full data and vtable of the process monitor provider. I may go into further detail about the Steam service process monitor component in the future, but in the meantime curious parties can get started by looking for the string “Steam_{E9FD3C51-9B58-4DA0-962C-734882B19273}_Pid:%000008X” in steamservice.dll.

20150422-164633: Two scans. The first returns information on processes which have an open handle to a process (i.e. the game). This is similar in concept to the system profiler component which the 20140422-222301 module returns the data for, however it is different in implementation. Along with the actual data being returned being different, the key differences are the fact that it utilizes manual syscalls (including using native x64 code on x64 processors instead of using the OS thunking), presumably to bypass usermode API hooks, and also that it runs in Steam instead of the game which means that in most cases it is running at a higher privilege level. The second scan appears to be detecting modified cvars for the purposes of cheating. Interestingly, while this scan appears to be specific to Dota 2 (it’s the only game I found with an engine.dll which contained the necessary code for the scan to actually work) I was unable to trigger it even during a Dota 2 game, so it may be inactive at the moment.

20150609-214236: Four scans. The first returns information about the modules loaded in a process (path, name, flags indicating various properties of the file, vsn and file index, hash, base address, name and path hash, mapping size, etc.). The second returns information about a specific module or file (or memory region in the case of a manually mapped module), including path, various PE file characteristics (headers, timestamp, checksum, pdb path, icon hash, section hashes, etc), VSN and file index, etc. It also returns strings found in the file/region, and finally it also does some “sig scanning” via function and/or string hashing and flags any matches, as well as containing some additional even more targeted checks for specific cheats. This is definitely the scan to look into if you want to know how VAC operates, because my guess is that this is how they catch the majority of cheats. Please note that I’ve been intentionally vague about the details of this particular scan so as not to allow my work to be leveraged by cheaters without them doing their own research, contact me directly if you have a question or want more details and can prove to me you’re not just after copy-pasta for your payhack. Scan three is very similar to scan one, however instead of returning information about modules it returns information about all the currently running processes. It appears to be inactive at the moment though because I have never seen it called and can’t think of any reasonable way to trigger it. Scan four is one I am not yet sure about. I’ve never seen it called, and reversing it statically is not making a whole lot of sense as it is doing a bunch of PE file format manipulations that don’t appear to be generally applicable, so my current suspicion is that it is targeting a specific cheat, but I still need to look deeper to confirm this.

What’s next? Finishing reversing the final corner cases and minor details/nuances of the modules so I can round off my private notes and counterpart test implementations (I like to re-implement all the scans so I can ensure my understanding is correct), especially the ones that I haven’t been able to trigger because those are obviously harder to verify (though I think I can simply spoof input data for all of them). Following that will be another long break (likely a very long one this time because I am in the process of moving overseas), and then probably some further analysis of the ‘supplementary’ VAC components (specifically the process monitor in SteamService, and the stack tracing in GameOverlayRenderer), and then finally an analysis of VAC2. I’ve also become more interested in DRM recently so who knows, maybe I’ll decide to look at CEG instead.

VAC Module Overview (1.5/2)

Time for another round of VAC info. Same conditions as last time still apply…

I was originally planning on having only two parts, but because I’ve been taking another break recently I figured I’d get what I’ve done so far out there and then follow up with the rest later on. Thankfully it shouldn’t take too much longer once I get some time again as there are only 3 left now, and two of them I have at 80%+ completion, so I only really have one which I need to start from scratch on. Anyway, on to the reason you’re actually reading this…

20140422-222301: Reports back the contents of the process map in the Steam client system profiler, which basically consists of data about processes which have/had an open handle to the game. I may got into more detail about this Steam client component in the future, but in the meantime there’s a decent explanation available here.

20140609-230714: Checks whether DEP is enabled by attempting to run code marked as NX (which should raise an AV). Also includes a debugger check using the PEB and a timing check in the event that an exception is not raised. In the event that an exception is raised it will be caught by the SEH handler inside steamservice.dll which wraps the runfunc calls and a packet is sent back containing basic information about the exception.

20141024-211529: Two scans. The first returns information about the attributes of a memory region (or regions) specified in the parameters. The second returns information about the contents of a memory region specified in the parameters; specifically it returns the top 15 entries of a histogram generated from parsing the memory as a dword array, followed by as many raw dword entries as it can fit in the output buffer. I may go more into what this is actually targeting at a later date.

20150211-205214: In the ‘normal’ case it enumerates system event log entries and reports back information on service installations for kernel mode drivers (event id, record timestamp, service name, and service path). However there is also a secondary implementation which does something unknown as the APIs and data used are encrypted with a 64-bit key which I have never been sent despite my best efforts to trigger it. I believe it’s related to unsigned driver load events based off some evidence which is circumstantial at best, but more investigation needs to be done to confirm this. If you know anything please don’t hesitate to contact me as I’m very interested.

EDIT (20150905-1514):

Forgot to edit this in earlier when I originally responded to the comment… Big thanks to how02 for providing the encryption key used by the event log module (0x47140E79, 0xB8D87D93). I may revisit that module in a later post with a bit of extra detail about what’s being done, but for now I can confirm that the keys are indeed correct, and as how02 already confirmed the code is related to unsigned driver loads.