We know how to track processes with the standard Windows audit policy option “Audit process tracking”, but Sysmon messages contain much more information to evaluate. By using Sysmon on many systems within the network and collecting all the logs in a central location you’ll get a database full of interesting attributes and Metadata which can be statistically analyzed in order to identify anomalies.
Carlos Perez wrote a really good article on Sysmon, which you should check out if you’re new to Sysmon and its capabilities.
In recent years “anomaly detection” has often been used as marketing buzzword and as a result lost some of its shine. I am still a strong believer and often phrase sentences like “anomaly detection is the only method to detect yet unknown threats”. In security monitoring we call it anomaly detection, Antivirus vendors call it heuristics and SPAM appliances evaluate it in a “X-Spam-Score”.
Anomaly detection requires the ability to describe what is normal and exclude it from the evaluation.
With the data collected from the different Sysmon sources, this is an easy task to do. Sysmon provides the executable hash as MD5, SHA1 or SHA256 in the log entries that enables an analyst to identify the few different versions of a certain system executable. A hash of a system program like “cmd.exe” executed on the different systems on your domain should always be the same on all systems running the same version of Windows. But let me give you some examples.
A sane system environment analysis for the “cmd.exe” would look like this:
Hash - Image - Count 3C77C39347A6FA560A74587B0498FE84 - C:\WINDOWS\system32\cmd.exe - 56 AD7B9C14083B52BC532FBA5948342B98 - C:\Windows\System32\cmd.exe - 34
The following analysis includes an anomaly, which is worth to be investigated:
Hash - Image - Count 3C77C39347A6FA560A74587B0498FE84 - C:\WINDOWS\system32\cmd.exe - 56 AD7B9C14083B52BC532FBA5948342B98 - C:\Windows\System32\cmd.exe - 34 D8B7B276710127D233ABCDB7313AAC36 - C:\WINDOWS\system32\cmd.exe - 1
Let’s take a look at two analysis examples in which I use this method to identify different anomalies.
Anomaly 1: “StickyKeys” backdoor and the like
I use my favorite log analysis system for the analysis, which is Splunk. Getting the Sysmon data into splunk is easy as there is already a Sysmon Add-on available in the App Store. Just use the deployment manager to push the Add-on to the Splunk Forwarders and install Sysmon. (see my other blog post on Sysmon for more appropriate configuration options)
Then you can do things like that:
source="WinEventLog:Microsoft-Windows-Sysmon/Operational" NOT Image=*Sysmon.exe | dedup host,Image | stats distinct_count(Image) AS different_names,values(Image),values(host) by Hash | sort -different_names
It gives you an overview of files with the same hash but different names. It is pretty easy to spot the manipulation.
We detected a so called “StickyKeys” backdoor, which is a system’s own “cmd.exe” copied over the “sethc.exe”, which is located in the same folder and provides the Sticky Keys functionality right in the login screen. Replacing it with a system command line establishes a shell running as LOCAL_SYSTEM that pops up when you RDP to a server and press 5 times shift consecutively. (see this blog post for more information on this backdoor)
Anomaly 2: The Black Sheep
If you create the statistics by “Image” instead of “Hash” you’ll get an overview of the different versions of system files in use and are able to identify system file versions that are unique.
Look at the following example to get an impression what can be done with this method.
source="WinEventLog:Microsoft-Windows-Sysmon/Operational" NOT Image=*Sysmon.exe | dedup host,Image | rex field=Image "(?
[^\\\]+)$" | eval Executable=lower(Executable) | stats count by Executable,Hash | sort +count
I am sorry but I can’t give you a nice screenshot on what would it look like in a big environment. These are the results from 3 different demo systems only (Win2003, Win7 and Win8), but in order to see what it would look like in a environment with hundreds or thousands of systems, see the listing below.
The result would look like this:
Hash - Image - Count AD7B9C14083B52BC532FBA5948342B98 - cmd.exe - 1480 3C77C39347A6FA560A74587B0498FE84 - cmd.exe - 256 D8B7B276710127D233ABCDB7313AAC36 - cmd.exe - 2
Consider the image files with a low count as anomalies and try to figure out, why the hash of the system executable is different from the variants on the other systems.
I would google the hash of the black sheep, which is “D8B7B276710127D233ABCDB7313AAC36” and see if I can get more details. An empty google result is NOT a good sign as some may be inclined to believe. If the google results are ambiguous you should try to figure out if these systems are somehow special – e.g. certain readout system on embedded OS versions, systems that do not receive patches. If the findings are still suspicious you should drop the samples in a sandbox and see how they behave.
Hope you liked it. Please give me feedback if you actually tested this method in your environment so that I can improve the search statements or handle false positive conditions.