New Feature in THOR v10.1 – Remote Scanning

THOR v10.1 features a mode of operation that is especially helpful in incident response or compromise assessment scenarios – remote scanning. 

Imagine that you’re in a firefighting scenario – a breach has been confirmed and management wants to have quick results on the extent of the compromise. 

The new remote scanning feature called “THOR Remote” allows you to perform triage scans on hundreds of remote systems from a single admin workstation. You can think of it as an integrated PsExec. 

No scripting, no agents, no hustle.  

Benefits

  • No agent
  • No scripting
  • Painless scans of many remote systems

Requirements

  • Available on Windows only
  • Accessible remote ports (135/tcp, 445/tcp)
  • Account with local admin rights

All you need is the new version 10.1 of THOR and a command line of an admin user with the required privileges and open Windows ports (135/tcp, 445/tcp) on the remote systems.

THOR will then switch into a new mode of operation and present a command line interface showing scan information and a scrollable pane for each log file. (see screenshot)

THOR writes the log files to a local folder on the admin workstation or sends them via SYSLOG to your SIEM system.  

You can also define a number of concurrent scans (workers) and delay the scan starts to distribute the load evenly among the target systems. This is beneficial when you scan numerous virtual machines running on a few host systems. 

A complete triage scan of your internal domain can’t be more comfortable. 

Write YARA Rules to Detect Embedded EXE Files in OLE Objects

This is the first blog post published on our new website. If you followed my blog on www.bsk-consulting.de you should consider subscribing to the RSS feed of this blog or the “Nextron Systems Newsletter”.

This is one of the YARA related blog posts showcasing a special use case. Last year I noticed that I wrote many rules for hex encoded strings found in OLE objects embedded in MS Office documents and RTF files.

I did most of the encoding and decoding work on the command line or with the help of CyberChef, an online tool provided by GCHQ. I also thought about a new YARA keyword that would allow us to write rules without encoding the strings.

Today, rules contain strings in a hex encoded form. I usually add the decoded string as a comment.

$s1 = "68007400740070003a002f002f00" /* http:// */

Rules with the new keyword would look like this:

$s1 = "http://" wide hex

Neat, isn’t it? I already forwarded that feature request to Wesley Shields (@wxs) but it seems to be no low hanging fruit. I’ll keep you informed about this feature via Twitter.

A tweet by Kevin Beaumont reminded me of the work that I’ve done and while looking at the tool by Rich Warren. I thought that I should create a illustrative example of a more generic YARA rule that explains why the “hex” keyword would be very useful.

The tool creates weaponized RTF files with hex encoded payloads.

I derived some strings for a new rule from the decoded object.

/* Hex encoded strings */
/* This program cannot be run in DOS mode */
$a1 = "546869732070726f6772616d2063616e6e6f742062652072756e20696e20444f53206d6f6465" ascii
/* C:fakepath */
$a2 = "433a5c66616b65706174685c" ascii

To further improve the rule I went to my goodware directory and ran the following command to generate a list of the most frequent PE file headers in a hex encoded form.

neo$ find ./ -type f -name "*.exe" -exec xxd -ps -l 14 {} ; | sort | uniq -c | sort -k 1 | tail -10
4 4d5a87000300000020000000ffff
4 4d5aae010300000020000000ffff
4 4d5abf000300000020000000ffff
4 4d5add000300000020000000ffff
4 4d5aeb000300000020000000ffff
6 213c73796d6c696e6b3e2f757372
8 4d5a72010200000020001700ffff
88 4d5a40000100000006000000ffff
116 4d5a50000200000004000f00ffff
5852 4d5a90000300000004000000ffff

Then I used these hex encoded strings in a YARA rule that looks for these strings in the OLE objects of an RTF file.

rule MAL_RTF_Embedded_OLE_PE {
   meta:
      description = "Detects a suspicious string often used in PE files in a hex encoded object stream"
      author = "Florian Roth"
      reference = "https://github.com/rxwx/CVE-2018-0802/blob/master/packager_exec_CVE-2018-0802.py"
      date = "2018-01-22"
   strings:
      /* Hex encoded strings */
      /* This program cannot be run in DOS mode */
      $a1 = "546869732070726f6772616d2063616e6e6f742062652072756e20696e20444f53206d6f6465" ascii
      /* KERNEL32.dll */
      $a2 = "4b45524e454c33322e646c6c" ascii
      /* C:fakepath */
      $a3 = "433a5c66616b65706174685c" ascii
      /* DOS Magic Header */
      $m3 = "4d5a40000100000006000000ffff"
      $m2 = "4d5a50000200000004000f00ffff"
      $m1 = "4d5a90000300000004000000ffff"
   condition:
      uint32be(0) == 0x7B5C7274 /* RTF */
      and 1 of them
}

The first analysis of the coverage looks pretty good. I see only clear matches in munin‘s output.

The few questionable matches look fishy enough to release my rule.

If you have further ideas to improve the rule, ping me via Twitter.

YARA Rules to Detect Uncommon System File Sizes

YARA is an awesome tool especially for incident responders and forensic investigators. In my scanners I use YARA for anomaly detection on files. I already created some articles on “Detecting System File Anomalies with YARA” which focus on the expected contents of system files but today I would like to focus on the size of certain system files.
I did a statistical analysis in order to rate a suspicious “csrss.exe” file and noticed that the size of the malicious file was way beyond the typical file size. I thought that I should do this for other typically abused file names based on this blog post by @hexacorn.
I used my VT Intelligence access and burned some searches to create this list.
System Files and Sizes

System Files and Sizes


You can find a spread sheet of this list here. It can be edited by everyone.
I created some YARA rules that use the external variable “filename” to work. LOKI and THOR use the “filename” and other external variables by default.
UPDATE 23.12.15 4:50pm:
I’ll update the list on the LOKI github page. For a current version of the YARA signatures visit this page.

rule Suspicious_Size_explorer_exe {
    meta:
        description = "Detects uncommon file size of explorer.exe"
        author = "Florian Roth"
        score = 60
        date = "2015-12-21"
    condition:
        uint16(0) == 0x5a4d
        and filename == "explorer.exe"
        and ( filesize < 1000KB or filesize > 3000KB )
}
rule Suspicious_Size_chrome_exe {
    meta:
        description = "Detects uncommon file size of chrome.exe"
        author = "Florian Roth"
        score = 60
        date = "2015-12-21"
    condition:
        uint16(0) == 0x5a4d
        and filename == "chrome.exe"
        and ( filesize < 500KB or filesize > 1300KB )
}
rule Suspicious_Size_csrss_exe {
    meta:
        description = "Detects uncommon file size of csrss.exe"
        author = "Florian Roth"
        score = 60
        date = "2015-12-21"
    condition:
        uint16(0) == 0x5a4d
        and filename == "csrss.exe"
        and ( filesize > 18KB )
}
rule Suspicious_Size_iexplore_exe {
    meta:
        description = "Detects uncommon file size of iexplore.exe"
        author = "Florian Roth"
        score = 60
        date = "2015-12-21"
    condition:
        uint16(0) == 0x5a4d
        and filename == "iexplore.exe"
        and ( filesize < 75KB or filesize > 910KB )
}
rule Suspicious_Size_firefox_exe {
    meta:
        description = "Detects uncommon file size of firefox.exe"
        author = "Florian Roth"
        score = 60
        date = "2015-12-21"
    condition:
        uint16(0) == 0x5a4d
        and filename == "firefox.exe"
        and ( filesize < 265KB or filesize > 910KB )
}
rule Suspicious_Size_java_exe {
    meta:
        description = "Detects uncommon file size of java.exe"
        author = "Florian Roth"
        score = 60
        date = "2015-12-21"
    condition:
        uint16(0) == 0x5a4d
        and filename == "java.exe"
        and ( filesize < 140KB or filesize > 900KB )
}
rule Suspicious_Size_lsass_exe {
    meta:
        description = "Detects uncommon file size of lsass.exe"
        author = "Florian Roth"
        score = 60
        date = "2015-12-21"
    condition:
        uint16(0) == 0x5a4d
        and filename == "lsass.exe"
        and ( filesize < 13KB or filesize > 45KB )
}
rule Suspicious_Size_svchost_exe {
    meta:
        description = "Detects uncommon file size of svchost.exe"
        author = "Florian Roth"
        score = 60
        date = "2015-12-21"
    condition:
        uint16(0) == 0x5a4d
        and filename == "svchost.exe"
        and ( filesize < 14KB or filesize > 40KB )
}
rule Suspicious_Size_winlogon_exe {
    meta:
        description = "Detects uncommon file size of winlogon.exe"
        author = "Florian Roth"
        score = 60
        date = "2015-12-21"
    condition:
        uint16(0) == 0x5a4d
        and filename == "winlogon.exe"
        and ( filesize < 279KB or filesize > 510KB )
}
rule Suspicious_Size_igfxhk_exe {
    meta:
        description = "Detects uncommon file size of igfxhk.exe"
        author = "Florian Roth"
        score = 60
        date = "2015-12-21"
    condition:
        uint16(0) == 0x5a4d
        and filename == "igfxhk.exe"
        and ( filesize < 200KB or filesize > 265KB )
}
rule Suspicious_Size_servicehost_dll {
    meta:
        description = "Detects uncommon file size of servicehost.dll"
        author = "Florian Roth"
        score = 60
        date = "2015-12-23"
    condition:
        uint16(0) == 0x5a4d
        and filename == "servicehost.dll"
        and filesize > 150KB
}
rule Suspicious_Size_rundll32_exe {
    meta:
        description = "Detects uncommon file size of rundll32.exe"
        author = "Florian Roth"
        score = 60
        date = "2015-12-23"
    condition:
        uint16(0) == 0x5a4d
        and filename == "rundll32.exe"
        and ( filesize < 30KB or filesize > 60KB )
}
rule Suspicious_Size_taskhost_exe {
    meta:
        description = "Detects uncommon file size of taskhost.exe"
        author = "Florian Roth"
        score = 60
        date = "2015-12-23"
    condition:
        uint16(0) == 0x5a4d
        and filename == "taskhost.exe"
        and ( filesize < 45KB or filesize > 85KB )
}
rule Suspicious_Size_spoolsv_exe {
    meta:
        description = "Detects uncommon file size of spoolsv.exe"
        author = "Florian Roth"
        score = 60
        date = "2015-12-23"
    condition:
        uint16(0) == 0x5a4d
        and filename == "spoolsv.exe"
        and ( filesize < 50KB or filesize > 800KB )
}
rule Suspicious_Size_smss_exe {
    meta:
        description = "Detects uncommon file size of smss.exe"
        author = "Florian Roth"
        score = 60
        date = "2015-12-23"
    condition:
        uint16(0) == 0x5a4d
        and filename == "smss.exe"
        and ( filesize < 40KB or filesize > 140KB )
}
rule Suspicious_Size_wininit_exe {
    meta:
        description = "Detects uncommon file size of wininit.exe"
        author = "Florian Roth"
        score = 60
        date = "2015-12-23"
    condition:
        uint16(0) == 0x5a4d
        and filename == "wininit.exe"
        and ( filesize < 90KB or filesize > 250KB )
}

I ran this rule set over my goodware database and got only a few false positives. Feel free to use these rules wherever you like but please share new rules or statistical analyses on other system files.

Yara System File Checks - False Positives

False Positives

How to Write Simple but Sound Yara Rules

How to Write Simple but Sound Yara Rules

During the last 2 years I wrote approximately 2000 Yara rules based on samples found during our incident response investigations. A lot of security professionals noticed that Yara provides an easy and effective way to write custom rules based on strings or byte sequences found in their samples and allows them as end user to create their own detection tools.
However it makes me sad to see that there are mainly two types of rules published by the researchers:

  1. rules that generate many false positives and
  2. rules that match only the specific sample and are not much better than a hash value.

I therefore decided to write an article on how to build optimal Yara rules, which can be used to scan single samples uploaded to a sandbox and whole file systems with a minimal chance of false positives.
These rules are based on contained strings and easy to comprehend. You do not need to understand the reverse engineering of executables and I decided to avoid the new Yara modules like “pe” which I still consider as “testing” features that may lead to memory leaks or other errors when used in practice.

Automatic Rule Generation

First I believed that automatically generated rules can never be as good as manually created ones. During my work for out IOC scanners THOR and LOKI I had to create hundreds of Yara rules manually and it became clear that there is an obvious disadvantage. What I used to do was to extract UNICODE and ASCII strings from my samples by the following commands:

strings -el samples.exe
strings -a sample.exe

I prefer the UNICODE strings as they are often overlooked and less frequently changed within a certain malware/tool family. Make sure that you use UNICODE strings with the “wide” keyword and ASCII strings with the “ascii” keyword in your rules and use “fullword” if there is a word boundary before and after the string. The problem with this method is that you cannot decide if the string that is returned by the commands is unique for this malware or often used in goodware samples as well.
Look at the extracted strings in the following example:

NTLMSSP
%d.%d.%d.%d
%s\IPC$
\\%s
NT LM 0.12
%s%s%s
%s.exe %s
%s\Admin$\%s.exe
RtlUpcaseUnicodeStringToOemString
LoadLibrary( NTDLL.DLL ) Error:%d

Could you be sure that the string “NT LM 0.12” is a unique one, which is not used by legitimate software?
To accomplish this task for me I developed “yarGen“, a Yara rule generator that ships with a huge string database of common and benign software. I used the Windows system folder files of Windows 2003, Windows 7 and Windows 2008 R2 server, typical software like Microsoft Office, 7zip, Firefox, Chrome, Cygwin and various Antivirus solution program folders to generate the database. yarGen allows you to generate your own database or add folders with more goodware to the existing database.
yarGen extracts all ASCII and UNICODE strings from a sample and removes all strings that do also appear in the goodware string database. Then it evaluates and scores every string by using fuzzy regular expressions and the “Gibberish Detector” that allows yarGen to detect and prefer real language over character chains without meaning. The top 20 of the strings will be integrated in the resulting rule.
Let’s look at two examples from my work. A sample of the Enfal Trojan and a SMB Worm sample.
yarGen generates the following rule for the Enfal Trojan sample:

rule Enfal_Generic {
meta:
description = "Auto-generated rule - from 3 different files"
author = "YarGen Rule Generator"
reference = "not set"
date = "2015/02/15"
super_rule = 1
hash0 = "6d484daba3927fc0744b1bbd7981a56ebef95790"
hash1 = "d4071272cc1bf944e3867db299b3f5dce126f82b"
hash2 = "6c7c8b804cc76e2c208c6e3b6453cb134d01fa41"
strings:
$s0 = "urlmon" fullword
$s1 = "Registered trademarks and service marks are the property of their respec" wide
$s2 = "Micorsoft Corportation" fullword wide
$s3 = "IM Monnitor Service" fullword wide
$s4 = "imemonsvc.dll" fullword wide
$s5 = "iphlpsvc.tmp" fullword
$s6 = "XpsUnregisterServer" fullword
$s7 = "XpsRegisterServer" fullword
$s8 = "{53A4988C-F91F-4054-9076-220AC5EC03F3}" fullword
$s9 = "tEHt;HuD" fullword
$s10 = "6.0.4.1624" fullword wide
$s11 = "#*8;-&gt;)" fullword
$s12 = "%/&gt;#?#*8" fullword
$s13 = "\\%04x%04x\" fullword
$s14 = "
3,8,18" fullword
$s15 = "
3,4,15" fullword
$s16 = "
3,7,12" fullword
$s17 = "
3,4,13" fullword
$s18 = "
3,8,12" fullword
$s19 = "
3,8,15" fullword
$s20 = "
3,6,12" fullword
condition:
all of them
}

The resulting string set contains many useful strings but also random ASCII characters ($s9, $s11, $s12) that do match on the given sample but are less likely to produce the same result on other samples of the family.
yarGen generates the following rule for the SMB Worm sample:

rule sig_smb {
meta:
description = "Auto-generated rule - file smb.exe"
author = "YarGen Rule Generator"
reference = "not set"
date = "2015/02/15"
hash = "db6cae5734e433b195d8fc3252cbe58469e42bf3"
strings:
$s0 = "LoadLibrary( NTDLL.DLL ) Error:%d" fullword ascii
$s1 = "SetServiceStatus failed, error code = %d" fullword ascii
$s2 = "%s\\Admin$\\%s.exe" fullword ascii
$s3 = "%s.exe %s" fullword ascii
$s4 = "iloveyou" fullword ascii
$s5 = "Microsoft@ Windows@ Operating System" fullword wide
$s6 = "\\svchost.exe" fullword ascii
$s7 = "secret" fullword ascii
$s8 = "SVCH0ST.EXE" fullword wide
$s9 = "msvcrt.bat" fullword ascii
$s10 = "Hello123" fullword ascii
$s11 = "princess" fullword ascii
$s12 = "Password123" fullword ascii
$s13 = "Password1" fullword ascii
$s14 = "config.dat" fullword ascii
$s15 = "sunshine" fullword ascii
$s16 = "password &lt;=14" fullword ascii
$s17 = "del /a %1" fullword ascii
$s18 = "del /a %0" fullword ascii
$s19 = "result.dat" fullword ascii
$s20 = "training" fullword ascii
condition:
all of them
}

The resulting rules are good enough to use them as they are, but they are far from an optimal solution. However it is good that so many strings have been found, which do not appear in the analyzed goodware samples.
If you don’t want to use or download yarGen, you could also use the online tool Yara Rule Generator provided by Joe Security, which was inspired by/based on yarGen.
It is not necessary to use a generator if your eye is trained and experienced. In this case just read the next section and select the strings to match the requirements of the (what I call) sufficiently generic Yara rules.

Sufficiently Generic Yara Rules

As I said in the introduction rules that generate false positives are pretty annoying. However the real tragedy is that most of the rules are far too specific to match on more than one sample and are therefore almost as useful as a file hash.
What I tend to do with the rules is to check all the strings and put them into at least 2 different categories:

  • Very specific strings = hard indicators for a malicious sample
  • Rare strings = likely that they do not appear in goodware samples, but possible
  • Strings that look common = (Optional) e.g. yarGen output strings that do not seem to be specific but didn’t appear in the goodware string database

Check out the modified rules in order to understand this splitting. Ignore the definition named $mz, I’ll explain it later and look at the string definitions below.
The definitions starting with $s contain the very specific strings, which I regard as so special that they would not appear in legitimate software. Note the typos in both strings: “Micorsoft Corportation” instead of “Microsoft Corporation” and “Monnitor” instead of “Monitor”.
The strings starting with $x seem to be special (I tend to google the strings) but I cannot say if they also appear in legitimate software. The definitions starting with $z seem to be ordinary but have not been part of the goodware string database so they have to be special in some way.

rule Enfal_Malware_Backdoor {
meta:
description = "Generic Rule to detect the Enfal Malware"
author = "Florian Roth"
date = "2015/02/10"
super_rule = 1
hash0 = "6d484daba3927fc0744b1bbd7981a56ebef95790"
hash1 = "d4071272cc1bf944e3867db299b3f5dce126f82b"
hash2 = "6c7c8b804cc76e2c208c6e3b6453cb134d01fa41"
strings:
$mz = { 4d 5a }
$s1 = "Micorsoft Corportation" fullword wide
$s2 = "IM Monnitor Service" fullword wide
$x1 = "imemonsvc.dll" fullword wide
$x2 = "iphlpsvc.tmp" fullword
$x3 = "{53A4988C-F91F-4054-9076-220AC5EC03F3}" fullword
$z1 = "urlmon" fullword
$z2 = "Registered trademarks and service marks are the property of their" wide
$z3 = "XpsUnregisterServer" fullword
$z4 = "XpsRegisterServer" fullword
condition:
( $mz at 0 ) and
(
( 1 of ($s*) ) or
( 2 of ($x*) and all of ($z*) )
)
and filesize &lt; 40000
}

Now check the condition statement and notice that I combine the rules with a magic header of an executable defined by $mz and a file size to exclude typical false positives like Antivirus signature files, browser cache or dictionary files. Set an ample file size value to avoid false negatives. (e.g. samples between 100K and 200K => set file size < 300K)
You can see that I decided that a single occurrence of one of the very specific strings would trigger that rule. ( 1 of $s* )
Than I combine a bunch of less unique strings with most or all of the ordinary looking strings. ( 2 of $x* and all of $z* )
Let’s look at second example. (see below)
$s1 is a very special string with string formatting placeholders “%s” in combination with an Admin$ share. $s2 seems to be the typical “svchost.exe” but contains the number “0” instead of an “O”, which is very uncommon and a clear indicator for something malicious.
All the definitions starting with $a are special but I cannot say for sure if they won’t appear in legitimate software. The strings defined by $x seem ordinary but were produced by yarGen, which means that they did not appear in the goodware string database.
This special example contains a list of typical passwords which is defined by $z1..z8.

rule SMB_Worm_Tool_Generic {
meta:
description = "Generic SMB Worm/Malware Signature"
author = "Florian Roth"
reference = "http://goo.gl/N3zx1m"
date = "2015/02/08"
hash = "db6cae5734e433b195d8fc3252cbe58469e42bf3"
strings:
$mz = { 4d 5a }
$s1 = "%s\\Admin$\\%s.exe" fullword ascii
$s2 = "SVCH0ST.EXE" fullword wide
$a1 = "LoadLibrary( NTDLL.DLL ) Error:%d" fullword ascii
$a2 = "\\svchost.exe" fullword ascii
$a3 = "msvcrt.bat" fullword ascii
$a4 = "Microsoft@ Windows@ Operating System" fullword wide
$x1 = "%s.exe %s" fullword ascii
$x2 = "password &lt;=14" fullword ascii
$x3 = "del /a %1" fullword ascii
$x4 = "del /a %0" fullword ascii
$x5 = "SetServiceStatus failed, error code = %d" fullword ascii
$z1 = "secret" fullword ascii
$z2 = "Hello123" fullword ascii
$z3 = "princess" fullword ascii
$z4 = "Password123" fullword ascii
$z5 = "Password1" fullword ascii
$z6 = "sunshine" fullword ascii
$z7 = "training" fullword ascii
$z8 = "iloveyou" fullword ascii
condition:
$mz at 0 and
( 1 of ($s*) and 1 of ($x*) ) or
( all of ($a*) and 2 of ($x*) ) or
( 5 of ($z*) and 2 of ($x*) ) and
filesize &lt; 200000
}

You see that I combined the string definitions in a similar way as before. This method in combination with the magic header and the file size should be a good starting point for the final stage – testing.

Testing

Testing the rules is very important. It seems that most authors decide that the rules are good enough if they match on the given samples.
You should definitely do the following checks:

  1. Scan the malware samples
  2. Scan a big goodware archive

To carry out the tests download the Yara scanner and run it from the command line. The goodware directory should include system files from various Windows versions, typical software and possible false positive sources (e.g. typical CMS software if you wrote Yara rules that match on malicious web shells)

Yara Rule Testing

Yara Rule Testing on Samples and Goodware


If the rule matched on the malicious samples and did not generate a match on the goodware archive your rule is good enough to test the rule in practice.

Update

Make sure to check Part 2 of “How to Write Simple and Sound YARA Rules”.