Coffee Space


Listen:

Malware in Reversible Hash

Preview Image

This article is in response to an interesting discussion I had the other day about how compiled Rust confuses modern anti-virus software, and how this flaw is quite fundamental.

Objective

Malware is big business these days, with it being put into lots of “free” 1 and “cracked” 2 software, with the goal of compromising computers, getting information, controlling resources, etc.

Of course, in response, virus checkers will scan binaries and look for either common patterns found in other malware, or simulate the software and check the resources it uses.

In theory, we should be able to make a malware payload completely invisible to all scans or reasonable amounts of computation, so that it cannot be initially viewed.

Reversible Hash

One method to hide a payload could be to have it hide in some hashing function, a really useful type of function that can have many, many implementations. One approach is to ensure your hash is reversible, such that:

H(k,m)H(k,H(k,m)) H(k, m) \equiv H'(k', H(k, m))

In this case mm is our payload, H()H() is the hash function, H()H'() is some reverse hash function and kk is the key (kk maybe is the same as kk').

The idea is that you would create some payload, reverse compute the hash back some reasonable distance, and then have the hash forward computed on the target’s machine until some condition is reached. I nice implementation could be that it is the input to some small VM that is able to utilize more powerful functions.

When scanning, you would not see what the dangerous input is to these more powerful functions, and when simulating you would not be able to put in the computing resources to automatically detect something dangerous.

Another approach is simply that H(k,m)H(k, m) is simply setup such that they key kk is based on the date, so at a given date the payload deploys. This of course relies on the machine actually being on during this period, although it does mean that all target machines are infected at the same time, minimising the chances that a response is created. That said, not having your payload run all at the same time may also reduce the probability it is detected.

The point is, until a time of your choosing, the malicious payload is not revealed.

Hiding

You might say “well, this is super obvious”. But you are not quite right.

  1. A weird hashing algorithm is also just a pseudo random number generator. Your reversible hash having high entropy is actually desirable to prevent accidental triggering.
  2. A VM can essentially be a state machine, and state machines are super popular. It is also not super uncommon for there to be some small VM in code to make certain operations easier.
  3. Powerful functions don’t have to be referenced by name, it could simply be an offset address that is calculated in the payload at the same time as the parameters.

Proof?

Yeah, I’m not going to write your malware for you…

The hope of this article is that we start considering binary as something that is potentially really dangerous, and that anti-virus based on patterns or simulation just won’t cut it. It’s better than nothing for sure, but we still have a way to go to secure systems.

I had somebody say to me the other day “I ran the binary from the internet, but I virus scanned it first!”. My response was something like “yes, that just means that it didn’t find anything it had detected in the wild before”.


  1. There is always a cost for everything.↩︎

  2. Note that the they don’t even need to deliver on this.↩︎