Escaping Phishermen Nets: Cryptographic Methods Unveiled in the Fight Against Reverse Proxy Attacks

defence

Back in summer of 2022, Ksandros Apostoli from the SEC Consult DFIR team, started noticing a considerably high number of cases where the initial compromise of user credentials was becoming the most challenging question to give an answer to.

Phishing Login Data

In most of these cases, victims had MFA enabled in their infrastructure, but nonetheless, Audit Logs did not immediately shed light to any anomalous activities. Empirically, when running against a wall of unexplained initial compromise, the main culprit would be Phishing campaigns. However, traditional detection methods for Phishing were not yielding any results in this new wave of incidents. For example, during an old-school phishing attempt, and when MFA is enabled, very often attackers would utilize MFA fatigue to bypass the second layer of authentication, which leaves sufficient IoCs behind. Nothing similar was to be found in logs any longer!

After doing some internal research, our team came to the same conclusion as Microsoft would announce about a month later: Reverse-Proxy phishing frameworks such as EvilginX, Modlishka and Muraena were to blame!

Roughly another year later, by the end of May 2023, Ksandros Apostoli from the SEC Defence team, traveled to Poland to participate in x33fcon, where he attended a talk delivered by none other than the mastermind behind EvilginX, Kuba Gretzky aka @mrgretzky. X33fcon is a purple-team-themed infosec conference, and @mrgretzky was holding an interesting talk about EvilginX there, titled "How much is the Phish?". Additionally in this video made by him you can see how Evilginx catches a phish and completely bypasses MFA on Google.

Being part of a purple-team conference, the second half of the talk addressed potential defense mechanisms for reverse-proxy phishing tools such as EvilginX, with a focus on measures that can be implemented by web developers rather than end users. Listening to the proposed techniques in @mrgretzky's presentation, Ksandros came up with an idea that he shared with him after his talk. From the brief interaction they had, Kuba appeared to agree with the presented concept, even at a very high level of abstraction. Ksandros decided to put this idea down on paper and make this article about it for anyone who might want to entertain themselves with the challenging notion of phishing prevention, and the even more challenging ways of implementing it.

 

Outlook Web App Interface

How much is the Phish?

No matter how secure your website implementation might be, or how many anti-phishing tools you might try to employ, there will always be some Phish! What makes a difference, is the price of the Phish (referencing @mrgretzky), i.e. the ratio between the adversarial effort required and the quality/credibility/success rate of the phishing attempt. 

To understand what is meant by this, imagine a classic phishing page written from scratch in HTML by the attacker, aiming to impersonate the target page:

diagram on phishing

This type of Phish can be cheap enough. However, the quality in most cases is poor. It can even be spotted by looking at the fonts being utilized. The advantage of such phish however, is that you can't really protect against it as a web developer. As long as the victim decides to put their credentials in any form in front of them, this will always be the case. 

MFA has helped mitigate against these type of phishing campaigns. Especially techniques such as utilising mobile push notification MFA with number matching, can increase the effort of classic phishing attempts to a dramatic extent, making them almost only theoretically feasible to carry out.

So, this type of old-school phish stands somewhere in the phish price chart marked with X.

modern phishing suites

Shopping list for a classic phishing page

  1. Phishing web-site implementation
  2. Creativity in choosing a domain to host your page on (e.g. micr0soft.com)
  3. Additional effort depending on different MFA implementation employed by the target (sometimes you need to perform MFA fatigue, sometimes you need to be manually forwarding MFA challenge data to the user, so they click the right number on their phone etc.)

What this translates to is lower success rates, due to mostly poorly executed phishing attempts.

Fast forward to modern-day phishing, namely reverse-proxy phishing tools such as EvilginX. The difference between these tools and the classic phishing methods is simple, but still enough to have a great impact. Instead of implementing your own phishing website, you only need to register a convincing enough domain name now. EvilginX will do the rest for you. Namely, reverse-proxy phishing frameworks establish two separate TLS sessions: one with the victim, and one with the target domain. These two sessions enable these modern phishing suites to become a true malicious-in-the-middle actor, on a network level, by performing the following steps:

  1. Decrypt legitimate traffic from the target website and forward to victim
  2. Decrypt legitimate traffic from the victim and forward to the target website

The ultimate outcome of these capabilities is that as a victim, now you receive the original website from the legitimate source, down to a network packet level. The only difference is that the MitM phishing tool has access to all this decrypted traffic as well. Such access not only enables attackers to get cleartext access to your credentials, but also enables them to flawlessly carry you through the MFA process, ultimately providing them with a valid session token.

Therefore, once you have registered your Phishing domain and set up an open source reverse-proxy phishing framework, you are ready to Phish. What is best, since you forward the original network packets to the victim, the quality of the phish is at a maximum, there is virtually no difference between your phishing page and the original page, other than the domain hosting the two.

diagram on phishing quality

Shopping list for a reverse-proxy phishing campaign

  1. Creativity in choosing a domain to host your page on (e.g. micr0soft.com)
  2. 15 minutes of your time to setup your framework of choice (Evilginx2, Modlishka or Muraena)

This means that the reverse-proxy phishing has shifted the phishing price considerably to the right:

diagram on phishing price quality

 

So, let's look at the YTD price for the phish:

This is very bad! If it is not clear yet, our goal would be to always stand above the phish-price equilibrium line, yet reverse-proxy phishing toolkits have brought that price back down beneath that threshold.

How to Stand Above the Phish-Price Equilibrium?

Let's start with what @mrgretzky proposed during his x33fcon 2023 talk, which served as the main foundation behind the ideas of this blogpost.  

On a high-level, the proposal was to use obfuscation techniques in order to deliver to the client scripts that will validate the current location of the DOM during runtime. This way, in case the script will detect that the victim is accessing a login page from attacker.com rather than bob.com, the web application will learn about this, and either suspend the user's account, or simply refuse login. 

But why obfuscated? - you might ask.

The reason behind the need for obfuscation of this "domain-validation logic" is simple: the attacker has full access to the packets that get transmitted between the client and the target domain. This means that the attacker can implement logic in their malicious reverse proxy, which replaces this logic, to bypass such validations. For instance, if the legitimate validating script would do something like:

function is_this_phishing(){
    if(window.location.host == 'target' + ".domain" + ".com")
    return false;

return true;
}

the reverse-proxy could be grepping in runtime for "target.domain.com" through packet inspection, and replace that with "attacker.domain.com":

function is_this_phishing(){
    if(window.location.host == 'attacker' + ".domain" + ".com")
    return false;

return true;
}

 This will trivially bypass the measure, at a relatively low price for the attacker. Therefore, by obfuscating, one can claim that reverse engineering will become so complex as to demotivate the Phisherman from continuing their attempt.

This logic is OK. Nevertheless: "No security by obscurity…". What is meant by this, is that no matter how complex obfuscation is, it is still obfuscation, and when it is reversed, it is game over, unless web applications choose to upgrade their obfuscation methods every other week.

Authentication flow

How about encryption?

Cryptographically speaking, the victim and target web application have an advantage over the Phisherman: they have shared knowledge of the user’s password (hash), as well as the seed for generating OTPs, while the attacker does not. The next question then becomes: how to use this advantage to enhance client-side domain validation?

Consider the following authentication flow:

Authentication protocol
  1. User sends only the username, in a multi-step authentication fashion
  2. The server looks up the user’s password hash HPW based on the username provided, and derives a new key KU = KDF(HPW) using a cryptographic Key Derivation Function.
  3. The server encrypts the domain validation script using the generated key KU, and sends the encrypted script to the user.
  4. The user inputs their password, and their client computes the same key KU = KDF(HPW).
  5. The user decrypts and executes the domain validation script, and only proceeds with authentication if the script succeeds.

Note: Even if the attacker sits in between the user and the server in this case, on a network packet level, there is nothing they can do to modify the encrypted script, unless they already have the user’s password. This solves the obfuscation issue once and for all.

 

The approach presented above effectively forces the attacker to partially fallback to traditional phishing methods, since now they need the user’s password to proceed with authentication. However, getting user’s password would involve substantially modifying the website in real time, potentially adding their own malicious scripts, increasing the phishing price substantially.

The proposal seems to work at least from a conceptual level; however, even I was not convinced about solely relying on an encrypted script.

From Encryption to Challenge-Response Protocols

After putting some further thought into the advantages and shortcomings of encrypting domain validation logic, it became clear that this part can be made much more efficient, through the use of a Challenge-Response protocol in authentication, which embeds domain validation by design.

Consider the following authentication protocol:

1. The server generates a cryptographically random challenge C, and sends it to the user.
2. The user inputs their password, and their client computes the response to the received challenge, embedding the visited domain:

R = HMAC(HPWU, Challenge, window.location)

3. Client sends the response R to the server.
4.The server verifies the Response, using the stored user’s password hash:

R =?= HMAC(HPWU, Challenge, bob.com)

5. If the verification above succeeds, the authentication proceeds with the MFA step. Otherwise, the server stops the authentication and suspends the user account. 

Encryption

What Could Go Wrong?

The MitM Phisher can still carry out an attack here by actively modifying packets on the fly, and by completely altering the cryptographic logic delivered to the client (i.e. modify the HMAC computation to statically use bob.com).

To bypass this limitation, we either need to send the cryptographic logic encrypted using a key derived from the user’s password like shown in the first example, or use a script stored to an encrypted location, dynamically deciphered using user’s password. This method is shown in more detail in the following section.

Using Reverse-Proxies’ Weak Points Against Them

Upon this point, the proposed solution is good enough to make EvilginX Phishermen struggle a bit more again, but the most dedicated ones will still succeed. So, let’s reconsider what else we are missing.

Reading about EvilginX on its motherland page you might notice an interesting "Under the Hood" section: EvilginX needs to dynamically modify all redirect URLs on the fly. 

But why are they putting so much emphasis on this? Well, normally a good reverse-proxy phishing framework will want, and need, to get its hands on as much traffic between the victim and target website to succeed.

Wait a minute… didn’t we say that the victim and target website have a cryptographic advantage over the Phisherman already? What if instead of encrypting the script itself, we encrypt the location of the script on the legitimate server?

This is how we end up here:

The authentication flow now looks like this:

  1. The client requests a login by first sending their username, in a multi-step authentication manner.
  2. The server performs the following steps:
    1. Fetches the user’s password hash H<sub>PW</sub> from the database.
    2. Generates a cryptographically random GUID
    3. Encrypts the string bob.com/GUID.js using the user’s password hash obtaining enc_location = ENC(H<sub>PW</sub>, https://bob.com/GUID.js)
    4. Stores in a hash table the tuple: (GUID,Origin IP), where Origin IP represents the IP address used in the connection requesting the login.
    5. Sends the encrypted location to the user, part of its client-side logic.
  3. The user, upon receiving the encrypted location, enters their password. The client computes H<sub>PW</sub>, and obtains location = DEC( H<sub>PW</sub> , enc_location). This location can then be used to send login material and obtain additional domain-validation logic.
  4. The server upon receiving the request on bob.com/GUID.js first verifies that the origin IP address of this request matches the Origin IP address stored on the tuple (GUID, Origin IP). If there is a miss-match, the server stops immediately, suspending the user’s account (or at least blocking the current authentication process).

What Can Still Go Wrong?

You might be thinking, well, what if the user modified the application logic that deals with decrypting the location in the first place, and instead instructed it to send the user’s password to the Phishing domain?

Fair enough, that is a legitimate attack path still, but the effort (and thus cost) now becomes orders of magnitude higher, especially given that all this need to be performed during runtime. At this point, we should ask ourselves, what is the advantage of doing this in a reverse-proxy manner, over simply trying to create a convincing copy of the target website? The second option might be even more straightforward, when considering the amount of debugging/reversing that is required to be put in, in order to properly modify the legitimate web-page dynamically.

Phishing Price Chart

Bottom Line

This blogpost pushes the idea that cryptography can be used to mitigate against reverse-proxy phishing toolkits.

One thing needs to be made clear: Phishing cannot be prevented altogether. As long as users are convinced to send their credentials over the phone, or submit them in any form over the internet, phishing will always be around in one form or the other. However, when compared to reverse-proxy phishing, these methods are much more expensive for Phishermen.

The methods presented here aim to at least take the phishing price chart back to the green side.

That being said, there is still space for proxy-based phishing kits to improve, and add additional processing to the packets they transmit to potentially bypass the proposed mitigations. However, at present, this would require significantly higher effort.

The feasibility-to-efficiency ratio of the presented methods is left as an exercise to any reader of web-development background.

 

This blogpost has been conducted by Ksandros Apostoli, security consultant at SEC Consult Group, and published on behalf of  SEC Defence

Are you interested in working at SEC Consult?

SEC Consult is always searching for talented security professionals to work in our team.