Detecting Change Addresses
In Bitcoin and other UTXO-based cryptocurrencies, users have to specify a change address every time their input doesn’t exactly match the desired output (+ transaction fees). Therefore, one major field in chain analytics is to find out what address is the (external) receiver of the payment and what address is the change address that belongs to the sender.
In this scenario, we want to figure out what the change address is. Remember that a user can spend only complete UTXOs. If he has an UTXO of 5 BTC available but only needs to pay 2 BTC, the remaining 3 BTC need to be returned to the user. This is done by “sending” the 3 BTC to a change address. With most wallets, the change address is different from the original address. This makes detecting it harder.
One Input – Two Outputs
If we have two outputs and one input, at least one of the receiving addresses likely belongs to the sender. There are a few heuristics on how to detect the change address.
In a situation with one input and two outputs, an address is likely to be a change address if:
- It has not been used before, but the other address has been used before. The reason is that modern wallets create new change addresses with every transaction automatically.
- The amount sent to the address has more than four digits after the decimal. The reason is payment transactions mostly have round values. And with transaction fees that have many digits, the change amount has many digits too. However, some wallets try to smooth the transaction fee to flatten the change amount. But this works only to some extend.
- It is the second output in the transaction (wallets likely create first the receiver output and then the change output)
- The change output goes to the same script type as the input address and the other output to a different script type.
- The other address belongs to a known service (like exchange) and the original address does not belong to this service.
Two Inputs – Two Outputs
In this situation, we assume that the same user controls address A and address B. Again, we want to figure out what address is the change address.
It is reasonable to assume that a transaction uses as little inputs as possible to keep the transaction small and cheap. Therefore, multiple inputs are only used if there are no UTXOs available that match the required payment amount. In our example, the user wants to pay 4 BTC but has only two UTXOs with 2 BTC and 3 BTC, respectively. He needs to combine both and pay the excess input of 1 BTC to a change address. Hence, one can conclude that the smaller output amount belongs to the user who created the transaction if there were at least two UTXOs necessary to cover the larger output.
Address Reuse
Reusing addresses in Bitcoin is a bad idea. The following figure shows how a reused address (address B) can link two independent addresses (address A and C). Address A and B are used in the same transaction. This gives a hint that the same user controls them. Now, address B is used together with address B in transaction 2. Therefore, it is plausible that address C is also controlled by the same user who controls addresses B and A.
Multiple Inputs – Many Outputs – Batch Spend or Coinjoin
This indicates a large economic activity. Services that do such batch spends are, for example, exchanges. If it is an exchange, all inputs belong to the same entity. And it is likely that all but one output belong to different persons. However, it is difficult to say what the change address is.
Another problem is that such a pattern also occurs if a coinjoin is done. Here, multiple users bundle their UTXOs in one transaction to obscure the outputs.
Coinbase transaction
In coinbase transactions, there is no change address. If a coinbase transaction points to multiple addresses, they likely belong to the same entity.