Hashing & Crypto
Hash Functions Compared — MD5, SHA family, bcrypt, Argon2
Not all hash functions do the same job. When MD5 is still fine, why SHA-256 is the default, and why you must never hash passwords with either.
“Hash function” covers three job categories that only overlap at the edges: cryptographic digests, password-hashing functions, and fast non-cryptographic checksums. Using the wrong category for the job is how most of the industry’s password breaches happened. This guide is the practical separation.
The three categories
| Category | Goal | Examples | Speed | OK for passwords? |
|---|---|---|---|---|
| Cryptographic digest | Collision + preimage resistance | SHA-256, SHA-3, BLAKE3 | Fast | No |
| Password-hashing function | Slow on purpose, memory-hard | bcrypt, Argon2id, scrypt | Slow (tunable) | Yes — this is the job |
| Non-cryptographic checksum | Detect accidental corruption | CRC32, xxHash, MurmurHash | Very fast | No |
A common mistake is to reach for SHA-256 to “hash the password”. SHA-256 is a cryptographic digest and is extremely fast. That speed is the problem: a single GPU can compute billions of SHA-256 hashes per second against a stolen password database. Password-hashing functions (bcrypt, Argon2id) are designed to be slow and tunably expensive.
The cryptographic digests
| Algorithm | Output | Status | When to use |
|---|---|---|---|
| MD5 | 128 bit | Broken for security | Checksums only; file deduplication, cache keys |
| SHA-1 | 160 bit | Broken (SHAttered 2017) | Legacy only; Git uses it but is migrating to SHA-256 |
| SHA-256 | 256 bit | Current standard | Default choice for any cryptographic digest |
| SHA-512 | 512 bit | Current standard | Same security class; sometimes faster on 64-bit CPUs |
| SHA-3 (Keccak) | 224-512 bit | Current standard | NIST-approved alternative design; use if you need diverse primitives |
| BLAKE3 | Tunable | Modern, fast | Very fast parallel digest; newer, less battle-tested |
MD5 is not a slur word. It is broken for cryptographic use — collisions can be constructed cheaply — but it is fine for accidental-corruption detection. If you are checksumming a file against corruption in transit, MD5 or even CRC32 is faster than SHA-256 and the attack model does not matter. If you are verifying a file has not been tampered with by an attacker, you need SHA-256 or better.
# These commands hash the same file with different algorithms
md5sum file.tar
sha256sum file.tar
b3sum file.tar # BLAKE3, if installed
SHA-1 survived a bit longer than MD5 because its collision cost was higher, but Google demonstrated a practical collision in 2017 (the SHAttered paper). Git still uses SHA-1 for object IDs but has a migration path to SHA-256. New systems should default to SHA-256.
Password-hashing functions
The attack model for passwords is offline: an attacker steals your database and tries to crack it on their own hardware. Your defense is to make each hash computation expensive enough that a GPU farm cannot brute-force the weak passwords before you can respond.
Two things you need:
- A slow function, parameterized so you can scale it over time as hardware gets faster.
- A unique salt per user, stored with the hash, so precomputed rainbow tables are useless.
The current recommended options, in rough order of modernity:
- Argon2id — OWASP’s 2026 recommendation. Memory-hard (attackers cannot just throw GPUs at it), side-channel resistant, won the Password Hashing Competition in 2015. Use it if your stack has a vetted library (
argon2-cffiin Python,argon2in Node,golang.org/x/crypto/argon2in Go). - bcrypt — The 1999 standard, still fine. Well-audited, ubiquitous. Cost parameter (the “work factor”) is tunable. Libraries have limits on password length (72 bytes for bcrypt) that you must handle — either truncate or pre-hash with SHA-256 before bcrypt.
- scrypt — Memory-hard like Argon2. Less used today because Argon2 subsumes its design goals, but still secure.
- PBKDF2 — FIPS-approved. Use it only if you have a compliance reason; otherwise it is strictly worse than Argon2id.
# Argon2id in Python (argon2-cffi)
from argon2 import PasswordHasher
ph = PasswordHasher(
time_cost=3, # iterations
memory_cost=64_000, # 64 MiB
parallelism=4,
)
hash_ = ph.hash("user-password")
# "$argon2id$v=19$m=65536,t=3,p=4$saltsalt...$hash..."
ph.verify(hash_, "user-password") # True
Rule: never use SHA-256, SHA-3, BLAKE3, or any plain cryptographic digest to hash passwords. Not even with a salt. Not even with multiple iterations (you are badly reimplementing PBKDF2). Use a password-hashing function.
For the authentication tokens built on top of these foundations, see the JWT in 2026 piece; for the URL-safe token encoding that often accompanies them, see the Base64 in production guide.
Non-cryptographic hashes
These are tools for hash tables, cache keys, content-addressed storage, and Bloom filters. They are fast and they distribute inputs well, but they are trivially collidable by design.
| Algorithm | Speed | When to use |
|---|---|---|
| CRC32 | Very fast, 32 bit | File checksums in legacy formats, quick integrity checks |
| FNV | Fast, any width | String hashing in simple applications |
| MurmurHash3 | Fast, well-distributed | Hash tables, Bloom filters, cache sharding |
| xxHash | Fastest non-crypto | Large-payload checksums, log deduplication |
| CityHash / FarmHash | Fast | Google-origin; Protocol Buffers and BigQuery use relatives |
Never use these on untrusted input where collisions matter for security. An attacker can craft inputs that collide in MurmurHash and use that to create HashDoS attacks on a hash table — which is why modern runtimes use randomized hash seeds on startup.
HMAC — the thing most people actually need
If you are signing an API request, a webhook, or a cookie, you do not want a raw hash. You want HMAC (Hash-based Message Authentication Code). HMAC wraps a cryptographic hash with a key in a way that proves the message was signed by someone with the key.
import hmac, hashlib
key = b"secret"
message = b"payload=value&ts=123"
signature = hmac.new(key, message, hashlib.sha256).hexdigest()
Three things HMAC guards against that a raw sha256(key + message) does not:
- Length-extension attacks against the raw concatenation.
- Truncation attacks.
- Misuse of the hash’s internal state.
Default to HMAC-SHA256 for signing. Stripe, GitHub, most webhook providers use it because it is the no-footgun option.
Takeaways
Three categories, three tools: a cryptographic digest (SHA-256 by default) for integrity; a password-hashing function (Argon2id or bcrypt) for passwords; a non-cryptographic hash (xxHash, Murmur) for hash tables and checksums. MD5 is only dead for crypto — it is fine for dedup. Always use HMAC when you are signing, not a raw hash. Never hash passwords with anything other than a dedicated password-hashing function.