Skip to content

Binary to Text Converter — Decode Binary to ASCII / UTF-8

Last verified May 2026 — runs in your browser

Text Output

Binary to Text — Decode Binary Code to Readable Text

Paste a binary string — space-separated bytes (01001000 01100101...) or a continuous stream — and the page decodes it back to readable text in real time. The decoder splits input into 8-bit groups (the byte size standardized by ASA X3.4-1963 when ASCII was published 17 June 1963 with 7-bit code points; the 8th bit was claimed first for parity, then for extended encodings). For ASCII-only inputs (U+0000–U+007F), each 8-bit group maps to one character. For UTF-8 inputs (RFC 3629, Yergeau 2003), the decoder follows the multi-byte rules: code points U+0080–U+07FF use 2 bytes, U+0800–U+FFFF use 3 bytes (the BMP, including most non-Latin scripts), and supplementary-plane code points U+10000+ use 4 bytes. Errors surface inline rather than producing silent mojibake. Useful for reverse-engineering protocols, decoding embedded-device dumps, learning UTF-8 byte structure, or sanity-checking that a binary stream matches an expected ASCII pattern.

About this tool

The decoder runs entirely in the browser. The input parser tolerates whitespace separators (spaces, tabs, newlines) and joins continuous streams into a single bit-string before splitting into 8-bit groups — non-binary characters trigger an inline error rather than corrupting downstream byte alignment. Each byte is interpreted as a UTF-8 code unit per RFC 3629 §3, which means a 1-byte input in U+0000–U+007F maps directly to ASCII, and multi-byte sequences (lead byte 110xxxxx for 2-byte, 1110xxxx for 3-byte, 11110xxx for 4-byte, with continuation bytes 10xxxxxx) are validated against the byte-pattern rules. Three encoding distinctions matter when reading binary data. ASCII (ASA X3.4-1963 → ANSI X3.4-1986) defines 128 code points in 7 bits. Extended ASCII (ISO 8859-1 / Latin-1 / Windows-1252) extends to 8 bits with regional characters. UTF-8 (RFC 3629) covers the full Unicode 16.0 repertoire (~150,000 code points) using variable-length encoding. A binary stream alone does not carry encoding information — that lives in HTTP Content-Type headers, BOM (Byte Order Mark) markers per Unicode Standard 16.0 §2.6, or out-of-band metadata. Misidentified encoding produces 'mojibake' — visible-but-wrong characters.

  • Decode binary to ASCII / UTF-8 text in real time
  • Handles space-separated bytes and continuous bit streams
  • 8-bit byte alignment per ASA X3.4-1963 (ASCII) standard
  • UTF-8 multi-byte handling per RFC 3629 (1-, 2-, 3-, and 4-byte sequences)
  • Inline error for invalid characters or misaligned input
  • One-click copy of decoded text to clipboard
  • Pure client-side decoding — no API call, no upload
  • Reactive — re-decodes as you type or paste
  • Useful for protocol reverse-engineering and embedded-device debugging

Free. No signup. Your inputs stay in your browser. Ads via Google AdSense (consent required).

Frequently asked questions

Why do my decoder results look like garbage?

Three common causes: (a) bit-grouping mismatch — the input is a continuous stream of 1s and 0s but the decoder split it into 7-bit groups when the source was 8-bit (or vice versa); (b) encoding mismatch — the input is base64 but the decoder treats it as raw binary, or the input is binary representing UTF-8 bytes but the decoder assumes ASCII; (c) padding error — base64 inputs missing the trailing '=' padding bytes confuse strict decoders. Verify the source format first (binary vs base64 vs hex), then byte-align (8 bits per character per ASA X3.4-1963).

Why is base64 33% larger than the original binary?

Base64 (RFC 4648) encodes every 3 input bytes (24 bits) as 4 output characters (each carrying 6 bits = 24 bits total). The encoding alphabet is 64 printable ASCII characters (A–Z, a–z, 0–9, +, /), which fits in 6 bits per character. Result: 4 output chars per 3 input bytes = 33.3% expansion, plus '=' padding adds up to 2 chars at the end. Base32 expands +60%, base16 (hex) +100%. The trade-off is intentional: a smaller alphabet produces larger output but transports cleanly through case-folding, URL-encoding, and other channel constraints.

What's the difference between binary, hex, and base64 encodings?

All three are binary-to-text encodings (lossless). Binary (base2) lists each byte as 8 ones and zeros — 8× size, fully readable bits. Hex (base16, RFC 4648) groups every 4 bits as 0–9/A–F — 2× size, readable for short hashes, case-insensitive by spec. Base64 (RFC 4648) packs 6 bits per character — 1.33× size, compact for longer payloads (PEM keys, email attachments via MIME RFC 2045, data URIs). Choose by transport constraint: binary for human-readable bit inspection, hex for short identifiers, base64 for compact transport over text channels.

How do I detect what encoding a binary stream uses?

Several heuristics: (a) BOM (Byte Order Mark) at the stream start — 'EF BB BF' is UTF-8, 'FF FE' is UTF-16 LE, 'FE FF' is UTF-16 BE per Unicode Standard 16.0 §2.6; (b) HTTP Content-Type header 'charset=' parameter; (c) statistical analysis — UTF-8 sequences have specific byte patterns (continuation bytes always start with bits '10') per RFC 3629 §3. None is foolproof. A UTF-8 stream missing both BOM and Content-Type can be mistaken for Latin-1 — the failure mode is 'mojibake', visible-but-wrong characters where the decoder picked the wrong codepage and silently corrupted the text.

Why do older email systems use base64 even when the network supports binary?

SMTP (RFC 5321, Klensin 2008) historically accepted only 7-bit ASCII characters in message bodies — a constraint inherited from telex-era 7-bit channels. RFC 2045 (MIME, Freed & Borenstein 1996) standardized the framework for transporting 8-bit binary content (images, executables, non-ASCII text) over 7-bit transports via Content-Transfer-Encoding values: 7bit, 8bit, binary, quoted-printable, base64. Modern SMTP supports the 8BITMIME extension, but base64 remains the safe default — intermediate gateways may strip the 8th bit, and base64 transit survives any 7-bit channel without corruption.

Sources (6)
  • American Standards Association, X3.2 Subcommittee (1963). USA Standard Code for Information Interchange (ASCII), X3.4-1963. Published 17 June 1963 (7-bit, no lowercase); revised X3.4-1967 (added lowercase) and ANSI X3.4-1986 (final).
  • Yergeau, F. (2003). UTF-8, a transformation format of ISO 10646. RFC 3629, IETF (STD 63).
  • Freed, N., & Borenstein, N. (1996). Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies. RFC 2045, IETF (November 1996).
  • Josefsson, S. (2006). The Base16, Base32, and Base64 Data Encodings. RFC 4648, IETF (October 2006; obsoletes RFC 3548).
  • Klensin, J. (2008). Simple Mail Transfer Protocol. RFC 5321, IETF (defines 7-bit floor and 8BITMIME extension).
  • The Unicode Consortium (2024). The Unicode Standard, Version 16.0 — Chapter 2 (General Structure) and §2.6 (Byte Order Mark). Unicode Consortium, Mountain View, CA.

These are the original publications the formulas in this tool are based on. Locate them by journal name and year on Google Scholar or PubMed.

Related guides