Skip to content

Character Counter — Count Characters & Letters Online

Last verified May 2026 — runs in your browser
0
With spaces
0
Without spaces
Twitter / X 0 / 280
SMS 0 / 160

Character Counter — Count Letters, Words & Symbols Online

Type or paste any text and the counter shows two numbers — total characters with whitespace included and a separate count with whitespace removed — plus two coloured progress bars for the Twitter/X 280-char limit and the SMS 160-char limit (160 because of the original GSM 7-bit alphabet; SMS messages over that count get split into multi-part SMS, each billed separately). The crucial detail: this counter uses real Unicode code-point counting via the spread operator, not the naive `string.length` that returns UTF-16 code units. That means "😀" counts as 1, not 2; "日本語" counts as 3; and most flag emoji (which are pairs of regional indicator code points) count as 2 — matching how Twitter and the GSM Specification actually count for billing and limit enforcement.

About this tool

JavaScript's `string.length` returns the UTF-16 code unit count, which is wrong for any character outside the Basic Multilingual Plane — emoji, lots of CJK characters, mathematical symbols, ancient scripts. This page uses `[...input].length`, which iterates by code point (the spread operator on strings does Unicode-aware iteration since ES2015), giving the same number a human reader would expect. The whitespace-stripped count uses the same code-point iteration after a `\s` regex strip. The Twitter/X bar fills toward 280 and turns red on overflow — Twitter actually counts "weighted characters" where most CJK ideographs count as 2, but the 280-char headline is the one most users care about. The SMS bar fills toward 160 and turns red on overflow — that 160 is the GSM 7-bit alphabet limit; sending characters outside it (most accented Latin, all emoji, all CJK) triggers UCS-2 encoding which drops the per-segment limit to 70. This page intentionally shows the 160 line because it's the most common decision threshold for marketers and SMS API users; the multi-segment math is documented in the SMS spec for those who need it. Use cases: trimming a tweet to fit, copywriting transactional SMS, fitting a Bluesky post under 300, drafting an Instagram caption under 2,200.

  • Real Unicode code-point counting (spread operator, not string.length)
  • Counts emoji as 1, CJK as 1 per character — matches human expectation
  • Total count + whitespace-stripped count side by side
  • Twitter/X progress bar at 280 char limit, turns red on overflow
  • SMS progress bar at 160 char limit (GSM 7-bit alphabet)
  • Reactive — counts update on every keystroke
  • Live region (aria-live polite) — screen readers announce changes
  • No upload — your text never leaves the browser
  • Useful for tweets, SMS marketing, Bluesky posts, Instagram captions
  • Helps avoid the multi-segment SMS billing trap when using accents/emoji

Free. No signup. Your inputs stay in your browser. Ads via Google AdSense (consent required).

Frequently asked questions

Why does string.length return wrong numbers for emoji and CJK text?

JavaScript's string.length returns UTF-16 code units, the storage unit the language uses internally. For characters in the Basic Multilingual Plane (the original ~65,536 code points up to U+FFFF), one code point fits in one code unit. But emoji like U+1F600 😀, less common CJK ideographs above U+FFFF, mathematical symbols, and ancient scripts require a surrogate pair — two UTF-16 code units encoding one code point. ECMA-262's String iterator (the [...str] spread operator since ES2015) iterates over code points instead, so [...'😀'].length returns 1, the human-expected value. UAX #29 (Unicode 16.0, revision 45) defines an even stricter level called grapheme clusters, where ZWJ-joined family emoji like 👨‍👩‍👧 count as one user-perceived character even though built from five code points; full grapheme-cluster segmentation requires Intl.Segmenter, which most counters don't ship.

How does Twitter/X actually count characters under the 280 limit?

X Corp's developer documentation (docs.x.com/fundamentals/counting-characters) specifies weighted character counting after Unicode Normalization Form C (NFC). Most characters count as 1; Chinese, Japanese (Kanji, Hiragana, Katakana), Korean (Hangul), and fullwidth forms count as 2; all emoji count as 2 regardless of skin-tone or ZWJ complexity; URLs are wrapped to t.co at a fixed weight of 23 characters regardless of original length. The 280 number became the headline figure when Twitter doubled the original 140-char ceiling in 2017, but for Japanese or Chinese content the practical limit is closer to 140 weighted characters. The official open-source twitter-text library is the canonical reference implementation when integration precision matters.

Where does the 160-character SMS limit come from?

3GPP TS 23.038 (originally GSM Recommendation 03.38, mandatory for GSM handsets) defines the GSM 7-bit default alphabet. An SMS message envelope carries up to 140 octets of payload; with 7 bits per character that yields ⌊140 × 8 / 7⌋ = 160 characters per single SMS. If a message contains any character outside the 7-bit table — most accented Latin (é, ñ, ü), all emoji, all CJK — the network falls back to UCS-2 encoding (16 bits per character) and the per-segment limit drops to 70. Some markets ship national language shift tables (Portuguese, Turkish, several Brahmic scripts) that extend the 7-bit set. Multi-part SMS (per 3GPP TS 23.040) adds a User Data Header that further reduces per-segment payload to 153 (7-bit) or 67 (UCS-2).

Are emoji always 2 characters everywhere?

It depends on the system. ECMA-262 code-point counting treats a simple emoji like U+1F600 😀 as 1; a regional indicator pair like 🇺🇸 (two code points U+1F1FA + U+1F1F8) as 2; and a ZWJ family 👨‍👩‍👧 as 5. UAX #29 grapheme-cluster counting collapses all three to 1 user-perceived character. X Corp's weighted counter charges every emoji 2 characters regardless of underlying complexity. SMS using the GSM 7-bit alphabet doesn't carry emoji at all — the message gets re-encoded as UCS-2 and each emoji costs one or two UTF-16 code units depending on plane. The 'right' count depends on which platform's billing or limit rule the user is trying to satisfy.

How does this counter handle accessibility for screen readers?

The total and stripped counts and the Twitter/SMS progress bars sit inside a region marked aria-live="polite", which W3C WCAG Success Criterion 4.1.3 Status Messages (introduced in WCAG 2.1, W3C Recommendation 5 June 2018; carried unchanged into WCAG 2.2, Recommendation 5 October 2023) defines as the canonical mechanism for assistive technology to announce content updates without moving keyboard focus. The polite politeness setting queues announcements behind any speech the user is already hearing — appropriate for non-urgent tally updates, where assertive would interrupt mid-sentence on every keystroke. Screen readers (NVDA, JAWS, VoiceOver) consume the live region automatically; nothing else is required from the user.

Sources (6)
  • The Unicode Consortium (2024). The Unicode Standard, Version 16.0. Unicode Consortium, Mountain View, CA (released 10 September 2024).
  • Davis, M. (Ed.) (2024). UAX #29: Unicode Text Segmentation. Unicode Standard Annex, Revision 45 (Unicode 16.0).
  • ECMA International (2025). ECMAScript 2025 Language Specification — String.prototype [@@iterator] / String Iterator Objects. ECMA-262, 16th edition, June 2025.
  • 3GPP (2024). TS 23.038: Alphabets and language-specific information (Release 18). 3rd Generation Partnership Project, technical specification (originally GSM 03.38).
  • X Corp (2024). Counting Characters. X Developer Platform documentation, docs.x.com/fundamentals/counting-characters.
  • World Wide Web Consortium (W3C) (2018). Web Content Accessibility Guidelines (WCAG) 2.1 — Success Criterion 4.1.3 Status Messages. W3C Recommendation 5 June 2018; carried unchanged into WCAG 2.2 (Recommendation 5 October 2023).

These are the original publications the formulas in this tool are based on. Locate them by journal name and year on Google Scholar or PubMed.