What Is an SMS Counter?
An SMS counter is a tool that calculates how many characters your message contains, detects the encoding type (GSM 7-bit or UCS-2), and determines how many SMS segments the message will be split into. This matters because mobile carriers charge per segment, not per message, and exceeding the single-SMS limit can double or triple your cost without warning.
Unlike a regular character counter, an SMS counter accounts for the technical constraints of the Short Message Service standard defined in GSM 03.38. It considers not just the length of your text, but which characters it contains and how they must be encoded for transmission over cellular networks.
How GSM 7-Bit Encoding Works
The GSM 03.38 specification defines a default alphabet of 128 characters that can be encoded using 7 bits each. This includes uppercase and lowercase Latin letters, digits, common punctuation, and a few special symbols. Because each character uses only 7 bits instead of 8, a single SMS can fit up to 160 characters — calculated as (140 bytes × 8 bits) / 7 bits per character.
GSM 7-bit basic character set:
| Category | Characters |
|---|---|
| Letters | A–Z, a–z |
| Digits | 0–9 |
| Punctuation | @, !, ", #, $, %, &, ', (, ), *, +, ,, -, ., / |
| Symbols | :, ;, <, =, >, ?, ¡, ¿, ¤, £, ¥, §, ¨, ©, ®, ´, `, ^, ~, ¯ |
| Special | Space, newline, carriage return |
| Extended | ^, {, }, , [, ~, ], , € (require escape byte) |
Extended GSM 7-bit characters — such as ^, {, }, [, ], ~, |, and € — require an escape code (0x1B) before the character. This escape byte consumes space in the message, so each extended character effectively counts as 2 characters toward the 160-character limit.
UCS-2 Encoding: When 7 Bits Is Not Enough
When your message contains characters outside the GSM 7-bit alphabet — such as emoji, curly quotes, accented letters like é or ü, dashes like —, or any Unicode symbol — the entire message switches to UCS-2 (16-bit) encoding. In UCS-2 mode, each character occupies 16 bits, reducing the single-SMS capacity from 160 to 70 characters.
Characters that force UCS-2 encoding:
- Emoji and pictographs (
😊,🚀,❤️) - Curly quotes (
"",'') - Long dashes (
—,–) - Accented characters not in GSM basic set (
ě,ř,č) - Currency symbols beyond
$,£,¥,€ - Arrows (
→,←,↑,↓) - Mathematical symbols (
∑,∫,√,∞) - Most CJK (Chinese, Japanese, Korean) characters
Once a message contains even a single UCS-2 character, the entire message uses 16-bit encoding. There is no hybrid mode — it is all 7-bit or all 16-bit.
Multipart SMS and Segment Limits
When your message exceeds the single-SMS limit (160 GSM characters or 70 UCS-2 characters), the carrier splits it into multiple segments. Each segment includes a 6-byte User Data Header (UDH) for reassembly, which reduces the available payload space.
| Encoding | Single SMS | Multipart (per segment) |
|---|---|---|
| GSM 7-bit | 160 characters | 153 characters |
| UCS-2 | 70 characters | 67 characters |
How the calculation works:
GSM 7-bit multipart:
characters_per_segment = (140 - 6) × 8 / 7 = 153
UCS-2 multipart:
characters_per_segment = (140 - 6) / 2 = 67
A 200-character message in GSM 7-bit encoding splits into 2 segments (153 + 47). A 150-character message containing emoji splits into 3 UCS-2 segments (67 + 67 + 16). Most carriers bill per segment, so understanding these limits helps you control costs.
Code Examples
PHP: Count SMS Segments
function countSmsSegments(string $text): array {
$gsm7 = '@£$¥èéùìòÇØøÅåΔ_ΦΓΛΩΠΨΣΘΞÆæßÉ !"#¤%&\'()*+,-./0123456789:;<=>?¡ABCDEFGHIJKLMNOPQRSTUVWXYZÄÖÑܧ¿abcdefghijklmnopqrstuvwxyzäöñüà';
$gsm7ext = ['^', '{', '}', '\\', '[', '~', ']', '|', '€'];
$isUcs2 = false;
$extCount = 0;
for ($i = 0; $i < mb_strlen($text); $i++) {
$char = mb_substr($text, $i, 1);
if (in_array($char, $gsm7ext)) {
$extCount++;
} elseif (mb_strpos($gsm7, $char) === false) {
$isUcs2 = true;
break;
}
}
if ($isUcs2) {
$maxPerSegment = 70;
$segments = (int)ceil(mb_strlen($text) / $maxPerSegment);
// Multipart reduces limit
if ($segments > 1) {
$maxPerSegment = 67;
$segments = (int)ceil(mb_strlen($text) / $maxPerSegment);
}
return ['encoding' => 'UCS-2', 'segments' => $segments, 'chars_per_segment' => $maxPerSegment];
}
$effectiveLength = mb_strlen($text) + $extCount;
$maxPerSegment = 160;
$segments = (int)ceil($effectiveLength / $maxPerSegment);
if ($segments > 1) {
$maxPerSegment = 153;
$segments = (int)ceil($effectiveLength / $maxPerSegment);
}
return ['encoding' => 'GSM 7-bit', 'segments' => $segments, 'chars_per_segment' => $maxPerSegment];
}
JavaScript: Detect SMS Encoding
const GSM7_BASIC = new Set(
'@£$¥èéùìòÇØøÅåΔ_ΦΓΛΩΠΨΣΘΞÆæßÉ !"#¤%&\'()*+,-./0123456789:;<=>?¡' +
'ABCDEFGHIJKLMNOPQRSTUVWXYZÄÖÑܧ¿abcdefghijklmnopqrstuvwxyzäöñüà'.split('')
);
const GSM7_EXT = new Set(['^', '{', '}', '\\', '[', '~', ']', '|', '€']);
function detectSmsEncoding(text) {
let extCount = 0;
for (const char of text) {
if (GSM7_EXT.has(char)) {
extCount++;
} else if (!GSM7_BASIC.has(char)) {
return { encoding: 'UCS-2', extCount: 0 };
}
}
return { encoding: 'GSM 7-bit', extCount };
}
const result = detectSmsEncoding('Hello World! €50');
console.log(result); // { encoding: 'GSM 7-bit', extCount: 1 }
Python: Calculate SMS Parts
def sms_parts(text: str) -> dict:
gsm7_basic = set('@£$¥èéùìòÇØøÅåΔ_ΦΓΛΩΠΨΣΘΞÆæßÉ !"#¤%&\'()*+,-./0123456789:;<=>?¡ABCDEFGHIJKLMNOPQRSTUVWXYZÄÖÑܧ¿abcdefghijklmnopqrstuvwxyzäöñüà')
gsm7_ext = {'^', '{', '}', '\\', '[', '~', ']', '|', '€'}
is_ucs2 = False
ext_count = 0
for char in text:
if char in gsm7_ext:
ext_count += 1
elif char not in gsm7_basic:
is_ucs2 = True
break
if is_ucs2:
limit = 70 if len(text) <= 70 else 67
parts = (len(text) + limit - 1) // limit
return {'encoding': 'UCS-2', 'parts': parts, 'chars_per_part': limit}
effective = len(text) + ext_count
limit = 160 if effective <= 160 else 153
parts = (effective + limit - 1) // limit
return {'encoding': 'GSM 7-bit', 'parts': parts, 'chars_per_part': limit}
print(sms_parts('Hello World')) # 1 part, GSM 7-bit
print(sms_parts('Hello 😊 World')) # UCS-2 encoding triggered
Why SMS Encoding Matters for Developers
If you build applications that send SMS notifications — such as two-factor authentication codes, appointment reminders, or marketing campaigns — understanding SMS encoding helps you control both deliverability and cost.
Key takeaways for developers:
- Always validate message length before sending to avoid unexpected multipart charges
- Strip or replace UCS-2 characters when possible to keep messages in GSM 7-bit
- Test with extended characters like
€and[]— they compress differently than basic GSM characters - Account for the 6-byte UDH overhead when estimating multipart capacity
- Consider using a dedicated SMS counter tool during development and testing
Online Tool
The SMS Counter tool on Help2Code provides real-time character counting, encoding detection, and segment preview. Paste your message and instantly see whether it uses GSM 7-bit or UCS-2 encoding, how many segments it requires, and exactly where each segment breaks. The color-coded encoding preview shows you which characters trigger Unicode mode and which are safe in 7-bit.
Conclusion
An SMS counter is an essential utility for anyone who sends text messages programmatically or wants to avoid unexpected carrier charges. By understanding the difference between GSM 7-bit and UCS-2 encoding, and knowing how multipart segmentation works, you can write more cost-effective SMS applications and debug delivery issues faster. Use the SMS Counter tool to test your messages before sending them in production.