Smart character encoding

Updated

SMS messages typically support up to 160 characters. But some characters change the encoding scheme, and would otherwise drop your message to 70 characters. To prevent this sort of issue, we automatically replace certain characters with their GSM 7-bit unicode equivalents to maximize the number of characters in your message.

How it works

By default, SMS messages use the GSM 7-bit character set, which supports up to 160 characters. Going over the character limit will cause your message to split into multiple messages—which may not be a great experience for your customers and each message (called a segment or a credit depending on your plan) is counted towards your bill!

While the default character set covers roughly 128 common symbols, it doesn’t cover all languages or symbols you might use in text messages. When you use a character outside the GSM 7-bit set, text messages switch to 16-bit UCS-2 encoding. The increased bit depth means that you can only send 70 characters in your message.

To prevent this problem, we automatically replace certain characters with their GSM 7-bit equivalents to maximize the number of characters in your messages—so you don’t have to worry about complicated encoding schemes and message length limits.

This is all a fancy way of saying that we encode messages to support 160 characters whenever possible.

Some symbols still consume more than one character

You’ll notice that, while we replace many characters with their GSM 7-bit equivalents, some symbols still consume more than 1 character after we replace them.

For example, we replace the trademark symbol with (TM), which takes up 4 characters in your message. This just happens to be the most efficient way to represent the symbol in a 160 character message.

Emojis limit messages to 70 characters

Emojis are a special case. They’re not part of the GSM 7-bit character set, and we can’t replace them with their GSM 7-bit equivalents—like we do with other unicode characters.

So, when you use emojis, you’re limited to 70 characters in your message. Each emoji is counted as 2 of the 70 available characters.

Smart encoding character replacements

By default, we replace the following characters with their GSM 7-bit equivalents:

UNICODEORIGINALREPLACEMENT
‘`’`'
‘\u00a2’¢cents
‘\u00a8’¨"
‘\u00a9’©(C)
‘\u00aa’ªa
‘\u00ab’««
‘\u00ac’¬NOT
‘\u00ad’-
‘\u00ae’®(R)
‘\u00af’¯-
‘\u00b0’°deg
‘\u00b1’±+/-
‘\u00b2’²^2
‘\u00b3’³^3
‘\u00b4’´'
‘\u00b5’µu
‘\u00b6’P
‘\u00b7’·.
‘\u00b8’¸,
‘\u00ba’ºo
‘\u00bb’»»
‘\u00bc’¼1/4
‘\u00bd’½1/2
‘\u00be’¾3/4
‘\u00c0’ÀA
‘\u00c1’ÁA
‘\u00c2’ÂA
‘\u00c3’ÃA
‘\u00c8’ÈE
‘\u00ca’ÊE
‘\u00cb’ËE
‘\u00cc’ÌI
‘\u00cd’ÍI
‘\u00ce’ÎI
‘\u00cf’ÏI
‘\u00d0’ÐD
‘\u00d2’ÒO
‘\u00d3’ÓO
‘\u00d4’ÔO
‘\u00d5’ÕO
‘\u00d7’×x
‘\u00d9’ÙU
‘\u00da’ÚU
‘\u00db’ÛU
‘\u00dd’ÝY
‘\u00de’ÞTH
‘\u00e1’áa
‘\u00e2’âa
‘\u00e3’ãa
‘\u00e7’çc
‘\u00ea’êe
‘\u00eb’ëe
‘\u00ed’íi
‘\u00ee’îi
‘\u00ef’ïi
‘\u00f0’ðd
‘\u00f3’óo
‘\u00f4’ôo
‘\u00f5’õo
‘\u00f7’÷/
‘\u00fa’úu
‘\u00fb’ûu
‘\u00fd’ýy
‘\u00fe’þth
‘\u00ff’ÿy
‘\u01c3’ǃ!
‘\u0262’ɢG
‘\u026a’ɪI
‘\u0274’ɴN
‘\u0280’ʀR
‘\u028f’ʏY
‘\u0299’ʙB
‘\u029c’ʜH
‘\u029f’ʟL
‘\u02b9’ʹ'
‘\u02ba’ʺ"
‘\u02bb’ʻ'
‘\u02bc’ʼ'
‘\u02bd’ʽ'
‘\u02c6’ˆ^
‘\u02c8’ˈ'
‘\u02ca’ˊ'
‘\u02cb’ˋ'
‘\u02dc’˜~
‘\u02ee’ˮ"
‘\u02f7’˷~
‘\u0302’̂^
‘\u0303’̃~
‘\u0313’̓'
‘\u0314’̔'
‘\u0330’̰~
‘\u0332’̲_
‘\u0334’̴~
‘\u0337’̷/
‘\u0338’̸/
‘\u0347’͇=
‘\u1d00’A
‘\u1d04’C
‘\u1d05’D
‘\u1d07’E
‘\u1d0a’J
‘\u1d0b’K
‘\u1d0d’M
‘\u1d0f’O
‘\u1d18’P
‘\u1d1b’T
‘\u1d1c’U
‘\u1d20’V
‘\u1d21’W
‘\u1d22’Z
‘\u1dcd’^
‘\u2010’-
‘\u2011’-
‘\u2012’-
‘\u2013’-
‘\u2014’-
‘\u2015’-
‘\u2017’__
‘\u2018’''
‘\u2019’''
‘\u201a’,
‘\u201b’'
‘\u201c’""
‘\u201d’""
‘\u201e’,,
‘\u201f’"
‘\u2020’+
‘\u2021’++
‘\u2022’*
‘\u2023’>
‘\u2024’.
‘\u2025’..
‘\u2026’
‘\u2027’-
‘\u2028’
‘\u2029’
‘\u2030’/1000
‘\u2031’/10000
‘\u2032’'
‘\u2033’''
‘\u2034’’''
‘\u2035’'
‘\u2036’''
‘\u2037’’''
‘\u2038’^
‘\u2039’<
‘\u203a’>
‘\u203b’*
‘\u203c’!!
‘\u203d’?!
‘\u203e’-
‘\u2043’-
‘\u2044’/
‘\u2045’[
‘\u2046’]
‘\u2047’??
‘\u2048’?!
‘\u2049’!?
‘\u204a’&
‘\u204b’P
‘\u204c’<
‘\u204d’>
‘\u204e’*
‘\u204f’;
‘\u2051’**
‘\u2052’-
‘\u2053’~
‘\u2054’~
‘\u2055’*
‘\u2056’
‘\u2057’’’''
‘\u2058’….
‘\u2059’…..
‘\u205a’..
‘\u205b’….
‘\u205c’+
‘\u205d’:
‘\u205e’:
‘\u2070’^0
‘\u2071’^i
‘\u2072’^n
‘\u2073’^m
‘\u2074’^4
‘\u2075’^5
‘\u2076’^6
‘\u2077’^7
‘\u2078’^8
‘\u2079’^9
‘\u207a’^+
‘\u207b’^-
‘\u207c’^=
‘\u207d’^(
‘\u207e’^)
‘\u207f’^n
‘\u2080’_0
‘\u2081’_1
‘\u2082’_2
‘\u2083’_3
‘\u2084’_4
‘\u2085’_5
‘\u2086’_6
‘\u2087’_7
‘\u2088’_8
‘\u2089’_9
‘\u208a’_+
‘\u208b’_-
‘\u208c’_=
‘\u208d’_(
‘\u208e’_)
‘\u208f’_y
‘\u20a9’KRW
‘\u20b9’INR
‘\u20ba’TRY
‘\u20bd’RUB
‘\u20d2’'
‘\u20d3’'
‘\u20e5’\
‘\u2122’(TM)
‘\u2150’1/7
‘\u2151’1/9
‘\u2152’1/10
‘\u2153’1/3
‘\u2154’2/3
‘\u2155’1/5
‘\u2156’2/5
‘\u2157’3/5
‘\u2158’4/5
‘\u2159’1/6
‘\u215a’5/6
‘\u215b’1/8
‘\u215c’3/8
‘\u215d’5/8
‘\u215e’7/8
‘\u2202’partial
‘\u2207’nabla
‘\u220f’prod
‘\u2211’sum
‘\u221a’sqrt
‘\u221d’prop
‘\u221e’inf
‘\u2220’angle
‘\u2221’mangle
‘\u2222’sangle
‘\u2224’!
‘\u2225’
‘\u2226’!
‘\u2227’and
‘\u2228’or
‘\u2229’intersect
‘\u222a’union
‘\u222b’int
‘\u2248’~=
‘\u2260’!=
‘\u2264’<=
‘\u2265’>=
‘\u2282’subset
‘\u2283’superset
‘\u2284’!subset
‘\u2285’!superset
‘\u2286’subseteq
‘\u2287’supseteq
‘\u2288’!subseteq
‘\u2289’!supseteq
‘\u228a’subsetneq
‘\u228b’supersetneq
‘\ua730’F
‘\ua731’S
‘\ufe10’'
‘\ufe11’'
‘\ufe13’:
‘\ufe14’;
‘\ufe15’!
‘\ufe16’?
‘\ufe50’,
‘\ufe51’,
‘\ufe52’.
‘\ufe54’;
‘\ufe56’?
‘\ufe57’!
‘\ufe59’(
‘\ufe5a’)
‘\ufe5b’{
‘\ufe5c’}
‘\ufe5f’#
‘\ufe60’&
‘\ufe61’*
‘\ufe62’+
‘\ufe63’-
‘\ufe64’<
‘\ufe65’>
‘\ufe66’=
‘\ufe68’\
‘\ufe69’$
‘\ufe6a’%
‘\ufe6b’@
‘\uff01’!
‘\uff02’"
‘\uff03’#
‘\uff04’$
‘\uff05’%
‘\uff06’&
‘\uff07’'
‘\uff08’(
‘\uff09’)
‘\uff0a’*
‘\uff0b’+
‘\uff0c’,
‘\uff0d’-
‘\uff0e’.
‘\uff0f’/
‘\uff10’0
‘\uff11’1
‘\uff12’2
‘\uff13’3
‘\uff14’4
‘\uff15’5
‘\uff16’6
‘\uff17’7
‘\uff18’8
‘\uff19’9
‘\uff1a’:
‘\uff1b’;
‘\uff1c’<
‘\uff1d’=
‘\uff1e’>
‘\uff1f’?
‘\uff20’@
‘\uff21’A
‘\uff22’B
‘\uff23’C
‘\uff24’D
‘\uff25’E
‘\uff26’F
‘\uff27’G
‘\uff28’H
‘\uff29’I
‘\uff2a’J
‘\uff2b’K
‘\uff2c’L
‘\uff2d’M
‘\uff2e’N
‘\uff2f’O
‘\uff30’P
‘\uff31’Q
‘\uff32’R
‘\uff33’S
‘\uff34’T
‘\uff35’U
‘\uff36’V
‘\uff37’W
‘\uff38’X
‘\uff39’Y
‘\uff3a’Z
‘\uff3b’[
‘\uff3c’\
‘\uff3d’]
‘\uff3e’^
‘\uff3f’__
‘\uff5b’{
‘\uff5c’
‘\uff5d’}
‘\uff5e’~
‘\uff61’.
‘\uff64’,
Copied to clipboard!
  Contents
Is this page helpful?