Skip to content

Glossary

MT SMS

Mobile Terminated SMS, i.e. an SMS generated by an application going through a gateway to a device. Often also referred to as Sent SMS and Outgoing SMS.

MO SMS

Mobile Originated SMS. An SMS sent by a device coming through a gateway to an application. Often also referred to as Incoming SMS and Received SMS.

MSISDN

Mobile Station International Subscriber Directory Number, but you may think of this as the full mobile number, including area code if available and the country code, but without prefixed zeros or +.

Examples:

  • 4510203040 (typical Danish format: 10 20 30 40)
  • 46735551020 (typical Swedish format: 073-555 10 20)
  • 17325551020 (typical US format: (732) 555-1020)

When our APIs parse the MSISDN, we allow for some slack. Whitespace and leading + is discarded before it is parsed to an integer and then checked that it is not above an certainly unreasonable maximum.

The MSISDN is easily interchangeable with E.164 numbers, you simply remove or add the leading + in E.164. It can contain up to 15 digits, so we use an unsigned 64-bit integer.

E.164

The standard format for international phone numbers. Up to 15 digits.

MCC

Mobile Country Code, as defined by the ITU-T E.212 standard.

MNC

Mobile Network Code, as defined by the ITU-T E.212 standard.

GSM-7

The GSM standard defines the GSM-7 character encoding, frequently used for encoding SMS contents to a compact size. It uses a 7-bit character set, with an escape to use a double width code point for certain characters - which makes GSM-7 a variable width encoding.

It does this by using a few tricks:

  • Use a small character set
  • Use bit-packing to fit 160 characters into 140 bytes

National Shift Tables

Some GSM-7 codecs make it possible to encode text using specific shift tables that alter the meaning of the code pages in GSM-7.

GatewayAPI has no support for this currently. In situations where this would have been useful, we recommend using UCS-2.

The character set is designed so most characters used in western text can be encoded without any problems.

The full character set is:

  • Basic Latin
    • a b c d e f g h i j k l m n o p q r s t u v w x y z
    • A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
    • 0 1 2 3 4 5 6 7 8 9
    • ! " # $ % & ' ( ) * + , - . / _ : ; < = > ? @
  • Whitespace
    • Space   Newline \n and Carriage Return \r
  • Special Characters
    • £ ¥ § ¿
  • Greek Characters
    • Δ Φ Γ Λ Ω Π Ψ Σ Θ Ξ
  • Diacritics
    • è é ù ì ò Ç Ø ø Å å Æ æ ß É Ä Ö Ñ Ü ä ö ñ ü à ä ö ñ ü à
  • Extra Special Characters - using double width to be encoded with
    • ^ { } [ ] ~ |

Furthermore, Wikipedia has a useful page which shows the encoding as a table.

For details on how we split long messages, check the SMS Length section.

UCS-2

Universal Coded Character Set, an early Unicode implementation. For the most part you can use UTF-16BE (Big Endian) interchangeably with UCS-2. This encoding is used when sending messages that cannot be encoded in GSM-7, such as when using non-latin based languages or emojis.

UCS-2 Lies

Unlike UTF-16, UCS-2 is a fixed width encoding using 16 bits per code point. This limits the available code points to 216 or 65 536, which technically limits UCS-2 encodings to the Unicode Basic Multilingual Plane.

These days, UCS-2 has been superseded by UTF-16 and is therefore not easily available on many platforms, from programming languages to operating systems. Within the SMS ecosystem, UCS-2 exists as an alias for UTF-16, which does allow using multiple code units per code point, vastly expanding the encodings character set.

UTF-16BE is a superset of UCS-2, with the unassigned code points near the end of the page assigned to surrogate pairs, which allows a single character to use multiple 16-bit code units.

This does mean that an SMS that contains a code point outside of the Unicode Basic Multilingual Plane is strictly speaking not quite valid UCS-2, as it travels from origin to destination. But every system in the chain of delivery either does not care if it is encoded correctly or just uses UTF-16BE.

For details on how we split long messages, check the SMS Length section.

Webhooks

Despite the somewhat unfamiliar name, the concepts backing webhooks should be familiar to most developers that have worked with web technologies. A webhook system is in its essence a user configurable system that allows events to be sent from the system to a remote system accessible on a certain URL. Such an event could be an SMS status change that needs to be communicated from our systems to your systems. Say for the sake of example that you have configured our webhook system to send events to example.com/sms/status, then on each event about SMS messages that happen in our systems or that we receive from local operators, we send to that URL, with the event encoded as json.

All it requires from you is a webserver of sorts that is able to receive, process and respond with a 2xx status code.

JSON

A simple and fairly common data format, derived from JavaScript.