What Is URL Encoding
URL encoding (percent-encoding) is a mechanism for converting characters into a format that is safe for use in URLs. The URL standard (RFC 3986) only allows a limited set of ASCII characters in addresses: Latin letters, digits, and a few special characters (-, _, ., ~). All other characters — including spaces, Cyrillic, and special characters — must be encoded.
An encoded character looks like a percent sign % followed by two hexadecimal digits representing a UTF-8 byte. For example, a space is encoded as %20, and the Cyrillic letter "а" becomes %D0%B0 (two UTF-8 bytes).
Why URL Encoding Is Needed
A URL has a strict structure: protocol, host, path, query parameters, fragment. Special characters serve as delimiters: / separates path segments, ? starts the query string, & separates parameters, and # marks the fragment. If these characters appear in the data (for example, in a parameter value), they must be encoded so the browser and server can parse the URL correctly.
Example of the problem: suppose you want to pass the search query "blue shirt & white" in a URL:
https://shop.example.com/search?q=blue shirt & white
Without encoding, the browser interprets & as a parameter separator and spaces as the end of the URL. The correct encoded URL:
https://shop.example.com/search?q=blue%20shirt%20%26%20white
Which Characters Are Encoded
Characters fall into three groups:
- Unreserved — do not require encoding:
A-Z,a-z,0-9,-,_,.,~. - Reserved — have special meaning in a URL:
:,/,?,#,[,],@,!,$,&,',(,),*,+,,,;,=. Encoded when used outside their intended purpose (e.g., inside a parameter value). - Everything else — spaces, Cyrillic, CJK characters, special symbols — are always encoded.
URL Encoding in Different Languages
JavaScript
JavaScript has two pairs of functions:
encodeURIComponent()/decodeURIComponent()— encodes parameter values. Encodes all special characters except- _ . ! ~ * ' ( ).encodeURI()/decodeURI()— encodes an entire URL. Does not encode delimiter characters (:,/,?,#,&,=).
Important: use encodeURIComponent() for encoding parameter values and encodeURI() for entire URLs. Confusing the two is a common source of bugs.
PHP
urlencode() encodes a string for use in a query string (spaces are replaced with +). rawurlencode() uses %20 for spaces, which conforms to the RFC 3986 standard.
Python
The urllib.parse module: quote() for encoding path components, quote_plus() for query parameters (space as +).
Common Issues and Solutions
- Double encoding. If data is encoded twice,
%20turns into%2520. Check whether the data is already encoded before encoding it again. - Space: %20 or +? In a query string, both are valid, but
%20is universal and works in any part of a URL. The+as a space is only valid in application/x-www-form-urlencoded. - Cyrillic in URLs. Modern browsers display Cyrillic URLs nicely, but internally encode them using percent-encoding. When copying a URL from the address bar, you may get either the Cyrillic or the encoded version.
- Incorrect encoding. URL encoding assumes UTF-8. If the source data is in a different encoding (e.g., Windows-1251), the result will be incorrect.
URL Encoding and UTM Tags
When creating UTM tags with our UTM tag generator, parameter values are encoded automatically. But if you build a URL manually and the value contains special characters (for example, a campaign name like "50% off + free gift"), be sure to encode it. Otherwise, % and + will be interpreted as control characters.
URL Encoding vs. Base64
Both techniques convert data to text, but for different purposes. URL encoding makes a string safe for use in a URL while keeping ASCII characters readable. Base64 encodes binary data into text for transmission over text-based protocols. To transmit binary data in a URL, the two methods are often combined: first Base64, then URL encoding (or the Base64URL variant is used).
Frequently Asked Questions
Why encode a URL if the browser displays Cyrillic just fine?
The browser displays a decoded URL for user convenience, but when sending a request to the server, the URL is always encoded. Problems arise when you build URLs programmatically (in APIs, scripts, email campaigns) — there, encoding is mandatory.
How do I decode a URL?
Use our URL encoding tool, which supports both encoding and decoding. Every programming language also has the corresponding functions: decodeURIComponent() in JavaScript, urldecode() in PHP.
Can URL encoding be used for security?
No. URL encoding is not a security measure. Attackers can use encoding to bypass filters (for example, encoding SQL injection characters). Always validate and sanitize input data regardless of encoding.
What is the maximum URL length?
The standard does not set a limit, but in practice: Internet Explorer supported up to 2,083 characters, modern browsers up to 65,000. Servers typically limit URLs to 8 KB. Given that URL encoding increases length (one Cyrillic character takes 6-9 characters), this is important to consider when building long URLs.