How to Encode and Decode HTML Entities in Any Language

16 Jun 2026 669 words

How to Encode and Decode HTML Entities

Encoding and decoding HTML entities is a routine task for any web developer. You encode when you need to display special characters safely in HTML, and you decode when you need to process HTML content back into plain text. Every programming language provides built-in functions for both operations, but the exact function names and behaviours vary.

When to Encode vs Decode

Encoding converts special characters to their entity references. You encode data before inserting it into HTML to prevent XSS attacks and ensure correct rendering. Decoding converts entity references back to characters. You decode when extracting text from HTML content, processing RSS feeds, or reading HTML email bodies.

The critical rule: encode when outputting to HTML, never when storing in the database. Store raw text, encode at render time.

PHP

PHP provides two core functions and a dedicated extension for HTML entity handling.

// Encode special characters (HTML5)
$text = 'AT&T "special" <offer>';
echo htmlspecialchars($text, ENT_QUOTES | ENT_HTML5, 'UTF-8');
// Output: AT&amp;T &quot;special&quot; &lt;offer&gt;

// Encode ALL entities (including accented characters)
echo htmlentities($text, ENT_QUOTES | ENT_HTML5, 'UTF-8');

// Decode entities back to characters
$encoded = 'AT&amp;T &quot;special&quot;';
echo htmlspecialchars_decode($encoded, ENT_QUOTES);
// Output: AT&T "special"

echo html_entity_decode('&amp; &lt; &gt; &quot;', ENT_QUOTES, 'UTF-8');
// Output: & < > "

Use htmlspecialchars for most cases — it handles the five essential characters. Use htmlentities only when you need to encode all characters with HTML entity equivalents, such as accented letters.

JavaScript (Browser)

In the browser, the simplest approach is to use the DOM API.

// Encode
function encodeHtml(str) {
  const div = document.createElement('div');
  div.appendChild(document.createTextNode(str));
  return div.innerHTML;
}

console.log(encodeHtml('AT&T "special" <offer>'));
// Output: AT&amp;T &quot;special&quot; &lt;offer&gt;

// Decode
function decodeHtml(str) {
  const div = document.createElement('div');
  div.innerHTML = str;
  return div.textContent;
}

console.log(decodeHtml('AT&amp;T &quot;special&quot;'));
// Output: AT&T "special"

JavaScript (Node.js)

Node.js does not have a DOM, so you need to use the he package or the entities package.

const he = require('he');

// Encode
console.log(he.encode('AT&T "special" <offer>'));
// Output: AT&amp;T &quot;special&quot; &lt;offer&gt;

// Decode
console.log(he.decode('AT&amp;T &quot;special&quot;'));
// Output: AT&T "special"

Python

Python's html module handles both encoding and decoding.

import html

# Encode
text = 'AT&T "special" <offer>'
safe = html.escape(text, quote=True)
print(safe)
# Output: AT&amp;T &quot;special&quot; &lt;offer&gt;

# Decode
encoded = 'AT&amp;T &quot;special&quot;'
decoded = html.unescape(encoded)
print(decoded)
# Output: AT&T "special"

Ruby

Ruby's CGI module and ERB::Util provide encoding functions.

require 'cgi'
require 'erb'

# Encode with CGI
encoded = CGI.escapeHTML('AT&T "special" <offer>')
puts encoded
# Output: AT&amp;T &quot;special&quot; &lt;offer&gt;

# Encode with ERB::Util
encoded = ERB::Util.html_escape('AT&T "special" <offer>')
puts encoded

# Decode
decoded = CGI.unescapeHTML('AT&amp;T &quot;special&quot;')
puts decoded
# Output: AT&T "special"

Java

Java provides StringEscapeUtils from the Apache Commons Text library.

import org.apache.commons.text.StringEscapeUtils;

public class HtmlEntities {
    public static void main(String[] args) {
        String text = "AT&T \"special\" <offer>";

        // Encode
        String encoded = StringEscapeUtils.escapeHtml4(text);
        System.out.println(encoded);
        // Output: AT&amp;T &quot;special&quot; &lt;offer&gt;

        // Decode
        String decoded = StringEscapeUtils.unescapeHtml4(encoded);
        System.out.println(decoded);
        // Output: AT&T "special"
    }
}

Online Tool

For quick one-off conversions, use the HTML Entity Encoder & Decoder tool. Paste your text, choose encode or decode, and copy the result. It handles all named and numeric entities.

Common Pitfalls

Double encoding occurs when you encode already-encoded text. If &amp; is encoded again, it becomes &amp;amp;. Always check whether your framework automatically encodes output before adding manual encoding.

Missing the quote flag in PHP. The ENT_QUOTES flag encodes both single and double quotes. Without it, single quotes remain unencoded, which can break JavaScript contexts.

Encoding for the wrong context. HTML entity encoding is correct for HTML body and attribute contexts but wrong for URLs, JavaScript, and CSS. Use URL encoding, JavaScript escaping, or CSS escaping respectively.

Conclusion

HTML entity encoding and decoding are well-supported across all major programming languages. Use your language's built-in functions for production code and an online tool for quick tasks. Always encode at render time, never at storage time.


About this article

Learn how to encode and decode HTML entities in PHP, JavaScript, Python, Ruby, and Java with complete code examples.


Related Articles


Related Tools

Help2Code Logo
Menu