JSON Formatter
JSON Formatter was created to help developers with debugging, formatting, and having clear set of JSON data for easier development.
Instantly convert HTML entities like & and < back to readable text. Decode scraped content, API responses, and encoded strings for plain-text use.
You copy text from an API response, paste it into your document, and suddenly you're staring at a mess of ampersands and semicolons. & where you expected ""&"", < where you needed ""<"", and " littering what should have been simple quotation marks. The content is technically there, but it's wrapped in HTML entity codes that make it unreadable for anyone who isn't a parser.
This is where HTML decoding comes in. It's the cleanup crew for encoded text, converting those entity references back into the characters they represent. One pass through a decoder, and your content goes from machine-readable markup to human-readable text.
HTML decoding reverses the encoding process that makes special characters safe for web markup. When text gets encoded, characters that have meaning in HTML—like <, >, &, and ""—get converted into entity references that browsers can display without breaking the page structure. Decoding flips this transformation, turning & back into &, < back into <, and " back into "".
The process handles three main entity formats. Named entities like © and ® use memorable abbreviations for common symbols. Decimal numeric entities like & reference characters by their Unicode code point in base-10. Hexadecimal numeric entities like & do the same thing in base-16. All three formats represent the same underlying characters, just encoded differently.
HTML uses angle brackets, ampersands, and quotes as structural elements. Write <div> in your content, and the browser tries to create a div element instead of displaying the text. Encoding these characters prevents this conflict, letting you show <div> as text by writing <div> in your HTML source.
This safety mechanism works great for web display. The problem shows up when you extract that content for other uses—pasting it into a document, feeding it to an analytics tool, or loading it into a database that expects plain text, not markup.
The need for decoding shows up in predictable patterns, usually when content crosses boundaries between systems or formats. Recognizing these scenarios saves you from manually cleaning text character by character.
Web scraping tools pull HTML as-is, entities and all. Export functions from content management systems often preserve the encoding used for web display. Before you can use this content in spreadsheets, documents, or text analysis tools, you need to decode it. Otherwise you're analyzing the encoded version, which skews word counts, search operations, and readability metrics.
Some APIs return HTML-encoded strings in their JSON or XML payloads, particularly older services or those repurposing content originally formatted for browsers. The API sends you "Hello" when you expected ""Hello"". Decoding becomes a preprocessing step before you can actually use the data in your application or display it to users.
Content migration projects frequently produce encoding artifacts. Moving blog posts from one CMS to another sometimes results in double-encoded text where & becomes & which then becomes &amp;. Each migration layer adds another round of encoding unless you decode before re-import. The same issue appears when converting between different rich text formats or moving content between databases with different character handling.
Email systems sometimes encode special characters in message bodies for compatibility across different mail clients. What you see in the source doesn't match what the sender intended. Decoding reveals the actual text. The same applies when debugging web applications—inspecting HTML source or database contents often shows encoded entities where you need to see the actual characters to understand what's happening.
The basic process stays consistent across most HTML decoders. Paste your HTML-encoded text into the input field. The tool processes the entities and returns plain text output, usually in real-time as you type or paste. Copy the decoded result and use it wherever you need clean, readable text.
The speed matters when you're processing multiple chunks of content. A good decoder handles the conversion instantly, letting you move through extracted API responses or exported content without waiting. Some tools show both input and output side by side, making it easy to verify the conversion handled every entity correctly.
Named entities use readable abbreviations that describe the character. © represents the copyright symbol ©, ® represents the registered trademark symbol ®, and represents a non-breaking space. These read intuitively if you know HTML, but they're still encoded text that needs conversion for plain-text contexts.
Numeric entities reference characters by their position in the Unicode standard. & is the decimal representation of the ampersand, character number 38 in Unicode. & is the same character in hexadecimal. Both decode to &. Numeric entities appear frequently in content that includes special symbols, mathematical notation, or characters from non-Latin writing systems.
Running plain text through an HTML decoder doesn't damage it. If the input doesn't contain valid HTML entity patterns, the tool returns it unchanged. A literal ampersand in your text stays as & because it's not followed by a recognized entity name or numeric pattern that ends with a semicolon.
This safety feature matters when you're processing mixed content where only some portions are encoded. You can decode an entire document without worrying about corrupting the parts that were already in plain text. The tool only transforms recognized entity patterns, leaving everything else alone.
HTML decoding often connects to other text processing operations. When you're building data pipelines or content processing workflows, understanding how decoding fits with related tools helps you clean and transform text more effectively.
If you're working with URL parameters and query strings, those use percent-encoding rather than HTML entities—a different encoding scheme that requires a different decoder. Base64-encoded content represents another encoding layer you might encounter in API responses or email attachments, particularly for binary data or embedded images.
When you need to go the opposite direction, converting plain text into HTML-safe markup, an HTML encoder handles that transformation. This becomes necessary when you're programmatically generating HTML content or preparing user-submitted text for display on web pages.
Data formats like JSON sometimes contain HTML-encoded strings within their structure. A JSON formatter can help you visualize the structure, but you'll still need to decode any HTML entities embedded in the string values before using that text in plain-text contexts.
Have you encountered HTML entities in unexpected places? The next time you're copy-pasting content and spot those telltale ampersands and semicolons, you'll know exactly what you're looking at—and how to clean it up.