About Textformatter Normalize UTF8

Textformatter Normalize UTF8 uses a lightweight PHP class for UTF8 normalization.

Category Text Formatters
Textformatter modules that provide run-time formatting for blocks of text (typically used with Text/Textarea fields).
Release StateStable
Should be safe for use in production environments. *
Authorjustb3a
Module Version1.0.0
Class NameTextformatterNormalizeUtf8
Compatibility2.6, 2.7
Date AddedJuly 24, 2015
Last UpdatedSeptember 14, 2015
Recommended ByNew recommendations may take up to 1 day to appear.

Instructions

This module's files should be placed in /site/modules/TextformatterNormalizeUtf8/
How to install or uninstall modules

README

ProcessWire Textformatter Normalize UTF8

Textformatter Normalize UTF8 uses a lightweight PHP class (Patchwork UTF-8).

Use it if ..

  1. If you check the page with the W3C HTML5 validator, you'll maybe get the following warning:

    Text run is not in Unicode Normalization Form C.

  2. If you notice strange output in some browsers (bold letters, shifted characters, ..).

What it does

In Unicode it is possible to produce the same text with different sequences of characters.For example, take the Hungarian word világ. The fourth letter could be stored in memory as a precomposed U+00E1 LATIN SMALL LETTER A WITH ACUTE (a single character) or as a decomposed sequence of U+0061 LATIN SMALL LETTER A followed by U+0301 COMBINING ACUTE ACCENT (two characters).

világ = világ

The Unicode Standard allows either of these alternatives, but requires that both be treated as identical. To improve efficiency, an application will usually normalize text before performing searches or comparisons. Normalization, in this case, means converting the text to use all precomposed or all decomposed characters.

There are four normalization forms specified by the Unicode Standard: NFC, NFD, NFKC and NFKD. The C stands for (pre-)composed, and the D for decomposed. The K stands for compatibility. To improve interoperability, the W3C recommends the use of NFC normalized text on the Web.

-- W3C

Comments

No comments yet. Be the first to post!

Post a Comment

Your e-mail is kept confidential and not included with your comment. Website is optional.