Click to See Complete Forum and Search --> : [RESOLVED] How to decode " %u5D16 " in PHP?
DavidNels
Dec 31st, 2009, 01:23 AM
Hi, how would I go about decoding text like this in PHP:
%u85E4%u5CA1%u85E4%u5DFB%u3068%u5927%u6A4B%u306E%u305E%u307F - %u5D16%u306E%u4E0A%u306E%u30DD%u30CB%u30E7
It appears it is URL encoded, but the php URLDecode function does not properly decode the text to its intended format:
藤岡藤巻と大橋のぞみ - 崖の上の ....
Thanks in advance!
David
Justa Lol
Dec 31st, 2009, 06:54 AM
Conforming HTML user agents may receive or output a document, or represent a document internally, using any character encoding. A character encoding represents some subset of the document character set. Character encodings such as ISO-8859-1 (commonly referred to as "Latin-1" since it encodes most Western European languages), ISO-8859-5 (which supports Cyrillic), SHIFT_JIS (a Japanese encoding), and euc-jp (another Japanese encoding) save bandwidth by representing only slices of the document character set.
http://www.w3.org/TR/WD-html40-970708/charset.html
its all about the charset
or why not just copy it right in?
or did i misunderstand this? btw this doesn't look like a url: Fujioka Fujimaki & Nozomi Ohashi - on the cliff
sciguyryan
Dec 31st, 2009, 07:00 AM
The issue is that those are UTF-8 encoded so you'll need to decode those first. I hit this a while back and wrote a script to get around it.
function urldecode_utf8($input)
{
$input = urldecode($input);
$result = preg_replace('/%u([0-9a-f]{3,4})/i', '&#x\\1;', $input);
return html_entity_decode($result, null, 'UTF-8');
}
That should do what you need.
Edit:
One note: ensure that you set the content type header to output UTF-8 contant or it'll render incorrectly.
DavidNels
Dec 31st, 2009, 12:23 PM
Great! Thanks, worked a charm.
vbforums.com
Copyright Internet.com Inc., All Rights Reserved.