|
-
Aug 2nd, 2004, 03:47 PM
#1
Thread Starter
Hyperactive Member
Perl's handling of foreign (Polish) characters?
I am try to write a script to anglicise Polish text by replacing accented characters with their english counterparts but I am encountering so really odd behaviour.
This code (coloured to look like Kate):
Code:
sub anglicise ($){
$_ = $_[0];
print "ang: $_\n" ;
tr/?????Ó????????ó???/ACELNOSZZacelnoszz/ ;
print "anged: $_\n" ;
return;
} #end sub anglicise ()
print "returned: " .anglicise ("?????ó???");
print "\n" ; exit(0);
produces this:
ang: ?????ó???
anged: AzAzAzSzSCczSzSzSz
returned:
[edit]
<rant>
*sigh* this site doesn't seem to like foreign characters either... but then again this is not surprising since it doesn't even specify a character set:
<meta http-equiv="MSThemeCompatible" content="Yes">
That "Yes" part made me laugh if it wasn't so sad.
</rant>
The garbled part of the tr// is accented versions ACELNOSZZ, the upper case followed by the lower case. The arguments to anglescise() are just the lower letters. The string starting with "ang: " is all the lower case letters correctly displayed. The string starting with "anged: " is how it actually appears and there is nothing after the "returned: " strangely.
"There are only two things that are infinite. The universe and human stupidity... and the universe I'm not sure about." - Einstein
If you are programming in Java use www.NetBeans.org
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|