trying the code below:
<?php
echo "ORD ~ = ".ord("~");
Basing on the extended ASCII table -> http://www.ascii-code.com/ the output is
ORD ~ = 126
Which is correct, but then when outputting something in the extended ASCII table, like Ø:
<?php
echo "ORD Ø = ".ord("Ø");
Gives:
ORD Ø = 195
While in the linked extended ASCII table the correct code for 'Ø' is 216. The same goes e.g. for √ (ord("√") outputs 226
while the proper extended ASCII char for 226 is â and √ is not even in the table).
So my question is, as the PHP strings basically are an array of strings ($str[0] for the first character, $str[1] for the second, C like, etc...), and as PHP doesn't have a char type, how does PHP handles the 1 byte char when it treats it separately e.g. using the previous ord() function and pack() and unpack() functions?
Are PHP char unsigned or are they signed? What's the difference?
How should I interpret this phrase A string is series of characters, where a character is the same as a byte. This means that PHP only supports a 256-character set
taken from the PHP manual?
256-character
meaning that it supports extended ASCII? But why then those differences when calling ord() on extended ASCII chars?
Thanks for the attention!
The PHP core as it stands right now has no notion of character encoding. Strings are just -as the manual states- series of bytes (unsigned 8bit). How the output medium interprest those bytes is ...beyond php.
In your example the Ø might have been utf-8 encoded, i.e. as the two bytes 195 and 152.
PHP not beeing aware of the encoding treats those two bytes as two separate single-byte "characters".
ord()
only takes the first "character" in a string into account and so you get195
.So the answer is: unsigned, no charset at all ...just bytes with a length indiciator.