Here are miscelaneous tips I used while working with PHP tokens.
Tokens are computed by the PHP interpreter : it decomposes a program in terminal pieces of code, called tokens ; these tokens can be manipulated by a program. See the
Tokenizer Functions page of PHP manual.
PHP token lists
Here is a list of php tokens and their codes.
Page
List of Parser Tokens of the PHP manual provides a list of tokens, but don't indicate the associations between token codes and corresponding constant names.
Having these associations is quite useful when working with tokens, so I put them here, sorted by name and by code.
These lists are generated by this code :
function printTokens(){
for($i=258; $i < 376; $i++){
$res[$i] = token_name($i);
}
echo "<style>#tokenCodes td{ white-space:pre; }</style>\n";
echo "<div id='tokenCodes'><table><tr>\n";
asort($res);
echo "<td style='padding-right:1em;'>"; print_r($res); echo "</td>\n";
ksort($res);
echo "<td style='border-left:1px solid black; padding-left:1em;'>"; print_r($res); echo "</td>\n";
echo "<tr></table></div>\n";
}
| Array
(
[345] => T_ABSTRACT
[271] => T_AND_EQUAL
[359] => T_ARRAY
[292] => T_ARRAY_CAST
[326] => T_AS
[279] => T_BOOLEAN_AND
[278] => T_BOOLEAN_OR
[290] => T_BOOL_CAST
[331] => T_BREAK
[329] => T_CASE
[337] => T_CATCH
[352] => T_CLASS
[360] => T_CLASS_C
[298] => T_CLONE
[369] => T_CLOSE_TAG
[365] => T_COMMENT
[273] => T_CONCAT_EQUAL
[334] => T_CONST
[315] => T_CONSTANT_ENCAPSED_STRING
[332] => T_CONTINUE
[374] => T_CURLY_OPEN
[296] => T_DEC
[324] => T_DECLARE
[330] => T_DEFAULT
[274] => T_DIV_EQUAL
[306] => T_DNUMBER
[317] => T_DO
[366] => T_DOC_COMMENT
[373] => T_DOLLAR_OPEN_CURLY_BRACES
[357] => T_DOUBLE_ARROW
[294] => T_DOUBLE_CAST
[375] => T_DOUBLE_COLON
[316] => T_ECHO
[303] => T_ELSE
[302] => T_ELSEIF
[350] => T_EMPTY
[314] => T_ENCAPSED_AND_WHITESPACE
[325] => T_ENDDECLARE
[321] => T_ENDFOR
[323] => T_ENDFOREACH
[304] => T_ENDIF
[328] => T_ENDSWITCH
[319] => T_ENDWHILE
[372] => T_END_HEREDOC
[260] => T_EVAL
[300] => T_EXIT
[354] => T_EXTENDS
[364] => T_FILE
[344] => T_FINAL
[320] => T_FOR
[322] => T_FOREACH
[333] => T_FUNCTION
[362] => T_FUNC_C
[340] => T_GLOBAL
[351] => T_HALT_COMPILER
[301] => T_IF
[355] => T_IMPLEMENTS
[297] => T_INC
[262] => T_INCLUDE
[261] => T_INCLUDE_ONCE
[311] => T_INLINE_HTML
[288] => T_INSTANCEOF
[353] => T_INTERFACE
[295] => T_INT_CAST
[349] => T_ISSET
[283] => T_IS_EQUAL
[284] => T_IS_GREATER_OR_EQUAL
[281] => T_IS_IDENTICAL
[282] => T_IS_NOT_EQUAL
[280] => T_IS_NOT_IDENTICAL
[285] => T_IS_SMALLER_OR_EQUAL
[363] => T_LINE
[358] => T_LIST
[305] => T_LNUMBER
[265] => T_LOGICAL_AND
[263] => T_LOGICAL_OR
[264] => T_LOGICAL_XOR
[361] => T_METHOD_C
[276] => T_MINUS_EQUAL
[272] => T_MOD_EQUAL
[275] => T_MUL_EQUAL
[299] => T_NEW
[310] => T_NUM_STRING
[291] => T_OBJECT_CAST
[356] => T_OBJECT_OPERATOR
[367] => T_OPEN_TAG
[368] => T_OPEN_TAG_WITH_ECHO
[270] => T_OR_EQUAL
[277] => T_PLUS_EQUAL
[266] => T_PRINT
[343] => T_PRIVATE
[342] => T_PROTECTED
[341] => T_PUBLIC
[259] => T_REQUIRE
[258] => T_REQUIRE_ONCE
[335] => T_RETURN
[287] => T_SL
[268] => T_SL_EQUAL
[286] => T_SR
[267] => T_SR_EQUAL
[371] => T_START_HEREDOC
[346] => T_STATIC
[307] => T_STRING
[293] => T_STRING_CAST
[308] => T_STRING_VARNAME
[327] => T_SWITCH
[338] => T_THROW
[336] => T_TRY
[348] => T_UNSET
[289] => T_UNSET_CAST
[339] => T_USE
[347] => T_VAR
[309] => T_VARIABLE
[318] => T_WHILE
[370] => T_WHITESPACE
[269] => T_XOR_EQUAL
[312] => UNKNOWN
[313] => UNKNOWN
)
|
Array
(
[258] => T_REQUIRE_ONCE
[259] => T_REQUIRE
[260] => T_EVAL
[261] => T_INCLUDE_ONCE
[262] => T_INCLUDE
[263] => T_LOGICAL_OR
[264] => T_LOGICAL_XOR
[265] => T_LOGICAL_AND
[266] => T_PRINT
[267] => T_SR_EQUAL
[268] => T_SL_EQUAL
[269] => T_XOR_EQUAL
[270] => T_OR_EQUAL
[271] => T_AND_EQUAL
[272] => T_MOD_EQUAL
[273] => T_CONCAT_EQUAL
[274] => T_DIV_EQUAL
[275] => T_MUL_EQUAL
[276] => T_MINUS_EQUAL
[277] => T_PLUS_EQUAL
[278] => T_BOOLEAN_OR
[279] => T_BOOLEAN_AND
[280] => T_IS_NOT_IDENTICAL
[281] => T_IS_IDENTICAL
[282] => T_IS_NOT_EQUAL
[283] => T_IS_EQUAL
[284] => T_IS_GREATER_OR_EQUAL
[285] => T_IS_SMALLER_OR_EQUAL
[286] => T_SR
[287] => T_SL
[288] => T_INSTANCEOF
[289] => T_UNSET_CAST
[290] => T_BOOL_CAST
[291] => T_OBJECT_CAST
[292] => T_ARRAY_CAST
[293] => T_STRING_CAST
[294] => T_DOUBLE_CAST
[295] => T_INT_CAST
[296] => T_DEC
[297] => T_INC
[298] => T_CLONE
[299] => T_NEW
[300] => T_EXIT
[301] => T_IF
[302] => T_ELSEIF
[303] => T_ELSE
[304] => T_ENDIF
[305] => T_LNUMBER
[306] => T_DNUMBER
[307] => T_STRING
[308] => T_STRING_VARNAME
[309] => T_VARIABLE
[310] => T_NUM_STRING
[311] => T_INLINE_HTML
[312] => UNKNOWN
[313] => UNKNOWN
[314] => T_ENCAPSED_AND_WHITESPACE
[315] => T_CONSTANT_ENCAPSED_STRING
[316] => T_ECHO
[317] => T_DO
[318] => T_WHILE
[319] => T_ENDWHILE
[320] => T_FOR
[321] => T_ENDFOR
[322] => T_FOREACH
[323] => T_ENDFOREACH
[324] => T_DECLARE
[325] => T_ENDDECLARE
[326] => T_AS
[327] => T_SWITCH
[328] => T_ENDSWITCH
[329] => T_CASE
[330] => T_DEFAULT
[331] => T_BREAK
[332] => T_CONTINUE
[333] => T_FUNCTION
[334] => T_CONST
[335] => T_RETURN
[336] => T_TRY
[337] => T_CATCH
[338] => T_THROW
[339] => T_USE
[340] => T_GLOBAL
[341] => T_PUBLIC
[342] => T_PROTECTED
[343] => T_PRIVATE
[344] => T_FINAL
[345] => T_ABSTRACT
[346] => T_STATIC
[347] => T_VAR
[348] => T_UNSET
[349] => T_ISSET
[350] => T_EMPTY
[351] => T_HALT_COMPILER
[352] => T_CLASS
[353] => T_INTERFACE
[354] => T_EXTENDS
[355] => T_IMPLEMENTS
[356] => T_OBJECT_OPERATOR
[357] => T_DOUBLE_ARROW
[358] => T_LIST
[359] => T_ARRAY
[360] => T_CLASS_C
[361] => T_METHOD_C
[362] => T_FUNC_C
[363] => T_LINE
[364] => T_FILE
[365] => T_COMMENT
[366] => T_DOC_COMMENT
[367] => T_OPEN_TAG
[368] => T_OPEN_TAG_WITH_ECHO
[369] => T_CLOSE_TAG
[370] => T_WHITESPACE
[371] => T_START_HEREDOC
[372] => T_END_HEREDOC
[373] => T_DOLLAR_OPEN_CURLY_BRACES
[374] => T_CURLY_OPEN
[375] => T_DOUBLE_COLON
)
|
|
Displaying token array
The PHP tokens of a string containing source code is obtained by php function
token_get_all().
For example, for the following code :
/** A comment for $var */
static public $var;
token_get_all() returns :
[55] => Array(
[0] => 366
[1] => /** A comment for $var */
[2] => 9
)
[56] => Array(
[0] => 370
[1] =>
[2] => 9
)
[57] => Array(
[0] => 346
[1] => static
[2] => 10
)
[58] => Array(
[0] => 370
[1] =>
[2] => 10
)
[59] => Array(
[0] => 341
[1] => public
[2] => 10
)
[60] => Array(
[0] => 370
[1] =>
[2] => 10
)
[61] => Array(
[0] => 309
[1] => $var
[2] => 10
)
[62] => ;
This code is not easy to read, so the following function returns a more readable version :
/**
Returns a compact dump of a token array
@param $ta Token array to dump
@param Boolean $stripWhitespaces If true, T_WHITESPACE tokens are not included in returned array
*/
function readableArray(&$ta, $stripWhitespaces=true){
while(list($key, $val) = each($ta)){
if(is_array($val)){
if($stripWhitespaces && $val[0] == T_WHITESPACE) continue;
$val2 = $val[1] . ' - ' . token_name($val[0]) . ' : ' . $val[2];
$res[$key] = $val2;
}
else{
$res[$key] = $val;
}
}
return $res;
}// end readableArray
For the same code, the result is :
[55] => /** A comment for $var */ - T_DOC_COMMENT : 9
[57] => static - T_STATIC : 10
[59] => public - T_PUBLIC : 10
[61] => $var - T_VARIABLE : 10
[62] => ;