parse_url
(PHP 4, PHP 5, PHP 7, PHP 8)
parse_url — Parse a URL and return its components
Description
$url
, int $component
= -1): int|string|array|null|falseThis function parses a URL and returns an associative array containing any of the various components of the URL that are present. The values of the array elements are not URL decoded.
This function is not meant to validate the given URL, it only breaks it up into the parts listed below. Partial and invalid URLs are also accepted, parse_url() tries its best to parse them correctly.
Parameters
-
url
-
The URL to parse.
-
component
-
Specify one of
PHP_URL_SCHEME
,PHP_URL_HOST
,PHP_URL_PORT
,PHP_URL_USER
,PHP_URL_PASS
,PHP_URL_PATH
,PHP_URL_QUERY
orPHP_URL_FRAGMENT
to retrieve just a specific URL component as a string (except whenPHP_URL_PORT
is given, in which case the return value will be an int).
Return Values
On seriously malformed URLs, parse_url() may return
false
.
If the component
parameter is omitted, an
associative array is returned. At least one element will be
present within the array. Potential keys within this array are:
- scheme - e.g. http
- host
- port
- user
- pass
- path
-
query - after the question mark
?
-
fragment - after the hashmark
#
If the component
parameter is specified,
parse_url() returns a string (or an
int, in the case of PHP_URL_PORT
)
instead of an array. If the requested component doesn't exist
within the given URL, null
will be returned.
As of PHP 8.0.0, parse_url() distinguishes absent and empty
queries and fragments:
http://example.com/foo → query = null, fragment = null http://example.com/foo? → query = "", fragment = null http://example.com/foo# → query = null, fragment = "" http://example.com/foo?# → query = "", fragment = ""
Previously all cases resulted in query and fragment being null
.
Note that control characters (cf. ctype_cntrl()) in the
components are replaced with underscores (_
).
Changelog
Version | Description |
---|---|
8.0.0 | parse_url() will now distinguish absent and empty queries and fragments. |
Examples
Example #1 A parse_url() example
<?php
$url = 'http://username:password@hostname:9090/path?arg=value#anchor';
var_dump(parse_url($url));
var_dump(parse_url($url, PHP_URL_SCHEME));
var_dump(parse_url($url, PHP_URL_USER));
var_dump(parse_url($url, PHP_URL_PASS));
var_dump(parse_url($url, PHP_URL_HOST));
var_dump(parse_url($url, PHP_URL_PORT));
var_dump(parse_url($url, PHP_URL_PATH));
var_dump(parse_url($url, PHP_URL_QUERY));
var_dump(parse_url($url, PHP_URL_FRAGMENT));
?>
The above example will output:
array(8) { ["scheme"]=> string(4) "http" ["host"]=> string(8) "hostname" ["port"]=> int(9090) ["user"]=> string(8) "username" ["pass"]=> string(8) "password" ["path"]=> string(5) "/path" ["query"]=> string(9) "arg=value" ["fragment"]=> string(6) "anchor" } string(4) "http" string(8) "username" string(8) "password" string(8) "hostname" int(9090) string(5) "/path" string(9) "arg=value" string(6) "anchor"
Example #2 A parse_url() example with missing scheme
<?php
$url = '//www.example.com/path?googleguy=googley';
// Prior to 5.4.7 this would show the path as "//www.example.com/path"
var_dump(parse_url($url));
?>
The above example will output:
array(3) { ["host"]=> string(15) "www.example.com" ["path"]=> string(5) "/path" ["query"]=> string(17) "googleguy=googley" }
Notes
This function may not give correct results for relative or invalid URLs,
and the results may not even match common behavior of HTTP clients.
If URLs from untrusted input need to be parsed, extra validation is
required, e.g. by using filter_var() with the
FILTER_VALIDATE_URL
filter.
Note:
This function is intended specifically for the purpose of parsing URLs and not URIs. However, to comply with PHP's backwards compatibility requirements it makes an exception for the file:// scheme where triple slashes (file:///...) are allowed. For any other scheme this is invalid.
See Also
- pathinfo() - Returns information about a file path
- parse_str() - Parses the string into variables
- http_build_query() - Generate URL-encoded query string
- dirname() - Returns a parent directory's path
- basename() - Returns trailing name component of path
- » RFC 3986