我不得不为几周前编写的PHP类处理这个问题,最终得到了一个匹配任何类型字符串的正则表达式:有或没有URL方案,有或没有子域,youtube.com URL字符串,youtu.be URL字符串并处理各种参数排序。您可以在GitHub上查看它,或者只需复制并粘贴下面的代码块:
/**
* Check if input string is a valid YouTube URL
* and try to extract the YouTube Video ID from it.
* @author Stephan Schmitz <eyecatchup@gmail.com>
* @param $url string The string that shall be checked.
* @return mixed Returns YouTube Video ID, or (boolean) false.
*/
function parse_yturl($url)
{
$pattern = '#^(?:https?://|//)?(?:www\.|m\.)?(?:youtu\.be/|youtube\.com/(?:embed/|v/|watch\?v=|watch\?.+&v=))([\w-]{11})(?![\w-])#';
preg_match($pattern, $url, $matches);
return (isset($matches[1])) ? $matches[1] : false;
}
测试用例:https://3v4l.org/GEDT0
JavaScript 版本:https://stackoverflow.com/a/10315969/624466
为了解释正则表达式,这里有一个拆分版本:
/**
* Check if input string is a valid YouTube URL
* and try to extract the YouTube Video ID from it.
* @author Stephan Schmitz <eyecatchup@gmail.com>
* @param $url string The string that shall be checked.
* @return mixed Returns YouTube Video ID, or (boolean) false.
*/
function parse_yturl($url)
{
$pattern = '#^(?:https?://|//)?' # Optional URL scheme. Either http, or https, or protocol-relative.
. '(?:www\.|m\.)?' # Optional www or m subdomain.
. '(?:' # Group host alternatives:
. 'youtu\.be/' # Either youtu.be,
. '|youtube\.com/' # or youtube.com
. '(?:' # Group path alternatives:
. 'embed/' # Either /embed/,
. '|v/' # or /v/,
. '|watch\?v=' # or /watch?v=,
. '|watch\?.+&v=' # or /watch?other_param&v=
. ')' # End path alternatives.
. ')' # End host alternatives.
. '([\w-]{11})' # 11 characters (Length of Youtube video ids).
. '(?![\w-])#'; # Rejects if overlong id.
preg_match($pattern, $url, $matches);
return (isset($matches[1])) ? $matches[1] : false;
}