我制定了以下表达式来匹配各种外壳和擒纵机构:
$pattern = <<<REGEX
/
(?:
" ((?:(?<=\\\\)"|[^"])*) "
|
' ((?:(?<=\\\\)'|[^'])*) '
|
(\S+)
)
/x
REGEX;
preg_match_all($pattern, $input, $matches, PREG_SET_ORDER);
它匹配:
- 两个双引号,其中一个双引号可以转义
- 与 #1 相同,但适用于单引号
- 不带引号的字符串
之后,您需要(小心地)删除转义字符:
$args = array();
foreach ($matches as $match) {
if (isset($match[3])) {
$args[] = $match[3];
} elseif (isset($match[2])) {
$args[] = str_replace(['\\\'', '\\\\'], ["'", '\\'], $match[2]);
} else {
$args[] = str_replace(['\\"', '\\\\'], ['"', '\\'], $match[1]);
}
}
print_r($args);
更新
为了好玩,我写了一个更正式的解析器,概述如下。它不会给你更好的性能,它比正则表达式慢三倍,主要是因为它是面向对象的。我认为优势更多的是学术而不是实践:
class ArgvParser2 extends StringIterator
{
const TOKEN_DOUBLE_QUOTE = '"';
const TOKEN_SINGLE_QUOTE = "'";
const TOKEN_SPACE = ' ';
const TOKEN_ESCAPE = '\\';
public function parse()
{
$this->rewind();
$args = [];
while ($this->valid()) {
switch ($this->current()) {
case self::TOKEN_DOUBLE_QUOTE:
case self::TOKEN_SINGLE_QUOTE:
$args[] = $this->QUOTED($this->current());
break;
case self::TOKEN_SPACE:
$this->next();
break;
default:
$args[] = $this->UNQUOTED();
}
}
return $args;
}
private function QUOTED($enclosure)
{
$this->next();
$result = '';
while ($this->valid()) {
if ($this->current() == self::TOKEN_ESCAPE) {
$this->next();
if ($this->valid() && $this->current() == $enclosure) {
$result .= $enclosure;
} elseif ($this->valid()) {
$result .= self::TOKEN_ESCAPE;
if ($this->current() != self::TOKEN_ESCAPE) {
$result .= $this->current();
}
}
} elseif ($this->current() == $enclosure) {
$this->next();
break;
} else {
$result .= $this->current();
}
$this->next();
}
return $result;
}
private function UNQUOTED()
{
$result = '';
while ($this->valid()) {
if ($this->current() == self::TOKEN_SPACE) {
$this->next();
break;
} else {
$result .= $this->current();
}
$this->next();
}
return $result;
}
public static function parseString($input)
{
$parser = new self($input);
return $parser->parse();
}
}
它基于一次一个字符遍历字符串:StringIterator
class StringIterator implements Iterator
{
private $string;
private $current;
public function __construct($string)
{
$this->string = $string;
}
public function current()
{
return $this->string[$this->current];
}
public function next()
{
++$this->current;
}
public function key()
{
return $this->current;
}
public function valid()
{
return $this->current < strlen($this->string);
}
public function rewind()
{
$this->current = 0;
}
}