PHP 会分解字符串,但将引号中的单词视为单个单词

2022-08-30 10:19:36

如何分解以下字符串:

Lorem ipsum "dolor sit amet" consectetur "adipiscing elit" dolor

array("Lorem", "ipsum", "dolor sit amet", "consectetur", "adipiscing elit", "dolor")

因此,引号中的文本被视为单个单词。

以下是我现在拥有的内容:

$mytext = "Lorem ipsum %22dolor sit amet%22 consectetur %22adipiscing elit%22 dolor"
$noquotes = str_replace("%22", "", $mytext");
$newarray = explode(" ", $noquotes);

但是我的代码将每个单词划分为一个数组。如何将引号内的单词视为一个单词?


答案 1

您可以使用 :preg_match_all(...)

$text = 'Lorem ipsum "dolor sit amet" consectetur "adipiscing \\"elit" dolor';
preg_match_all('/"(?:\\\\.|[^\\\\"])*"|\S+/', $text, $matches);
print_r($matches);

这将产生:

Array
(
    [0] => Array
        (
            [0] => Lorem
            [1] => ipsum
            [2] => "dolor sit amet"
            [3] => consectetur
            [4] => "adipiscing \"elit"
            [5] => dolor
        )

)

正如你所看到的,它还解释了引号字符串内的转义引号。

编辑

简短的解释:

"           # match the character '"'
(?:         # start non-capture group 1 
  \\        #   match the character '\'
  .         #   match any character except line breaks
  |         #   OR
  [^\\"]    #   match any character except '\' and '"'
)*          # end non-capture group 1 and repeat it zero or more times
"           # match the character '"'
|           # OR
\S+         # match a non-whitespace character: [^\s] and repeat it one or more times

如果匹配而不是双引号,你可以这样做:%22

preg_match_all('/%22(?:\\\\.|(?!%22).)*%22|\S+/', $text, $matches);

答案 2

使用str_getcsv()会容易得多。

$test = 'Lorem ipsum "dolor sit amet" consectetur "adipiscing elit" dolor';
var_dump(str_getcsv($test, ' '));

为您提供

array(6) {
  [0]=>
  string(5) "Lorem"
  [1]=>
  string(5) "ipsum"
  [2]=>
  string(14) "dolor sit amet"
  [3]=>
  string(11) "consectetur"
  [4]=>
  string(15) "adipiscing elit"
  [5]=>
  string(5) "dolor"
}

推荐