有没有办法强制Google Speech api只返回单词作为响应？

android java speech-recognition google-speech-api

2022-09-02 22:40:26

我正在使用谷歌这个api：-

https://www.google.com/speech-api/v2/recognize?output=json&lang=“+ language_code+”&key=“My key”

对于语音识别，它工作得很好。

问题在于数字，即，如果我说结果将是，如果我说结果仍然是。one two three four1234one thousand two hundred thirty four1234

另一个问题是，对于其他语言，即德语中的单词是。如果你说结果是，而不是精灵。elfelevenelf11

我知道我们无法控制api，但是我们可以向此API添加任何参数或技巧以强制它仅返回单词。

响应有时具有正确的结果，但并非总是如此。

这些是示例响应

1）当我说“一二三四”时

{"result":[{"alternative":[{"transcript":"1234","confidence":0.47215959},{"transcript":"1 2 3 4","confidence":0.25},{"transcript":"one two three four","confidence":0.25},{"transcript":"1 2 34","confidence":0.33333334},{"transcript":"1 to 34","confidence":1}],"final":true}],"result_index":0}

2）当我说“一千二百三十四”时

{"result":[{"alternative":[{"transcript":"1234","confidence":0.94247383},{"transcript":"1.254","confidence":1},{"transcript":"1284","confidence":1},{"transcript":"1244","confidence":1},{"transcript":"1230 4","confidence":1}],"final":true}],"result_index":0}

我做了什么。

检查结果是否为数字，然后按空格拆分每个数字，并检查结果数组中是否存在相同的序列。例如，结果1234变为1 2 3 4，并将搜索结果数组中是否存在相似的序列，然后将其转换为单词。在第二种情况下，没有1 2 3 4，因此将坚持原始结果。

这是代码。

 String numberPattern = "[0-9]";
  Pattern r1 = Pattern.compile(numberPattern);
  Matcher m2 = r1.matcher(output);
  if (m2.find()) {
      char[] digits2 = output.toCharArray();
      String digit = "";
      for (char c: digits2) {
          digit += c + " ";
      }

      for (int i = 1; i < jsonArray2.length(); i++) {
          String value = jsonArray2.getJSONObject(i).getString("transcript");
          if (digit.trim().equals(value.trim())) {
              output = digit + " ";
          }
      }
  }

所以问题是，当我“说十三四八”时，这种方法会将13拆分为一个三，因此不是一个可靠的解决方案。

更新

我尝试了新的云视觉api（https://cloud.google.com/speech/），它比v2好一点。的结果是在单词本身，我的解决方法也有效。但是当我说它仍然与v2中的结果相同时。one two three fourthirteen four eight

而且精灵在德语中仍然是11。

也尝试过，也没有用。speech_context

答案 1

看看这个问题和答案。

您可以为 API 提供“语音上下文”提示，如下所示：

"speech_context": {
  "phrases":["zero", "one", "two", ... "nine", "ten", "eleven", ... "twenty", "thirty,..., "ninety"]
 }

我想这也适用于其他语言，比如德语。

"speech_context": {
  "phrases":["eins", "zwei", "drei", ..., "elf", "zwölf" ... ]
 }

答案 2

您可能需要自己将数字（而不是数字）转换为单词。由于大多数语言（例如英语，德语）中存在一些逻辑，因此您可以使用算法方法执行此操作。

请参阅如何在java中将数字转换为单词