Java 中的协议缓冲区分隔 I/O 函数是否有C++等效项？

serialization java c++ protocol-buffers

2022-08-31 14:31:11

我正在尝试从文件中读取/写入多个协议缓冲区消息，包括C++和Java。Google建议在消息之前写上长度前缀，但默认情况下没有办法这样做（我可以看到）。

但是，2.1.0 版中的 Java API 接收了一组“分隔”I/O 函数，这些函数显然可以完成这项工作：

parseDelimitedFrom
mergeDelimitedFrom
writeDelimitedTo

是否有C++等效项？如果没有，Java API 附加的大小前缀的连线格式是什么，这样我就可以C++解析这些消息？

更新：

从 v3.3.0 开始，它们现在存在于 google/protobuf/util/delimited_message_util.h 中。

答案 1

我在这里的聚会有点晚了，但下面的实现包括其他答案中缺少的一些优化，并且在64MB的输入后不会失败（尽管它仍然对每条消息强制执行64MB的限制，只是不是在整个流上）。

（我是C++和Java protobuf库的作者，但我不再为Google工作。很抱歉，这段代码从未进入官方的lib。如果有的话，这就是它的样子。

bool writeDelimitedTo(
    const google::protobuf::MessageLite& message,
    google::protobuf::io::ZeroCopyOutputStream* rawOutput) {
  // We create a new coded stream for each message.  Don't worry, this is fast.
  google::protobuf::io::CodedOutputStream output(rawOutput);

  // Write the size.
  const int size = message.ByteSize();
  output.WriteVarint32(size);

  uint8_t* buffer = output.GetDirectBufferForNBytesAndAdvance(size);
  if (buffer != NULL) {
    // Optimization:  The message fits in one buffer, so use the faster
    // direct-to-array serialization path.
    message.SerializeWithCachedSizesToArray(buffer);
  } else {
    // Slightly-slower path when the message is multiple buffers.
    message.SerializeWithCachedSizes(&output);
    if (output.HadError()) return false;
  }

  return true;
}

bool readDelimitedFrom(
    google::protobuf::io::ZeroCopyInputStream* rawInput,
    google::protobuf::MessageLite* message) {
  // We create a new coded stream for each message.  Don't worry, this is fast,
  // and it makes sure the 64MB total size limit is imposed per-message rather
  // than on the whole stream.  (See the CodedInputStream interface for more
  // info on this limit.)
  google::protobuf::io::CodedInputStream input(rawInput);

  // Read the size.
  uint32_t size;
  if (!input.ReadVarint32(&size)) return false;

  // Tell the stream not to read beyond that size.
  google::protobuf::io::CodedInputStream::Limit limit =
      input.PushLimit(size);

  // Parse the message.
  if (!message->MergeFromCodedStream(&input)) return false;
  if (!input.ConsumedEntireMessage()) return false;

  // Release the limit.
  input.PopLimit(limit);

  return true;
}

答案 2

好吧，所以我无法找到实现我需要的顶级C++函数，但是通过Java API参考进行的一些洞穴探险在MessageLite接口中发现了以下内容：

void writeDelimitedTo(OutputStream output)
/*  Like writeTo(OutputStream), but writes the size of 
    the message as a varint before writing the data.   */

因此，Java 大小前缀是（协议缓冲区）变体！

有了这些信息，我翻阅了C++ API，找到了CodedStream头文件，其中包含以下内容：

bool CodedInputStream::ReadVarint32(uint32 * value)
void CodedOutputStream::WriteVarint32(uint32 value)

使用这些，我应该能够滚动自己的C++完成这项工作的功能。

不过，他们应该真的把它添加到主消息API中;考虑到Java拥有它，它缺少功能，Marc Gravell出色的protobuf-net C#端口也是如此（通过SerializeWithLengthPrefix和DeserializeWithLengthPrefix）。