Spark 和 Not Serialize DateTimeFormatter
2022-09-01 16:56:21
我正在尝试在Spark中使用java.time.format中的DateTimeFormatter,但它似乎不可序列化。这是相关的代码块:
val pattern = "<some pattern>".r
val dtFormatter = DateTimeFormatter.ofPattern("<some non-ISO pattern>")
val logs = sc.wholeTextFiles(path)
val entries = logs.flatMap(fileContent => {
val file = fileContent._1
val content = fileContent._2
content.split("\\r?\\n").map(line => line match {
case pattern(dt, ev, seq) => Some(LogEntry(LocalDateTime.parse(dt, dtFormatter), ev, seq.toInt))
case _ => logger.error(s"Cannot parse $file: $line"); None
})
})
如何避免异常?有没有更好的库来解析时间戳?我读到Joda也是不可序列化的,并且已被合并到Java 8的时间库中。java.io.NotSerializableException: java.time.format.DateTimeFormatter