Android Java UTF-8 HttpClient Problem

2022-09-03 03:43:20

我在从网页抓取的JSON数组中遇到了奇怪的字符编码问题。服务器正在发回此标头:

内容类型文本/javascript;字符集 = UTF-8

此外,我还可以查看Firefox或任何浏览器和Unicode字符正确显示的JSON输出。响应有时将包含来自另一种语言的带有重音符号等的单词。然而,当我把它拉下来并把它放在Java中的字符串时,我得到了那些奇怪的问号。这是我的代码:

HttpParams params = new BasicHttpParams();
HttpProtocolParams.setVersion(params, HttpVersion.HTTP_1_1);
HttpProtocolParams.setContentCharset(params, "utf-8");
params.setBooleanParameter("http.protocol.expect-continue", false);

HttpClient httpclient = new DefaultHttpClient(params);

HttpGet httpget = new HttpGet("http://www.example.com/json_array.php");
HttpResponse response;
    try {
        response = httpclient.execute(httpget);

        if(response.getStatusLine().getStatusCode() == 200){
            // Connection was established. Get the content. 

            HttpEntity entity = response.getEntity();
            // If the response does not enclose an entity, there is no need
            // to worry about connection release

            if (entity != null) {
                // A Simple JSON Response Read
                InputStream instream = entity.getContent();
                String jsonText = convertStreamToString(instream);

                Toast.makeText(getApplicationContext(), "Response: "+jsonText, Toast.LENGTH_LONG).show();

            }

        }


    } catch (MalformedURLException e) {
        Toast.makeText(getApplicationContext(), "ERROR: Malformed URL - "+e.getMessage(), Toast.LENGTH_LONG).show();
        e.printStackTrace();
    } catch (IOException e) {
        Toast.makeText(getApplicationContext(), "ERROR: IO Exception - "+e.getMessage(), Toast.LENGTH_LONG).show();
        e.printStackTrace();
    } catch (JSONException e) {
        Toast.makeText(getApplicationContext(), "ERROR: JSON - "+e.getMessage(), Toast.LENGTH_LONG).show();
        e.printStackTrace();
    }

private static String convertStreamToString(InputStream is) {
    /*
     * To convert the InputStream to String we use the BufferedReader.readLine()
     * method. We iterate until the BufferedReader return null which means
     * there's no more data to read. Each line will appended to a StringBuilder
     * and returned as String.
     */
    BufferedReader reader;
    try {
        reader = new BufferedReader(new InputStreamReader(is, "UTF-8"));
    } catch (UnsupportedEncodingException e1) {
        // TODO Auto-generated catch block
        e1.printStackTrace();
    }
    StringBuilder sb = new StringBuilder();

    String line;
    try {
        while ((line = reader.readLine()) != null) {
            sb.append(line + "\n");
        }
    } catch (IOException e) {
        e.printStackTrace();
    } finally {
        try {
            is.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
    return sb.toString();
}

如您所见,我在 InputStreamReader 上指定 UTF-8,但每次我通过 Toast 查看返回的 JSON 文本时,它都有奇怪的问号。我想我需要将输入流发送到一个字节[]?

提前感谢您的任何帮助。


答案 1

试试这个:

if (entity != null) {
    // A Simple JSON Response Read
    // InputStream instream = entity.getContent();
    // String jsonText = convertStreamToString(instream);

    String jsonText = EntityUtils.toString(entity, HTTP.UTF_8);

    // ... toast code here
}

答案 2

@Arhimed的答案是解决方案。但是我看不出你的代码有什么明显的错误。convertStreamToString

我的猜测是:

  1. 服务器将 UTF 字节顺序标记 (BOM) 放在流的开头。标准的 Java UTF-8 字符解码器不会删除 BOM,因此它最终可能会出现在生成的字符串中。(但是,EntityUtils 的代码似乎也没有对 BOM 执行任何操作。
  2. 您正在一次读取一行字符流,并使用硬连线作为行尾标记重新组合它。如果要将其写入外部文件或应用程序,则可能应该使用特定于平台的行尾标记。convertStreamToString'\n'

推荐