2009年12月7日 星期一

JSON String Encoding

最近在找關於JSON string encoding的時候額外發現一個有趣的技巧可以分辨Unicode家族。
JSON text SHALL be encoded in Unicode. The default encoding is UTF-8. Since the first two characters of a JSON text will always be ASCII characters [RFC0020], it is possible to determine whether an octet stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking at the pattern of nulls in the first four octets.
00 00 00 xx UTF-32BE 
00 xx 00 xx UTF-16BE 
xx 00 00 00 UTF-32LE 
xx 00 xx 00 UTF-16LE 
xx xx xx xx UTF-8
Read more: http://www.faqs.org/rfcs/rfc4627.html#ixzz0V5v8SI97

沒有留言:

張貼留言