Re: Specification for UTF-8

From: Scott Ferguson <ferg@xxx.com>
Date: Fri May 12 2006 - 08:09:12 PDT

On May 12, 2006, at 3:32 AM, Petr Gladkikh wrote:

> Hello.
>
> Current Hessian specification is not clear about lengths of UTF-8
> encoded data.
> It is reasonable to think that binary data, XML and strings are all
> measured in number of octets in a chunk to be sent/read. And I belive
> it is the "right thing". But Java implementation sends number of
> symbols in original string not number of octets in encoded data.
> I think this should be at least be clarified in the specification what
> are the units in which length is measured. (Although I vote for number
> of octets of course :)

It's length in 16-bit characters for strings and XML. That reduces
the computation needed on both ends when the language represents the
string with characters like Java (as opposed to an encoded byte array).

-- Scott

>
>
> --
> Petr Gladkikh
>
Received on Fri 12 May 2006 08:09:12 -0700

This archive was generated by hypermail 2.1.8 : Thu Sep 28 2006 - 20:16:41 PDT