Bug 8579 - System.Text.Encoding.GetEncoding("UTF-16LE") or UTF-16 - BOM incompatibility
Summary: System.Text.Encoding.GetEncoding("UTF-16LE") or UTF-16 - BOM incompatibility
Status: RESOLVED FIXED
Alias: None
Product: Class Libraries
Classification: Mono
Component: System ()
Version: 2.10.x
Hardware: PC Linux
: --- normal
Target Milestone: Untriaged
Assignee: Bugzilla
URL:
Depends on:
Blocks:
 
Reported: 2012-11-23 17:44 UTC by Lukáš Fireš
Modified: 2012-11-26 08:57 UTC (History)
2 users (show)

Tags:
Is this bug a regression?: ---
Last known good build:

Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.


Please create a new report on GitHub or Developer Community with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:
Status:
RESOLVED FIXED

Description Lukáš Fireš 2012-11-23 17:44:38 UTC
UTF-16 means UTF-16LE with BOM, mono does not have UTF-16LE and its UTF-16 behaves like UTF-16LE.
I have encountered it while using NPOI and changed the code like this (MONO defined, GOOD_MONO not defined)
...using ubuntu 12.10 MonoDevelop 3.0.3.2, mono 2.10.8

       public static void PutUnicodeLE(String input, byte[] output, int offset)
        {
        #if !MONO
             byte[] bytes = Encoding.GetEncoding("UTF-16LE").GetBytes(input);
             Array.Copy(bytes, 0, output, offset, bytes.Length);
		#elif GOOD_MONO
             byte[] bytes = Encoding.GetEncoding("UTF-16").GetBytes(input);
             Array.Copy(bytes, 2, output, offset, bytes.Length - 2);
        #else
             byte[] bytes = Encoding.GetEncoding("UTF-16").GetBytes(input);
             Array.Copy(bytes, 0, output, offset, bytes.Length);
	#endif
        }
        public static void PutUnicodeLE(String input, ILittleEndianOutput out1)
        {
            byte[] bytes;
            try
            {
            #if !MONO
                bytes = Encoding.GetEncoding("UTF-16LE").GetBytes(input);
	    #else
                bytes = Encoding.GetEncoding("UTF-16").GetBytes(input);
	    #endif
            }
            catch (EncoderFallbackException)
            {
                throw;
            }
        #if !MONO || !GOOD_MONO
            out1.Write(bytes);
	#else
	    out1.Write(bytes, 2, bytes.Length - 2);
	#endif
        }
Comment 2 Marek Safar 2012-11-24 05:25:19 UTC
Could you provide actual code which is failing on Mono? When I tried to reproduce it with it worked for me just fine. My test code.

using System;
using System.Text;
using System.Linq;

public class Test
{
	public static void Main()
	{
		var enc = Encoding.GetEncoding("UTF-16").GetBytes ("A");
		var enc_le = Encoding.GetEncoding("UTF-16LE").GetBytes ("A");
		Console.WriteLine(Enumerable.SequenceEqual(enc, enc_le));

		return;
	}
}
Comment 3 Lukáš Fireš 2012-11-25 04:50:50 UTC
Encoding.GetEncoding("UTF-16LE") throws an exception (encoding does not exist) with me. I have provided the actual code above, it is from NPOI project for manipulating Excel files and I am obviously not first with such a problem:

http://npoi.codeplex.com/workitem/4547
Dear developers,
I get an error when running applications using NPOI under mono:

Unhandled Exception: System.ArgumentException: Encoding name 'UTF-16LE' not supported

Parameter name: name

at System.Text.Encoding.GetEncoding (System.String name) [0x00000]
at NPOI.Util.StringUtil.PutUnicodeLE (System.String input, System.Byte[] output, Int32 offset) [0x00000]

Thank you for help,

h2o



...did you compile it under Linux? What version of Linux and Mono?
Comment 4 Marek Safar 2012-11-25 04:59:14 UTC
I tested the code with Mono 3.0
Comment 5 Lukáš Fireš 2012-11-25 05:17:15 UTC
What version of Mono Runtime? What OS?

mono --version
Mono JIT compiler version 2.10.8.1 (Debian 2.10.8.1-5ubuntu1)
Copyright (C) 2002-2011 Novell, Inc, Xamarin, Inc and Contributors. www.mono-project.com
        TLS:           __thread
        SIGSEGV:       altstack
        Notifications: epoll
        Architecture:  amd64
        Disabled:      none
        Misc:          softdebug
        LLVM:          supported, not enabled.
        GC:            Included Boehm (with typed GC and Parallel Mark)

Started with new project:

using System; using System.Text;
namespace MonoEncBug { class MainClass {
public static void Main(string[] args) {
//	This should give me FOUR bytes starting with BOM (either FFFE or FEFF) followed by 4100
	byte[] utf16 = Encoding.GetEncoding("UTF-16").GetBytes("A");
	Console.Write("Got UTF-16, 'A' converts to {0:D} bytes:", utf16.Length);
	foreach(byte b in utf16) Console.Write(b.ToString("X2"));
	Console.WriteLine(); // ... but gives only TWO bytes
	
//	This should give me TWO bytes - 4100
	byte[] utf16le = Encoding.GetEncoding("UTF-16LE").GetBytes("A");
	Console.Write("Got UTF-16LE, 'A' converts to {0:D} bytes:", utf16le.Length);
	foreach(byte b in utf16le) Console.Write(b.ToString("X2"));
	Console.WriteLine(); // ... but crashes:
//	System.ArgumentException: Encoding name 'UTF-16LE' not supported
//	Parameter name: name
//	at System.Text.Encoding.GetEncoding (System.String name) [0x00000] in <filename unknown>:0
//	at MonoEncBug.MainClass.Main (System.String[] args) [0x00057] in /home/firda/devel/MonoTests/MonoEncBug/Main.cs:11
}}}
Comment 6 Marek Safar 2012-11-26 05:34:04 UTC
Already fixed in Mono 3.0
Comment 7 Lukáš Fireš 2012-11-26 06:21:24 UTC
Thx, I was confused that I cannot find any Mono 3.0 ... because it is BETA, not marked as stable :)
Comment 8 Lukáš Fireš 2012-11-26 08:57:52 UTC
P.S.: Using Encoding.Unicode works in both .NET and Mono 2.x - no need for conditional compilation ;)