Bug 54417 - Native linker fails when linking framework with large number of exported functions
Summary: Native linker fails when linking framework with large number of exported func...
Status: RESOLVED FIXED
Alias: None
Product: iOS
Classification: Xamarin
Component: Tools ()
Version: XI 10.4 (C9)
Hardware: PC Windows
: --- normal
Target Milestone: 15.3
Assignee: Rolf Bjarne Kvinge [MSFT]
URL:
Depends on:
Blocks:
 
Reported: 2017-04-03 19:52 UTC by Rich Zwaap
Modified: 2017-06-02 16:31 UTC (History)
4 users (show)

Tags:
Is this bug a regression?: ---
Last known good build:


Attachments
An example of a clang command that exceeds the argument size limit (186.52 KB, text/plain)
2017-04-03 19:52 UTC, Rich Zwaap
Details


Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.


Please create a new report on Developer Community or GitHub with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:
Status:
RESOLVED FIXED

Description Rich Zwaap 2017-04-03 19:52:51 UTC
Created attachment 21162 [details]
An example of a clang command that exceeds the argument size limit

Linking a native framework fails under the following conditions:

1.) The frameworks contain a very large number of exported functions
2.) These functions are made accessible to managed code via p/invoke (i.e. managed functions exist that are decorated with the DllImport attribute and whose signatures match the native exports)
3.) The linker behavior is set to "Don't link"

When these conditions are met, the native linking step of the build fails with the message "Native linking failed. Please review the build log."  However, the build logs do not contain any additional information.

When investigating further, I tried executing the clang command that was used by the Xamarin build process for linking the native framework independently of the build.  That immediately errored out with a message of "Argument list too long."  What appears to be happening is:

1.) When linker behavior is "Don't link," the Xamarin build process includes an argument of "-u _<function name>" for every DllImport
2.) The number of DllImports is large enough in this case that it exceeds the macOS limit on the size of commands, which is 262,144 bytes (256 KB)

I believe the behavior is not present when linker behavior is set to something other than "Don't link" because the Xamarin linking process analyzes the managed code paths within the app, and trims down the -u arguments to only include those DllImports which are contained within a code path.

These -u flags are ultimately passed down to the native linker process (ld), but, as best I can tell, they may not actually be required.  The man page for ld states the following about the -u flag:

> Specified [sic] that symbol <symbol_name> must be defined for the
> link to succeed. This is useful to force selected functions
> to be loaded from a static library

Some testing locally in Terminal also suggested that the -u flags can be omitted.  It appears to me their inclusion in the build acts as a mechanism to validate the existence of an appropriate native entry point for each managed DllImport.  But if they are, in fact, not required, then it should at least be possible for end users to specify that they be left out.  If such an option exists, I was not able to find it.

As it stands, this would a blocking issue for us in our upcoming release, as the list of DllImports in internal builds of our API assembly now exceed this limit.  While it is possible to work around this by setting linker behavior to something other than "Don't link," we don't believe it's reasonable to require this of our users.

An example of a clang command that runs into this issue is attached. If a recent build of our API is desired to help investigate the problem, please follow up with me by email.
Comment 1 Rolf Bjarne Kvinge [MSFT] 2017-04-04 08:29:45 UTC
(In reply to Rich Zwaap from comment #0)
> Created attachment 21162 [details]
> An example of a clang command that exceeds the argument size limit
> 
> Linking a native framework fails under the following conditions:
> 
> 1.) The frameworks contain a very large number of exported functions
> 2.) These functions are made accessible to managed code via p/invoke (i.e.
> managed functions exist that are decorated with the DllImport attribute and
> whose signatures match the native exports)
> 3.) The linker behavior is set to "Don't link"
> 
> When these conditions are met, the native linking step of the build fails
> with the message "Native linking failed. Please review the build log." 
> However, the build logs do not contain any additional information.
> 
> When investigating further, I tried executing the clang command that was
> used by the Xamarin build process for linking the native framework
> independently of the build.  That immediately errored out with a message of
> "Argument list too long."  What appears to be happening is:
> 
> 1.) When linker behavior is "Don't link," the Xamarin build process includes
> an argument of "-u _<function name>" for every DllImport
> 2.) The number of DllImports is large enough in this case that it exceeds
> the macOS limit on the size of commands, which is 262,144 bytes (256 KB)
> 
> I believe the behavior is not present when linker behavior is set to
> something other than "Don't link" because the Xamarin linking process
> analyzes the managed code paths within the app, and trims down the -u
> arguments to only include those DllImports which are contained within a code
> path.
> 
> These -u flags are ultimately passed down to the native linker process (ld),
> but, as best I can tell, they may not actually be required.  The man page
> for ld states the following about the -u flag:
> 
> > Specified [sic] that symbol <symbol_name> must be defined for the
> > link to succeed. This is useful to force selected functions
> > to be loaded from a static library
> 
> Some testing locally in Terminal also suggested that the -u flags can be
> omitted.  It appears to me their inclusion in the build acts as a mechanism
> to validate the existence of an appropriate native entry point for each
> managed DllImport.  But if they are, in fact, not required, then it should
> at least be possible for end users to specify that they be left out.  If
> such an option exists, I was not able to find it.

The -u flag we pass to the ld is to ensure that ld doesn't remove the native function (ld analyzes static libraries at link time and can remove unused code, but P/Invoke uses dlopen/dlsym to resolve methods, which ld doesn't see, so ld might end up removing P/Invoked functions otherwise).

A potential fix would be to generate a .c file that references all these symbols, and then compile and link that file as well. Incidentally we already have another case where we need to find an alternative for -u (bug #51710), so the this fix would work for both bugs.

> As it stands, this would a blocking issue for us in our upcoming release, as
> the list of DllImports in internal builds of our API assembly now exceed
> this limit.  While it is possible to work around this by setting linker
> behavior to something other than "Don't link," we don't believe it's
> reasonable to require this of our users.

I agree, we'll fix this (although at this point I don't think we'll be able to ship a fix until this summer).

> An example of a clang command that runs into this issue is attached. If a
> recent build of our API is desired to help investigate the problem, please
> follow up with me by email.

I think I'll be able to reproduce this myself (if your analysis (which is great btw) is correct, I will be), otherwise I'll ask for a test case.
Comment 2 Sebastien Pouliot 2017-04-04 13:40:52 UTC
@Rich, as a potential, short term workaround you can add

> [assembly:LinkerSafe]

into your assembly. That will apply the managed linker on your assembly, without requiring any change on the customer projects.
Comment 3 Rich Zwaap 2017-04-04 15:37:53 UTC
Thanks for the quick follow-up.  Regarding this:

> as a potential, short term workaround you can add
> [assembly:LinkerSafe]

We actually already decorate our assembly with the LinkerSafe attribute.  My understanding was that this enables the managed linker to run against our assembly when the linker behavior is set to "Link SDK assemblies only," but that it still does not run with behavior set to "Don't link."  This seems consistent with what we're seeing as well - we don't encounter this issue with "Link SDK assemblies only" or "Link all assemblies," but we do with "Don't link."
Comment 4 Rich Zwaap 2017-04-05 13:08:03 UTC
@Rolf I'm not sure exactly how this suggestion would work:

> A potential fix would be to generate a .c file that references all these symbols,
> and then compile and link that file as well.

Does referencing the symbols necessarily mean declaring invocations of the functions to which the symbols refer?  If so, that might be a rather difficult thing to assemble, as we've thousands of functions with widely varying argument types.

Assuming that's doable, how exactly would this .c file need to be compiled and linked?  Would it be included in the same native library as the symbols themselves, or would it need to be separate?  If separate, then I assume that would be linked with the Xamarin iOS app as we're already doing with our existing native framework.

Thinking through this a little, I assume the idea here is that creating references in native code will provide an alternative mechanism for informing the native linker to preserve these symbols.  That makes sense.  But let's suppose that we put together one massive native function that references all these symbols.  How would the native linker know not to strip away *that* function?  And, assuming the native linker does preserve this master symbol-referencing function, how could that information get back to the managed build so that it omits the troublesome -u flags?

We're certainly open to ideas on this end as to how we might get around it within our product (i.e. without requiring anything of end users).  It sounds like we'll be shipping our next release before the underlying issue can be fixed, and needless to say, we would much rather not go out with a known issue that apps referencing our API will not build with linker behavior set to "don't link."
Comment 5 Rolf Bjarne Kvinge [MSFT] 2017-04-05 13:41:23 UTC
(In reply to Rich Zwaap from comment #4)
> @Rolf I'm not sure exactly how this suggestion would work:
> 
> > A potential fix would be to generate a .c file that references all these symbols,
> > and then compile and link that file as well.
> 
> Does referencing the symbols necessarily mean declaring invocations of the
> functions to which the symbols refer?  If so, that might be a rather
> difficult thing to assemble, as we've thousands of functions with widely
> varying argument types.

No, you just have to use the symbol, the function signature doesn't matter:

extern void *pinvoke_method;
void dummy ()
{
	void *used_pinoke_method = pinvoke_method;
}

It gets a little bit more complicated when the symbols are Objective-C classes, but not much more.

> Assuming that's doable, how exactly would this .c file need to be compiled
> and linked?  Would it be included in the same native library as the symbols
> themselves, or would it need to be separate?  If separate, then I assume
> that would be linked with the Xamarin iOS app as we're already doing with
> our existing native framework.

Xamarin.iOS would automatically generate this file, and then link it into the final executable.
 
> Thinking through this a little, I assume the idea here is that creating
> references in native code will provide an alternative mechanism for
> informing the native linker to preserve these symbols.  That makes sense. 
> But let's suppose that we put together one massive native function that
> references all these symbols.  How would the native linker know not to strip
> away *that* function?  

Now you can pass "-u that_big_function" to ld :)

> And, assuming the native linker does preserve this
> master symbol-referencing function, how could that information get back to
> the managed build so that it omits the troublesome -u flags?

This is not a problem, Xamarin.iOS would omit the troublesome -u flags at the same time as it's generating the .c file.

> We're certainly open to ideas on this end as to how we might get around it
> within our product (i.e. without requiring anything of end users).  It
> sounds like we'll be shipping our next release before the underlying issue
> can be fixed, and needless to say, we would much rather not go out with a
> known issue that apps referencing our API will not build with linker
> behavior set to "don't link."

If you're OK with ugly and hacky solutions, you could rename your P/Invoke methods to be shorter, so that the command-line length (hopefully) doesn't go above the max limit.

Depending on your code, it might even be possible to automate this with few source changes:

1. First rewrite all of your P/Invokes to set EntryPoint to a constant:

    const string my_func_name = "my_func";

    [DllImport ("__Internal", EntryPoint=my_func_name)]
    static extern void my_func ();

2. Collect a list of all the P/Invokes:

    grep 'static extern' *.cs | <some awk/sed logic to remove everything but the name>

3. Use the list of P/Invokes to generate the constants:

    var list = new string [] { <list of functions> }
    for (int i = 0; i < list.Count; i++)
        Console.WriteLine ("const string {0}_name = \"f{1}\";", list [i], i);

4. Use the same list to generate a header file that you include in your native code, which renames all the functions:

    var list = new string [] { <list of functions> }
    for (int i = 0; i < list.Count; i++)
        Console.WriteLine ("#define {0} f{1}", list [i], i);

5. Include the header file in all your native source files, and the generated constants in your managed project.
Comment 6 Rich Zwaap 2017-04-05 14:05:30 UTC
Sorry, I'm a bit lost on this point:

> > Assuming that's doable, how exactly would this .c file need to be compiled
> > and linked?  Would it be included in the same native library as the symbols
> > themselves, or would it need to be separate?  If separate, then I assume
> > that would be linked with the Xamarin iOS app as we're already doing with
> > our existing native framework.


> Xamarin.iOS would automatically generate this file, and then link it into the
> final executable.

Xamarin.iOS would automatically generate the .c file?  I thought we would be creating the .c file with our master symbol-referencing function, and including that in a native library that we build.  If not, where should we be defining that symbol-referencing function?

Thanks very much for the assistance here.
Comment 7 Rolf Bjarne Kvinge [MSFT] 2017-04-05 14:09:55 UTC
(In reply to Rich Zwaap from comment #6)
> Sorry, I'm a bit lost on this point:
> 
> > > Assuming that's doable, how exactly would this .c file need to be compiled
> > > and linked?  Would it be included in the same native library as the symbols
> > > themselves, or would it need to be separate?  If separate, then I assume
> > > that would be linked with the Xamarin iOS app as we're already doing with
> > > our existing native framework.
> 
> 
> > Xamarin.iOS would automatically generate this file, and then link it into the
> > final executable.
> 
> Xamarin.iOS would automatically generate the .c file?  I thought we would be
> creating the .c file with our master symbol-referencing function, and
> including that in a native library that we build.  If not, where should we
> be defining that symbol-referencing function?

I'm sorry, I realize what I said earlier was confusing:

> A potential fix would be to generate a .c file that references all these
> symbols, and then compile and link that file as well. Incidentally we
> already have another case where we need to find an alternative for -u (bug
> #51710), so the this fix would work for both bugs.

What I meant here is that this is the fix *we* need to implement, in Xamarin.iOS, not you, in your app.

With the current version of Xamarin.iOS, there is no way to skip passing the -u flags to ld, which means creating a master function on your end wouldn't accomplish anything.
Comment 8 Rich Zwaap 2017-04-05 14:22:54 UTC
Ok, that makes much more sense now!  From what little I know, that does seem like a workable solution on your end.

Regarding the entry point renaming, that certainly could work.  We're already auto-generating the native exports and managed imports, so we could conceivably make a change at that level as a temporary solution.  The difficulty on our end is that our p/invokes are shared across all the platforms we support, and I don't think we'd want to apply that change globally.  So that would imply a different set of native/managed mappings for iOS as compared to our other platforms.  We already do a bit of that due to the requirement that reverse callbacks be static, but shortening all our symbols would obviously mean touching every single p/invoke.

Still, it's an option.  If anything else comes to mind, I'm all ears (or is that eyes?).
Comment 9 Rich Zwaap 2017-04-14 13:03:36 UTC
I just wanted to report back that getting around this by shortening the name of the native entry points is indeed working for us.  We may well get beyond this limit again as our API surface grows, but the workaround should be sufficient for our upcoming release.  Thanks again for the assistance.
Comment 10 Rolf Bjarne Kvinge [MSFT] 2017-04-17 09:35:45 UTC
Thanks for letting us know that it worked.
Comment 11 Rolf Bjarne Kvinge [MSFT] 2017-06-02 07:28:45 UTC
PR with fix: https://github.com/xamarin/xamarin-macios/pull/2162