Bug 42618 - Accumulation of sshd instances on OSX build host resulting in eventual build failures.
Summary: Accumulation of sshd instances on OSX build host resulting in eventual build ...
Status: VERIFIED FIXED
Alias: None
Product: Visual Studio Extensions
Classification: Xamarin
Component: XMA ()
Version: unspecified
Hardware: PC Windows
: High normal
Target Milestone: 15.4
Assignee: Bugzilla
URL:
Depends on:
Blocks:
 
Reported: 2016-07-17 18:05 UTC by Brian Berry
Modified: 2017-08-25 21:09 UTC (History)
14 users (show)

Tags:
Is this bug a regression?: ---
Last known good build:


Attachments
Console Log from Mac Build Server (744.24 KB, image/png)
2016-08-11 20:29 UTC, Ace Olszowka
Details


Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.


Please create a new report on Developer Community with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:
Status:
VERIFIED FIXED

Description Brian Berry 2016-07-17 18:05:48 UTC
Greetings!

Running into some issues with eventual remote build host failures due to apparent sshd saturation.   There are a few variables in the mix here, so bear with me.

Visual Studio 2015 Community Edition
Xamarin 4.2.0.21 (a532c82)
Xamarin.iOS (3afb4af)

Steps to reproduce:
  * Create a new iOS library project.
  * On your Mac build host, establish a shell session and watch "top" or your favorite process view.
  * Repeatedly build the iOS library target on the PC.
  * Be mindful to ignore any builds that did not run remotely, e.g.:
       warning VSX1000: The project XXXX was built while disconnected from a Mac agent, so only the
       main assembly was compiled. Connect to a Xamarin Mac build agent to build the full application.

Observed:
  * An accumulation of sshd instances on the remote host when the builds do run remotely.

Expected:
  * Cleanup after each build to the original base state (note that a few sessions exist even before
    any builds due to Xamarin's initial build host connection at IDE/project load.

Notes:
  * There appear to be retained msbuild.exe instances.  Killing said instances results in the expected cleanup.
  * I *do* have MSBUILDDISABLENODEREUSE=1 set globally to prevent msbuild.exe node reuse by VS itself.
  * When building non-Xamarin.iOS targets, I see msbuild.exe processes terminate as expected.
  * Suspicion here that there might be cleanup issues in Xamarin.iOS-related tasks that prevent node
    termination, perhaps related to cleanup of Renci.SshNet elements themselves.

I created a simple testbed that shows proper cleanup of remote Renci.SshNet connections provided that the
SshClient instances (and command created with them) are properly disposed (e.g. using semantics).  The
remote connections remain if the former are not disposed---so something to look at there.

(The above test was a simple console app---I may also create a simple test in MSBuild task form to see
if for some reason there is something unique/misbehaved about Renci.SshNet use in that environment.)

Local workarounds are periodic kills of resident MSBuild.exe processes, restarts of VS, etc.
Of course, this is not an ideal remedy.

Cheers.
Comment 1 Brian Berry 2016-07-17 18:20:40 UTC
Additional note: a simple test to employ Renci.SshNet from an MSBuild task servicing a non-Xamarin.iOS target shows proper session cleanup/msbuild.exe termination with a session/command pair properly disposed.
Comment 2 Daniel Cazzulino 2016-08-11 12:40:51 UTC
thanks for reporting and the very helpful and detailed repro steps and workarounds you have in place. 

We'll be looking into this right away, and it should make it into our next major release (cycle8).

we'll keep you posted. Thanks again!
Comment 3 Ace Olszowka 2016-08-11 20:29:14 UTC
Created attachment 16995 [details]
Console Log from Mac Build Server

We believe we're seeing similar behavior; is there anything we can to do provide more logging or capture something to help resolve the issue? Groveling in the Console log I found this, unsure what exit code 255 out of opensshd is, but probably not good.
Comment 4 mag@xamarin.com 2016-08-18 17:08:10 UTC
The sshd instances are consequence of orphan MSBuild instances in Windows side. They maintain an active SSH connection, that keeps the sshd processes running.

We have introduced some fixes to avoid these orphan MSBuild instances, that might affect directly on this bug. For more details on the MSBUild issue, see this related bug: https://bugzilla.xamarin.com/show_bug.cgi?id=42717

The fixes should be available on our cycle8 release, from 4.2.0.541 version.

Please let us know if it solves your issue.
Comment 7 Brian Berry 2016-09-19 00:49:17 UTC
Still repros here under Xamarin 4.2.0.680 (c4382f5).

Additionally, I have a question about comments above (mag@xamarin.com's last):  While there may be issues with MSBuild.exe node reuse (with tip VS2015, it does not appear my MSBUILDDISABLENODEREUSE setting is honored), MSBuild.exe itself is not the connection initiator/maintainer here---it is the Xamarin.iOS-related task code (that, by nature of the hosting MSBuild process sticking around) maintains said connection.

I would think that there should be a path by which any task, given the knowledge that hosting nodes may be reused, need take extra care to ensure all resources are released (in this case, any established connections).

If you have any additional info on why the connections are not released (Renci.SshNet bug?  Other?), that might help steer me toward helping to identify a more durable fix.

Cheers.
Comment 8 Brian Berry 2016-09-19 00:50:58 UTC
(Additional note:  I had upgraded the OSX build host to macOS sierra RTM in the meantime, in case that somehow is related to any new misbehavior.)
Comment 10 mag@xamarin.com 2017-06-27 15:10:41 UTC
We have identified an issue within our build connection layer, which was causing CPU overhead and also some issues on the disposing and closing chain of the resources.

This is the bug related to that fix, that would also fix the remaining things for this bug: https://bugzilla.xamarin.com/show_bug.cgi?id=57339

This is the last commit of the series of fixes we did for this case and the bug above: b586e5bc2a59e01f5e14ff32a6e6995238bae433

After applying these changes, we didn't observe any alive sshd instance after closing VS.

One important thing to comment is that the MSBuild instance and the corresponding sshd instance on the Mac will be alive from the first time you build until you close VS, because we maintain that instance and connection alive during the life of the parent process, which is VS in this case. Also, only one connection is maintained and not one per build.

So, the right way of testing this fix is that after you close all the VS instances that are building against a Mac, then you don't see any orphan MSBuild or sshd instance running.

The fix should be available for the next 15.3 refresh.