I'm having an issue with a "test" stage that includes multiple Unit tests. In this case a set of tests for Net40\x86 and Net40\x64. I've un-checked the checkbox option on the Nunit screen "Fail action if any tests fail". I've also set the stage gate criteria to "$Stage.Metrics.UnitTests.Passed$ >= 78". This is because it's a legacy edition that's receiving a code patch. The x86 unit tests run and the results are
Passed = 83; Failed = 6; Error = 1.
The next Action (nunit) in the stage doesn't run. Is this because there is an "Error" in the unit test results or is it a bug in the Action logic?
If and when I get to running the 2 unit tests in the same stage, will the $Stage.Metrics.UnitTests.Passed$ variable add the two sets of results together?
I've setup Artifacts to collect the Unit Test results XML files with values which validate, and the files are getting generated, but not picked up in the artifact files tab for the build.
"The agent 'jdbne01' which was executing stage 'Test' has gone offline" .
On the agent machine I have series of Windows Event Logs with the following.
"The description for Event ID 0 from source VSTTExecution cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.If the event originated on another computer, the display information had to be saved with the event. The following information was included with the event: (devenv.exe, PID 13608, Thread 1) UIContextHelper.SetUIContext: The call to IVsAggregatableProject.GetCmdUIContextCookie failed. Error code: -2147467259 the message resource is present but the message is not found in the string/message table"
Coincidentally the Continua Visual Studio build action is occurring the "Test" stage prior to the Nunit actions running.
"The agent 'jdbne01' which was executing stage 'Test' has gone offline" .
I think this is definitely two separate issues. At first I thought, about the "offline" problem was it might be a PEBKAC where I was trying to build a .net 2.0 solution with .net 2.0/3.5/4.0 switches causing something to break. However it seems to randomly occur across all these different options and at different times. I had some +10 minute unit tests that might have been colliding with Repo updates (we try and do lots of very small code changes, to make branch merging easier). I think I'll need to put continua into some sort of detailed logging mode before I can get any useful feedback as to the cause of an agent going offline. How do we do that?
The eventlog error suggests to me that a VS addin has crashed. It’s very unusual for an agent to go offline during a build, that would indicate it has crashed (because the agent has a background pulse thread that talks to the server, the stage is run in a separate thread). We definitely want to get to the bottom of this.
The eventlog error suggests to me that a VS addin has crashed.
Log
Here is an example of a agent just stopping on stage 2 or 4. In this case the stage only has a single step that zipped some of the previous stages output files. It's about as simple as I think a stage could be.
Medium: 11:48:06 PM [Debug] Export >> Released read lock for path C:\ContinuaAgent\Repos\2fef14a9 on thread 20 Medium: 11:48:06 PM [Debug] Export >> End - files count=156 Medium: 11:48:06 PM [Debug] CopySourceToWorkspace >> files exported=156 Medium: 11:48:06 PM [Debug] CopySourceToWorkspace >> End Medium: 11:48:06 PM [Debug] SyncSourceFromServer >> End Medium: 11:48:06 PM [Debug] InitialiseWorkspaceOnAgent >> SyncWorkspaceFromServer - source=Continua.Modules.Builds.Common.Services.DataTransferObjects.TransportContextDTO Medium: 11:48:06 PM [Debug] SyncWorkspaceFromServer >> Start Medium: 11:48:06 PM [Debug] SyncWorkspaceFromServer >> CopyFilesFromServerToAgent - workspacePath=C:\ContinuaAgent\Ws\623 Medium: 11:48:07 PM [Debug] SyncWorkspaceFromServer >> End Medium: 11:48:07 PM [Debug] InitialiseWorkspaceOnAgent >> End Medium: 11:48:07 PM [Debug] Creating Agent Callback service Medium: 11:48:07 PM [Debug] Creating Agent Callback service, Create IP Client. Hostname: 'JDSRV03', Port: '9000'. Medium: 11:48:07 PM [Debug] Running action '7-Zip Create [$Workspace$\Output\JdData-$Build.Version$.zip] (SevenZipCreateAction)'. Medium: 11:48:07 PM [Debug] Creating Agent Callback service Medium: 11:48:07 PM [Debug] Creating Agent Callback service, Create IP Client. Hostname: 'JDSRV03', Port: '9000'. Medium: 11:48:09 PM [Debug] Creating Agent Callback service Medium: 11:48:09 PM [Debug] Creating Agent Callback service, Create IP Client. Hostname: 'JDSRV03', Port: '9000'. Medium: 11:48:09 PM [Debug] Calling back, stage done. Medium: 11:48:09 PM [Background Monitor] Stopping background process Agent Callback Proxy Medium: 11:48:09 PM [Debug] OnExecuteComplete Medium: 11:48:09 PM [Debug] 0 Runners active Medium: 11:48:09 PM [Debug] 0 Runners active Medium: 11:48:12 PM [Registration] Agent has been successfully registered with the Continua server, agent Id is 'b195eeb2-7056-42a4-a3c1-a16200bdcb35'. Medium: 11:48:42 PM [Debug] Registering Agent with Server Medium: 11:48:42 PM [Debug] Create ServiceClient with hostname and port in Register method: 'JDSRV03', Port: '9000'. Medium: 11:48:42 PM [Debug] Calling RegisterAgent Method... Medium: 11:48:43 PM [Debug] SSH Server connect: JDSRV03:9010. Medium: 11:49:04 PM [Registration] Agent has been successfully registered with the Continua server, agent Id is 'b195eeb2-7056-42a4-a3c1-a16200bdcb35'. Medium: 11:49:34 PM [Debug] Registering Agent with Server Medium: 11:49:34 PM [Debug] Create ServiceClient with hostname and port in Register method: 'JDSRV03', Port: '9000'. Medium: 11:49:34 PM [Debug] Calling RegisterAgent Method... Medium: 11:49:34 PM [Debug] SSH Server connect: JDSRV03:9010.
The log indicates normal behaviour on the agent. The agent callback process is stopped after the stage has completed. Did you get an "agent has gone offline" message in the timeline when running this stage? If so could you also post the server log file? Otherwise could you post the logs when running the stage with the multiple NUnit actions?
Posted By Jenasys Design Pty Ltd on 28 Jul 2013 08:34 AM Could it be as simple as using the "Devenv.com /SafeMode" command line option? http://msdn.microsoft.com/en-us/library/vstudio/xee0c8y7.aspx
I've added a SafeMode option for the next build (along with VS2013 support).
Neither of the logs showed anything unusual, did the build error out when that log was captured(doesn’t appear that way). We should have a new build out tomorrow, we’ll take a look at adding some extra logging.
I’ve been reviewing the build/logs and I think in the example logs sent for Nunit, it was an MS Build Error "error BC30009: Reference required to assembly " issue.
Debug is the most verbose level. We’ve been looking at the logs you sent and they look normal, it looks to us like the build completed. We’re in the process of adding a lot more logging, hope to have a new build out later todat with the extra logging.
The following includes the start of the Nunit action, in the agent log, but doesn't seem to go into any details around the logic. On the server side log is significantly larger and the end of the nunit action includes. I conclude that your correct, the "Action Failure" is overriding the stage gate settings
Medium: 4:19:00 PM [Debug] [BuildRunner.OnGetNextStage] : AutomaticallyPromote : True Medium: 4:19:00 PM [Debug] [BuildRunner.OnGetNextStage] : Running next stage. Medium: 4:19:00 PM [Debug] [BuildRunner.UpdateBuildStatus] : Allocating agent Medium: 4:19:00 PM [Debug] Publishing message of type 'BuildStateChangedMessage'. Medium: 4:19:00 PM [Debug] [BuildRunner.OnGetNextStage] : 691 Done Medium: 4:19:00 PM [Debug] Sending message of type 'BuildStateChangedMessage' to consumer of type 'NotificationController'. Medium: 4:19:00 PM [Debug] [BuildRunner.OnReservingAgent] : 691 Medium: 4:19:00 PM [Debug] [BuildRunner.OnReservingAgent] : Getting agent controller Medium: 4:19:00 PM [Debug] [BuildRunner.OnReservingAgent] : while !agentAndLicenseAllocated Medium: 4:19:00 PM [Debug] [BuildRunner.OnReservingAgent] : agent not yet allocated Medium: 4:19:00 PM [Debug] [BuildRunner.OnReservingAgent] : attempting to reserve agent Medium: 4:19:00 PM [Debug] [ServerAgentController.ReserveAgent] Medium: 4:19:00 PM [Debug] [ServerAgentController.ReserveAgent] done Medium: 4:19:00 PM [Debug] [BuildRunner.OnReservingAgent] : Assigning agent 'jdbne01 to stage 'Tests NET40x64' Medium: 4:19:00 PM [Debug] [BuildRunner.UpdateBuildStatus] : Agent 'jdbne01' reserved and license allocated for stage 'Tests NET40x64' Medium: 4:19:00 PM [Debug] Publishing message of type 'BuildStateChangedMessage'. Medium: 4:19:00 PM [Debug] [BuildRunner.UpdateStageStatus] : Stage Tests NET40x64 - StageReady Medium: 4:19:00 PM [Debug] Sending message of type 'BuildStateChangedMessage' to consumer of type 'NotificationController'. Medium: 4:19:00 PM [Debug] [BuildRunner.OnReservingAgent] : saving changes Medium: 4:19:00 PM [Debug] [BuildRunner.OnReservingAgent] : 691 Done Medium: 4:19:00 PM [Debug] [BuildRunner.OnExecutingStage] : 691 Medium: 4:19:00 PM [Debug] [BuildRunner.UpdateBuildStatus] : Initialising Stage... Medium: 4:19:00 PM [Debug] Publishing message of type 'BuildStateChangedMessage'. Medium: 4:19:00 PM [Debug] [BuildRunner.UpdateStageStatus] : Stage Tests NET40x64 - StageExecuting Medium: 4:19:00 PM [Debug] Sending message of type 'BuildStateChangedMessage' to consumer of type 'NotificationController'. Medium: 4:19:00 PM [Debug] [BuildRunner.OnExecutingStage] : geting previous duration Medium: 4:19:00 PM [Debug] [BuildRunner.OnExecutingStage] : opening log file Medium: 4:19:00 PM [Debug] Using a pooled session to log database 'C:\Continua\Ws\JD-Data\691\log'. Medium: 4:19:00 PM [Debug] [BuildRunner.OnExecutingStage] : log.OnStageBegin Medium: 4:19:00 PM [Debug] [BuildRunner.OnExecutingStage] : deserializing actions Medium: 4:19:00 PM [Debug] [BuildRunner.OnExecutingStage] getting ServerAgentController Medium: 4:19:00 PM [Debug] [BuildRunner.OnExecutingStage] agentController.Execute Medium: 4:19:00 PM [Debug] [ServerAgentController.Execute] Stage Tests NET40x64 Medium: 4:19:00 PM [Debug] [ServerAgentController.Execute] getting reserved agent for stage Medium: 4:19:00 PM [Debug] [ServerAgentController.Execute] agent seems to be alive, will check Medium: 4:19:00 PM [Debug] [ServerAgentController.Execute] refreshing agent Medium: 4:19:00 PM [Debug] [ServerAgentController.SetupAgentExecutionContext] stage Tests NET40x64
If this is the same project has you posted about before, the NUnit action is failing because of an test reporting an Error. I’m just about to post build 2370 which has a new option to Fail the action if tests Error, turn off that option and the build should continue.
I’ve been working on further complexities and evolution of the stages for this project to generate multiple combinations of .net framework and compiler plateforms, with matching unit tests. I can confirm, the unit test with an Error metric causes the problem for the progress of the Action/Stage. Thanks for adding new features for MS Build and Nunit, they have proven to be extremely helpful. Again, Vsoft demonstrating how software industry should be supporting software.
That may not be the next build… need to bootstrap the build server but it’s currently uploading a finalbuilder build. Will post on the forums later when a new build is available, apologies for the delay.