Build 2324 - Nunit Actions - Failed vs Error "Fail Action if any tests Fail" option

G'day,

I'm having an issue with a "test" stage that includes multiple Unit tests. In this case a set of tests for Net40\x86 and Net40\x64. I've un-checked the checkbox option on the Nunit screen "Fail action if any tests fail". I've also set the stage gate criteria to "$Stage.Metrics.UnitTests.Passed$ >= 78". This is because it's a legacy edition that's receiving a code patch. The x86 unit tests run and the results are

Passed = 83; Failed = 6; Error = 1.

The next Action (nunit) in the stage doesn't run. Is this because there is an "Error" in the unit test results or is it a bug in the Action logic?

If and when I get to running the 2 unit tests in the same stage, will the $Stage.Metrics.UnitTests.Passed$ variable add the two sets of results together?

I've setup Artifacts to collect the Unit Test results XML files with values which validate, and the files are getting generated, but not picked up in the artifact files tab for the build.

nunit | $Workspace$\Output\UnitTests.$Build.Version$\TestResults_Net40_x64.xml
nunit | $Workspace$\Output\UnitTests.$Build.Version$\TestResults_Net40_x86.xml

Example

Working Directory: C:\Continua_WS\Ws\569
Executable: C:\Program Files (x86)\NUnit 2.6.2\bin\nunit-console.exe
Arguments: C:\Continua_WS\Ws\569\Output\UnitTests.2.1.0.82\net40\x64\JDData.Nunit.dll /framework=net-4.0 /xml=C:\Continua_WS\Ws\569\Output\UnitTests.2.1.0.82\TestResults_Net40_x64.xml /nologo

The Continua timeline view includes a line

"The agent 'jdbne01' which was executing stage 'Test' has gone offline" .

On the agent machine I have series of Windows Event Logs with the following.

 
"The description for Event ID 0 from source VSTTExecution cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.If the event originated on another computer, the display information had to be saved with the event.
The following information was included with the event: (devenv.exe, PID 13608, Thread 1) UIContextHelper.SetUIContext: The call to IVsAggregatableProject.GetCmdUIContextCookie failed. Error code: -2147467259 the message resource is present but the message is not found in the string/message table"

Coincidentally the Continua Visual Studio build action is occurring the "Test" stage prior to the Nunit actions running.

The two issue might be separate as the VSTTExecution Window Event logs have only occurred since Build 2324.

Posted By Jenasys Design Pty Ltd on 25 Jul 2013 12:34 AM

The Continua timeline view includes a line

"The agent 'jdbne01' which was executing stage 'Test' has gone offline" .

 

I think this is definitely two separate issues. At first I thought, about the "offline" problem was it might be a PEBKAC where I was trying to build a .net 2.0 solution with .net 2.0/3.5/4.0 switches causing something to break. However it seems to randomly occur across all these different options and at different times. I had some +10 minute unit tests that might have been colliding with Repo updates (we try and do lots of very small code changes, to make branch merging easier).  I think I'll need to put continua into some sort of detailed logging mode before I can get any useful feedback as to the cause of an agent going offline. How do we do that?

Attachment unavailable

Hi Jamie

The eventlog error suggests to me that a VS addin has crashed. It’s very unusual for an agent to go offline during a build, that would indicate it has crashed (because the agent has a background pulse thread that talks to the server, the stage is run in a separate thread). We definitely want to get to the bottom of this.

This page in the wiki has some info on enabling debug logging : http://wiki.finalbuilder.com/display/continua/Troubleshooting

Could it be as simple as using the “Devenv.com /SafeMode” command line option? http://msdn.microsoft.com/en-us/library/vstudio/xee0c8y7.aspx

Posted By Vincent Parrett on 28 Jul 2013 08:22 AM
Hi Jamie

The eventlog error suggests to me that a VS addin has crashed.


Log

Here is an example of a agent just stopping on stage 2 or 4. In this case the stage only has a single step that zipped some of the previous stages output files. It's about as simple as I think a stage could be.

 

Medium: 11:48:06 PM [Debug] Export >> Released read lock for path C:\ContinuaAgent\Repos\2fef14a9 on thread 20
Medium: 11:48:06 PM [Debug] Export >> End - files count=156
Medium: 11:48:06 PM [Debug] CopySourceToWorkspace >> files exported=156
Medium: 11:48:06 PM [Debug] CopySourceToWorkspace >> End
Medium: 11:48:06 PM [Debug] SyncSourceFromServer >> End
Medium: 11:48:06 PM [Debug] InitialiseWorkspaceOnAgent >> SyncWorkspaceFromServer - source=Continua.Modules.Builds.Common.Services.DataTransferObjects.TransportContextDTO
Medium: 11:48:06 PM [Debug] SyncWorkspaceFromServer >> Start
Medium: 11:48:06 PM [Debug] SyncWorkspaceFromServer >> CopyFilesFromServerToAgent - workspacePath=C:\ContinuaAgent\Ws\623
Medium: 11:48:07 PM [Debug] SyncWorkspaceFromServer >> End
Medium: 11:48:07 PM [Debug] InitialiseWorkspaceOnAgent >> End
Medium: 11:48:07 PM [Debug] Creating Agent Callback service
Medium: 11:48:07 PM [Debug] Creating Agent Callback service, Create IP Client. Hostname: 'JDSRV03', Port: '9000'.
Medium: 11:48:07 PM [Debug] Running action '7-Zip Create [$Workspace$\Output\JdData-$Build.Version$.zip] (SevenZipCreateAction)'.
Medium: 11:48:07 PM [Debug] Creating Agent Callback service
Medium: 11:48:07 PM [Debug] Creating Agent Callback service, Create IP Client. Hostname: 'JDSRV03', Port: '9000'.
Medium: 11:48:09 PM [Debug] Creating Agent Callback service
Medium: 11:48:09 PM [Debug] Creating Agent Callback service, Create IP Client. Hostname: 'JDSRV03', Port: '9000'.
Medium: 11:48:09 PM [Debug] Calling back, stage done.
Medium: 11:48:09 PM [Background Monitor] Stopping background process Agent Callback Proxy
Medium: 11:48:09 PM [Debug] OnExecuteComplete

Medium: 11:48:09 PM [Debug] 0 Runners active
Medium: 11:48:09 PM [Debug] 0 Runners active

Medium: 11:48:12 PM [Registration] Agent has been successfully registered with the Continua server, agent Id is 'b195eeb2-7056-42a4-a3c1-a16200bdcb35'.
Medium: 11:48:42 PM [Debug] Registering Agent with Server
Medium: 11:48:42 PM [Debug] Create ServiceClient with hostname and port in Register method: 'JDSRV03', Port: '9000'.
Medium: 11:48:42 PM [Debug] Calling RegisterAgent Method...
Medium: 11:48:43 PM [Debug] SSH Server connect: JDSRV03:9010.
Medium: 11:49:04 PM [Registration] Agent has been successfully registered with the Continua server, agent Id is 'b195eeb2-7056-42a4-a3c1-a16200bdcb35'.
Medium: 11:49:34 PM [Debug] Registering Agent with Server
Medium: 11:49:34 PM [Debug] Create ServiceClient with hostname and port in Register method: 'JDSRV03', Port: '9000'.
Medium: 11:49:34 PM [Debug] Calling RegisterAgent Method...
Medium: 11:49:34 PM [Debug] SSH Server connect: JDSRV03:9010.

Hi Jamie,

The log indicates normal behaviour on the agent. The agent callback process is stopped after the stage has completed. Did you get an "agent has gone offline" message in the timeline when running this stage?  If so could you also post the server log file?  Otherwise could you post the logs when running the stage with the multiple NUnit actions?

Posted By Jenasys Design Pty Ltd on 28 Jul 2013 08:34 AM
Could it be as simple as using the "Devenv.com /SafeMode" command line option? http://msdn.microsoft.com/en-us/library/vstudio/xee0c8y7.aspx

I've added a SafeMode option for the next build (along with VS2013 support).

I’ve emailed various server/agent logs to Fogbugz.

Neither of the logs showed anything unusual, did the build error out when that log was captured(doesn’t appear that way). We should have a new build out tomorrow, we’ll take a look at adding some extra logging.

http://wiki.finalbuilder.com/display/continua/Troubleshooting - is there a more detailed option for your logger implementation that “Debug”, like “Verbose”?

I’ve been reviewing the build/logs and I think in the example logs sent for Nunit, it was an MS Build Error "error BC30009: Reference required to assembly " issue.

Debug is the most verbose level. We’ve been looking at the logs you sent and they look normal, it looks to us like the build completed. We’re in the process of adding a lot more logging, hope to have a new build out later todat with the extra logging.

Sweet, looking forward to it.

Hi Jamie,

The latest build 1.0.0.2352 includes additional debug logging.

The following includes the start of the Nunit action, in the agent log, but doesn't seem to go into any details around the logic. On the server side log is significantly larger and the end of the nunit action includes.  I conclude that your correct, the "Action Failure" is overriding the stage gate settings

Medium: 4:19:00 PM [Debug] [BuildRunner.OnGetNextStage] : AutomaticallyPromote : True
Medium: 4:19:00 PM [Debug] [BuildRunner.OnGetNextStage] : Running next stage.
Medium: 4:19:00 PM [Debug] [BuildRunner.UpdateBuildStatus] : Allocating agent
Medium: 4:19:00 PM [Debug] Publishing message of type 'BuildStateChangedMessage'.
Medium: 4:19:00 PM [Debug] [BuildRunner.OnGetNextStage] : 691 Done
Medium: 4:19:00 PM [Debug] Sending message of type 'BuildStateChangedMessage' to consumer of type 'NotificationController'.
Medium: 4:19:00 PM [Debug] [BuildRunner.OnReservingAgent] : 691
Medium: 4:19:00 PM [Debug] [BuildRunner.OnReservingAgent] : Getting agent controller
Medium: 4:19:00 PM [Debug] [BuildRunner.OnReservingAgent] : while !agentAndLicenseAllocated
Medium: 4:19:00 PM [Debug] [BuildRunner.OnReservingAgent] : agent not yet allocated
Medium: 4:19:00 PM [Debug] [BuildRunner.OnReservingAgent] : attempting to reserve agent
Medium: 4:19:00 PM [Debug] [ServerAgentController.ReserveAgent]
Medium: 4:19:00 PM [Debug] [ServerAgentController.ReserveAgent] done
Medium: 4:19:00 PM [Debug] [BuildRunner.OnReservingAgent] : Assigning agent 'jdbne01 to stage 'Tests NET40x64'
Medium: 4:19:00 PM [Debug] [BuildRunner.UpdateBuildStatus] : Agent 'jdbne01' reserved and license allocated for stage 'Tests NET40x64'
Medium: 4:19:00 PM [Debug] Publishing message of type 'BuildStateChangedMessage'.
Medium: 4:19:00 PM [Debug] [BuildRunner.UpdateStageStatus] : Stage Tests NET40x64 - StageReady
Medium: 4:19:00 PM [Debug] Sending message of type 'BuildStateChangedMessage' to consumer of type 'NotificationController'.
Medium: 4:19:00 PM [Debug] [BuildRunner.OnReservingAgent] : saving changes
Medium: 4:19:00 PM [Debug] [BuildRunner.OnReservingAgent] : 691 Done
Medium: 4:19:00 PM [Debug] [BuildRunner.OnExecutingStage] : 691
Medium: 4:19:00 PM [Debug] [BuildRunner.UpdateBuildStatus] : Initialising Stage...
Medium: 4:19:00 PM [Debug] Publishing message of type 'BuildStateChangedMessage'.
Medium: 4:19:00 PM [Debug] [BuildRunner.UpdateStageStatus] : Stage Tests NET40x64 - StageExecuting
Medium: 4:19:00 PM [Debug] Sending message of type 'BuildStateChangedMessage' to consumer of type 'NotificationController'.
Medium: 4:19:00 PM [Debug] [BuildRunner.OnExecutingStage] : geting previous duration
Medium: 4:19:00 PM [Debug] [BuildRunner.OnExecutingStage] : opening log file
Medium: 4:19:00 PM [Debug] Using a pooled session to log database 'C:\Continua\Ws\JD-Data\691\log'.
Medium: 4:19:00 PM [Debug] [BuildRunner.OnExecutingStage] : log.OnStageBegin
Medium: 4:19:00 PM [Debug] [BuildRunner.OnExecutingStage] : deserializing actions
Medium: 4:19:00 PM [Debug] [BuildRunner.OnExecutingStage] getting ServerAgentController
Medium: 4:19:00 PM [Debug] [BuildRunner.OnExecutingStage] agentController.Execute
Medium: 4:19:00 PM [Debug] [ServerAgentController.Execute] Stage Tests NET40x64
Medium: 4:19:00 PM [Debug] [ServerAgentController.Execute] getting reserved agent for stage
Medium: 4:19:00 PM [Debug] [ServerAgentController.Execute] agent seems to be alive, will check
Medium: 4:19:00 PM [Debug] [ServerAgentController.Execute] refreshing agent
Medium: 4:19:00 PM [Debug] [ServerAgentController.SetupAgentExecutionContext] stage Tests NET40x64

Medium: 4:20:52 PM [Debug] [ServerAgentController.CompleteStageExecution] callId 36345d14-272a-481a-95f8-28e7fbb10ce7, result : Failure
Medium: 4:20:52 PM [Debug] [ServerAgentController.CompleteStageExecution] done
Medium: 4:20:52 PM [Debug] [ServerAgentController.MonitorStageExecution] refreshing build object from db
Medium: 4:20:52 PM [Debug] [ServerAgentController.MonitorStageExecution] refreshing agent object from db, LastActive : 2/08/2013 6:19:47 AM
Medium: 4:20:52 PM [Debug] [ServerAgentController.MonitorStageExecution] agent refreshed, LastActive : 2/08/2013 6:20:39 AM
Medium: 4:20:52 PM [Debug] [ServerAgentController.MonitorStageExecution] stage Tests NET40x64 Done
Medium: 4:20:52 PM [Debug] [ServerAgentController.Execute] got async call result
Medium: 4:20:52 PM [Debug] [ServerAgentController.Execute] Async call Done
Medium: 4:20:52 PM [Debug] [ServerAgentController.Execute] Done
Medium: 4:20:52 PM [Debug] [BuildRunner.OnExecutingStage] stage finished
Medium: 4:20:52 PM [Debug] [BuildRunner.OnExecutingStage] deserializing variables
Medium: 4:20:52 PM [Debug] [BuildRunner.OnExecutingStage] Registering artifacts
Medium: 4:20:52 PM [Debug] [BuildRunner.OnExecutingStage] Getting unit test manager
Medium: 4:20:52 PM [Debug] [BuildRunner.OnExecutingStage] Adding unit test metrics
Medium: 4:20:52 PM [Debug] [BuildRunner.OnExecutingStage] Updating metric totals
Medium: 4:20:52 PM [Debug] [BuildRunner.UpdateStageStatus] : Stage Tests NET40x64 - StageFailed
Medium: 4:20:52 PM [Debug] [BuildRunner.OnExecutingStage] log.OnStageComplete
Medium: 4:20:52 PM [Debug] [BuildRunner.OnExecutingStage] releasing agent reservation
Medium: 4:20:52 PM [Debug] [BuildRunner.OnExecutingStage] : 691 Done
Medium: 4:20:52 PM [Debug] [BuildRunner.OnFailure] : 691

If this is the same project has you posted about before, the NUnit action is failing because of an test reporting an Error. I’m just about to post build 2370 which has a new option to Fail the action if tests Error, turn off that option and the build should continue.

I’ve been working on further complexities and evolution of the stages for this project to generate multiple combinations of .net framework and compiler plateforms, with matching unit tests. I can confirm, the unit test with an Error metric causes the problem for the progress of the Action/Stage. Thanks for adding new features for MS Build and Nunit, they have proven to be extremely helpful. Again, Vsoft demonstrating how software industry should be supporting software.

Scratch build 2370. Uploading 2372 at the moment.

That may not be the next build… need to bootstrap the build server but it’s currently uploading a finalbuilder build. Will post on the forums later when a new build is available, apologies for the delay.