How to cancel a hung deployment?
isme
Posts: 119
Normally my deployment takes about five minutes to complete.
The latest one has been running for 40 minutes with no end in sight.
The log goes no further than starting to execute a custom pre-deployment script:
It turns out that the script contains a subtle infinite loop.
How do I cancel a runaway deployment?
My Deployment Manager version is v2.3.4.13. The host runs PowerShell v2.
EDIT: I removed the script I posted because it was another one that actually caused the problem. Pebkac!
The latest one has been running for 40 minutes with no end in sight.
The log goes no further than starting to execute a custom pre-deployment script:
2013-12-18 17:48:16 +00:00 INFO Looking for PowerShell scripts named PreDeploy.ps1 2013-12-18 17:48:16 +00:00 INFO Calling PowerShell script: 'G:\Temp\p1piiqgl.csd\Packages\..\Applications\INT-CI\Rhubarb.Rhubarb.Rhubarb\1.0.8711.241\db\state\Replication\PreDeploy.ps1'
It turns out that the script contains a subtle infinite loop.
How do I cancel a runaway deployment?
My Deployment Manager version is v2.3.4.13. The host runs PowerShell v2.
EDIT: I removed the script I posted because it was another one that actually caused the problem. Pebkac!
Iain Elder, Skyscanner
Comments
Is there a less disruptive way to do this?
Here's what we tried.
I used this command to restart the RGDM service.
Initially it looked like it worked.
But a few seconds later, the service stopped again.
At this point we figured the easiest thing to do was to restart the host.
After a couple of minutes, the host responded to Get-Service requests again.
We tried to start the service again.
This time it stayed up.
Worryingly, there is no history of the bad deployment in the user interface.
The deployment history for my project says that the most recent deployment was successful.
I've omitted the stack trace.
Ask me if you want a complete copy of the filtered log.
To generate this, I connected the event view on my workstation to the RGDM host...
...and used this filter to show only the events from RGDM.
I tried to restart the service twice before restarting the host, which could explain why System.ServiceModel.AddressAlreadyInUseException appears twice.
By the way, the Red Gate Deployment Manager service (process name RedGate.Deploy.Server.exe) doesn't run the deployment directly; the Deployment Agent (RedGate.Deploy.Agent.exe) is responsible for actually executing the deployment. For standard deployments this is the Red Gate Deployment Agent service, normally running on a remote machine. For database deployments, the DM server spawns RedGate.Deploy.Agent.exe as a child process. If you need to terminate a running deployment, it's the agent process that you need to kill.
We don't currently have the ability to cancel deployments that are in progress; if you'd like us to implement that feature, please vote for it in UserVoice
Redgate Software
The next time we encounter a runaway deployment, we'll try to kill just the agent processes rather than the server process.
We may decide to reduce the default PowerShell time limit for most of our deployments to defend against being blocked by infinite loops.
We expect some of our bulk data transfer operations to take more than one hour. In these situations it might make sense to increase the limit.
Thanks for your help!