Deployment Manager Broke

SteveGTRSteveGTR Posts: 91
edited December 27, 2013 10:18AM in Deployment Manager
Everything was going great. Developers and management are loving DM.

Then today DM broke. I believe the problem occurred because of not enough disk space on the DM server. This results because the Windows\Temp\Nuget files are not being cleaned up. I've previously posted a message concerning this. I cleared out the deadwood in the Temp directory, but that didn't help.

The problem started when a developer did a manual build which automatically deploys to Development. DeploymentManager.exe generated the following message:
 Deployment Manager Command Line Tools version 2.3.0.21
Finding project: PAGE
Handshaking with Deployment Manager server: http://server:8080/api
Handshake successful. Deployment Manager version: 2.3.12.4; API version: 1.0.0
Finding steps for project...
Getting package versions for each step...
Using latest version (0.20131217.1) of package MergePage.
Using latest version (2.3.31.289) of package WebPage.
A --releaseversion parameter was not specified, so a version number was automatically selected based on the highest package version: 2.3.31.289
Creating release: 2.3.31.289
POST http://server:8080/api/projects/projects-1/releases
Release created successfully!
POST http://server:8080/api/projects/projects-1/releases/releases-1697/deployments
Successfully scheduled release '2.3.31.289' for deployment to environment 'Development'
Deployment not yet finished. It's taken 00:00:00.0625000 so far.
Deployment not yet finished. It's taken 00:00:10.5937500 so far.
Deployment not yet finished. It's taken 00:00:20.6406250 so far.
Deployment not yet finished. It's taken 00:00:30.6875000 so far.
Deployment not yet finished. It's taken 00:00:40.7343750 so far.
Deployment not yet finished. It's taken 00:00:50.7812500 so far.
Deployment not yet finished. It's taken 00:01:00.8125000 so far.
Deployment not yet finished. It's taken 00:01:10.8750000 so far.
Deployment not yet finished. It's taken 00:01:20.9375000 so far.
Deployment not yet finished. It's taken 00:01:31 so far.
Deployment not yet finished. It's taken 00:01:41.0468750 so far.
Deployment not yet finished. It's taken 00:01:51.0781250 so far.
Deployment not yet finished. It's taken 00:02:01.1406250 so far.
Deployment not yet finished. It's taken 00:02:12.3125000 so far.
Deployment not yet finished. It's taken 00:02:22.3593750 so far.
Unexpected character encountered while parsing value: <. Path '', line 2, position 1.
Newtonsoft.Json.JsonReaderException: Unexpected character encountered while parsing value: <. Path '', line 2, position 1.
at Newtonsoft.Json.JsonTextReader.ParseValue()
at Newtonsoft.Json.JsonTextReader.ReadInternal()
at Newtonsoft.Json.JsonTextReader.Read()
at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.ReadForType(JsonReader reader, JsonContract contract, Boolean hasConverter)
at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.Deserialize(JsonReader reader, Type objectType, Boolean checkAdditionalContent)
at Newtonsoft.Json.JsonSerializer.DeserializeInternal(JsonReader reader, Type objectType)
at Newtonsoft.Json.JsonConvert.DeserializeObject(String value, Type type, JsonSerializerSettings settings)
at Newtonsoft.Json.JsonConvert.DeserializeObject[T](String value, JsonSerializerSettings settings)
at Newtonsoft.Json.JsonConvert.DeserializeObject[T](String value)
at RedGate.Deploy.DeploymentManagerTools.Client.DeploymentManagerSession.Get[TResource](String path)
at RedGate.Deploy.DeploymentManagerTools.Client.DeploymentManagerSession.GetTask(DeploymentResource deployment)
at RedGate.Deploy.DeploymentManagerTools.Commands.DeploymentWatcher.<>c__DisplayClass8.<WaitForDeploymentsToFinish>b__0(DeploymentResource x)
at System.Linq.Enumerable.WhereSelectArrayIterator`2.MoveNext()
at System.Collections.Generic.List`1..ctor(IEnumerable`1 collection)
at System.Linq.Enumerable.ToList[TSource](IEnumerable`1 source)
at RedGate.Deploy.DeploymentManagerTools.Commands.DeploymentWatcher.WaitForDeploymentsToFinish(IDeploymentManagerSession session, DeploymentResource[] deployments, TimeSpan timeout, TimeSpan deploymentStatusCheckSleepCycle, Boolean displayDeploymentLog)
at RedGate.Deploy.DeploymentManagerTools.Commands.DeployCommandBase.DoDeployment(ReleaseResource release, IEnumerable`1 deployToEnvironmentNames)
at RedGate.Deploy.DeploymentManagerTools.Commands.DeployCommandBase.ExecuteImpl()
at RedGate.Deploy.DeploymentManagerTools.Commands.ApiCommand.Execute()
at RedGate.Deploy.Shared.Startup.CommandProcessor.<ProcessAsync>d__4.MoveNext()
Deployment manager returned error: 1 

Now DM is unresponsive. When we try to manually deploy via the portal it just sits at the queuing request stage (the 1st step). Tried to check health functionality and it just sits forever at queuing request.

Rebooted DM server, restart DM service and agent with no success.

I looked at the application event log on the DM server and there are problems:
2013-12-18 11:31:29,785 [5] WARN  RedGate.Deploy.Startup [(null)] - Tried to load C:\Program Files (x86)\Red Gate\Deployment Agent\Agent\Plugins\RedGate.Deploy.SqlServerDbPackage.AgentPlugin\SQLite.Interop.dll as potential plugin, but failed.
System.BadImageFormatException: Could not load file or assembly 'file:///C:\Program Files (x86)\Red Gate\Deployment Agent\Agent\Plugins\RedGate.Deploy.SqlServerDbPackage.AgentPlugin\SQLite.Interop.dll' or one of its dependencies. The module was expected to contain an assembly manifest.
File name: 'file:///C:\Program Files (x86)\Red Gate\Deployment Agent\Agent\Plugins\RedGate.Deploy.SqlServerDbPackage.AgentPlugin\SQLite.Interop.dll'
   at System.Reflection.RuntimeAssembly._nLoad(AssemblyName fileName, String codeBase, Evidence assemblySecurity, RuntimeAssembly locationHint, StackCrawlMark& stackMark, Boolean throwOnFileNotFound, Boolean forIntrospection, Boolean suppressSecurityChecks)
   at System.Reflection.RuntimeAssembly.nLoad(AssemblyName fileName, String codeBase, Evidence assemblySecurity, RuntimeAssembly locationHint, StackCrawlMark& stackMark, Boolean throwOnFileNotFound, Boolean forIntrospection, Boolean suppressSecurityChecks)
   at System.Reflection.RuntimeAssembly.InternalLoadAssemblyName(AssemblyName assemblyRef, Evidence assemblySecurity, StackCrawlMark& stackMark, Boolean forIntrospection, Boolean suppressSecurityChecks)
   at System.Reflection.RuntimeAssembly.InternalLoadFrom(String assemblyFile, Evidence securityEvidence, Byte[] hashValue, AssemblyHashAlgorithm hashAlgorithm, Boolean forIntrospection, Boolean suppressSecurityChecks, StackCrawlMark& stackMark)
   at System.Reflection.Assembly.LoadFrom(String assemblyFile)
   at RedGate.Deploy.Agent.Plugins.PluginsModule.GetPotentialPluginAssemblies()

=== Pre-bind state information ===
LOG: User = NT AUTHORITY\SYSTEM
LOG: Where-ref bind. Location = C:\Program Files (x86)\Red Gate\Deployment Agent\Agent\Plugins\RedGate.Deploy.SqlServerDbPackage.AgentPlugin\SQLite.Interop.dll
LOG: Appbase = file:///C:/Program Files (x86)/Red Gate/Deployment Agent/Agent/
LOG: Initial PrivatePath = NULL
Calling assembly : (Unknown).
===
LOG: This bind starts in LoadFrom load context.
WRN: Native image will not be probed in LoadFrom context. Native image will only be probed in default load context, like with Assembly.Load().
LOG: Using application configuration file: C:\Program Files (x86)\Red Gate\Deployment Agent\Agent\RedGate.Deploy.Agent.exe.Config
LOG: Using host configuration file: 
LOG: Using machine configuration file from C:\Windows\Microsoft.NET\Framework\v4.0.30319\config\machine.config.
LOG: Attempting download of new URL file:///C:/Program Files (x86)/Red Gate/Deployment Agent/Agent/Plugins/RedGate.Deploy.SqlServerDbPackage.AgentPlugin/SQLite.Interop.dll.
ERR: Failed to complete setup of assembly (hr = 0x80131018). Probing terminated.

I'm not sure how to proceed as we are dead in the water now. I could reinstall and reconfigure DM and the agent on the server. I will wait for feedback.

Comments

  • Additional information: Restarting IIS gets the following error in the Application Event Log on the DM server:
    Event code: 3005 
    Event message: An unhandled exception has occurred. 
    Event time: 12/18/2013 1:57:28 PM 
    Event time (UTC): 12/18/2013 6:57:28 PM 
    Event ID: 22a5d8d9cf274f0e8f859f587a8badda 
    Event sequence: 345 
    Event occurrence: 1 
    Event detail code: 0 
     
    Application information: 
        Application domain: /LM/W3SVC/2/ROOT-1-130318655548490218 
        Trust level: Full 
        Application Virtual Path: / 
        Application Path: C:\Program Files (x86)\Red Gate\Deployment Manager\Portal\ 
        Machine name: EEREPAGEDEVWEB1 
     
    Process information: 
        Process ID: 2456 
        Process name: w3wp.exe 
        Account name: NT AUTHORITY\NETWORK SERVICE 
     
    Exception information: 
        Exception type: ArgumentNullException 
        Exception message: Value cannot be null.
    Parameter name: httpContext
       at System.Web.HttpContextWrapper..ctor(HttpContext httpContext)
       at Glimpse.AspNet.AspNetFrameworkProvider.get_HttpRequestStore()
       at Glimpse.Core.Framework.Factory.<>c__DisplayClass2.<InstantiateTimerStrategy>b__1()
       at Glimpse.RavenDb.Profiler.get_Enabled()
       at Glimpse.RavenDb.Profiler.StopTrackingStore(Object sender, EventArgs e)
       at Raven.Client.Document.DocumentStore.Dispose()
       at Autofac.Core.Disposer.Dispose(Boolean disposing)
       at Autofac.Util.Disposable.Dispose()
       at Autofac.Core.Lifetime.LifetimeScope.Dispose(Boolean disposing)
       at Autofac.Util.Disposable.Dispose()
       at RedGate.Deploy.Portal.MvcApplication.Application_End()
    
     
     
    Request information: 
        Request URL:  
        Request path:  
        User host address:  
        User:  
        Is authenticated: False 
        Authentication Type:  
        Thread account name: NT AUTHORITY\NETWORK SERVICE 
     
    Thread information: 
        Thread ID: 23 
        Thread account name: NT AUTHORITY\NETWORK SERVICE 
        Is impersonating: False 
        Stack trace:    at System.Web.HttpContextWrapper..ctor(HttpContext httpContext)
       at Glimpse.AspNet.AspNetFrameworkProvider.get_HttpRequestStore()
       at Glimpse.Core.Framework.Factory.<>c__DisplayClass2.<InstantiateTimerStrategy>b__1()
       at Glimpse.RavenDb.Profiler.get_Enabled()
       at Glimpse.RavenDb.Profiler.StopTrackingStore(Object sender, EventArgs e)
       at Raven.Client.Document.DocumentStore.Dispose()
       at Autofac.Core.Disposer.Dispose(Boolean disposing)
       at Autofac.Util.Disposable.Dispose()
       at Autofac.Core.Lifetime.LifetimeScope.Dispose(Boolean disposing)
       at Autofac.Util.Disposable.Dispose()
       at RedGate.Deploy.Portal.MvcApplication.Application_End()
     
     
    Custom event details: 
    

    Getting ready to reinstall since I haven't heard back. Disappointing...
  • I downloaded DM from the Red Gate site. Tried to repair DM through Windows, but Windows didn't care for the msi image I downloaded.

    I uninstalled DM and reinstalled using the newly downloaded image. Amazingly all my settings were still in tact, but it still doesn't appear to work.

    11 minutes and counting on the Check Health screen:

    Settings -> Activity history -> Manual health check

    Loading...

    Details
    Started: Wednesday, December 18, 2013 2:18 PM -05:00
    By user: administrator
    Duration: 11 minutes
    Completed: Queued...
    Status: Queued

    After the install, I checked the Application Event Log and the portal is no longer throwing the error. No other errors in the other log files.

    15 minutes with the Manual health check screen. Nothing appears to be happening. I'll try to manually deploy again, but I anticipate that it will just sit there at the queue state.
  • Reinstalled the agent on the DM server (it's also an agent). This had no affect. The application warning (did I say error before) was still logged. I checked our QA web site target and the agent throws the same warning.

    My next step would be to uninstall, manually remove all the deployment folders, reinstall, and then totally reconfigure DM. I'd prefer not to do this, but its not working now...

    This is really making DM look bad when I thought we were really making a case for using this product beyond our development environment. How can I justify DM when the product breaks like this and is unrepairable? If anything, this begs for the need of a contingency plan for when DM fails in the future.
  • Sorry for us not replying sooner- we try and keep an eye on the forums but sometimes that falls behind other work. If you're ever having something you'd like immediate assistance on, it's generally best to email support@red-gate.com (or visit our portal at redgatesupport.zendesk.com) and a ticket will be raised for us to look into sooner (unanswered forum posts get automatically logged as tickets after a couple of days, but as you answered your own post that won't have happened!)

    Anyway, for your issue you have done all the obvious stuff - restarting, reinstalling and so on. I think the 'SQLite.Interop.dll' error is probably /not/ related as I see this in the logs when I do a SQL deployment myself and it all works OK.
    Are you OK for disk space now? If so, then there's a couple of possibilities. One is that it's trying to work with a corrupt cached package, so you may want to clear out C:\ProgramData\Red Gate\DeploymentManager\Data\PackageCache.

    Given that even the healthcheck is not working though it suggests /possibly/ that your database is corrupt in some way and needs to be restored. Therefore I'd suggest having a look in C:\ProgramData\Red Gate\DeploymentManager\Data\Backups and seeing if there is a backup from before the original issue occurred - copy the backup file (and probably a few preceding ones) elsewhere on the machine. In addition, make a copy of the 'feed' folder (C:\ProgramData\Red Gate\DeploymentManager\Data\feed) as the next step will be to delete the 'data' folder (you want to keep your feed contents!)

    To restore the data, you will need to follow these steps:

    - Stop DM (just the service)
    - Delete the ‘Data’ folder
    - Restart the services
    - Run the smuggler restore command for the relevant .dmbak, for example:

    Open an elevated command prompt, then go to C:\Program Files (x86)\Red Gate\Deployment Manager\Server folder and then the command to run would be:
    Raven.Smuggler.exe in http://localhost:10300/ mybackup.dmbak

    (replacing the filename with the appropriate one you copied out of the data folder earlier.

    That will run the restore process. Once done, restart the service again, and ideally the web portal in IIS too, then see if it works...

    I'll ask the DM team to look at this post tomorrow in case you're still having trouble
    Systems Software Engineer

    Redgate Software

  • Whew! James to the rescue. I'll heed your advice in the future and I'll give what you've suggested a try and report back.

    To answer your questions and provide additional information:

    Yes, there is plenty of free space now.

    We just did another build and now DeploymentManager.exe reports:
    Deployment Manager Command Line Tools version 2.3.0.21
    
    Finding project: PAGE
    Handshaking with Deployment Manager server: http://server:8080/api
    Handshake successful. Deployment Manager version: 2.3.12.4; API version: 1.0.0
    Finding steps for project...
    Getting package versions for each step...
    Using latest version (0.20131218.1) of package MergePage.
    Using latest version (2.3.31.291) of package WebPage.
    A --releaseversion parameter was not specified, so a version number was automatically selected based on the highest package version: 2.3.31.291
    Creating release: 2.3.31.291
    POST http://server:8080/api/projects/projects-1/releases
    Release created successfully!
    POST http://server:8080/api/projects/projects-1/releases/releases-1729/deployments
    The remote server returned an error: (404) Not Found.
    Resource not found: deployment with id deployments-1985, release releases-1729 and project projects-...
    
    Deployment manager returned error: 1
    
  • The command line tool I'm less familiar with I'm afraid, but this line:

    Resource not found: deployment with id deployments-1985, release releases-1729 and project projects-

    Suggests to me missing data. You can try browsing the datastore directly to see if deployment-1985 is there for example:

    http://localhost:10300/raven/studio.html#/documents?database=%3Csystem%3E&collection=Deployments

    or the release-1729:

    http://localhost:10300/raven/studio.html#/documents?database=%3Csystem%3E&collection=Deployments

    Do you have data leading up to that point? It's still sounding to me like a restore may be needed, but I'm afraid as we're in the UK the dev-team aren't here for me to confirm until tomorrow :(
    Systems Software Engineer

    Redgate Software

  • You are a life saving James. The restore did the trick. Check health works just fine. I did a build and it was deployed to Development with no problem.

    I'm going to have to create a job on the server to clear out the \Windows\Temp\Nuget directory for now so that we don't run into this problem again.

    Thanks again :)
  • Good news, glad you're up and running again now!
    Systems Software Engineer

    Redgate Software

  • I just had to follow these database backup restore steps as well and they worked like a charm! These steps should be added to the documentation so that people don't have to dig through forum posts in order to figure out how to do a db restore.
  • I agree! Even better would be an option to perform this processing on the portal.
Sign In or Register to comment.