Setting up Deployment Manager

SteveGTRSteveGTR Posts: 91
edited October 8, 2013 10:07AM in Deployment Manager
We are having all sorts of problems getting Deployment Manager (DM) setup in our environment.

Initially we tried to use the Windows Authentication and the DM portal would never come up; it would generate a 500 internal server error.

I found that if I didn't check the use Windows Authentication, the portal would come up.

Now connecting target machines is a problem. The agent that is installed on the same machine as the manager appears to come up. Sometimes it doesn't thought.

The bigger problem is the database server target doesn't come up. It throws an error:

Failed: http://eerepagedevdb1:10301/

2013-10-04 16:21:58 -04:00 ERROR System.ServiceModel.Security.MessageSecurityException: An unsecured or incorrectly secured fault was received from the other party. See the inner FaultException for the fault code and detail. ---> System.ServiceModel.FaultException: An error occurred when verifying security for the message.
--- End of inner exception stack trace ---

Server stack trace:
at System.ServiceModel.Channels.SecurityChannelFactory`1.SecurityRequestChannel.ProcessReply(Message reply, SecurityProtocolCorrelationState correlationState, TimeSpan timeout)
at System.ServiceModel.Channels.SecurityChannelFactory`1.SecurityRequestChannel.Request(Message message, TimeSpan timeout)
at System.ServiceModel.Security.SecuritySessionSecurityTokenProvider.DoOperation(SecuritySessionOperation operation, EndpointAddress target, Uri via, SecurityToken currentToken, TimeSpan timeout)
at System.ServiceModel.Security.SecuritySessionSecurityTokenProvider.GetTokenCore(TimeSpan timeout)
at System.IdentityModel.Selectors.SecurityTokenProvider.GetToken(TimeSpan timeout)
at System.ServiceModel.Security.SecuritySessionClientSettings`1.ClientSecuritySessionChannel.OnOpen(TimeSpan timeout)
at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout)
at System.ServiceModel.Channels.ReliableChannelBinder`1.ChannelSynchronizer.SyncWaiter.TryGetChannel()
at System.ServiceModel.Channels.ReliableChannelBinder`1.ChannelSynchronizer.SyncWaiter.TryWait(TChannel& channel)
at System.ServiceModel.Channels.ReliableChannelBinder`1.ChannelSynchronizer.TryGetChannel(Boolean canGetChannel, Boolean canCauseFault, TimeSpan timeout, MaskingMode maskingMode, TChannel& channel)
at System.ServiceModel.Channels.ClientReliableChannelBinder`1.Request(Message message, TimeSpan timeout, MaskingMode maskingMode)
at System.ServiceModel.Channels.RequestReliableRequestor.OnRequest(Message request, TimeSpan timeout, Boolean last)
at System.ServiceModel.Channels.ReliableRequestor.Request(TimeSpan timeout)
at System.ServiceModel.Channels.ClientReliableSession.Open(TimeSpan timeout)
at System.ServiceModel.Channels.ReliableRequestSessionChannel.OnOpen(TimeSpan timeout)
at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannel.OnOpen(TimeSpan timeout)
at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannel.CallOnceManager.CallOnce(TimeSpan timeout, CallOnceManager cascade)
at System.ServiceModel.Channels.ServiceChannel.EnsureOpened(TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage methodCall, ProxyOperationRuntime operation)
at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)

Exception rethrown at [0]:
at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)
at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)
at RedGate.Deploy.Shared.Contracts.IHealthService.CheckHealth()
at System.Dynamic.UpdateDelegates.UpdateAndExecute1[T0,TRet](CallSite site, T0 arg0)
at CallSite.Target(Closure , CallSite , Object )
at ImpromptuInterface.Optimization.InvokeHelper.InvokeMemberCallSite(Object target, String_OR_InvokeMemberName name, Object[] args, String[] tArgNames, Type tContext, Boolean tStaticContext, CallSite& callSite)
at ImpromptuInterface.Dynamic.ImpromptuForwarder.TryInvokeMember(InvokeMemberBinder binder, Object[] args, Object& result)
at RedGate.Deploy.Server.Proxies.WcfSecurityWrapper`1.TryInvokeMember(InvokeMemberBinder binder, Object[] args, Object& result)
at CallSite.Target(Closure , CallSite , Object )
at ActLike_IHealthService_a341aaf64f0449348fe6942ca27b2141.CheckHealth()
at RedGate.Deploy.Server.Tasks.Health.CheckAgentHealthActivity.<Execute>b__0()

My systems guy disabled the firewalls on both manager and agent machines with no success.

It did appear to work once. I asked the systems guy and he said he just ran the agent tool on the DB server as an administrator... We rebooted both machines to see if we could pinpoint what was happening, now the DB1 agent won't connect.

Very frustrating...

What log files is the error referring to? I looked in the Red Gate logs directory, but you can't look at the those files while the server is up.

Man, haven't even really done anything yet and having all these problems. I'd hate to see what problems we'd have when we try and connect to the production servers in a DMZ.

Help :(

Comments

  • Hi Steve,

    Messages from the deployment agents will be logged to the windows application log, and can be examined with the windows event viewer.

    Please could you look in the services control panel and make sure that the service 'Red Gate Deployment Agent' is started on both machines that you wish to connect to. If its status does not show as started, could you try to start it and then try to connect using deployment manager again. If it doesn't work can you send us any messages that you see in the event log.

    Is it possible that you have another service running on those machines that is also trying to bind to port 10301?

    Regarding the problems you experienced with windows authentication, if you turn the feature back on and check the portal again, then you should get a message in the event log from asp.net. Please could you also let us have those details.
    Robin Hellen
    Test Engineer
    DLM Automation
  • Thanks for the reply. I'll check those items. It's interesting that when I came in and looked at the status screen both targets machines had a green check mark. When I manually performed a help check, the connection to our DB server failed again. I looked at the history and it appears as if sometimes it connects and sometimes it doesn't. Here is the activity log for today and output from a successful help check:

    Failed Manual health check Steve Monday, October 07, 2013 9:19 AM -04:00
    Successful Check Agent health [System] Monday, October 07, 2013 8:59 AM -04:00
    Successful Check for updated version [System] Monday, October 07, 2013 8:54 AM -04:00
    Failed Check Agent health [System] Monday, October 07, 2013 8:29 AM -04:00
    Successful Scheduled database backup [System] Monday, October 07, 2013 8:02 AM -04:00
    Failed Check Agent health [System] Monday, October 07, 2013 7:59 AM -04:00
    Failed Check Agent health [System] Monday, October 07, 2013 7:29 AM -04:00
    Successful Check Agent health [System] Monday, October 07, 2013 6:59 AM -04:00
    Successful Check for updated version [System] Monday, October 07, 2013 6:54 AM -04:00
    Failed Check Agent health [System] Monday, October 07, 2013 6:29 AM -04:00
    Failed Check Agent health [System] Monday, October 07, 2013 5:59 AM -04:00
    Successful Check for updated version [System] Monday, October 07, 2013 4:54 AM -04:00
    Successful Scheduled database backup [System] Monday, October 07, 2013 4:01 AM -04:00
    Successful Check for updated version [System] Monday, October 07, 2013 2:54 AM -04:00
    Successful Check for updated version [System] Monday, October 07, 2013 12:54 AM -04:00
    Successful Scheduled database backup [System] Monday, October 07, 2013 12:04 AM -04:00


    Check Agent health

    2013-10-07 08:59:23 -04:00 INFO The following Agents will be checked:
    2013-10-07 08:59:23 -04:00 INFO - http://eerepagedevweb1:10301/
    2013-10-07 08:59:23 -04:00 INFO - http://eerepagedevdb1:10301/
    2013-10-07 08:59:23 -04:00 INFO Begin health checks
    2013-10-07 08:59:24 -04:00 INFO
    2013-10-07 08:59:24 -04:00 INFO Health results:
    2013-10-07 08:59:24 -04:00 INFO - ONLINE: http://eerepagedevweb1:10301/, running version 2.2.16.1 on machine EEREPAGEDEVWEB1 as SYSTEM
    2013-10-07 08:59:24 -04:00 INFO - ONLINE: http://eerepagedevdb1:10301/, running version 2.2.16.1 on machine EEREPAGEDEVDB1 as SYSTEM
    2013-10-07 08:59:24 -04:00 INFO Health check complete!

    Success: http://eerepagedevweb1:10301/

    2013-10-07 08:59:23 -04:00 INFO Health check successful. Running version: 2.2.16.1

    Success: http://eerepagedevdb1:10301/

    2013-10-07 08:59:24 -04:00 INFO Health check successful. Running version: 2.2.16.1
  • Messages from the deployment agents will be logged to the windows application log, and can be examined with the windows event viewer.

    I looked in the Event Viewer --> Windows Logs --> Application for the EEREPAGEDEVDB1 server where the agent is failing and there were no log entries after I tried a Health Check and it failed.

    I did see entries on the EEREPAGEDEVWEB1 server where the deployment manager runs. The entries are from the Deployment Manager and are an error:

    2013-10-07 09:29:19,898 [40] ERROR RedGate.Deploy.Server.Tasks.TaskRunner [(null)] - System.AggregateException: One or more errors occurred. ---> RedGate.Deploy.Server.Tasks.ActivityFailedException: One or more Agents were not online. Please see the output log for details.
    at RedGate.Deploy.Server.Tasks.Health.HealthControllerActivity.<Execute>d__6.MoveNext()
    --- End of inner exception stack trace ---
    at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
    at RedGate.Deploy.Server.Tasks.TaskRunner.<>c__DisplayClass1.<Execute>b__0()
    ---> (Inner Exception #0) RedGate.Deploy.Server.Tasks.ActivityFailedException: One or more Agents were not online. Please see the output log for details.
    at RedGate.Deploy.Server.Tasks.Health.HealthControllerActivity.<Execute>d__6.MoveNext()<---

    Followed by a warning:

    2013-10-07 09:29:20,008 [40] WARN RedGate.Deploy.Server.Tasks.TaskRunner [(null)] - Task tasks-466 exited with error One or more Agents were not online. Please see the output log for details.

    Please could you look in the services control panel and make sure that the service 'Red Gate Deployment Agent' is started on both machines that you wish to connect to.

    EEREPAGEDEVDB1

    Red Gate Deployment Agent is running

    EEREPAGEDEVWEB1

    Red Gate Deployment Agent is running
    Red Gate Deployment Manager is running

    Here is some additional information:

    From EEREPAGEDEVWEB1

    portqry -n EEREPAGEDEVDB1 -e 10301

    192.168.5.179

    TCP port 10301 (unknown service): LISTENING

    portqry -n EEREPAGEDEVWEB1 -e 10301

    192.168.5.174

    TCP port 10301 (unknown service): LISTENING

    netstat -ano | findstr 10301

    TCP 0.0.0.0:10301 0.0.0.0:0 LISTENING 4
    TCP [::]:10301 [::]:0 LISTENING 4

    PID 4: System

    From EEREPAGEDEVDB1

    portqry and netstat same as EEREPAGEDEVWEB1 results
  • Thanks Steve.

    I'm assuming both the machines are just on the same network segment based on the IP addresses- is there anything odd like multiple network cards in one or other machine?

    Does the healthcheck to the local agent instance on the DM server suffer the same issue, or does that always work?

    Another thing to check - though unlikely - is that you've disabled the power-saving feature on the network card in Device Manager in case it's going to sleep during times of inactivity and isn't reactivating quickly enough for things to work?
    Systems Software Engineer

    Redgate Software

  • SteveGTRSteveGTR Posts: 91
    edited October 8, 2013 10:14AM
    Both machines are on an internal network segment.

    The health check appears to work fine on EEREPAGEDEVWEB1. Both the agent and the manager are on this machine.

    Both machines are awake when the errors occur.

    These machines are setup on as virtual servers.
  • I tried using each of the DB1 IP addresses as Agent URL on the Edit Target Machine dialog and each failed with the same message:

    System.ServiceModel.Security.MessageSecurityException: An unsecured or incorrectly secured fault was received from the other party.

    I tried to regenerate the server keys and resetting both the environment setting and the agent setting on each server. The DB server is still offline.
  • I disabled the two VMware interfaces on the DB1 server and removed all but the 1 IP address on WEB1, flushed the DNS cache on WEB1, and retried and it still fails on DB1.

    At a loss...
  • Hi Steve,

    We have seen this issue before when the clocks on the agent and the server machine are out of sync. (see our troubleshooting page)
    Robin Hellen
    Test Engineer
    DLM Automation
  • Eureka! That was it. The DB server's time was 45 minutes ahead of the actual time. Resetting the time on the DB server solved the problem --- whew.

    Thanks for all your help. I can now move on to the important stuff :)
Sign In or Register to comment.