Backups failing, network or resource error
Model: HP Proliant DL380 G5
CPU: 2 x 4-core Intel Xeon E5440 2.83GHz
OS: Windows 2008 R2 Standard 64 bit
We currently have 79 SQL servers, with various numbers of databases of various sizes, backing up to this one server. The backup scheme is differential during the week and full backups at the weekend. My problem is that some, though not all backup jobs, are failing with the following error
8/09/2012 02:04:55: Warning 210: Thread 0 warning: WriteFile failed for file: \\REDGATE-SERVER\SQLBackupE\Data\SQL-CLIENT\LiveDatabase\FULL_(local)_LiveDatabase_20120907_232529.sqb at position: 1698694144 08/09/2012 01:31:59: WriteFile failed for file: \\REDGATE-SERVER\SQLBackupE\Data\NM-HADES\LiveDatabase\FULL_(local)_LiveDatabase_20120907_232529.sqb (121: The semaphore timeout period has expired.) 08/09/2012 01:31:59: CloseTargetFile.FlushFileBuffers error: The specified network name is no longer available.
Also in the Redgate server's System event log I am seeing the following SRV, event 2012 error:
While transmitting or receiving data, the server encountered a network error. Occassional errors are expected, but large amounts of these indicate a possible error in your network configuration. The error status code is contained within the returned data (formatted as Words) and may point you towards the problem.
What I would like to know is there any recommended guidelines for the number of SQL clients backing up to one Redgate server? I ask because I get the feeling the server can't handle the amount of data being thrown at it. Should we be using more than one Redgate server for my environment?
If the number of clients/databases is not the problem does anybody have an idea as to what might be causing this? Although it obviously points to network problems I just wanted to be certain we are following guidelines on number of clients and resource usage before I ask our network team to look into this.
Thanks for any replies.