MX records and "backup servers"

RavenGuard

Member
Jul 22, 2007
134
0
0
Hi all.

I work with email systems frequently, and have run into a fairly "deep" question regarding MX records.

In a circumstance when moving from one email system to another, the MX records are eventually changed to permanently point to the new system. If the old mail server remains functional (to receive mail, but would no longer be accessible to users) is it doing any harm to leave that MX record in place at a lower priority in case of the need to switch back to the old system?

In this case we are using Google Apps, and the scenario would be as follows.

MX Records in place:
Priority 10 points to Google Apps
Priority 20 points to Google Apps (alternate server 1)
Priority 30 points to Google Apps (alternate server 2)
Priority 40 points to Google Apps (alternate server 3)
Priority 50 points to Google Apps (alternate server 4)
Priority 60 points to Old Email Server

Scenario: user from outside has a shoddy internet connection. They send mail to this system, and their connection cuts out as the server is attempting to send to each of the first 5 severs denoted by the MX records. The connection resumes just as the 5th MX record fails, and mail goes through to the 6th server. This would mean the mail system would present no errors to the users, and mail would unknowingly go to an old server that would never be checked.

Is this a possibility? Does it work entirely differently and I'm crazy?

Thanks in advance for any help!
 

FiLeZz

Diamond Member
Jun 16, 2000
4,778
47
91
Not work this way.

would work this way if the mail servers had crappy internet or were down all together.
MX goes to first if its down goes to 2nd and so on.

Nothing to do with them having bad internet.
 

RavenGuard

Member
Jul 22, 2007
134
0
0
What if their (sending) mail server had a bad connection is kind of what I'm getting at, not that the guy at his desk has a bad connection.

Does this change things?
 

FiLeZz

Diamond Member
Jun 16, 2000
4,778
47
91
The likley hood of Google going down on all 5 servers before getting to your server is highly unlikley.

Usually you would leave a server like yours online for a grace period, till all MX replication has taken place on the internet for DNS.


The only reason you have the priority is for the next in line to be a failover server.
Usually the fail overs servers are 3rd party and when you come back online all the mail they got would be forward once your outage was over..

If it was me I would remove it after 1 week.
If I needed it in the future I would put it back.
 

RavenGuard

Member
Jul 22, 2007
134
0
0
The likley hood of Google going down on all 5 servers before getting to your server is highly unlikley.

Usually you would leave a server like yours online for a grace period, till all MX replication has taken place on the internet for DNS.


The only reason you have the priority is for the next in line to be a failover server.
Usually the fail overs servers are 3rd party and when you come back online all the mail they got would be forward once your outage was over..

If it was me I would remove it after 1 week.
If I needed it in the future I would put it back.

I completely understand that it is very unlikely all 5 Google servers will fail--Essentially I am looking a scenarios where another failure somewhere along the line causes mail to be delivered to the old server. Again, with the example of a shoddy internet connection on the sender's side... Is it POSSIBLE that an intermittent problem such as this can cause the sending server to fail on all 5 Google servers and pass on the 6th?

Thanks!
 

imagoon

Diamond Member
Feb 19, 2003
5,199
0
0
MX failover is pretty simple. The sending server will try each server, one at a time from smallest number to the largest. So if somehow, the sending servers connection was timed perfectly to be down for 10 -> 50 but came online right as it was trying "60" then yes it 60 would get it. However this pretty pointless as most mail servers that can't send will fail all MX records and try again in 5 minutes, 15minutes, 60 minutes and 8 hours which is the defacto standard. Some also try again @ 24 hours.

Basically keeping the old box accomplishes nothing but solves a theoretical problem where the "might happen" is so low that it would likely be a 1:1,000,000 chance or lower.

--edit--
Also realize we are talking the mail servers connection. User A that is sending with a shoddy connection simply wouldn't get the mail to his mail server. The user should never be sending email 'directly' from his/her machine.
 
Last edited:

RavenGuard

Member
Jul 22, 2007
134
0
0
MX failover is pretty simple. The sending server will try each server, one at a time from smallest number to the largest. So if somehow, the sending servers connection was timed perfectly to be down for 10 -> 50 but came online right as it was trying "60" then yes it 60 would get it. However this pretty pointless as most mail servers that can't send will fail all MX records and try again in 5 minutes, 15minutes, 60 minutes and 8 hours which is the defacto standard. Some also try again @ 24 hours.

Basically keeping the old box accomplishes nothing but solves a theoretical problem where the "might happen" is so low that it would likely be a 1:1,000,000 chance or lower.

--edit--
Also realize we are talking the mail servers connection. User A that is sending with a shoddy connection simply wouldn't get the mail to his mail server. The user should never be sending email 'directly' from his/her machine.

I am actually a proponent of *REMOVING* the additional MX record... the purpose of my questions is to find real reasoning that keeping the extra record can cause issues. I believe you are agreeing that yes, it is possible something such as a shoddy network connection on an outbound server CAN cause this mail to go to the "backup" system when it would be preferred for it to strictly fail and retry in 5 minutes. Is this assumption correct?
 

imagoon

Diamond Member
Feb 19, 2003
5,199
0
0
I am actually a proponent of *REMOVING* the additional MX record... the purpose of my questions is to find real reasoning that keeping the extra record can cause issues. I believe you are agreeing that yes, it is possible something such as a shoddy network connection on an outbound server CAN cause this mail to go to the "backup" system when it would be preferred for it to strictly fail and retry in 5 minutes. Is this assumption correct?

Basically. You would practically need a deity to control the timing not once but 4-5 times at the exact time the retries where happening to make that last server 'save you.' Now I would put this out there: If you reconfigured the server to act as a third party 'smart host' where it would basically get email while all of google was down, and then send them in to google when google came back online, I could see that being useful (albeit rare use.) This design is typically used by companies not as diverse and redundant as google so I would question the need. IE the 5 server google provides are basically doing this for you already.