Filer controller down and auto failover issue

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Filer controller down and auto failover issue

Sayla, Mustafa

Last night we had an issue with our production filer where controller A went down but controller B did not know that A is down and did not took over. Only after I manually power off the controller A then controller B took over. I then brought the Controller A back up and did the give back and the filer worked fine for about 3 hours and same thing happened again. This time I shut down controller A and let it shut down and right now we are operating off of controller B only. Per netapp there is nothing in the logs that I sent them or in autosupport to suggest the root cause. They are asking us to bring controller A again and wait for it to happen again and if it does collect a core dump. This option is not acceptable to us as this affects production. Has anyone seen this before or have any ideas?

 

Filer failover settings are all correct. We are running CDOT 8.3.1 on FAS8040.

 

Mustafa Sayla

 

Visit us on the Web at mesirowfinancial.com

This communication may contain privileged and/or confidential information. It is intended solely for the use of the addressee. If you are not the intended recipient, you are strictly prohibited from disclosing, copying, distributing or using any of this information. If you received this communication in error, please contact the sender immediately and destroy the material in its entirety, whether electronic or hard copy. Confidential, proprietary or time-sensitive communications should not be transmitted via the Internet, as there can be no assurance of actual or timely delivery, receipt and/or confidentiality. This is not an offer, or solicitation of any offer to buy or sell any security, investment or other product.
_______________________________________________
Toasters mailing list
[hidden email]
http://www.teaparty.net/mailman/listinfo/toasters
Reply | Threaded
Open this post in threaded view
|

Re: Filer controller down and auto failover issue

andrei.borzenkov@ts.fujitsu.com
How did you determine that controller B was down? That partner did not notice it implies that Data ONTAP was running at least enough to provide heartbeat. Also you were able to shutdown it - again, it implies it was up. This makes me wonder if the issue is with client access (network, SAN).

Отправлено с iPhone

28 июня 2016 г., в 17:36, Sayla, Mustafa <[hidden email]> написал(а):

Last night we had an issue with our production filer where controller A went down but controller B did not know that A is down and did not took over. Only after I manually power off the controller A then controller B took over. I then brought the Controller A back up and did the give back and the filer worked fine for about 3 hours and same thing happened again. This time I shut down controller A and let it shut down and right now we are operating off of controller B only. Per netapp there is nothing in the logs that I sent them or in autosupport to suggest the root cause. They are asking us to bring controller A again and wait for it to happen again and if it does collect a core dump. This option is not acceptable to us as this affects production. Has anyone seen this before or have any ideas?

 

Filer failover settings are all correct. We are running CDOT 8.3.1 on FAS8040.

 

Mustafa Sayla

 

Visit us on the Web at mesirowfinancial.com

This communication may contain privileged and/or confidential information. It is intended solely for the use of the addressee. If you are not the intended recipient, you are strictly prohibited from disclosing, copying, distributing or using any of this information. If you received this communication in error, please contact the sender immediately and destroy the material in its entirety, whether electronic or hard copy. Confidential, proprietary or time-sensitive communications should not be transmitted via the Internet, as there can be no assurance of actual or timely delivery, receipt and/or confidentiality. This is not an offer, or solicitation of any offer to buy or sell any security, investment or other product.
_______________________________________________
Toasters mailing list
[hidden email]
http://www.teaparty.net/mailman/listinfo/toasters

_______________________________________________
Toasters mailing list
[hidden email]
http://www.teaparty.net/mailman/listinfo/toasters
Reply | Threaded
Open this post in threaded view
|

RE: Filer controller down and auto failover issue

Sayla, Mustafa

Yes controller B was powered up but when I do node show it would not show me any information for Node B. also when I log in using service processor I cannot get to system console.

 

Mustafa Sayla

 

From: [hidden email] [mailto:[hidden email]]
Sent: Tuesday, June 28, 2016 10:46 AM
To: Sayla, Mustafa
Cc: [hidden email]
Subject: Re: Filer controller down and auto failover issue

 

How did you determine that controller B was down? That partner did not notice it implies that Data ONTAP was running at least enough to provide heartbeat. Also you were able to shutdown it - again, it implies it was up. This makes me wonder if the issue is with client access (network, SAN).

Отправлено с iPhone


28 июня 2016 г., в 17:36, Sayla, Mustafa <[hidden email]> написал(а):

Last night we had an issue with our production filer where controller A went down but controller B did not know that A is down and did not took over. Only after I manually power off the controller A then controller B took over. I then brought the Controller A back up and did the give back and the filer worked fine for about 3 hours and same thing happened again. This time I shut down controller A and let it shut down and right now we are operating off of controller B only. Per netapp there is nothing in the logs that I sent them or in autosupport to suggest the root cause. They are asking us to bring controller A again and wait for it to happen again and if it does collect a core dump. This option is not acceptable to us as this affects production. Has anyone seen this before or have any ideas?

 

Filer failover settings are all correct. We are running CDOT 8.3.1 on FAS8040.

 

Mustafa Sayla

 

Visit us on the Web at mesirowfinancial.com

This communication may contain privileged and/or confidential information. It is intended solely for the use of the addressee. If you are not the intended recipient, you are strictly prohibited from disclosing, copying, distributing or using any of this information. If you received this communication in error, please contact the sender immediately and destroy the material in its entirety, whether electronic or hard copy. Confidential, proprietary or time-sensitive communications should not be transmitted via the Internet, as there can be no assurance of actual or timely delivery, receipt and/or confidentiality. This is not an offer, or solicitation of any offer to buy or sell any security, investment or other product.

_______________________________________________
Toasters mailing list
[hidden email]
http://www.teaparty.net/mailman/listinfo/toasters

Visit us on the Web at mesirowfinancial.com

This communication may contain privileged and/or confidential information. It is intended solely for the use of the addressee. If you are not the intended recipient, you are strictly prohibited from disclosing, copying, distributing or using any of this information. If you received this communication in error, please contact the sender immediately and destroy the material in its entirety, whether electronic or hard copy. Confidential, proprietary or time-sensitive communications should not be transmitted via the Internet, as there can be no assurance of actual or timely delivery, receipt and/or confidentiality. This is not an offer, or solicitation of any offer to buy or sell any security, investment or other product.
_______________________________________________
Toasters mailing list
[hidden email]
http://www.teaparty.net/mailman/listinfo/toasters