RAID Reconstruction after ONTAP upgrade?

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

RAID Reconstruction after ONTAP upgrade?

Alexander Griesser-2

Hi,

 

has anyone ever seen a RAID reconstruct happening immediately after an OnTap upgrade?

I just upgraded one of my older filers to 8.3.1P2 and it is now running a reconstruction on one of its aggregates for (at least to me) no obvious reason.

 

During the boot up of this controller after the upgrade, I saw the following message on the console which did not show up on the second controller:

 

Creating trace file /etc/log/rastrace/RAID_0_20170402_17:18:00:095890.dmp

 

No disks show as broken, or in maintenance mode, or anything like that – so any hints would be welcome.

 

Here’s the output of `aggr status –r` on this controller:

 

      RAID Disk Device          HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)

      --------- ------          ------------- ---- ---- ---- ----- --------------    --------------

      dparity   0b.22.23        0b    22  23  SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      parity    0b.22.4         0b    22  4   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.5         2a    21  5   SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.5         0b    22  5   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.6         2a    21  6   SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.6         0b    22  6   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.7         2a    21  7   SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.7         0b    22  7   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816 (reconstruction 3% completed)

      data      2a.21.8         2a    21  8   SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.8         0b    22  8   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.9         2a    21  9   SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.9         0b    22  9   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.10        2a    21  10  SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.10        0b    22  10  SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.11        2a    21  11  SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.11        0b    22  11  SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.12        2a    21  12  SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.12        0b    22  12  SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.13        2a    21  13  SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

 

Thanks,

 

Alexander Griesser

Head of Systems Operations

 

ANEXIA Internetdienstleistungs GmbH

 

E-Mail: [hidden email]

Web: http://www.anexia-it.com

 

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt

Geschäftsführer: Alexander Windbichler

Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

 


_______________________________________________
Toasters mailing list
[hidden email]
http://www.teaparty.net/mailman/listinfo/toasters
Reply | Threaded
Open this post in threaded view
|

RE: RAID Reconstruction after ONTAP upgrade?

mrlizard23

Hi Alexander,

 

We have seen this on 7-Mode following a cf takeover & giveback. (FAS3250)


Same output as you and no disks failed before or after this procedure.

 

Kind Regards,

Chris.

 

From: [hidden email] [mailto:[hidden email]] On Behalf Of Alexander Griesser
Sent: 02 April 2017 19:08
To: [hidden email]
Subject: RAID Reconstruction after ONTAP upgrade?

 

Hi,

 

has anyone ever seen a RAID reconstruct happening immediately after an OnTap upgrade?

I just upgraded one of my older filers to 8.3.1P2 and it is now running a reconstruction on one of its aggregates for (at least to me) no obvious reason.

 

During the boot up of this controller after the upgrade, I saw the following message on the console which did not show up on the second controller:

 

Creating trace file /etc/log/rastrace/RAID_0_20170402_17:18:00:095890.dmp

 

No disks show as broken, or in maintenance mode, or anything like that – so any hints would be welcome.

 

Here’s the output of `aggr status –r` on this controller:

 

      RAID Disk Device          HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)

      --------- ------          ------------- ---- ---- ---- ----- --------------    --------------

      dparity   0b.22.23        0b    22  23  SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      parity    0b.22.4         0b    22  4   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.5         2a    21  5   SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.5         0b    22  5   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.6         2a    21  6   SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.6         0b    22  6   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.7         2a    21  7   SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.7         0b    22  7   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816 (reconstruction 3% completed)

      data      2a.21.8         2a    21  8   SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.8         0b    22  8   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.9         2a    21  9   SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.9         0b    22  9   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.10        2a    21  10  SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.10        0b    22  10  SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.11        2a    21  11  SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.11        0b    22  11  SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.12        2a    21  12  SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.12        0b    22  12  SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.13        2a    21  13  SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

 

Thanks,

 

Alexander Griesser

Head of Systems Operations

 

ANEXIA Internetdienstleistungs GmbH

 

E-Mail: [hidden email]

Web: http://www.anexia-it.com

 

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt

Geschäftsführer: Alexander Windbichler

Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

 


_______________________________________________
Toasters mailing list
[hidden email]
http://www.teaparty.net/mailman/listinfo/toasters
Reply | Threaded
Open this post in threaded view
|

AW: RAID Reconstruction after ONTAP upgrade?

Alexander Griesser-2

Did you ever find out the reason for this?

 

Alexander Griesser

Head of Systems Operations

 

ANEXIA Internetdienstleistungs GmbH

 

E-Mail: [hidden email]

Web: http://www.anexia-it.com

 

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt

Geschäftsführer: Alexander Windbichler

Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

 

Von: Chris Hague [mailto:[hidden email]]
Gesendet: Montag, 3. April 2017 11:56
An: Alexander Griesser <[hidden email]>; [hidden email]
Betreff: RE: RAID Reconstruction after ONTAP upgrade?

 

Hi Alexander,

 

We have seen this on 7-Mode following a cf takeover & giveback. (FAS3250)


Same output as you and no disks failed before or after this procedure.

 

Kind Regards,

Chris.

 

From: [hidden email] [[hidden email]] On Behalf Of Alexander Griesser
Sent: 02 April 2017 19:08
To: [hidden email]
Subject: RAID Reconstruction after ONTAP upgrade?

 

Hi,

 

has anyone ever seen a RAID reconstruct happening immediately after an OnTap upgrade?

I just upgraded one of my older filers to 8.3.1P2 and it is now running a reconstruction on one of its aggregates for (at least to me) no obvious reason.

 

During the boot up of this controller after the upgrade, I saw the following message on the console which did not show up on the second controller:

 

Creating trace file /etc/log/rastrace/RAID_0_20170402_17:18:00:095890.dmp

 

No disks show as broken, or in maintenance mode, or anything like that – so any hints would be welcome.

 

Here’s the output of `aggr status –r` on this controller:

 

      RAID Disk Device          HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)

      --------- ------          ------------- ---- ---- ---- ----- --------------    --------------

      dparity   0b.22.23        0b    22  23  SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      parity    0b.22.4         0b    22  4   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.5         2a    21  5   SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.5         0b    22  5   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.6         2a    21  6   SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.6         0b    22  6   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.7         2a    21  7   SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.7         0b    22  7   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816 (reconstruction 3% completed)

      data      2a.21.8         2a    21  8   SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.8         0b    22  8   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.9         2a    21  9   SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.9         0b    22  9   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.10        2a    21  10  SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.10        0b    22  10  SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.11        2a    21  11  SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.11        0b    22  11  SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.12        2a    21  12  SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.12        0b    22  12  SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.13        2a    21  13  SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

 

Thanks,

 

Alexander Griesser

Head of Systems Operations

 

ANEXIA Internetdienstleistungs GmbH

 

E-Mail: [hidden email]

Web: http://www.anexia-it.com

 

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt

Geschäftsführer: Alexander Windbichler

Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

 


_______________________________________________
Toasters mailing list
[hidden email]
http://www.teaparty.net/mailman/listinfo/toasters
Reply | Threaded
Open this post in threaded view
|

RE: RAID Reconstruction after ONTAP upgrade?

mrlizard23

NetApp said upgrade the Disk FW and allow an aggr scrub to complete (when we looked at the aggr scrub status –v, some of the scrubs hadn’t completed for over a year, others had never completed!)

 

As an aside, we have another issue with the same system - disk reservation error & missing raid group child objects which are also preventing a graceful takeover. This is a rare bug which requires an ontap upgrade, but as we cannot gracefully takeover we are awaiting an outage window to perform a DU.

 

Kind Regards,

Chris.

 

From: Alexander Griesser [mailto:[hidden email]]
Sent: 03 April 2017 10:57
To: Chris Hague; [hidden email]
Subject: AW: RAID Reconstruction after ONTAP upgrade?

 

Did you ever find out the reason for this?

 

Alexander Griesser

Head of Systems Operations

 

ANEXIA Internetdienstleistungs GmbH

 

E-Mail: [hidden email]

Web: http://www.anexia-it.com

 

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt

Geschäftsführer: Alexander Windbichler

Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

 

Von: Chris Hague [[hidden email]]
Gesendet: Montag, 3. April 2017 11:56
An: Alexander Griesser <[hidden email]>; [hidden email]
Betreff: RE: RAID Reconstruction after ONTAP upgrade?

 

Hi Alexander,

 

We have seen this on 7-Mode following a cf takeover & giveback. (FAS3250)


Same output as you and no disks failed before or after this procedure.

 

Kind Regards,

Chris.

 

From: [hidden email] [[hidden email]] On Behalf Of Alexander Griesser
Sent: 02 April 2017 19:08
To: [hidden email]
Subject: RAID Reconstruction after ONTAP upgrade?

 

Hi,

 

has anyone ever seen a RAID reconstruct happening immediately after an OnTap upgrade?

I just upgraded one of my older filers to 8.3.1P2 and it is now running a reconstruction on one of its aggregates for (at least to me) no obvious reason.

 

During the boot up of this controller after the upgrade, I saw the following message on the console which did not show up on the second controller:

 

Creating trace file /etc/log/rastrace/RAID_0_20170402_17:18:00:095890.dmp

 

No disks show as broken, or in maintenance mode, or anything like that – so any hints would be welcome.

 

Here’s the output of `aggr status –r` on this controller:

 

      RAID Disk Device          HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)

      --------- ------          ------------- ---- ---- ---- ----- --------------    --------------

      dparity   0b.22.23        0b    22  23  SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      parity    0b.22.4         0b    22  4   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.5         2a    21  5   SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.5         0b    22  5   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.6         2a    21  6   SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.6         0b    22  6   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.7         2a    21  7   SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.7         0b    22  7   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816 (reconstruction 3% completed)

      data      2a.21.8         2a    21  8   SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.8         0b    22  8   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.9         2a    21  9   SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.9         0b    22  9   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.10        2a    21  10  SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.10        0b    22  10  SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.11        2a    21  11  SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.11        0b    22  11  SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.12        2a    21  12  SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.12        0b    22  12  SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.13        2a    21  13  SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

 

Thanks,

 

Alexander Griesser

Head of Systems Operations

 

ANEXIA Internetdienstleistungs GmbH

 

E-Mail: [hidden email]

Web: http://www.anexia-it.com

 

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt

Geschäftsführer: Alexander Windbichler

Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

 


_______________________________________________
Toasters mailing list
[hidden email]
http://www.teaparty.net/mailman/listinfo/toasters
Reply | Threaded
Open this post in threaded view
|

AW: RAID Reconstruction after ONTAP upgrade?

Alexander Griesser-2

I was running shelf, ACP and disk firmware upgrades prior to the upgrade and I also installed the new disk qualification package as recommended.

Here’s the output of my scrub status for the affected aggregate:

 

::> aggr scrub -action status -aggregate blabla_data

 

Raid Group:/blabla_data/plex0/rg0, Is Suspended:false, Last Scrub:Sun Apr  2 04:24:20 2017

 

Raid Group:/blabla_data/plex0/rg1, Is Suspended:true, Last Scrub:Sun Mar 19 06:24:32 2017

, Percentage Completed:38%

Raid Group:/blabla_data/plex0/rg2, Is Suspended:true, Percentage Completed:40%

 

So, I guess I’ll leave that running and for some time now before I try another takeover.

 

How did you check for the disk reservation and missing raid group child object errors? Does `cf status` on this system tell you that a takeover is not possible due to this issue or does it only tell you when you try to run a takeover?

 

Best,

 

Alexander Griesser

Head of Systems Operations

 

ANEXIA Internetdienstleistungs GmbH

 

E-Mail: [hidden email]

Web: http://www.anexia-it.com

 

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt

Geschäftsführer: Alexander Windbichler

Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

 

Von: Chris Hague [mailto:[hidden email]]
Gesendet: Montag, 3. April 2017 12:02
An: Alexander Griesser <[hidden email]>; [hidden email]
Betreff: RE: RAID Reconstruction after ONTAP upgrade?

 

NetApp said upgrade the Disk FW and allow an aggr scrub to complete (when we looked at the aggr scrub status –v, some of the scrubs hadn’t completed for over a year, others had never completed!)

 

As an aside, we have another issue with the same system - disk reservation error & missing raid group child objects which are also preventing a graceful takeover. This is a rare bug which requires an ontap upgrade, but as we cannot gracefully takeover we are awaiting an outage window to perform a DU.

 

Kind Regards,

Chris.

 

From: Alexander Griesser [[hidden email]]
Sent: 03 April 2017 10:57
To: Chris Hague; [hidden email]
Subject: AW: RAID Reconstruction after ONTAP upgrade?

 

Did you ever find out the reason for this?

 

Alexander Griesser

Head of Systems Operations

 

ANEXIA Internetdienstleistungs GmbH

 

E-Mail: [hidden email]

Web: http://www.anexia-it.com

 

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt

Geschäftsführer: Alexander Windbichler

Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

 

Von: Chris Hague [[hidden email]]
Gesendet: Montag, 3. April 2017 11:56
An: Alexander Griesser <[hidden email]>; [hidden email]
Betreff: RE: RAID Reconstruction after ONTAP upgrade?

 

Hi Alexander,

 

We have seen this on 7-Mode following a cf takeover & giveback. (FAS3250)


Same output as you and no disks failed before or after this procedure.

 

Kind Regards,

Chris.

 

From: [hidden email] [[hidden email]] On Behalf Of Alexander Griesser
Sent: 02 April 2017 19:08
To: [hidden email]
Subject: RAID Reconstruction after ONTAP upgrade?

 

Hi,

 

has anyone ever seen a RAID reconstruct happening immediately after an OnTap upgrade?

I just upgraded one of my older filers to 8.3.1P2 and it is now running a reconstruction on one of its aggregates for (at least to me) no obvious reason.

 

During the boot up of this controller after the upgrade, I saw the following message on the console which did not show up on the second controller:

 

Creating trace file /etc/log/rastrace/RAID_0_20170402_17:18:00:095890.dmp

 

No disks show as broken, or in maintenance mode, or anything like that – so any hints would be welcome.

 

Here’s the output of `aggr status –r` on this controller:

 

      RAID Disk Device          HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)

      --------- ------          ------------- ---- ---- ---- ----- --------------    --------------

      dparity   0b.22.23        0b    22  23  SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      parity    0b.22.4         0b    22  4   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.5         2a    21  5   SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.5         0b    22  5   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.6         2a    21  6   SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.6         0b    22  6   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.7         2a    21  7   SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.7         0b    22  7   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816 (reconstruction 3% completed)

      data      2a.21.8         2a    21  8   SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.8         0b    22  8   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.9         2a    21  9   SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.9         0b    22  9   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.10        2a    21  10  SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.10        0b    22  10  SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.11        2a    21  11  SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.11        0b    22  11  SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.12        2a    21  12  SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.12        0b    22  12  SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.13        2a    21  13  SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

 

Thanks,

 

Alexander Griesser

Head of Systems Operations

 

ANEXIA Internetdienstleistungs GmbH

 

E-Mail: [hidden email]

Web: http://www.anexia-it.com

 

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt

Geschäftsführer: Alexander Windbichler

Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

 


_______________________________________________
Toasters mailing list
[hidden email]
http://www.teaparty.net/mailman/listinfo/toasters
Reply | Threaded
Open this post in threaded view
|

RE: RAID Reconstruction after ONTAP upgrade?

mrlizard23

Hi Alexander,

 

No, unfortunately cf status does not show this issue.

 

It only appears when the takeover is attempted (we have tried this 3 times now with the same results)

Errors;

00000016.0000e424 01ca8287 Sat Mar 18 2017 08:45:50 +00:00 [disk.reserveFailed:error] Disk reservation failed on 3a.25.1 CDB 0x5f:0001 - SCSI:illegal request (5 55 4)

00000016.0000e425 01ca8287 Sat Mar 18 2017 08:45:50 +00:00 [disk.reserveFailed:error] Disk reservation failed on 3a.25.5 CDB 0x5f:0001 - SCSI:illegal request (5 55 4)

00000016.0000e426 01ca8287 Sat Mar 18 2017 08:45:50 +00:00 [disk.reserveFailed:error] Disk reservation failed on 3a.25.3 CDB 0x5f:0001 - SCSI:illegal request (5 55 4)

00000016.0000e427 01ca8287 Sat Mar 18 2017 08:45:50 +00:00 [disk.reserveFailed:error] Disk reservation failed on 3a.25.2 CDB 0x5f:0001 - SCSI:illegal request (5 55 4)

Sat Mar 18 08:46:00 GMT [KEN:raid.assim.rg.missingChild:error]: Aggregate partner:aggr0, rgobj_verify: RAID object 0 has only 7 valid children, expected 17.

Sat Mar 18 08:46:00 GMT [KEN:raid.assim.rg.missingChild:error]: Aggregate partner:aggr0, rgobj_verify: RAID object 2 has only 5 valid children, expected 17.

Sat Mar 18 08:46:00 GMT [KEN:raid.assim.rg.missingChild:error]: Aggregate partner:aggr0, rgobj_verify: RAID object 1 has only 12 valid children, expected 17.

Sat Mar 18 08:46:00 GMT [KEN:raid.assim.rg.missingChild:error]: Aggregate partner:aggr0, rgobj_verify: RAID object 3 has only 8 valid children, expected 17.

Sat Mar 18 08:46:15 GMT [BARBIE:ha.takeoverImpNotDef:warning]: Takeover of the partner node is impossible due to reason waiting for partner to recover.

 

Aggr scrub by default is only configured to run for 10 hours every Sunday @ 1am.

We changed this to run continuously in order to complete the scrubs. (once the reconstruction had completed and the disk FW had been upgraded)

 

Kind Regards,

Chris.

 

From: Alexander Griesser [mailto:[hidden email]]
Sent: 03 April 2017 11:08
To: Chris Hague; [hidden email]
Subject: AW: RAID Reconstruction after ONTAP upgrade?

 

I was running shelf, ACP and disk firmware upgrades prior to the upgrade and I also installed the new disk qualification package as recommended.

Here’s the output of my scrub status for the affected aggregate:

 

::> aggr scrub -action status -aggregate blabla_data

 

Raid Group:/blabla_data/plex0/rg0, Is Suspended:false, Last Scrub:Sun Apr  2 04:24:20 2017

 

Raid Group:/blabla_data/plex0/rg1, Is Suspended:true, Last Scrub:Sun Mar 19 06:24:32 2017

, Percentage Completed:38%

Raid Group:/blabla_data/plex0/rg2, Is Suspended:true, Percentage Completed:40%

 

So, I guess I’ll leave that running and for some time now before I try another takeover.

 

How did you check for the disk reservation and missing raid group child object errors? Does `cf status` on this system tell you that a takeover is not possible due to this issue or does it only tell you when you try to run a takeover?

 

Best,

 

Alexander Griesser

Head of Systems Operations

 

ANEXIA Internetdienstleistungs GmbH

 

E-Mail: [hidden email]

Web: http://www.anexia-it.com

 

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt

Geschäftsführer: Alexander Windbichler

Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

 

Von: Chris Hague [[hidden email]]
Gesendet: Montag, 3. April 2017 12:02
An: Alexander Griesser <[hidden email]>; [hidden email]
Betreff: RE: RAID Reconstruction after ONTAP upgrade?

 

NetApp said upgrade the Disk FW and allow an aggr scrub to complete (when we looked at the aggr scrub status –v, some of the scrubs hadn’t completed for over a year, others had never completed!)

 

As an aside, we have another issue with the same system - disk reservation error & missing raid group child objects which are also preventing a graceful takeover. This is a rare bug which requires an ontap upgrade, but as we cannot gracefully takeover we are awaiting an outage window to perform a DU.

 

Kind Regards,

Chris.

 

From: Alexander Griesser [[hidden email]]
Sent: 03 April 2017 10:57
To: Chris Hague; [hidden email]
Subject: AW: RAID Reconstruction after ONTAP upgrade?

 

Did you ever find out the reason for this?

 

Alexander Griesser

Head of Systems Operations

 

ANEXIA Internetdienstleistungs GmbH

 

E-Mail: [hidden email]

Web: http://www.anexia-it.com

 

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt

Geschäftsführer: Alexander Windbichler

Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

 

Von: Chris Hague [[hidden email]]
Gesendet: Montag, 3. April 2017 11:56
An: Alexander Griesser <[hidden email]>; [hidden email]
Betreff: RE: RAID Reconstruction after ONTAP upgrade?

 

Hi Alexander,

 

We have seen this on 7-Mode following a cf takeover & giveback. (FAS3250)


Same output as you and no disks failed before or after this procedure.

 

Kind Regards,

Chris.

 

From: [hidden email] [[hidden email]] On Behalf Of Alexander Griesser
Sent: 02 April 2017 19:08
To: [hidden email]
Subject: RAID Reconstruction after ONTAP upgrade?

 

Hi,

 

has anyone ever seen a RAID reconstruct happening immediately after an OnTap upgrade?

I just upgraded one of my older filers to 8.3.1P2 and it is now running a reconstruction on one of its aggregates for (at least to me) no obvious reason.

 

During the boot up of this controller after the upgrade, I saw the following message on the console which did not show up on the second controller:

 

Creating trace file /etc/log/rastrace/RAID_0_20170402_17:18:00:095890.dmp

 

No disks show as broken, or in maintenance mode, or anything like that – so any hints would be welcome.

 

Here’s the output of `aggr status –r` on this controller:

 

      RAID Disk Device          HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)

      --------- ------          ------------- ---- ---- ---- ----- --------------    --------------

      dparity   0b.22.23        0b    22  23  SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      parity    0b.22.4         0b    22  4   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.5         2a    21  5   SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.5         0b    22  5   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.6         2a    21  6   SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.6         0b    22  6   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.7         2a    21  7   SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.7         0b    22  7   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816 (reconstruction 3% completed)

      data      2a.21.8         2a    21  8   SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.8         0b    22  8   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.9         2a    21  9   SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.9         0b    22  9   SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.10        2a    21  10  SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.10        0b    22  10  SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.11        2a    21  11  SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.11        0b    22  11  SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.12        2a    21  12  SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      0b.22.12        0b    22  12  SA:A   0  BSAS  7200 1695466/3472315904 1695759/3472914816

      data      2a.21.13        2a    21  13  SA:B   0  BSAS  7200 1695466/3472315904 1695759/3472914816

 

Thanks,

 

Alexander Griesser

Head of Systems Operations

 

ANEXIA Internetdienstleistungs GmbH

 

E-Mail: [hidden email]

Web: http://www.anexia-it.com

 

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt

Geschäftsführer: Alexander Windbichler

Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

 


_______________________________________________
Toasters mailing list
[hidden email]
http://www.teaparty.net/mailman/listinfo/toasters