SnapMirror vs. SnapVault for 70+ million files...

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

SnapMirror vs. SnapVault for 70+ million files...

Ray Van Dolson-3
We wanted to use SnapVault to protect a volume containing 70+ million
files (probably also has around 30TB of data, though it de-dupes down
to less than 6TB).  However, it appears that with SnapVault a full file
scan is preformed prior to the block-based replication, and that scan
can take around 24 hours.  I'm assuming it will do this on subsequent
differential vaults, though the block transfer part should be much
shorter we'll still need to wait for the file scan to complete.

As we'd like to "back up" this data at least once a day, would we be
better positioned by using SnapMirror?  My belief is that it does *not*
scan all of the files first and simply replicates changed blocks.

We'd need to keep more snapshots on the source storage to meet our
retention requirements (or maybe further replicate the volume on the
destination side?).

Thanks,
Ray
_______________________________________________
Toasters mailing list
[hidden email]
http://www.teaparty.net/mailman/listinfo/toasters
Reply | Threaded
Open this post in threaded view
|

Re: SnapMirror vs. SnapVault for 70+ million files...

Sebastian Goetze
In 7-Mode, SnapVault logically goes through the whole filesystem to find
changed blocks. E.g. dedupe is not 'seen' at all by SnapVault.
SnapMirror on the other hand just looks at the blocks and doesn't care
how big or small the filesystem is.


In High File Count (HFC) situations (or highly dedupeable data) I always
advise to use SnapMirror, if at all possible.

It transfers the data deduped (and compressed, if the source is
compressed) and can also compress on the wire (don't do this, if your
source data is already compressed...).


And yes, you could continue the replication sequence with SnapVault (e.g
local on the secondary, e.g. only weekly's but further back). This could
offset the possibly more storage you might need on the primary (e.g. if
you don't yet do weekly's at all on the primary at the moment).


My 2c


Sebastian



On 8/22/2016 6:35 PM, Ray Van Dolson wrote:

> We wanted to use SnapVault to protect a volume containing 70+ million
> files (probably also has around 30TB of data, though it de-dupes down
> to less than 6TB).  However, it appears that with SnapVault a full file
> scan is preformed prior to the block-based replication, and that scan
> can take around 24 hours.  I'm assuming it will do this on subsequent
> differential vaults, though the block transfer part should be much
> shorter we'll still need to wait for the file scan to complete.
>
> As we'd like to "back up" this data at least once a day, would we be
> better positioned by using SnapMirror?  My belief is that it does *not*
> scan all of the files first and simply replicates changed blocks.
>
> We'd need to keep more snapshots on the source storage to meet our
> retention requirements (or maybe further replicate the volume on the
> destination side?).
>
> Thanks,
> Ray
> _______________________________________________
> Toasters mailing list
> [hidden email]
> http://www.teaparty.net/mailman/listinfo/toasters

_______________________________________________
Toasters mailing list
[hidden email]
http://www.teaparty.net/mailman/listinfo/toasters
Reply | Threaded
Open this post in threaded view
|

Re: SnapMirror vs. SnapVault for 70+ million files...

Klise, Steve-2
I used to do ~ 10m files (10TB vol) for a roaming profile repository via snapmirror about 5 years ago.  Worked like a charm and was 7-mode.  We would sm to another filer, and then ndmp dump of the 2ndary vol to a vtl.  


On 8/22/16, 9:59 AM, "[hidden email] on behalf of Sebastian Goetze" <[hidden email] on behalf of [hidden email]> wrote:

    In 7-Mode, SnapVault logically goes through the whole filesystem to find
    changed blocks. E.g. dedupe is not 'seen' at all by SnapVault.
    SnapMirror on the other hand just looks at the blocks and doesn't care
    how big or small the filesystem is.
   
   
    In High File Count (HFC) situations (or highly dedupeable data) I always
    advise to use SnapMirror, if at all possible.
   
    It transfers the data deduped (and compressed, if the source is
    compressed) and can also compress on the wire (don't do this, if your
    source data is already compressed...).
   
   
    And yes, you could continue the replication sequence with SnapVault (e.g
    local on the secondary, e.g. only weekly's but further back). This could
    offset the possibly more storage you might need on the primary (e.g. if
    you don't yet do weekly's at all on the primary at the moment).
   
   
    My 2c
   
   
    Sebastian
   
   
   
    On 8/22/2016 6:35 PM, Ray Van Dolson wrote:
    > We wanted to use SnapVault to protect a volume containing 70+ million
    > files (probably also has around 30TB of data, though it de-dupes down
    > to less than 6TB).  However, it appears that with SnapVault a full file
    > scan is preformed prior to the block-based replication, and that scan
    > can take around 24 hours.  I'm assuming it will do this on subsequent
    > differential vaults, though the block transfer part should be much
    > shorter we'll still need to wait for the file scan to complete.
    >
    > As we'd like to "back up" this data at least once a day, would we be
    > better positioned by using SnapMirror?  My belief is that it does *not*
    > scan all of the files first and simply replicates changed blocks.
    >
    > We'd need to keep more snapshots on the source storage to meet our
    > retention requirements (or maybe further replicate the volume on the
    > destination side?).
    >
    > Thanks,
    > Ray
    > _______________________________________________
    > Toasters mailing list
    > [hidden email]
    > http://www.teaparty.net/mailman/listinfo/toasters
   
    _______________________________________________
    Toasters mailing list
    [hidden email]
    http://www.teaparty.net/mailman/listinfo/toasters
   



_______________________________________________
Toasters mailing list
[hidden email]
http://www.teaparty.net/mailman/listinfo/toasters
Reply | Threaded
Open this post in threaded view
|

RE: SnapMirror vs. SnapVault for 70+ million files...

Payne, Richard
In reply to this post by Sebastian Goetze
Just to be clear...when you say

"In High File Count (HFC) situations (or highly dedupeable data) I always
advise to use SnapMirror, if at all possible."

That is referring to Volume SnapMirror (VSM) in 7mode. Qtree Snap Mirror (QSM) will have all of the issue of SnapVault as well.

--rdp

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Sebastian Goetze
Sent: Monday, August 22, 2016 12:59 PM
To: Ray Van Dolson; [hidden email]
Subject: Re: SnapMirror vs. SnapVault for 70+ million files...

In 7-Mode, SnapVault logically goes through the whole filesystem to find
changed blocks. E.g. dedupe is not 'seen' at all by SnapVault.
SnapMirror on the other hand just looks at the blocks and doesn't care
how big or small the filesystem is.


In High File Count (HFC) situations (or highly dedupeable data) I always
advise to use SnapMirror, if at all possible.

It transfers the data deduped (and compressed, if the source is
compressed) and can also compress on the wire (don't do this, if your
source data is already compressed...).


And yes, you could continue the replication sequence with SnapVault (e.g
local on the secondary, e.g. only weekly's but further back). This could
offset the possibly more storage you might need on the primary (e.g. if
you don't yet do weekly's at all on the primary at the moment).


My 2c


Sebastian



On 8/22/2016 6:35 PM, Ray Van Dolson wrote:

> We wanted to use SnapVault to protect a volume containing 70+ million
> files (probably also has around 30TB of data, though it de-dupes down
> to less than 6TB).  However, it appears that with SnapVault a full file
> scan is preformed prior to the block-based replication, and that scan
> can take around 24 hours.  I'm assuming it will do this on subsequent
> differential vaults, though the block transfer part should be much
> shorter we'll still need to wait for the file scan to complete.
>
> As we'd like to "back up" this data at least once a day, would we be
> better positioned by using SnapMirror?  My belief is that it does *not*
> scan all of the files first and simply replicates changed blocks.
>
> We'd need to keep more snapshots on the source storage to meet our
> retention requirements (or maybe further replicate the volume on the
> destination side?).
>
> Thanks,
> Ray
> _______________________________________________
> Toasters mailing list
> [hidden email]
> http://www.teaparty.net/mailman/listinfo/toasters

_______________________________________________
Toasters mailing list
[hidden email]
http://www.teaparty.net/mailman/listinfo/toasters

_______________________________________________
Toasters mailing list
[hidden email]
http://www.teaparty.net/mailman/listinfo/toasters
Reply | Threaded
Open this post in threaded view
|

Re: SnapMirror vs. SnapVault for 70+ million files...

Sebastian Goetze
Absolutely, yes! I should have written VSM.

(Does anybody use QSM?)


Sebastian

On 8/22/2016 7:08 PM, Payne, Richard wrote:

> Just to be clear...when you say
>
> "In High File Count (HFC) situations (or highly dedupeable data) I always
> advise to use SnapMirror, if at all possible."
>
> That is referring to Volume SnapMirror (VSM) in 7mode. Qtree Snap Mirror (QSM) will have all of the issue of SnapVault as well.
>
> --rdp
>
> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]] On Behalf Of Sebastian Goetze
> Sent: Monday, August 22, 2016 12:59 PM
> To: Ray Van Dolson; [hidden email]
> Subject: Re: SnapMirror vs. SnapVault for 70+ million files...
>
> In 7-Mode, SnapVault logically goes through the whole filesystem to find
> changed blocks. E.g. dedupe is not 'seen' at all by SnapVault.
> SnapMirror on the other hand just looks at the blocks and doesn't care
> how big or small the filesystem is.
>
>
> In High File Count (HFC) situations (or highly dedupeable data) I always
> advise to use SnapMirror, if at all possible.
>
> It transfers the data deduped (and compressed, if the source is
> compressed) and can also compress on the wire (don't do this, if your
> source data is already compressed...).
>
>
> And yes, you could continue the replication sequence with SnapVault (e.g
> local on the secondary, e.g. only weekly's but further back). This could
> offset the possibly more storage you might need on the primary (e.g. if
> you don't yet do weekly's at all on the primary at the moment).
>
>
> My 2c
>
>
> Sebastian
>
>
>
> On 8/22/2016 6:35 PM, Ray Van Dolson wrote:
>> We wanted to use SnapVault to protect a volume containing 70+ million
>> files (probably also has around 30TB of data, though it de-dupes down
>> to less than 6TB).  However, it appears that with SnapVault a full file
>> scan is preformed prior to the block-based replication, and that scan
>> can take around 24 hours.  I'm assuming it will do this on subsequent
>> differential vaults, though the block transfer part should be much
>> shorter we'll still need to wait for the file scan to complete.
>>
>> As we'd like to "back up" this data at least once a day, would we be
>> better positioned by using SnapMirror?  My belief is that it does *not*
>> scan all of the files first and simply replicates changed blocks.
>>
>> We'd need to keep more snapshots on the source storage to meet our
>> retention requirements (or maybe further replicate the volume on the
>> destination side?).
>>
>> Thanks,
>> Ray
>> _______________________________________________
>> Toasters mailing list
>> [hidden email]
>> http://www.teaparty.net/mailman/listinfo/toasters
> _______________________________________________
> Toasters mailing list
> [hidden email]
> http://www.teaparty.net/mailman/listinfo/toasters

_______________________________________________
Toasters mailing list
[hidden email]
http://www.teaparty.net/mailman/listinfo/toasters
Reply | Threaded
Open this post in threaded view
|

Re: SnapMirror vs. SnapVault for 70+ million files...

Michael Bergman
In reply to this post by Klise, Steve-2
On 2016-08-22 19:06, Klise, Steve wrote:
> I used to do ~ 10m files (10TB vol) for a roaming profile repository via
> snapmirror about 5 years ago.  Worked like a charm and was 7-mode.  We
> would sm to another filer, and then ndmp dump of the 2ndary vol to a vtl.

I'm sure 10 M inodes in 10 TB worked AOK for you. It did for us as well, no
prob. If you multiply that by 10x... to 100 M inodes... *then* it's a
problem.  Then several volumes with >100 M inodes, and it starts to hurt. A lot.
I think that a FlexVol (qtree or whatever) with 10 M inodes in it isn't very
much, not even 5 y ago

Luckily we're rid of all this file tree walk c**p in C.DOT

All the bg scanners still in ONTAP walking file trees, like the redirect
scanner which implicitly runs after you've done Aggr reallocate, is just a
PITA.... *sigh*  They literally can never finish. I've tried, during a long
period of time and it's just "#(¤%/%(¤

I think there's probably some of these bg scanner still in Kahuna as well,
but I don't know for sure.  Redirect scanning was moved to Vol Affinity in
wafl_exempt though, in 8.3.1 (I think it was, correct me if I'm wrong!)

Regards,
M

> On 8/22/16, 9:59 AM, "[hidden email] on behalf of Sebastian Goetze"<[hidden email] on behalf of [hidden email]>  wrote:
>
> In 7-Mode, SnapVault logically goes through the whole filesystem to find
> changed blocks. E.g. dedupe is not 'seen' at all by SnapVault.
> SnapMirror on the other hand just looks at the blocks and doesn't care
> how big or small the filesystem is.
>
> In High File Count (HFC) situations (or highly dedupeable data) I always
> advise to use SnapMirror, if at all possible.
>
> It transfers the data deduped (and compressed, if the source is
> compressed) and can also compress on the wire (don't do this, if your
> source data is already compressed...).
>
> And yes, you could continue the replication sequence with SnapVault (e.g
> local on the secondary, e.g. only weekly's but further back). This could
> offset the possibly more storage you might need on the primary (e.g. if
> you don't yet do weekly's at all on the primary at the moment).
>
> My 2c
>
> Sebastian

_______________________________________________
Toasters mailing list
[hidden email]
http://www.teaparty.net/mailman/listinfo/toasters
Reply | Threaded
Open this post in threaded view
|

RE: SnapMirror vs. SnapVault for 70+ million files...

Jordan Slingerland
In reply to this post by Klise, Steve-2
VSM will be much faster.  

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Klise, Steve
Sent: Monday, August 22, 2016 1:06 PM
To: Sebastian Goetze; Ray Van Dolson; [hidden email]
Subject: Re: SnapMirror vs. SnapVault for 70+ million files...

I used to do ~ 10m files (10TB vol) for a roaming profile repository via snapmirror about 5 years ago.  Worked like a charm and was 7-mode.  We would sm to another filer, and then ndmp dump of the 2ndary vol to a vtl.  


On 8/22/16, 9:59 AM, "[hidden email] on behalf of Sebastian Goetze" <[hidden email] on behalf of [hidden email]> wrote:

    In 7-Mode, SnapVault logically goes through the whole filesystem to find
    changed blocks. E.g. dedupe is not 'seen' at all by SnapVault.
    SnapMirror on the other hand just looks at the blocks and doesn't care
    how big or small the filesystem is.
   
   
    In High File Count (HFC) situations (or highly dedupeable data) I always
    advise to use SnapMirror, if at all possible.
   
    It transfers the data deduped (and compressed, if the source is
    compressed) and can also compress on the wire (don't do this, if your
    source data is already compressed...).
   
   
    And yes, you could continue the replication sequence with SnapVault (e.g
    local on the secondary, e.g. only weekly's but further back). This could
    offset the possibly more storage you might need on the primary (e.g. if
    you don't yet do weekly's at all on the primary at the moment).
   
   
    My 2c
   
   
    Sebastian
   
   
   
    On 8/22/2016 6:35 PM, Ray Van Dolson wrote:
    > We wanted to use SnapVault to protect a volume containing 70+ million
    > files (probably also has around 30TB of data, though it de-dupes down
    > to less than 6TB).  However, it appears that with SnapVault a full file
    > scan is preformed prior to the block-based replication, and that scan
    > can take around 24 hours.  I'm assuming it will do this on subsequent
    > differential vaults, though the block transfer part should be much
    > shorter we'll still need to wait for the file scan to complete.
    >
    > As we'd like to "back up" this data at least once a day, would we be
    > better positioned by using SnapMirror?  My belief is that it does *not*
    > scan all of the files first and simply replicates changed blocks.
    >
    > We'd need to keep more snapshots on the source storage to meet our
    > retention requirements (or maybe further replicate the volume on the
    > destination side?).
    >
    > Thanks,
    > Ray
    > _______________________________________________
    > Toasters mailing list
    > [hidden email]
    > http://www.teaparty.net/mailman/listinfo/toasters
   
    _______________________________________________
    Toasters mailing list
    [hidden email]
    http://www.teaparty.net/mailman/listinfo/toasters
   



_______________________________________________
Toasters mailing list
[hidden email]
http://www.teaparty.net/mailman/listinfo/toasters


_______________________________________________
Toasters mailing list
[hidden email]
http://www.teaparty.net/mailman/listinfo/toasters
Reply | Threaded
Open this post in threaded view
|

Re: SnapMirror vs. SnapVault for 70+ million files...

Ray Van Dolson-3
In reply to this post by Sebastian Goetze
Not us!

Sounds like we'll plan to shift to VSM instead of SnapVault.

(That and think about migration to cDOT -- for us challenging as we
have a lot of 7-mode N-Series)

Thanks for everyone's responses.

Ray

On Mon, Aug 22, 2016 at 07:47:29PM +0200, Sebastian Goetze wrote:

> Absolutely, yes! I should have written VSM.
>
> (Does anybody use QSM?)
>
> Sebastian
>
> On 8/22/2016 7:08 PM, Payne, Richard wrote:
> >Just to be clear...when you say
> >
> >"In High File Count (HFC) situations (or highly dedupeable data) I
> >always advise to use SnapMirror, if at all possible."
> >
> >That is referring to Volume SnapMirror (VSM) in 7mode. Qtree Snap
> >Mirror (QSM) will have all of the issue of SnapVault as well.
> >
> >--rdp
> >
> >-----Original Message-----
> >From: [hidden email] [mailto:[hidden email]] On Behalf Of Sebastian Goetze
> >Sent: Monday, August 22, 2016 12:59 PM
> >To: Ray Van Dolson; [hidden email]
> >Subject: Re: SnapMirror vs. SnapVault for 70+ million files...
> >
> >In 7-Mode, SnapVault logically goes through the whole filesystem to find
> >changed blocks. E.g. dedupe is not 'seen' at all by SnapVault.
> >SnapMirror on the other hand just looks at the blocks and doesn't care
> >how big or small the filesystem is.
> >
> >
> >In High File Count (HFC) situations (or highly dedupeable data) I always
> >advise to use SnapMirror, if at all possible.
> >
> >It transfers the data deduped (and compressed, if the source is
> >compressed) and can also compress on the wire (don't do this, if your
> >source data is already compressed...).
> >
> >
> >And yes, you could continue the replication sequence with SnapVault (e.g
> >local on the secondary, e.g. only weekly's but further back). This could
> >offset the possibly more storage you might need on the primary (e.g. if
> >you don't yet do weekly's at all on the primary at the moment).
> >
> >
> >My 2c
> >
> >
> >Sebastian
> >
> >
> >
> >On 8/22/2016 6:35 PM, Ray Van Dolson wrote:
> >>We wanted to use SnapVault to protect a volume containing 70+ million
> >>files (probably also has around 30TB of data, though it de-dupes down
> >>to less than 6TB).  However, it appears that with SnapVault a full file
> >>scan is preformed prior to the block-based replication, and that scan
> >>can take around 24 hours.  I'm assuming it will do this on subsequent
> >>differential vaults, though the block transfer part should be much
> >>shorter we'll still need to wait for the file scan to complete.
> >>
> >>As we'd like to "back up" this data at least once a day, would we be
> >>better positioned by using SnapMirror?  My belief is that it does *not*
> >>scan all of the files first and simply replicates changed blocks.
> >>
> >>We'd need to keep more snapshots on the source storage to meet our
> >>retention requirements (or maybe further replicate the volume on the
> >>destination side?).
> >>
> >>Thanks,
> >>Ray
_______________________________________________
Toasters mailing list
[hidden email]
http://www.teaparty.net/mailman/listinfo/toasters
Reply | Threaded
Open this post in threaded view
|

RE: SnapMirror vs. SnapVault for 70+ million files...

Payne, Richard
"Does anybody use QSM?"

Yes, we use a lot of it. Pre 8.3 Cmode VSM mandated the destination always be same or greater OnTap version and we have lots of one to many relationships that cross the world...in same cases a filer may be the source of one relationship and destination of others.

--rdp

-----Original Message-----
From: Ray Van Dolson [mailto:[hidden email]]
Sent: Monday, August 22, 2016 4:18 PM
To: Sebastian Goetze
Cc: Payne, Richard; [hidden email]
Subject: Re: SnapMirror vs. SnapVault for 70+ million files...

Not us!

Sounds like we'll plan to shift to VSM instead of SnapVault.

(That and think about migration to cDOT -- for us challenging as we
have a lot of 7-mode N-Series)

Thanks for everyone's responses.

Ray

On Mon, Aug 22, 2016 at 07:47:29PM +0200, Sebastian Goetze wrote:

> Absolutely, yes! I should have written VSM.
>
> (Does anybody use QSM?)
>
> Sebastian
>
> On 8/22/2016 7:08 PM, Payne, Richard wrote:
> >Just to be clear...when you say
> >
> >"In High File Count (HFC) situations (or highly dedupeable data) I
> >always advise to use SnapMirror, if at all possible."
> >
> >That is referring to Volume SnapMirror (VSM) in 7mode. Qtree Snap
> >Mirror (QSM) will have all of the issue of SnapVault as well.
> >
> >--rdp
> >
> >-----Original Message-----
> >From: [hidden email] [mailto:[hidden email]] On Behalf Of Sebastian Goetze
> >Sent: Monday, August 22, 2016 12:59 PM
> >To: Ray Van Dolson; [hidden email]
> >Subject: Re: SnapMirror vs. SnapVault for 70+ million files...
> >
> >In 7-Mode, SnapVault logically goes through the whole filesystem to find
> >changed blocks. E.g. dedupe is not 'seen' at all by SnapVault.
> >SnapMirror on the other hand just looks at the blocks and doesn't care
> >how big or small the filesystem is.
> >
> >
> >In High File Count (HFC) situations (or highly dedupeable data) I always
> >advise to use SnapMirror, if at all possible.
> >
> >It transfers the data deduped (and compressed, if the source is
> >compressed) and can also compress on the wire (don't do this, if your
> >source data is already compressed...).
> >
> >
> >And yes, you could continue the replication sequence with SnapVault (e.g
> >local on the secondary, e.g. only weekly's but further back). This could
> >offset the possibly more storage you might need on the primary (e.g. if
> >you don't yet do weekly's at all on the primary at the moment).
> >
> >
> >My 2c
> >
> >
> >Sebastian
> >
> >
> >
> >On 8/22/2016 6:35 PM, Ray Van Dolson wrote:
> >>We wanted to use SnapVault to protect a volume containing 70+ million
> >>files (probably also has around 30TB of data, though it de-dupes down
> >>to less than 6TB).  However, it appears that with SnapVault a full file
> >>scan is preformed prior to the block-based replication, and that scan
> >>can take around 24 hours.  I'm assuming it will do this on subsequent
> >>differential vaults, though the block transfer part should be much
> >>shorter we'll still need to wait for the file scan to complete.
> >>
> >>As we'd like to "back up" this data at least once a day, would we be
> >>better positioned by using SnapMirror?  My belief is that it does *not*
> >>scan all of the files first and simply replicates changed blocks.
> >>
> >>We'd need to keep more snapshots on the source storage to meet our
> >>retention requirements (or maybe further replicate the volume on the
> >>destination side?).
> >>
> >>Thanks,
> >>Ray

_______________________________________________
Toasters mailing list
[hidden email]
http://www.teaparty.net/mailman/listinfo/toasters