A computer components & hardware forum. HardwareBanter

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » HardwareBanter forum » General Hardware & Peripherals » Storage & Hardrives
Site Map Home Register Authors List Search Today's Posts Mark Forums Read Web Partners

ILM and Full Text Search



 
 
Thread Tools Display Modes
  #1  
Old January 30th 07, 11:01 PM posted to comp.arch.storage
[email protected]
external usenet poster
 
Posts: 3
Default ILM and Full Text Search

Hello,

I'm looking into various ILM products such as those from Kazeon, EMC,
NeoPath, etc. One question that comes up is how these products behave
when a client does a full-text search against a volume that contains
data that's been migrated away.

From what I understand, a file access causes many of these products to

bring the file back from a secondary tier. I know that some ILM API's
allow for redirection, which would seemingly avoid this issue.
However, others do not have redirection. Wouldn't this mean that a
full-text search causes the entire set of data to be brought back onto
the primary tier? Doesn't this cause capacity issues?

What am I missing? Your help is greatly appreciated.

Thanks,
Ron

  #3  
Old February 1st 07, 07:28 PM posted to comp.arch.storage
[email protected]
external usenet poster
 
Posts: 18
Default ILM and Full Text Search

On Jan 30, 2:47 pm, Nik Simpson wrote:
wrote:
Hello,


I'm looking into various ILM products such as those from Kazeon, EMC,
NeoPath, etc. One question that comes up is how these products behave
when a client does a full-text search against a volume that contains
data that's been migrated away.


From what I understand, a file access causes many of these products to

bring the file back from a secondary tier. I know that some ILM API's
allow for redirection, which would seemingly avoid this issue.
However, others do not have redirection. Wouldn't this mean that a
full-text search causes the entire set of data to be brought back onto
the primary tier? Doesn't this cause capacity issues?


What am I missing? Your help is greatly appreciated.


Typically, a content search is performed against a content index, not
against the original file, so the search doesn't touch the file at all.
The file is read during the indexing process, if that occurs before
migration then the file will not be hit after migration.

PS. If you looking at this space you should also take a look at Scentric
(FTR I work for Scentric, well at least for another ten days :-)

--
Nik Simpson


What happens when someone opens Windows file explorer and performs a
search through it's search tool? Wont it try and read all the files
off of the NAS and to the OPs point, wont it cause all the files to be
moved from tier II to tier I again?

Dvy

  #5  
Old February 1st 07, 11:37 PM posted to comp.arch.storage
bcwalrus
external usenet poster
 
Posts: 2
Default ILM and Full Text Search

On Jan 30, 2:01 pm, "
wrote:
Hello,

I'm looking into various ILM products such as those from Kazeon, EMC,
NeoPath, etc. One question that comes up is how these products behave
when a client does a full-text search against a volume that contains
data that's been migrated away.

From what I understand, a file access causes many of these products to


bring the file back from a secondary tier. I know that some ILM API's
allow for redirection, which would seemingly avoid this issue.
However, others do not have redirection. Wouldn't this mean that a
full-text search causes the entire set of data to be brought back onto
the primary tier? Doesn't this cause capacity issues?

What am I missing? Your help is greatly appreciated.

Thanks,
Ron


Not for the NeoPath FileDirector. They redirect access traffic to the
migration destination. If you access it frequent enough, then
depending on how you set up the placement policy, data may be migrated
back to the primary tier. Or you can set up your policy not to do
that. In other words, data access and data placement policy are
independent.

(I happen to be the NFS guy at NeoPath.)

Cheers,
bc

  #6  
Old February 2nd 07, 12:01 AM posted to comp.arch.storage
Faeandar
external usenet poster
 
Posts: 191
Default ILM and Full Text Search

On 30 Jan 2007 14:01:52 -0800, "
wrote:

Hello,

I'm looking into various ILM products such as those from Kazeon, EMC,
NeoPath, etc. One question that comes up is how these products behave
when a client does a full-text search against a volume that contains
data that's been migrated away.

From what I understand, a file access causes many of these products to

bring the file back from a secondary tier. I know that some ILM API's
allow for redirection, which would seemingly avoid this issue.
However, others do not have redirection. Wouldn't this mean that a
full-text search causes the entire set of data to be brought back onto
the primary tier? Doesn't this cause capacity issues?

What am I missing? Your help is greatly appreciated.

Thanks,
Ron



So, since we have two people from companies in this space I'd like to
pose the competitive question:

What are your thoughts on Index Engines?

Thanks.

~F
  #7  
Old February 2nd 07, 02:54 AM posted to comp.arch.storage
Nik Simpson
external usenet poster
 
Posts: 73
Default ILM and Full Text Search

Faeandar wrote:

So, since we have two people from companies in this space I'd like to
pose the competitive question:

What are your thoughts on Index Engines?



First, right now I would not see Index Engines as a direct competitor,
they are purely a search application and don't offer much in the way of
classification or policy-based data management which is needed for ILM.

Second for enterprise wide search the problem is that when I'm looking
for document X, I'd rather find it on disk than buried on a backup tape.
If I can't find it online, then I'd go backup tape. So other than as an
application for helping me keep better track of what I've backed up I
don't see much of a future for it.

Interesting technology that I suspect will get embedded in things like
VTLs and D2D disk backup appliances. I don't see it as a standalone
technology. Good acquisition candidate for somebody in that space.

--
Nik Simpson
  #8  
Old February 2nd 07, 03:00 AM posted to comp.arch.storage
[email protected]
external usenet poster
 
Posts: 18
Default ILM and Full Text Search

On Feb 1, 5:54 pm, Nik Simpson wrote:
Faeandar wrote:

So, since we have two people from companies in this space I'd like to
pose the competitive question:


What are your thoughts on Index Engines?


First, right now I would not see Index Engines as a direct competitor,
they are purely a search application and don't offer much in the way of
classification or policy-based data management which is needed for ILM.

Second for enterprise wide search the problem is that when I'm looking
for document X, I'd rather find it on disk than buried on a backup tape.
If I can't find it online, then I'd go backup tape. So other than as an
application for helping me keep better track of what I've backed up I
don't see much of a future for it.

Interesting technology that I suspect will get embedded in things like
VTLs and D2D disk backup appliances. I don't see it as a standalone
technology. Good acquisition candidate for somebody in that space.

--
Nik Simpson


Where does the google search appliance fit into this?

Dvy

  #9  
Old February 2nd 07, 03:32 AM posted to comp.arch.storage
Faeandar
external usenet poster
 
Posts: 191
Default ILM and Full Text Search

On Thu, 01 Feb 2007 20:54:20 -0500, Nik Simpson
wrote:

Faeandar wrote:

So, since we have two people from companies in this space I'd like to
pose the competitive question:

What are your thoughts on Index Engines?



First, right now I would not see Index Engines as a direct competitor,
they are purely a search application and don't offer much in the way of
classification or policy-based data management which is needed for ILM.

Second for enterprise wide search the problem is that when I'm looking
for document X, I'd rather find it on disk than buried on a backup tape.
If I can't find it online, then I'd go backup tape. So other than as an
application for helping me keep better track of what I've backed up I
don't see much of a future for it.

Interesting technology that I suspect will get embedded in things like
VTLs and D2D disk backup appliances. I don't see it as a standalone
technology. Good acquisition candidate for somebody in that space.



They can get metadata directly from NDMP dumps. If someone figures
out how to flag the dump to only pass the metadata then they will be
able to get an entire storage array's metadata in a matter of hours
instead of days that file crawlers will take.
Even without the flag they still get data far faster than any file
crawler.

I may have been asking far too open ended a question. My needs are
fairly simple; tell me what, where, how big, how frequently accessed,
what type of file, etc. I've no need for a deep dive of content.

I'm looking for typical SRM stats, but on a fair scale.

Hopefully this provides more to go on.

Thanks.

~F
  #10  
Old February 3rd 07, 02:06 PM posted to comp.arch.storage
Nik Simpson
external usenet poster
 
Posts: 73
Default ILM and Full Text Search

Faeandar wrote:


They can get metadata directly from NDMP dumps. If someone figures
out how to flag the dump to only pass the metadata then they will be
able to get an entire storage array's metadata in a matter of hours
instead of days that file crawlers will take.
Even without the flag they still get data far faster than any file
crawler.


Yes, they could do that, but then so could every other competitor, NDMP
is available to anybody, not just Index Engines. EMC does something
similar, though probably proprietary with it's classification product
which gets a "dump" of metadata from Celerra file servers rather walking
the file system over the network.

I may have been asking far too open ended a question. My needs are
fairly simple; tell me what, where, how big, how frequently accessed,
what type of file, etc. I've no need for a deep dive of content.


Index Engines wouldn't be a solution then, since to the best of my
knowledge it's all about content indexing & search. However, both
Scentric and Kazeon can do what you want without having to generate a
content index.


I'm looking for typical SRM stats, but on a fair scale.


So you don't actually want to take any actions like migrating little
used stuff to tier2? Anyway, both Scentric and Kazeon offer extensive
SRM reporting, though if reporting is all you want, you might want to
take a look at Monosphere which has a pure file SRM solution. How big is
a "fair scale" to you, 10s, 100s, 1000s of TB?

If you do want to take actions, then a policy engine is something you
want to look at. I can't speak for Kazeon's policy engine, but Scentric
lets you build policies with classification rules. For example "find all
OFFICE files larger than 50MB, & not accessed in 30days" which can be
combined with one or more actions that work on the results of the
filter. Actions include move, copy, delete, script, archive with
retention, etc.

You can schedule these policies on a calendar or event trigger (i.e.
once a week, or when file system has less than 20% free), you can also
trigger them from external scripts.



--
Nik Simpson
 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT +1. The time now is 03:07 PM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 HardwareBanter.
The comments are property of their posters.