A computer components & hardware forum. HardwareBanter

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » HardwareBanter forum » General Hardware & Peripherals » Storage & Hardrives
Site Map Home Register Authors List Search Today's Posts Mark Forums Read Web Partners

ILM and Full Text Search



 
 
Thread Tools Display Modes
  #11  
Old February 4th 07, 04:44 AM posted to comp.arch.storage
Faeandar
external usenet poster
 
Posts: 191
Default ILM and Full Text Search

On Sat, 03 Feb 2007 08:06:42 -0500, Nik Simpson
wrote:

Faeandar wrote:



Yes, they could do that, but then so could every other competitor, NDMP
is available to anybody, not just Index Engines. EMC does something
similar, though probably proprietary with it's classification product
which gets a "dump" of metadata from Celerra file servers rather walking
the file system over the network.


Any/every other product could but, so far as I've seen, do not. That
one bit is intriguing enough to me to look at them.


I may have been asking far too open ended a question. My needs are
fairly simple; tell me what, where, how big, how frequently accessed,
what type of file, etc. I've no need for a deep dive of content.


Index Engines wouldn't be a solution then, since to the best of my
knowledge it's all about content indexing & search. However, both
Scentric and Kazeon can do what you want without having to generate a
content index.


We have Kazeon on eval and so far I can't say I'm impressed. It's
quite slow. Getting data on an entire filer would take many weeks
based on performance tests. It took 4 days to run a single qtree on a
filer.



I'm looking for typical SRM stats, but on a fair scale.


So you don't actually want to take any actions like migrating little
used stuff to tier2?


That is correct. No automated migrations or anything. I want
information that me and my staff can make decisions based on, but our
needs are not simple enough for policy based file migration.

Anyway, both Scentric and Kazeon offer extensive
SRM reporting, though if reporting is all you want, you might want to
take a look at Monosphere which has a pure file SRM solution. How big is
a "fair scale" to you, 10s, 100s, 1000s of TB?


I thought Monosphere was more of a trending and analysis tool? Not
file level reporting. We are slated to eval them for a different
purpose but I'll keep them in mind for this as well.
Fair scale would be 100's of TB.

~F
  #12  
Old February 5th 07, 11:52 PM posted to comp.arch.storage
[email protected]
external usenet poster
 
Posts: 18
Default ILM and Full Text Search

On Feb 3, 7:44 pm, Faeandar wrote:
On Sat, 03 Feb 2007 08:06:42 -0500, Nik Simpson

wrote:
Faeandar wrote:


Yes, they could do that, but then so could every other competitor, NDMP
is available to anybody, not just Index Engines. EMC does something
similar, though probably proprietary with it's classification product
which gets a "dump" of metadata from Celerra file servers rather walking
the file system over the network.


Any/every other product could but, so far as I've seen, do not. That
one bit is intriguing enough to me to look at them.



I may have been asking far too open ended a question. My needs are
fairly simple; tell me what, where, how big, how frequently accessed,
what type of file, etc. I've no need for a deep dive of content.


Index Engines wouldn't be a solution then, since to the best of my
knowledge it's all about content indexing & search. However, both
Scentric and Kazeon can do what you want without having to generate a
content index.


We have Kazeon on eval and so far I can't say I'm impressed. It's
quite slow. Getting data on an entire filer would take many weeks
based on performance tests. It took 4 days to run a single qtree on a
filer.


Is this for the kazeon to crawl the filer? How much data is on that
filer? And how many files is that data in?
Is it crawling the filer via the FPolicy link or via a NFS link?




I'm looking for typical SRM stats, but on a fair scale.


So you don't actually want to take any actions like migrating little
used stuff to tier2?


That is correct. No automated migrations or anything. I want
information that me and my staff can make decisions based on, but our
needs are not simple enough for policy based file migration.

Anyway, both Scentric and Kazeon offer extensive
SRM reporting, though if reporting is all you want, you might want to
take a look at Monosphere which has a pure file SRM solution. How big is
a "fair scale" to you, 10s, 100s, 1000s of TB?


I thought Monosphere was more of a trending and analysis tool? Not
file level reporting. We are slated to eval them for a different
purpose but I'll keep them in mind for this as well.
Fair scale would be 100's of TB.

~F



  #13  
Old February 6th 07, 12:45 AM posted to comp.arch.storage
bcwalrus
external usenet poster
 
Posts: 2
Default ILM and Full Text Search

On Feb 3, 7:44 pm, Faeandar wrote:
...

I thought Monosphere was more of a trending and analysis tool? Not
file level reporting. We are slated to eval them for a different
purpose but I'll keep them in mind for this as well.
Fair scale would be 100's of TB.

~F


If the number of files is around tens of millions, then this
Fileyzer tool seems to do what you mentioned:
http://neopathnetworks.com/products/fileyzer.aspx
(Trial download)

It's purely for analysis; no data placement. It's light and
fast. The GUI is neat, too.

Cheers,
bc

  #14  
Old February 6th 07, 03:32 AM posted to comp.arch.storage
Faeandar
external usenet poster
 
Posts: 191
Default ILM and Full Text Search

On 5 Feb 2007 15:45:22 -0800, "bcwalrus" wrote:

On Feb 3, 7:44 pm, Faeandar wrote:
...

I thought Monosphere was more of a trending and analysis tool? Not
file level reporting. We are slated to eval them for a different
purpose but I'll keep them in mind for this as well.
Fair scale would be 100's of TB.

~F


If the number of files is around tens of millions, then this
Fileyzer tool seems to do what you mentioned:
http://neopathnetworks.com/products/fileyzer.aspx
(Trial download)

It's purely for analysis; no data placement. It's light and
fast. The GUI is neat, too.

Cheers,
bc



i will check this out and post back my findings. Thanks.

~F
  #15  
Old February 6th 07, 04:00 AM posted to comp.arch.storage
Faeandar
external usenet poster
 
Posts: 191
Default ILM and Full Text Search

On 5 Feb 2007 15:45:22 -0800, "bcwalrus" wrote:

On Feb 3, 7:44 pm, Faeandar wrote:
...

I thought Monosphere was more of a trending and analysis tool? Not
file level reporting. We are slated to eval them for a different
purpose but I'll keep them in mind for this as well.
Fair scale would be 100's of TB.

~F


If the number of files is around tens of millions, then this
Fileyzer tool seems to do what you mentioned:
http://neopathnetworks.com/products/fileyzer.aspx
(Trial download)

It's purely for analysis; no data placement. It's light and
fast. The GUI is neat, too.

Cheers,
bc


I downloaded it and two things I notice make it less than helpful.

1) the trial version won't analyze network drives (That's where the
problems are !!!)

2) it seems to only analyze the C drive on my windows box and
disregards any selection criteria I give it. It's the same view no
matter what my criteria are.

I may talk to NeoPath and get a full fledged eval because the concept
is interesting. But this download, thoughI appreciate the effort and
thought, proved to be useless.

~F
  #16  
Old February 6th 07, 04:22 PM posted to comp.arch.storage
[email protected]
external usenet poster
 
Posts: 1
Default ILM and Full Text Search

On Feb 1, 8:54 pm, Nik Simpson wrote:
Faeandar wrote:

So, since we have two people from companies in this space I'd like to
pose the competitive question:


What are your thoughts onIndex Engines?


First, right now I would not seeIndex Enginesas a direct competitor,
they are purely a search application and don't offer much in the way of
classification or policy-based data management which is needed for ILM.

Second for enterprise wide search the problem is that when I'm looking
for document X, I'd rather find it on disk than buried on a backup tape.
If I can't find it online, then I'd go backup tape. So other than as an
application for helping me keep better track of what I've backed up I
don't see much of a future for it.

Interesting technology that I suspect will get embedded in things like
VTLs and D2D disk backup appliances. I don't see it as a standalone
technology. Good acquisition candidate for somebody in that space.

--
Nik Simpson


Nik:
I am with Index Engines - and want to update your post above. We
initially entered the market with a search capability, however we have
since added reporting and classification solutions. We are seeing
strong traction in the data classification space as we are the only
vendor that can provide full knowledge of data at the scale required
for enterprise wide engagements. Many of the ILM/classification
vendors have used open source indexing solutions - which do not scale
to millions/billions of files and email. We have architected a
purpose built indexing solution that provides comprehensive insight
into all enterprise data assets. A logical fit for anyone looking
into ILM or classification solutions.

Additionally, we can ingest data from a SAN, LAN or directly from
tape. Our architecture is designed to understand storage protocols -
so plugging us into any of these environments will allow us to ingest
data.

Hope this clarifies how we fit in the market and differentiates us
from the others.

Jim McGann
www.indexengines.com

  #17  
Old February 6th 07, 06:58 PM posted to comp.arch.storage
[email protected]
external usenet poster
 
Posts: 18
Default ILM and Full Text Search

On Feb 5, 7:00 pm, Faeandar wrote:
On 5 Feb 2007 15:45:22 -0800, "bcwalrus" wrote:



On Feb 3, 7:44 pm, Faeandar wrote:
...


I thought Monosphere was more of a trending and analysis tool? Not
file level reporting. We are slated to eval them for a different
purpose but I'll keep them in mind for this as well.
Fair scale would be 100's of TB.


~F


If the number of files is around tens of millions, then this
Fileyzer tool seems to do what you mentioned:
http://neopathnetworks.com/products/fileyzer.aspx
(Trial download)


It's purely for analysis; no data placement. It's light and
fast. The GUI is neat, too.


Cheers,
bc


I downloaded it and two things I notice make it less than helpful.

1) the trial version won't analyze network drives (That's where the
problems are !!!)

2) it seems to only analyze the C drive on my windows box and
disregards any selection criteria I give it. It's the same view no
matter what my criteria are.

I may talk to NeoPath and get a full fledged eval because the concept
is interesting. But this download, thoughI appreciate the effort and
thought, proved to be useless.

~F


Faeandar, just out of curiosity, how does the kazeon to crawl the
filer... does it do it via NFS or via NetApp's FPolicy API? And how
much data is on your filer that it took so many days to analyze (how
many TB and how many files?)


Dvy

  #19  
Old February 6th 07, 08:22 PM posted to comp.arch.storage
[email protected]
external usenet poster
 
Posts: 18
Default ILM and Full Text Search

On Feb 6, 11:11 am, Faeandar wrote:
On 6 Feb 2007 09:58:53 -0800, wrote:



Faeandar, just out of curiosity, how does the kazeon to crawl the
filer... does it do it via NFS or via NetApp's FPolicy API? And how
much data is on your filer that it took so many days to analyze (how
many TB and how many files?)


Dvy


It does not use FPolicy for crawls, though I hear it does do
migrations now so I assume it talke to Fpolicy in some fashion?

The amount of data on the filer does not seem to make a difference.
Some qtrees have 10's of millions, others have millions, even others
had 100's of thousands. In all cases it traversed at about 16 objects
per sec. Not good.

~F


16 files per second via NFS seems very bad... One should easily be
able to findfirst/findnext via NFS to get meta data much faster than
16 files per sec... I wonder what is holding them up...

  #20  
Old February 6th 07, 11:37 PM posted to comp.arch.storage
Nik Simpson
external usenet poster
 
Posts: 73
Default ILM and Full Text Search

wrote:

Hope this clarifies how we fit in the market and differentiates us
from the others.

Jim McGann
www.indexengines.com


Thanks Jim, useful information, though I suspect some in the market
would disagree with your differentiators :-)

--
Nik Simpson
 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT +1. The time now is 01:10 PM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 HardwareBanter.
The comments are property of their posters.