Company wide File Scan. need recomendations

OutHouse

Lifer
Jun 5, 2000
36,410
616
126
We have been tasked to do a company wide file scan looking for just about everything. PST's, office docs, engineering docs, voice-mails, media files like videos music photos, redundant copies of files.

we need to know what the file is, where the file is, when it was last touched and who owns it. and to be able to put it into a report.

we are talking about 5,000 servers/PC's/laptops/buffalo drives across every state and on separate subnets. what would you guys recommend to do this?
 

Zargon

Lifer
Nov 3, 2009
12,218
2
76
*technically* if you have windows search 4.0 installed on all of them you can use it, but I wouldnt dare try it.

I've used it to do a search of about 20 machines and it was time consuming
 

rsutoratosu

Platinum Member
Feb 18, 2011
2,716
4
81
Ouch, that is not easy, maybe someone else, i used auditing software for server but basically you're asking to get into local machines as well. Ill go look up what I used but basically for a 200gb drive with probably 300,000 text,pdf,word, it generate like a 2000 page excel file, insane.
 

OutHouse

Lifer
Jun 5, 2000
36,410
616
126
Ouch, that is not easy, maybe someone else, i used auditing software for server but basically you're asking to get into local machines as well. Ill go look up what I used but basically for a 200gb drive with probably 300,000 text,pdf,word, it generate like a 2000 page excel file, insane.

yea this project came down from the top. we are not too happy about it.
 

ImpulsE69

Lifer
Jan 8, 2010
14,946
1,077
126
Are you just scanning local system drives or are they also looking at NAS/SAN that they are connected to?

Regardless of how you do it you are talking a LONG time for even one machine. We do this regularly for CIFS Shares prior to big migrations, and it's quite time consuming and all we are doing is a volscan. Then someone manually goes through the data.

I don't believe there's anything out there that you can set it and forget it. Closest you can do would be find 2-3 applications that do portions of what you are looking for, and script the rest to parse the info into a spreadsheet.

PS. This really sounds like something that would come from the top from my management as well, but I think we'd push back to the local desktop support and/or OSE's and split up the work load ;p
 
Last edited:

OutHouse

Lifer
Jun 5, 2000
36,410
616
126
local and nas/san drives. each office and there are like 40 offices has a buffalo drive.
 

gsaldivar

Diamond Member
Apr 30, 2001
8,691
1
81
Just pick two or three of the most expensive enterprise solutions and pass those up "to the top".

Then watch how quickly the scope of this project shrinks...
 

MrChad

Lifer
Aug 22, 2001
13,507
3
81
What's the motivation behind the report?

If it's something that needs to be done on a semi-regular basis, you could consider adopting an enterprise content management system and migrating your files to that. That would allow you to control access and retention much more easily, and would make searching across content far easier too.

Regardless, the answer to this question will not come cheaply.
 

ImpulsE69

Lifer
Jan 8, 2010
14,946
1,077
126
Just pick two or three of the most expensive enterprise solutions and pass those up "to the top".

Then watch how quickly the scope of this project shrinks...

LMAO. Exactly.

Generally, these type of things are done on a smaller scale by divisions of your company. Of course, depending on how your company is laid out, there may be no one to do it but you(r group).

There are generally a few reasons they would ask for something like this which normally revolve around the cost of storage.

The company I work for is very large and very widespread and we get asked for things like this regularly by different groups. There are a few things that can look for specific file types and get rid of them, but that's about the extent of it. They are very concerned about cost of storage and density and usage, but do not do any kinds of scans for unneeded data.

There are file life management products that can be used, but I'm not sure they do any reporting.
 

JustMe21

Senior member
Sep 8, 2011
324
49
91
A quick and easy way to get the info is to add the drive as a share and go to the command line, change to the root of that directory and pipe the dir results to a text file.

e.g. dir /s *.doc >C: \Docs.txt (Colon and back slashs should be together, but I don't know how to get it to not do a smiley)

Now, parsing the information out into a spreadsheet would be the real trick.
 
Last edited:

ForumMaster

Diamond Member
Feb 24, 2005
7,792
1
0
Well, since you don't need to actually index the content of the files, rather just the metadata, it is significantly easier.

There are many software products (such as this ) that don't index anything themselves. They access the local MFT directly. This allows instantaneous (well almost) searches.

Perhaps you could use tools like the one i mentioned to do that.

Best of luck.
 

OutHouse

Lifer
Jun 5, 2000
36,410
616
126
Just pick two or three of the most expensive enterprise solutions and pass those up "to the top".

Then watch how quickly the scope of this project shrinks...

i have a meeting tomorrow with the director about this project. your advice will be presented.
 

OutHouse

Lifer
Jun 5, 2000
36,410
616
126
What's the motivation behind the report?

If it's something that needs to be done on a semi-regular basis, you could consider adopting an enterprise content management system and migrating your files to that. That would allow you to control access and retention much more easily, and would make searching across content far easier too.

Regardless, the answer to this question will not come cheaply.

we were told "to get an idea of whats out there and who has it, and how much space its taking up.
 

OutHouse

Lifer
Jun 5, 2000
36,410
616
126

thanks, i think autonomy is a strong possibility of doing what is being asked of us. if not it looks like it will come really close.
 

Nothinman

Elite Member
Sep 14, 2001
30,672
0
0
In theory you could write and test a script to do that in a few hours. Then you'd just run it on the servers yourself and put it in a logon script for the workstations. It might take a few runs to get the output how you want.
 

sm625

Diamond Member
May 6, 2011
8,172
137
106
Agent ransack will prepare a text file report that contains the following info (example):

Code:
C:\camera cache\DSC00596.JPG   (2289 KB,  11/23/2011 3:43:40 AM)
C:\camera cache\DSC00597.JPG   (2294 KB,  11/23/2011 3:44:36 AM)
C:\camera cache\DSC00598.JPG   (2284 KB,  11/23/2011 3:45:22 AM)

I did a search for jpg files over 1MB in size and that is a partial result. Here is an example of a regular expression you might use to do a wider search:

.doc$|.jpg$|.mov$|.flv$|.mp3$

The $ says the file must end with the dot/3-letter-extension combo. The | is an or operator.
 

pcunite

Senior member
Nov 15, 2007
336
1
76
Agent ransack will prepare a text file report that contains the following info (example):

Seriously, when is the Agent RanSPAM going to stop? Every file search tool in existence does this, FileSearchEX, Everything, UltraSearch and on and on. He wants a real product solution, not your user level search tool ...

Now back to the OP ... have you found a real solution yet? If not, I might have somewhere to direct you.