File shredders and recovery software

chrstrbrts · Oct 26, 2017

Hello everyone,

I've been using a file shredder for some time called Eraser.

Recently, I started using a file recovery program called Recuva.

I'm not, however, interested in these programs specifically; I'm interested in how these types of programs work.

Recuva is able to show a list of erased files on a drive.

Many are unrecoverable because they have been overwritten.

But, Recuva still shows the name of the overwritten file.

Some of these files have "normal" names like acpi.mod.

But, some have strange names like 67Ims{bmdY7l9uQ[q(.

Are these strange names the result of the file having been erased with a shredder?

Also, how can you shred a file and leave absolutely no trace whatsoever?

Leaving behind file names is like leaving a meta-data trail.

The recovery software can tell the name of the file that overwrote an earlier file.

How?

How can the recovery software find the name of the file that was there originally after it was overwritten?

Is it possible to shred a drive and leave no trace of anything at all in such a way that if you were to try to run a recovery program focusing on that drive it would return nothing at all?

Also, I noticed that Recuva has a gradation of file designations referring to how well a file can be recovered.

If a file is 'excellent', it hasn't been overwritten at all and can be fully recovered.

But, there are also 'unrecoverable' and 'very poor' designations as well.

The 'excellent' and 'unrecoverable' designations make sense to me.

Either a file can be recovered or not.

But the 'very poor' designation doesn't make sense to me.

How could a file be somewhere in between?

Is it because only some of the file's clusters have been overwritten but not all of them?

Thanks.

Gryz · Oct 27, 2017

What is possible depends completely on the OS you are using, and the file-system type you are using.

If you really want to destroy "all data", I would wonder if that is still possible in these days of SSDs ? Even if the OS wants to overwrite a block on disk, the SSD will probably write the new info in another block. And the old data is still available somewhere on the SSD.

DaveSimmons · Oct 27, 2017

In a simple file system, the disk directory is a set of data blocks on the disk. The files are separate sets of blocks on the disk.

A program like Recuva reads the disk directory blocks to get the names of deleted files and their locations on disk, then checks those locations to see if some or all of the data blocks are still there. It takes advantage of the fact that operating systems don't immediately "garbage collect" the disk directory and file data blocks because that wastes time and goes against wear leveling.

It's a bit like how DOS CHKDSK works when it validates the files on disk.

chrstrbrts · Nov 11, 2017

Gryz said:
What is possible depends completely on the OS you are using, and the file-system type you are using.

Jesus, I had no idea there were hundreds of file systems.

Gryz said:
If you really want to destroy "all data", I would wonder if that is still possible in these days of SSDs ? Even if the OS wants to overwrite a block on disk, the SSD will probably write the new info in another block. And the old data is still available somewhere on the SSD.

Why would an SSD behave that way?

Why wouldn't it follow the orders given to it?

Would it rely on some kind of optimization algorithm burned in as firmware on its integrated controller?

chrstrbrts · Nov 11, 2017

DaveSimmons said:
A program like Recuva reads the disk directory blocks to get the names of deleted files and their locations on disk,

How can you tell by reading the disk directory whether a file has been deleted?

Is there some kind of "file-is-valid" bit somewhere?

DaveSimmons said:
then checks those locations to see if some or all of the data blocks are still there.

But, everything on the disk is just 0's and 1's.

What would it look like in binary for a collection of data blocks to no longer be there?

I mean if you check any location on the disk, you'll find binary information.

DaveSimmons said:
It takes advantage of the fact that operating systems don't immediately "garbage collect" the disk directory and file data blocks because that wastes time and goes against wear leveling.

So, this is how the recovery software can discern that file 1 has overwritten file 2?

If the directory bytes aren't immediately wiped and a file is deleted and if there is a "file-is-valid" bit flipped to show that a file is no longer valid because the user has deleted it, and another file that is still valid is linked to the same data blocks as the first file, then the software can tell that file 1 has been overwritten by file 2, yes?

Red Squirrel · Nov 11, 2017

When you delete a file, you are simply removing a link, the actual data stays on the drive, but since there is no record of a file at that location, it will eventually be overwritten. A primitive way of thinking about it is imagine a large library full of books but they have no titles, just numbers, there is a master book with all the titles and numbers. So you lookup the title and then go get the book by number. If you want to delete a book, you just remove it from the master book. The book itself is still on the shelf at the same number but that number no longer exists in the master book. Eventually another book might be added, only at that point will the old book be removed and replaced with the new one.

The file shredders will basically overwrite the file's content with giberish before deleting the file, at least I think that's how it works. I'm not sure what kind of smarts it might use to ensure it's overwriting the same locations on the drive though as depending on the file system, especially more avanced ones that might have stuff like volume shadow copy, there is no guarantee that the data is being written in the same location.

The most safe way of removing files is to zero out the entire drive. Even then, in theory, some kind of magnetic reader could potentially try to detect some sectors that are "more negative" or "more positive" than others and try to figure out what used to be there. That's why lot of places will do multiple passes of random data and zeroes. I think the DOD standard is like 13 passes or something like that.

DaveSimmons · Nov 11, 2017

To add to the above, the same thing is done with the directory listing itself. A page of file listings might be "torn out" of the master book (unlinked from it) but be left in place on the disk where it can be read using raw block addressing. Or an entry might be "crossed out" but with that done by just marking an "x" next to the entry.

Another way of thinking about it is if you malloc, fill and then free a chunk of RAM. Unless there is some security extension running, when you free a buffer the C++ runtime library will probably not spend the CPU time to fill that buffer with 0s. A "RAM recovery function" could find the data in that buffer. On a modern OS that function should need to be running in the same process.

Read this: https://www.codeproject.com/Articles/28314/Reading-and-Writing-to-Raw-Disk-Sectors

A file system is just a well-defined way of organizing things on a disk. It could be as simple as:

Directory pages begin on sector 10 of the disk and are a doubly-linked list:

One Directory "Page" is 16K in size and holds:
- N x File-Entry structs
- The sector addresses of the next page and the previous page

One File-Entry struct is:
- Type (File, Sub-Directory)
- Deleted flag
- Location of the first sector of the file or the first sector of the sub-directory page
- Size if a file
- Security descriptor, owner, access flags, ...

One file is also a linked list, where sector 1 lists:
- Number of contiguous sectors
- Sector for next clump of sectors (0 if no more clumps)
- Bytes in this clump

There you go, the "Dave Filesystem" or DFS. You could write something like this is a few days.

Gryz · Nov 12, 2017

chrstrbrts said:
Why would an SSD behave that way?
Why wouldn't it follow the orders given to it?
Would it rely on some kind of optimization algorithm burned in as firmware on its integrated controller?

The biggest problem of SSDs is that a single cell (which holds an amount of data, very similar to a "block" on a disk) can be rewritten only a certain number of times. Therefor an SSD will try to avoid rewriting the same cell over and over again. Also, what happens if a cell is "worn out" ? Then the SSD can't write there anymore, and it has to write in another cell.

Therefor, if the OS tells the SSD to "write this data to block 0x1234", the SSD can not always write to the same block. SSDs have a separate table that maps "old fashioned block numbers" to cell locations. And that table gets changed all the time. Maybe every write to block 0x1234 the cell in which that data is held changes location. That's how I would do it. That guarantees you are using all cells in the SSD an equal amount of times, and none of them get worn out quicker than the other cells.

Also, if the OS writes block 0x1234, I would find a new location (cell) for that data (as I wrote above). But I would not clean out the cell that held the previous incarnation of block 0x1234. That would only double the number of changes of cells. And thus half the lifespan of my SSD. Not good. I would keep the old data in the old cell.

See ? Many reasons to assume that old data on an SSD is not easily erased.

I think the only way to make sure all data is removed from an SSD is to write so much data to the SSD that you are guaranteed to have rewritten all cells. Maybe you'll have to overwrite the SSD so often that all cells get worn out ? Very impractical. I think smashing an SSD with a hammer would be easiest.

Red Squirrel · Nov 12, 2017

Oh yes that's a good point, data recovery/erasing on a SSD is going to be different. The block addresses are kinda virtual I think. A given logical address is not actually going to always write to the same physical address/area, as the writes are staggered across all of the flash to get the life higher, as flash has limited writes. This is why you never defrag a SSD as it will never really actually be defragmented and all you're doing is causing unnecessary writes. SSDs are kinda a whole different beast overall when it comes to securely deleting files, as well as recovering files.

AMDisTheBEST · Nov 27, 2017

Guess this means NSA can easily read your data unless you encrypt the entire drive

IHateMyJob2004 · Nov 30, 2017

When you delete a file, the OS typically just deletes the reference to it. The data is still on disk undisturbed until the OS decides to overwrite those parts of the disk.

I'm actually surprised that the references to the data itself are still intact. I thought those pieces were deleted.

mv2devnull · Nov 30, 2017

Red Squirrel said:
The block addresses are kinda virtual I think. A given logical address is not actually going to always write to the same physical address/area, ...

HDDs have a mapping too, but not as extensive as in SSDs.
Sector reallocation: https://kb.acronis.com/content/9105

Search

File shredders and recovery software

chrstrbrts

Senior member

Gryz

Golden Member

DaveSimmons

Elite Member

chrstrbrts

Senior member

chrstrbrts

Senior member

Red Squirrel

No Lifer

DaveSimmons

Elite Member

Gryz

Golden Member

Red Squirrel

No Lifer

AMDisTheBEST

Senior member

IHateMyJob2004

Lifer

mv2devnull

Golden Member

TRENDING THREADS