Any good free windows decompilers?

Red Squirrel

No Lifer
May 24, 2003
69,884
13,432
126
www.anyf.ca
I'm looking for a program that will take any exe and disassemble then convert to a file that can compile with C++. Is there such thing out there that is free?

I understand I will not get the most easy to read code, but if I can compile it with a C++ compiler and basically get the same app, or modify it then recompile, that would be cool. Just something I'd like to screw around with on my spare time, is all. :D
 

dphantom

Diamond Member
Jan 14, 2005
4,763
327
126
So what you are asking for is a tool that will say for example, decompile Word 2007, allow you to convert the source code to C++, recompile and then allow you to modify the original code??

Do you see anything wrong with that?
 

Red Squirrel

No Lifer
May 24, 2003
69,884
13,432
126
www.anyf.ca
So what you are asking for is a tool that will say for example, decompile Word 2007, allow you to convert the source code to C++, recompile and then allow you to modify the original code??

Do you see anything wrong with that?

Pretty much this. It could be fun to play with, and even be useful.
 

Nothinman

Elite Member
Sep 14, 2001
30,672
0
0
Pretty much this. It could be fun to play with, and even be useful.

Except that you'd be kicked out of any respectable software project the second another developer found out about it. There's a reason that Linux kernel developers avoided that NT kernel source code leak like the plague.
 

Modelworks

Lifer
Feb 22, 2007
16,240
7
76
Never going to happen. Something compiled like Word will never decompile into easy to read code. There are too many things discarded in the compiling process that the programmer needed so he could understand it, but that the compiler tossed out to make it more optimal. I just upgraded to Ida Pro 5.6 and it has a very good decompiler but you better know assembly too as the code it generates is not readable anywhere near like the code used to create the program.

You also have programmers, myself included that use obfuscators. Basically before compiling the program you run it on the source code and it converts all the variable names and anything legible to garbage, so a text like 'Start Menu' become 'hdyebkdjshe'

Good luck deciphering that :)
 
Last edited:

Red Squirrel

No Lifer
May 24, 2003
69,884
13,432
126
www.anyf.ca
Never going to happen. Something compiled like Word will never decompile into easy to read code. There are too many things discarded in the compiling process that the programmer needed so he could understand it, but that the compiler tossed out to make it more optimal. I just upgraded to Ida Pro 5.6 and it has a very good decompiler but you better know assembly too as the code it generates is not readable anywhere near like the code used to create the program.

You also have programmers, myself included that use obfuscators. Basically before compiling the program you run it on the source code and it converts all the variable names and anything legible to garbage, so a text like 'Start Menu' become 'hdyebkdjshe'

Good luck deciphering that :)

Oh I know this, I would not even expect readable variable names tbh, those are probably discarded once it turns into assembly.

What I basically expect to see is C++ equivalent of assembly instructions. For example a series of jumps may be converted into a case statement, and so on. Variable names will probably be all generic like a b c d etc...

Something that converts straight to assembler would be ok too, but I know zero assembly.. then again maybe it would be a nice way to learn. I coulod code a simple app, compile it, then disassemble it and try to understand each part. Something I've always wanted to learn properly, just never got the time for it. In college we only scratched the surface.
 

MrChad

Lifer
Aug 22, 2001
13,507
3
81
What you're looking for does not exist. The best you can hope for is a decompilation into assembly, but you're not going to get C++.

You might have better luck with decompilers for managed languages like Java that only compile down to intermediate byte code. I know I've gotten readable Java files from JAD before.
 

Jeff7181

Lifer
Aug 21, 2002
18,368
11
81
Never going to happen. Something compiled like Word will never decompile into easy to read code. There are too many things discarded in the compiling process that the programmer needed so he could understand it, but that the compiler tossed out to make it more optimal. I just upgraded to Ida Pro 5.6 and it has a very good decompiler but you better know assembly too as the code it generates is not readable anywhere near like the code used to create the program.

You also have programmers, myself included that use obfuscators. Basically before compiling the program you run it on the source code and it converts all the variable names and anything legible to garbage, so a text like 'Start Menu' become 'hdyebkdjshe'

Good luck deciphering that :)

Tisk tisk... we now have your cipher... all your source are belong to us.
 

Leros

Lifer
Jul 11, 2004
21,867
7
81
As other said, you can often get readable files from managed code like C# or Java.

The best you can get from normal code is assembly, which is pretty tough to decipher. Its not bad if you're only trying to modify strings, as there are tools for locating ascii strings. But modifying the control flow is difficult in a huge assembly program.
 

Nothinman

Elite Member
Sep 14, 2001
30,672
0
0
Something that converts straight to assembler would be ok too

That's simple, I believe you can do that with gdb and objdumb. But you won't get anything really meaningful unless you already know asm.
 

Red Squirrel

No Lifer
May 24, 2003
69,884
13,432
126
www.anyf.ca
Hmm I could have sworn there was apps that would convert to c++ and try to sorta make it readable by figuring out the program logic. Guess I may be out of luck then. That or maybe it's a good time to learn asm. :p
 

Nothinman

Elite Member
Sep 14, 2001
30,672
0
0
Hmm I could have sworn there was apps that would convert to c++ and try to sorta make it readable by figuring out the program logic. Guess I may be out of luck then. That or maybe it's a good time to learn asm. :p

I'm sure they exist, but I'm also sure that the code they generate is crap that you probably wouldn't understand it anyway.
 

Modelworks

Lifer
Feb 22, 2007
16,240
7
76
Hmm I could have sworn there was apps that would convert to c++ and try to sorta make it readable by figuring out the program logic. Guess I may be out of luck then. That or maybe it's a good time to learn asm. :p

There are program that do that, I use Ida Pro 5.6 with hexrays decompiler. It is one of the best programs out there for this sort of thing but still is not an easy read without knowing asm.


Original c++ code:
Code:
#include <iostream>
using namespace std;

int main ()
{
  cout << "Hello World!";
  return 0;
}

Loading the compiled exe into ida pro generates this:
Code:
; +-------------------------------------------------------------------------+
; |   This file	has been generated by The Interactive Disassembler (IDA)    |
; |	   Copyright (c) 2009 by Hex-Rays, <support@hex-rays.com>	    |
; |			 License info: xxxxxxxxxxxxxxxx		    |
; |				 				    |
; +-------------------------------------------------------------------------+
;
; Input	MD5   :	96933C364FC18B14DAF45FA112849185

; File Name   :	C:\Visual Studio 2008\Projects\Hello World\Release\Hello World.exe
; Format      :	Portable executable for	80386 (PE)
; Imagebase   :	400000
; Section 1. (virtual address 00001000)
; Virtual size			: 00000B6D (   2925.)
; Section size in file		: 00000C00 (   3072.)
; Offset to raw	data for section: 00000400
; Flags	60000020: Text Executable Readable
; Alignment	: default
; OS type	  :  MS	Windows
; Application type:  Executable	32bit

include	uni.inc	; see unicode subdir of	ida for	info on	unicode

.686p
.mmx
.model flat


; Segment type:	Pure code
; Segment permissions: Read/Execute
_text segment para public 'CODE' use32
assume cs:_text
;org 401000h
assume es:nothing, ss:nothing, ds:_data, fs:nothing, gs:nothing



; int __cdecl main(int argc, const char	**argv,	const char **envp)
_main proc near
mov	eax, ds:?cout@std@@3V?$basic_ostream@DU?$char_traits@D@std@@@1@A ; std::basic_ostream<char,std::char_traits<char>> std::cout
push	eax
call	sub_401150
add	esp, 4
xor	eax, eax
retn
_main endp


sub_401150 proc	near

var_20=	dword ptr -20h
var_1C=	byte ptr -1Ch
var_18=	dword ptr -18h
var_14=	dword ptr -14h
var_10=	dword ptr -10h
var_C= dword ptr -0Ch
var_4= dword ptr -4
arg_0= dword ptr  8

push	ebp
mov	ebp, esp
push	0FFFFFFFFh
push	offset loc_401B52
mov	eax, large fs:0
push	eax
sub	esp, 14h
push	ebx
push	esi
push	edi
mov	eax, dword_403000
xor	eax, ebp
push	eax
lea	eax, [ebp+var_C]
mov	large fs:0, eax
mov	[ebp+var_10], esp
mov	esi, [ebp+arg_0]
xor	ebx, ebx
mov	eax, offset aHelloWorld	; "Hello World!"
mov	[ebp+var_14], ebx
lea	edx, [eax+1]
jmp	short loc_401190
align 10h



And when it converts to c++ it produces this:
Code:
/* This file has been generated by the Hex-Rays decompiler.
   Copyright (c) 2009 Hex-Rays <info@hex-rays.com>

   Detected compiler: Visual C++
*/

#include <windows.h>
#include <defs.h>


//-------------------------------------------------------------------------
// Data declarations

// extern void *std__cout; weak
extern char aHelloWorld[13]; // weak
extern _UNKNOWN unk_402200; // weak
extern _UNKNOWN unk_402208; // weak
extern int dword_403000; // weak

//-------------------------------------------------------------------------
// Function declarations

#define __thiscall __cdecl // Test compile in C mode

int __cdecl main(int argc, const char **argv, const char **envp);
// int __userpurge sub_401020<eax>(int a1<esi>, int a2);
int __stdcall sub_4010B0(int a1);
int loc_401130(); // weak
int __cdecl sub_401150(int a1);
// int (*__usercall sub_4012E2<eax>(int a1<ebp>))();
int loc_401303(); // weak
int __thiscall sub_40130B(void *this, char a2);
// _DWORD __thiscall __report_gsfailure(_DWORD ecx0, _BYTE _4); weak
int (*__cdecl sub_4017DE())(void);
int (*__cdecl sub_401804())(void);
int __cdecl sub_401A35();
// int __usercall sub_401B10<eax>(int a1<ebp>);
// int __usercall sub_401B40<eax>(int a1<ebp>);
void __cdecl sub_401B4A();
// int __thiscall std__basic_ios_char_std__char_traits_char____setstate(_DWORD, _DWORD, _DWORD); weak
// int __thiscall std__basic_ostream_char_std__char_traits_char____flush(_DWORD); weak
// int __thiscall std__basic_streambuf_char_std__char_traits_char____sputc(_DWORD, _DWORD); weak
// int __thiscall std__basic_streambuf_char_std__char_traits_char_____Unlock(_DWORD); weak
// int __cdecl std__basic_streambuf_char_std__char_traits_char_____Lock(_DWORD); weak
// int __thiscall std__basic_ostream_char_std__char_traits_char_____Osfx(_DWORD); weak
// int __thiscall std__basic_streambuf_char_std__char_traits_char____sputn(_DWORD, _DWORD, _DWORD); weak
// int __cdecl std__uncaught_exception(_DWORD); weak


//----- (00401000) --------------------------------------------------------
int __cdecl main(int argc, const char **argv, const char **envp)
{
  sub_401150((int)std__cout);
  return 0;
}
// 402054: using guessed type void *std__cout;

//----- (00401020) --------------------------------------------------------
int __userpurge sub_401020<eax>(int a1<esi>, int a2)
{
  int v2; // eax@3
  int v3; // eax@4
  unsigned int v5; // [sp-4h] [bp-14h]@1
  char v6; // [sp+0h] [bp-10h]@1
  int v7; // [sp+Ch] [bp-4h]@3

  v5 = (unsigned int)&v6 ^ dword_403000;
  *(_DWORD *)a2 = a1;
  if ( *(_DWORD *)(*(_DWORD *)(*(_DWORD *)a1 + 4) + a1 + 40) )
    std__basic_streambuf_char_std__char_traits_char_____Lock(v5);
  v7 = 0;
  v2 = a1 + *(_DWORD *)(*(_DWORD *)a1 + 4);
  if ( !*(_DWORD *)(v2 + 8) )
  {
    v3 = *(_DWORD *)(v2 + 44);
    if ( v3 )
      std__basic_ostream_char_std__char_traits_char____flush(v3);
  }
  *(_BYTE *)(a2 + 4) = *(_DWORD *)(*(_DWORD *)(*(_DWORD *)a1 + 4) + a1 + 8) == 0;
  return a2;
}
// 40203C: using guessed type int __thiscall std__basic_ostream_char_std__char_traits_char____flush(_DWORD);
// 402048: using guessed type int __cdecl std__basic_streambuf_char_std__char_traits_char_____Lock(_DWORD);
// 403000: using guessed type int dword_403000;

//----- (004010B0) --------------------------------------------------------
int __stdcall sub_4010B0(int a1)
{
  int result; // eax@3
  int v2; // edx@3
  char v3; // [sp+0h] [bp-10h]@1
  int v4; // [sp+Ch] [bp-4h]@1

  v4 = 0;
  if ( !(unsigned __int8)std__uncaught_exception((unsigned int)&v3 ^ dword_403000) )
    std__basic_ostream_char_std__char_traits_char_____Osfx(*(_DWORD *)a1);
  v4 = -1;
  v2 = *(_DWORD *)(**(_DWORD **)a1 + 4);
  result = *(_DWORD *)(v2 + *(_DWORD *)a1 + 40);
  if ( result )
    result = std__basic_streambuf_char_std__char_traits_char_____Unlock(*(_DWORD *)(v2 + *(_DWORD *)a1 + 40));
  return result;
}
// 402044: using guessed type int __thiscall std__basic_streambuf_char_std__char_traits_char_____Unlock(_DWORD);
// 40204C: using guessed type int __thiscall std__basic_ostream_char_std__char_traits_char_____Osfx(_DWORD);
// 402058: using guessed type int __cdecl std__uncaught_exception(_DWORD);
// 403000: using guessed type int dword_403000;

//----- (00401150) --------------------------------------------------------
int __cdecl sub_401150(int a1)
{
  signed int v1; // eax@1
  int v2; // ebx@1
  unsigned int v3; // edi@1
  int v4; // eax@8
  char v5; // cl@8
  int v6; // eax@8
  int v7; // ecx@16
  int v9; // eax@21
  char v10; // cl@21
  int v11; // eax@21
  char v12; // [sp+0h] [bp-30h]@1
  int v13; // [sp+10h] [bp-20h]@4
  char v14; // [sp+14h] [bp-1Ch]@4
  int v15; // [sp+18h] [bp-18h]@8
  int v16; // [sp+1Ch] [bp-14h]@1
  char *v17; // [sp+20h] [bp-10h]@1
  int v18; // [sp+2Ch] [bp-4h]@4

  v17 = &v12;
  v2 = 0;
  v16 = 0;
  v3 = strlen("Hello World!");
  v1 = *(_DWORD *)(*(_DWORD *)(*(_DWORD *)a1 + 4) + a1 + 24);
  if ( v1 > 0 )
  {
    if ( v1 > (signed int)v3 )
      v2 = v1 - v3;
  }
  sub_401020(a1, (int)&v13);
  v18 = 0;
  if ( v14 )
  {
    LOBYTE(v18) = 1;
    if ( (*(_DWORD *)(*(_DWORD *)(*(_DWORD *)a1 + 4) + a1 + 16) & 0x1C0) == 64 )
      goto LABEL_26;
    while ( v2 > 0 )
    {
      v4 = *(_DWORD *)(*(_DWORD *)a1 + 4);
      v5 = *(_BYTE *)(v4 + a1 + 48);
      v6 = *(_DWORD *)(a1 + v4 + 40);
      LOBYTE(v15) = v5;
      if ( std__basic_streambuf_char_std__char_traits_char____sputc(v6, v15) == -1 )
      {
        v16 |= 4u;
        break;
      }
      --v2;
    }
    if ( !v16 )
    {
LABEL_26:
      if ( std__basic_streambuf_char_std__char_traits_char____sputn(
             *(_DWORD *)(*(_DWORD *)(*(_DWORD *)a1 + 4) + a1 + 40),
             "Hello World!",
             v3) == v3 )
      {
        while ( v2 > 0 )
        {
          v9 = *(_DWORD *)(*(_DWORD *)a1 + 4);
          v10 = *(_BYTE *)(v9 + a1 + 48);
          v11 = *(_DWORD *)(a1 + v9 + 40);
          LOBYTE(v15) = v10;
          if ( std__basic_streambuf_char_std__char_traits_char____sputc(v11, v15) == -1 )
          {
            v16 |= 4u;
            break;
          }
          --v2;
        }
      }
      else
      {
        v16 = 4;
      }
    }
    *(_DWORD *)(a1 + *(_DWORD *)(*(_DWORD *)a1 + 4) + 24) = 0;
    v18 = 0;
  }
  else
  {
    v16 = 4;
  }
  std__basic_ios_char_std__char_traits_char____setstate(a1 + *(_DWORD *)(*(_DWORD *)a1 + 4), v16, 0);
  v18 = 3;
  if ( !(unsigned __int8)std__uncaught_exception(*(_DWORD *)&v12) )
    std__basic_ostream_char_std__char_traits_char_____Osfx(v13);
  v18 = -1;
  v7 = *(_DWORD *)(*(_DWORD *)(*(_DWORD *)v13 + 4) + v13 + 40);
  if ( v7 )
    std__basic_streambuf_char_std__char_traits_char_____Unlock(v7);
  return a1;
}
// 402038: using guessed type int __thiscall std__basic_ios_char_std__char_traits_char____setstate(_DWORD, _DWORD, _DWORD);
// 402040: using guessed type int __thiscall std__basic_streambuf_char_std__char_traits_char____sputc(_DWORD, _DWORD);
// 402044: using guessed type int __thiscall std__basic_streambuf_char_std__char_traits_char_____Unlock(_DWORD);
// 40204C: using guessed type int __thiscall std__basic_ostream_char_std__char_traits_char_____Osfx(_DWORD);
// 402050: using guessed type int __thiscall std__basic_streambuf_char_std__char_traits_char____sputn(_DWORD, _DWORD, _DWORD);
// 402058: using guessed type int __cdecl std__uncaught_exception(_DWORD);

//----- (004012E2) --------------------------------------------------------
int (*__usercall sub_4012E2<eax>(int a1<ebp>))()
{
  std__basic_ios_char_std__char_traits_char____setstate(
    *(_DWORD *)(a1 + 8) + *(_DWORD *)(**(_DWORD **)(a1 + 8) + 4),
    4,
    1);
  *(_DWORD *)(a1 - 4) = 0;
  return loc_401303;
}
// 401303: using guessed type int loc_401303();
// 402038: using guessed type int __thiscall std__basic_ios_char_std__char_traits_char____setstate(_DWORD, _DWORD, _DWORD);

//----- (0040130B) --------------------------------------------------------
int __thiscall sub_40130B(void *this, char a2)
{
  if ( this == (void *)dword_403000 )
    __asm { rep retn }
  return __report_gsfailure(this, a2);
}
// 4015C6: using guessed type _DWORD __thiscall __report_gsfailure(_DWORD ecx0, _BYTE _4);
// 403000: using guessed type int dword_403000;

//----- (004017DE) --------------------------------------------------------
int (*__cdecl sub_4017DE())(void)
{
  int (*result)(void); // eax@1
  unsigned int v1; // edi@1

  result = (int (*)(void))&unk_402200;
  v1 = (unsigned int)&unk_402200;
  if ( &unk_402200 < &unk_402200 )
  {
    do
    {
      result = *(int (**)(void))v1;
      if ( *(_DWORD *)v1 )
        result = (int (*)(void))result();
      v1 += 4;
    }
    while ( v1 < (unsigned int)&unk_402200 );
  }
  return result;
}

//----- (00401804) --------------------------------------------------------
int (*__cdecl sub_401804())(void)
{
  int (*result)(void); // eax@1
  unsigned int v1; // edi@1

  result = (int (*)(void))&unk_402208;
  v1 = (unsigned int)&unk_402208;
  if ( &unk_402208 < &unk_402208 )
  {
    do
    {
      result = *(int (**)(void))v1;
      if ( *(_DWORD *)v1 )
        result = (int (*)(void))result();
      v1 += 4;
    }
    while ( v1 < (unsigned int)&unk_402208 );
  }
  return result;
}

//----- (00401A35) --------------------------------------------------------
int __cdecl sub_401A35()
{
  return 0;
}

//----- (00401B10) --------------------------------------------------------
int __usercall sub_401B10<eax>(int a1<ebp>)
{
  int result; // eax@1

  result = *(_DWORD *)(*(_DWORD *)(***(_DWORD ***)(a1 + 4) + 4) + **(_DWORD **)(a1 + 4) + 40);
  if ( result )
    result = std__basic_streambuf_char_std__char_traits_char_____Unlock(result);
  return result;
}
// 402044: using guessed type int __thiscall std__basic_streambuf_char_std__char_traits_char_____Unlock(_DWORD);

//----- (00401B40) --------------------------------------------------------
int __usercall sub_401B40<eax>(int a1<ebp>)
{
  return sub_4010B0(a1 - 32);
}

//----- (00401B4A) --------------------------------------------------------
void __cdecl sub_401B4A()
{
  JUMPOUT(*(unsigned int *)loc_401130);
}
// 401130: using guessed type int loc_401130();

// ALL OK, 12 function(s) have been successfully decompiled

Nothing at all like the original.
 

Snapster

Diamond Member
Oct 14, 2001
3,916
0
0
Sure when you decompile all the logic is there, the problem is that it's not logic how we as humans interpret it. All of the useful information like variable/method names etc are lost making it impossible to understand by reading, never mind the fact that the code goes through optimisation/obfuscation so it's never anything like the original.