What is a decompiler?
A decompiler is a computer program that takes an executable file as input and attempts to create a high-level source file that can be recompiled successfully. It is therefore the opposite of a compiler, which takes a source file and makes an executable. Decompilers are usually unable to perfectly reconstruct the original source code, and as such, will frequently produce obfuscated code. Nonetheless, decompilers remain an important tool in the reverse engineering of computer software.
Decompilation is the act of using a decompiler, although the term can also refer to the output of a decompiler. It can be used for the recovery of lost source code and is also useful in some cases for computer security, interoperability, and error correction. The success of decompilation depends on the amount of information present in the code being decompiled and the sophistication of the analysis performed on it. The bytecode formats used by many virtual machines (such as the Java Virtual Machine or the .NET Framework Common Language Runtime) often include extensive metadata and high-level features that make decompilation quite feasible. The presence of debug data can make it possible to reproduce the original variable and structure names and even the line numbers. Machine language without such metadata or debug data is much harder to decompile.[2]
Some compilers and post-compilation tools produce obfuscated code (that is, they attempt to produce output that is very difficult to decompile, or that decompiles to confusing output). This is done to make it more difficult to reverse engineer the executable.
While decompilers are normally used to (re-)create source code from binary executables, there are also decompilers to turn specific binary data files into human-readable and editable sources.
How do Decompilers work?
Decompilers can be thought of as composed of a series of phases each of which contributes specific aspects of the overall decompilation process.
1. Loader
The first decompilation phase loads and parses the input machine code or intermediate language program's binary file format. It should be able to discover basic facts about the input program, such as the architecture (Pentium, PowerPC, etc.) and the entry point. In many cases, it should be able to find the equivalent of the main function of a C program, which is the start of the user-written code. This excludes the runtime initialization code, which should not be decompiled if possible. If available the symbol tables and debug data are also loaded. The front end may be able to identify the libraries used even if they are linked with the code, this will provide library interfaces. If it can determine the compiler or compilers used it may provide useful information in identifying code idioms.
2. Disassembly
The next logical phase is the disassembly of machine code instructions into a machine-independent intermediate representation (IR).
3. Idioms
Idiomatic machine code sequences are sequences of code whose combined semantics are not immediately apparent from the instructions' individual semantics. Either as part of the disassembly phase, or as part of later analyses, these idiomatic sequences need to be translated into known equivalent IR. For example, the x86 assembly code:
Some idiomatic sequences are machine-independent; some involve only one instruction. In general, it is best to delay detection of idiomatic sequences if possible, to later stages that are less affected by instruction ordering. For example, the instruction scheduling phase of a compiler may insert other instructions into an idiomatic sequence, or change the ordering of instructions in the sequence. A pattern matching process in the disassembly phase would probably not recognize the altered pattern. Later phases group instruction expressions into more complex expressions, and modify them into a canonical (standardized) form, making it more likely that even the altered idiom will match a higher-level pattern later in the decompilation.
It is particularly important to recognize the compiler idioms for subroutine calls, exception handling, and switch statements. Some languages also have extensive support for strings or long integers.
What are the best decompilers in Java?
1. JDProject
JDProject is one of the most frequently used java decompiler offline. It is developed to decompile java 5 or later versions(as of now till java8). It is available for Windows, Mac OS, and Linux. It is a best decompiler for eclipse and intellij too as it provides plugin for each platform. JD-Eclipse is a plug-in for the Eclipse platform while JD-Intellij is a plug-in for Intellij IDEA.
2. Procyon
Procyon is a java decompiler developed by Mike Strobel. Procyon decompiler handles java language enhancements from jdk 1.5 and beyond that most other decompilers don't. Procyon does well with enum declarations, annotations and java 8 lambdas and method references. You can find the wiki here comparing Original code, Procyon decompiler code and JD decompiler code. Although wiki above makes Procyon looks good, but I still prefer JDProject over Procyon. Procyon is relatively new and still a work in progress.
3. Cavaj Java Decompiler
If you are a Windows user, then Cavaj is a good option as a decompiler. It is simple to use and decompiles nearly any java class file. The main drawback of Cavaj is that it lacks syntax highlighting.
Also, it is not available for Mac or Linux OS. In short, it is a freeware standalone windows application which converts bytecode(.class) files to java source code.
4. DJ Java Decompiler
DJ Java Decompiler is yet another standalone windows application. It is available for Windows XP, Windows 2003, Windows Vista, Windows 7, Windows 8, 8.1 and 10. You don't need to have the JVM (Java Virtual Machine) or any other java JDK installed. The main advantage of DJ Java Decompiler is that you can decompile more than one java class file at one time. DJ Decompiler enables users to users to save, print, edit, and compile the generated java code.
5. JBVD
JBVD stands for java bytecode viewer and decompiler. It is based on the javassist open source library. It is only available for Windows OS. JBVD requires java to be installed on your device to work.
6. AndroChef
AndroChef is also a windows based java decompiler application. With AndroChef Java Decompiler you can decompile apk., dex, jar and java class-files. It's simple and easy. AndroidChef supports Windows XP, Windows 2003, Windows Vista, Windows 7, Windows 8, 8.1 and 10. AndroChef successfully decompiles java6, java 7 and java 8 .jar and .class file