"From English to 0's and 1's: Exploring the Magic of Programming Languages—Interpreted vs. Compiled vs. Just-In-Time Compilation."

If computers only understand 0s and 1s, then how are we able to write computer programs using English words? All higher-level programming languages, from the first higher-level language (Fortran) to the newest programming languages that keep emerging, are written using English words.

Before a program is executed, it is first converted into 0s and 1s, also known as machine-level programming languages that can be understood by the computer. The machine code is loaded into the computer's RAM and executed by the computer's CPU using fetch-decode-execute and update cycles. The machine code consists of a sequence of instructions stored inside the RAM. The CPU fetches each instruction (machine code) one by one, decodes it to identify the specific operations to be performed, which can involve certain operations and memory allocation, and then executes the instructions. Finally, the program counter is incremented to point to the next instruction in the memory (RAM). Nowadays, CPUs have additional features such as instruction pipelining, branch prediction, optimization, caching, etc.

While the above is the general process by which a computer executes a program, different programming languages have unique approaches to running code. Programming languages are classified based on how they are run: compiled, interpreted, and just-in-time compilation.

Compiled Programming Languages:

Programming languages like C, C++, Go, Rust, etc., are called compiled programming languages. Any program written in these programming languages is first compiled, or changed, into machine code with the help of a compiler. Then, through linking with other machine code in other files, it is converted into executable files. Executable files contain machine code and other instructions generated for the operating system to load and execute the program. They can also contain libraries and data, etc. Executable files can be directly executed by the computer. It is important to note that different compiled programming languages have different processes for converting the source code to machine language. The C/C++ process includes Preprocessing -> Assembly -> Core Compilation -> Linking. (You can check the below video to learn about the process involved: )

Other programming languages like Rust and Go have their processes.

Pros of Compiled Programming:

  1. Performance: Compiled languages generally provide high performance because the code is translated directly into machine code that can be executed by the hardware. This direct translation allows for efficient use of system resources, resulting in faster execution speeds and reduced runtime overhead compared to interpreted languages

  2. Efficiency: Compiled languages often provide more control over memory allocation and optimization techniques. They allow developers to fine-tune memory management, utilize low-level operations, and optimize code for specific hardware architectures. This control can lead to more efficient resource utilization and improved overall program efficiency.

  3. Portability: Compiled languages can be compiled into machine code for specific target platforms, resulting in highly portable executables. Once compiled, the code can be distributed and executed on compatible systems without requiring the source code or dependencies. This portability is especially useful for deploying software on different operating systems and hardware architectures.

  4. Strong Typing and Safety: Compiled languages often have strong static typing systems, which provide compile-time type checking. This helps catch type-related errors early in the development process, reducing the likelihood of runtime errors and improving program stability. Additionally, compiled languages often enforce stricter memory safety, leading to fewer vulnerabilities such as buffer overflows and memory leaks.

  5. Security: Compiled languages can provide enhanced security features, such as memory protection mechanisms and control over system resources. This helps prevent unauthorized access to sensitive data and reduces the risk of security vulnerabilities.

Cons of Compiled Programming:

  1. Platform dependence of the generated binary code/executable file. An executable generated for a 64-bit machine architecture would not work on a 32-bit architecture.

  2. Most compiled languages are strongly typed, which means you have to specify the data type of a variable every time, making the learning curve steeper.

Interpreted Programming Languages:

In this category, we find languages like Python and JavaScript. These programming languages are executed line by line by an interpreter. Interpreters are software programs mainly written in C and C++, which can execute instructions written in high-level programming languages. They match the high-level language line by line to low-level instructions that can be understood by computers. Interpreted programming languages are often dynamically typed, which means you do not need to specify their type. One of the major differences between compiled and interpreted programming languages is that an intermediate file(Executable file) is created for the compiled programming while there is non for interpreted.

Pros of Interpreted Programming:

  1. Easy to Learn and Use: Interpreted languages often have simpler syntax and semantics compared to compiled languages, making them more approachable for beginners. They typically have fewer strict rules and allow for more flexibility, making it easier to write and understand code. Languages like Python can be read like English.

  2. Rapid Development and Prototyping: Interpreted languages facilitate rapid development and prototyping. Since there is no compilation step, changes to the code can be immediately executed and tested, allowing for quick iterations and experimentation. This agility is particularly useful in situations that require fast iteration and frequent code modifications.

  3. Platform Independence: Interpreted languages are often designed to be platform-independent. The interpreter serves as a layer between the code and the underlying hardware or operating system, allowing the same code to be executed on different platforms without requiring recompilation or modification.

  4. Dynamic Typing and Flexibility: Interpreted languages often support dynamic typing, allowing variables to be assigned values of different types during runtime. This flexibility enables developers to write code that adapts to changing requirements and simplifies certain programming tasks.

  5. Cross-Language Integration: Interpreted languages often have built-in support or easy integration with other languages. This allows developers to leverage existing code libraries and take advantage of specialized functionalities from different languages, enhancing productivity and code reusability.

  6. Scripting and Automation: Interpreted languages are commonly used for scripting and automation tasks. Their concise syntax and ease of use make them suitable for writing scripts that automate repetitive tasks, interact with system utilities, or perform batch operations.

Cons of Interpreted Programming:

  1. Performance Overhead: Interpreted languages generally have slower execution speeds compared to compiled languages. Since code is interpreted at runtime, it incurs additional overhead compared to directly executing pre-compiled machine code. This can result in slower program execution, especially for computationally intensive or performance-critical applications.

  2. Lack of Full Optimization: Interpreted languages often do not provide the same level of optimization as compiled languages. The interpreter focuses on executing the code line by line without the opportunity for extensive optimization and analysis performed by ahead-of-time compilers. This can limit the overall performance and efficiency of the code.

  3. Dependency on Source Code: Interpreted languages typically require the distribution of the source code along with the interpreter. This exposes the logic and implementation details of the program, potentially compromising intellectual property or trade secrets.

The low speed of interpreted languages is what leads to Just-In-Time compilation.

Just-In-Time Compilation:

This is a process employed by interpreted programming languages to optimize their runtime speed. Instead of compiling the entire program into machine code like compiled programs, the JIT compiler identifies frequently executed portions of code, known as "hotspots." These hotspots are then translated into machine code, which is used every time those hotspots are encountered in the program. A good example can be a function that is called in many places in the program. The most common languages that use Just-In-Time compilation are Java and JavaScript.

SPECIAL CASES:

It is very important to note that some programming languages can use a combination of all these processes. For example, Java source code is first compiled to Java Bytecode, a machine language for the Java Virtual Machine. Then, we have different Java Bytecode interpreters for different operating systems that interpret and execute the Java Bytecode. These Java Bytecode interpreters also use Just-In-Time compilation to improve the speed of execution.

TypeScript, which is a strongly typed language, is essentially compiled into JavaScript and then run in the normal way of running JavaScript. TypeScript can lead to more optimized and efficient code due to its static typing and type-checking during the compilation process. TypeScript's static typing helps catch type-related errors early, which can improve code quality and reduce the chances of runtime errors. This can indirectly contribute to better performance by avoiding certain runtime issues.

image credit :- Adobe stock, Sitebay, GeegForGeeks, India dictionary.