Compile vs. Interpret - CS 61C Course Notes

1Learning Outcomes¶

Be familiar with the high-level process of executing a compilable program.
Understand the tradeoffs of compilation vs. interpretation.

🎥 Lecture Video

2Compilation vs. Interpretation¶

There are two main ways a program gets run by a computer: compilation and interpretation.

C is a compiled language. C compilers map C programs directly into architecture-specific machine code, or bitstrings of 1s and 0s.

Languages that can be compiled let us transfer programs more easily between different architectures. For example, in 2020, Apple decided to change the architecture for their Mac computer series. They moved Intel-based x86 processors to an ARM processor,. Even with this huge move, C programs did not change that much. Instead, the change happened in the compilers themselves, which were also programs. They were rewritten to handle the translation from the high-level C language to the new instruction architectures.

How do Python and Java programs compare? They differ mainly in when a program is converted into low-level machine instructions.

Java: Converts to architecture-independent bytecode, which is then compiled by a just-in-time compiler (JIT)
Python: Interpreted. Converts to byte code at runtime.

2.1Compilation: Advantages¶

1. Reasonable compilation time. Imagine you have two programs, foo.c and bar.c. Making changes to foo.c will not imply that bar.c needs to be recompiled. This process is coordinated via Makefiles, which you will see in a future course.

2. Generally much faster runtime performance. Compiled C will generally run faster compared to functionally equivalent Java code. After all, the compilation process optimizes code for a given architecture.

Note that depending on your application, you may still prefer Python because (1) there are libraries written for Python that are optimized for GPUs; equivalent usable libraries might not exist for C. Python also has Cython, which you may see in a future class.

3Compilation: Disadvantages¶

1. Compiled files, including the executable, are architecture-specific. The exeuctable depends on the processor type (e.g., MIPS vs. x86 vs. RISC-V) and the operating system (e.g., Windows vs. Linux vs. MacOS). “Porting your code” to a new architecture means rebuilding your executable: copying the .c file, then recompiling using gcc.

2. Slower development cycle: Unlike Python, C doesn’t really have a “read-evaluate-print” loop (REPL). Instead, the cycle is “edit file, compile, link, run, find error”, meaning that development may be much slower.

3. Linking is a bottleneck A program executable will need to be recompiled when any part of the program changes. Compilation is a lengthy process! While some parts of the compilation process can be sped up—for example, independent program subparts can be compiled in parallel—other parts remain serial, like the linking stage (which we will talk about much later). This “serial bottleneck” is an example of Amdahl’s Law; more later.

4“Compiling” as a Colloquial Term¶

Compiling a C program colloquially refers to the full process of using a compiler to translate C programs into executables. We will use this term for now.

In reality, this full process has multiple steps:

Compiling .c files to .o files
Automatic assembling
Linking the .o files into an executable.

We will discuss this as a four-stage process (“CALL”: Compile, Assemble, Link, Load) much later in the course.