Building an Interpreter from scratch

Semantics of programming languages

Ratings: 4.82 / 5.00




Description

How programming languages work under the hood? What’s the difference between compiler and interpreter? What is a virtual machine, and JIT-compiler? And what about the difference between functional and imperative programming?

There are so many questions when it comes to implementing a programming language!

The problem with “compiler classes” in school is they usually are presented as some “hardcore rocket science” which is only for advanced engineers.

Moreover, classic compiler books start from the least significant topic, such as Lexical analysis, going right away deep down to the theoretical aspects of formal grammars. And by the time of implementing a first Tokenizer module, students simply lose an interest to the topic, not having a chance to actually start implementing a programing language itself. And all this is spread to a whole semester of messing with tokenizers and BNF grammars, without understanding an actual semantics of programming languages.

I believe we should be able to build and understand a full programming language semantics, end-to-end, in 4-6 hours — with a content going straight to the point, showed in live coding session as pair-programming, and described in a comprehensible way.

In the Essentials of Interpretations class we focus specifically on runtime semantics, and build a interpreter for a programming language very similar to JavaScript or Python.

Implementing a programing language would also make your practical usage level of other programming languages more professional.


Who this class is for?

This class is for any curious engineer, who would like to gain skills of building complex systems (and building a programming language is really a pretty advanced engineering task!), and obtain a transferable knowledge for building such systems.

If you are interested specifically in compilers, interpreters, and source code transformation tools, then this class is also for you.

The only pre-requisite for this class is basic data structures and algorithms: trees, lists, traversal.


What is used for implementation?

Since we build a language very similar in semantics to JavaScript or Python (the two most popular programming languages today) we use specifically JavaScript — its elegant multi-paradigm structure which combines functional programming, class-based, and prototype-based OOP fits ideal for that.

Many engineers are familiar with JavaScript so it should be easier to start coding right away. However in implementation we don’t rely on too specific to JS constructs, and the code from the course is easily portable to TypeScript, Python, Java, C++, Rust, and any other language of your taste.

Note: we want our students to actually follow, understand and implement every detail of the interpreter themselves, instead of just copy-pasting from final solution. The full source code for the language is available in video lectures, showing and guiding how to structure specific modules.


What’s specific in this class?

The main features of these lectures are:

  • Concise and straight to the point. Each lecture is self-sufficient, concise, and describes information directly related to the topic, not distracting on unrelated materials or talks.

  • Animated presentation combined with live-editing notes. This makes understanding of the topics easier, and shows how (and when at time) the object structures are connected. Static slides simply don’t work for a complex content.

  • Live coding session end-to-end with assignments. The full source code, starting from scratch, and up to the very end is presented in the video lectures class. In the course we implement a full AST interpreter for our programming language.


Reading materials

As further reading and additional literature for this course the following books are recommended:

  • Structure and Interpretation of Computer Programs (SICP) by Harold Abelson and Gerald Jay Sussman

  • Programming Languages: Application and Interpretation (PLAI) by Shriram Krishnamurthi


What is in the course?


The course is divided into four parts, in total of 18 lectures, and many sub-topics in each lecture. Please address curriculum for detailed lectures descriptions.


PART 1: COMPILERS CRASH COURSE

In this part we describe different compilation and interpretation pipelines, see the difference between JIT-compilers and AOT-compilers, talk about what is a Virtual machine and Bytecode-interpreter, and how it difference from an AST-interpreter, show examples of native code, and LLVM IR, and other topics.


PART 2: INTERPRETERS: BASIC EXPRESSIONS AND VARIABLES

In this part we start building our programming language, and consider basic expressions, such as numbers, strings, talk about variables, scopes, and lexical environments, control structures, and touching parser generator.


PART 3: FUNCTIONS AND FUNCTIONAL PROGRAMMING

In this part we start talking and implementing function abstraction, and function calls. We describe concept of closures, lambda function, and IILEs (Immediately-invoked lambda expressions). In addition, we touch topics of Call-stack, recursion, and syntactic sugar.


PART 4: OBJECT-ORIENTED PROGRAMMING

The final part of the course is devoted to the object-oriented support in our language. We describe the class-based, and prototype-based approaches, implement concept of classes, instance and modules.


I hope you’ll enjoy the class, and will be glad to discuss any questions and suggestion in comments.

- Dmitry Soshnikov

What You Will Learn!

  • Build a programing language from scratch
  • Interpreters and Compilers
  • AOT, JIT-compilers and Transpilers
  • AST-interpreters and Virtual Machines
  • Bytecode, LLVM, Stack-machines
  • First-class functions, Lambdas and Closures
  • Call-stack and Activation Records
  • OOP: Classes, Instances and Prototypes
  • Modules and Abstractions

Who Should Attend!

  • Curious engineers who want to know and understand how programming languages work under the hood