This blog post will be written in English/Mandarin.

既然被导师要求写点review, 不如挪到知乎这里来。

This blog post is not really reviewing anything, but it suffices as a supplementary material for specific topics.

这则博客不是真的在点评,而是以几篇paper作为中心进行背景补充。


PLMW talk : Mechanizing Metatheory

By Brigitte Pientka

可惜了,youtube version 最后没被上传。就凑合着看吧。

Mechanizing Metatheory usually refers to the action that formalize the meta-theory in a proof assistant.

个人无法翻译 Mechanizing Metatheory。这一般是指用 定理证明器 形式化某个元理论。

Motivation 动机

元理论是指讨论关于一个理论本身的性质。比如说我们想证明所有通过Java类型检查和证明的程序不会出现运行时的Null Exception这样一个定理,(曾经是错的现在不知道怎么样了),这样的定理是讨论Java类型系统本身的,所以被称作元定理。

一般来说这样的定理是用纸笔/Latex做做证明的。但是从Proof Engineering的角度来看,这样在Latex上的定义、定理没什么扩展性,而且细节容易掉进犄角旮旯里而导致某些错漏,加之没有定理证明怎么能相信你的证明呢((

Metatheory is about reasoning about the theory itself. For example, if we want to prove an miniKanren implementation can solve all the propositional logic query (in some level of time complexity) or we want to prove any query on propositional logic will halt in miniKanren and return a complete non-duplicated set of solution; or we want to prove a compiled (typed) Java program will never raise an exception. This kind of reasoning about miniKanren/Java Programming Language itself is an example of metatheory.

Usually this kind of proving is done in latex and paper, but it is hard to keep track of all the details in the paper, and it is not scalable (when you want to alter the definition) – so using proof assistant to let machine check this kind of proof is becoming a prevalent choice.

Challenge 现有问题

然而,使用定理证明器需要大量的训练和经验。最简单的例子:元理论中,在论文里面你可以省略对于变量操作(例如变量替换,自由变量提取)的细节。然而对电脑来说,这类操作必须严谨完整的写出来;而且这些操作本身会有一些trivial的性质。比如说 $X[a \mapsto c][b \mapsto d] = X[b \mapsto d] [a \mapsto c]$ 亦即在某些简单的情况下可以交换替换顺序 这个定理就要被完整的在定理证明器中递推的证明一遍。只是考虑这些变量就耗费大量的精力去证明相关的Lemma。

仅仅考虑变量本身,在定理证明器中就有多种设计模式可以参考。 Pientka 主要是考虑 HOAS (Higher-order Abstract Syntax) 和 Contextual Modal Type Theory

However, using proof assistant requires a lot of training and experience. For example, usually in the paper you can assume the reader has the idea of variable, and variable substitution (substitute all variable occurences with something else), and this kind of boilerplate definition need to be explicitly defined in a proof assistant. And what’s more, some trivial lemma $X[a \mapsto c][b \mapsto d] = X[b \mapsto d] [a \mapsto c]$ (i.e. substitution on a and substitution on b are independent of their order) needs to explicitly proved by the user using proof assistant/cannot easily generate this kind of proof. Just reasoning variable binding in an example can be highly nontrivial and requires a lot of labor.

To achieve this kind of reasoning, there have been multiple proposed ways of handling variables themselves on proving all kinds of meta-theorem. But Pientka is mainly working on HOAS (Higher-order abstract syntax) and contextual modal type theory.

HOAS

简单来说, HOAS是一个用于 将 程序本身 表达成 数据结构 的 设计模式。比如说我们用简单的代数数据类型/BNF 表达一个程序 tm := var | Lam var tm | App tm tm 。这里的Lam还是用一个简单的tuple表达的。现在我们考虑写一个替换函数,可以想象这个函数不断的递归分解每个子结构。然后我们会需要 capture-avoiding substitution, 亦即考虑同名变量的shadowing问题。

然而如果我们用HOAS的话, 这个Lam就会直接存一个函数本身 fun var => tm, 亦即这次我们用一个函数作为数据结构的一部分。这就是Higher-order的意思。

这样做的话,好处是 – 替换函数做 capture-avoiding substitution 会简单一点 (alpha-renaming),直接把传入这个函数一个term即可 ; 坏处是 – 我们有一个函数作为数据的一部分,解构这个数据本身有点复杂,尤其是在定理证明器中使用HOAS的时候(这里可能有点争议,可能参考Pientka的演讲和CPDT会更好)。

然而在定理证明器中使用HOAS不是那么容易,因为定理证明器有Strict Positivity的限制。去解决这个问题,我们有很多种办法。Pientka研究的是使用Contextual Type Theory解决这个问题的同时,combine the best between LF and MLTT. 这个演讲中有更细致的讨论,但需要不少的LF + MLTT基础。

HOAS is a technique to represent program. For example, if we use data structure to represent program, we will have BNF definition tm := var | Lam var tm | App tm tm Look at how lambda is represented as a pair of variable and term. (a data structure again) Consider how (capture-avoiding) substitution will need to inspect/destruct this data structure recursively. Using HOAS, we use function of the program itself to represent a lambda term instead of using a var and a data representing lambda term.

In that way, we gain something – substitution is just applying this stored function inside data structure! But we lose something – functions are not really (algebraic) data type so we cannot inspect/destruct it easily.

There are a lot of ways to handle this challenge in the context of proof assistant. Contextual type theory is an attempt to do that – we can use better representation for variable binding; and we can inspect the data as well. And there are a lot of ongoing work around it, and the theory and tool is not mature compared to the Coq/Agda community. And there are more details in the talk, but require some basic knowledges on LF/dependent type/type theory.


Calculating Dependently-Typed Compilers

by Pickard and Hutton

What does “Calculating” mean? 这里的”计算”是指什么?

这里的计算是指 “根据规范系统地导出 可证明正确 的编译器实现”。举个例子,假如我们要写一个Scheme编译器,我们需要参考某个官方的Scheme Reference Manual,例如 R6RS。这样做的原因是因为,当程序员编程的时候,应该是根据某个标准编程而不是某个实现。比如说GCC和Clang作为不同的C编译器,可能各自实现了不同的非标准的C语言拓展。哪某个程序员如果紧贴着GCC实现,就没法轻易的用Clang/LLVM作为编译器 – 哪如果某个学术、实验级的optimization technique只在LLVM上被实现了,如果这个程序员想要使用这个优化,就头疼了。这个可以看作是某种“可移植性”,也可以看作是“数学定义的必要性”。

Calculating here means “systematic derivation of the provably correct implementation of the compiler from the specification”. To be more precise, say I am writing a compiler for Scheme. We know the reference manual, or definition of Scheme, r6rs is online and we should make our compiler support what r6rs says a Scheme should support. Thus this r6rs is a definition of R6RS in human language.

然而,R6RS是一个几乎用人类语言写成的reference manual,这样我们依旧没法“系统地导出”某个实现。我们需要某个更形式的对Scheme语言的(数学)定义。这类定义一般是由 操作语义 或者 其他形式语义 写成的。既然这是一个数学定义,我们就有可能从这个语义“系统地导出”(或者说“算出”)一个解释器、编译器的实现。(实际上,我们可以直接把一个解释器看作一个大步操作语义(某种程度来说,就像是实现即规范。))

这样的计算和“首先我们在定理证明器中写一个编译器,然后证明这个实现满足了规范”是不同的。

But this human language is not good enough to “systematic deriving” an implementation. We need a formal (mathematical) definition, usually in operational semantic or other formal semantic. Then since it is mathematical, it becomes possible to “systematic deriving” an interpreter/compiler from this semantic. (Actually it is possible to consider an interpreter as a big-step operational semantic (as another specification)). This contrasts the way we first write and formalize a compiler in proof assistant, then prove this compiler satisfy the specification.

解释器可以系统地生成编译器。其实这也和 “正确性” 相关。如果一个编译器和解释器行为一致,哪我们就可以说这个编译器是正确的,(当然前提是这个解释器自己就是一个大步操作语义)。形式化的说, (exec (compile e) s) = eval e s

eval 是定好的解释器,s是栈,compile 就是我们想要的编译器,exec就是编译后代码的执行器(这也由我们决定)。

And interpreter can be used to generate compiler. Actually this is also the correctness condition – a compiler is correct if the compiled program acts “the same” as the interpreter. Formally speaking (exec (compile e) s) = eval e s eval is the given interpreter, s is a stack. compile is the compiler we want to have and exec is the executor for the output of the compiler (also we want to implement it).

之前的工作是写一个 动态类型语言的编译器。这次的工作是一个静态类型语言的编译器,作者选择使用依值类型语言Agda作为实现编译器的实现平台/元语言。

Previous work is about a compiler for an untyped language; and this work is about using dependent type language to write a compiler, and thus this work is formalized in Agda.

Why dependent type? Why Agda? 为什么用依值类型? 为什么用Agda?

我们可以使用依值类型去记录操作的语言的类型信息(还有栈上的类型信息).

  1. 使用依值类型作为元语言可以直接用元语言的类型检查来做到对象语言的类型检查。这里的意思说,假设我们用Agda来写Java的编译器,那么我们可以做到在Agda内定义数据的时候使用依值类型,那么Java的源程序,作为IR数据存在Agda内会需要符合定义 – 依值类型的强大之处在于可以让这些IR数据直接被静态类型检查了(不会有ill-typed的程序作为数据的一部分)这相当于直接类型检查
  2. 同样的检查可以用于静态检查 栈的信息

We can use dependent type to trace type information everywhere. 1. We use dependent type of meta-language to act as a type checker in our compiler. 2. We can also use dependent type to describe type information of the stack-based abstract machine (which is the target of the compiler), also to describe the state of the stack

How to Calculate? 是怎么算的呢?

我们需要根据那个正确性条件实现编译器。所以从某种角度来说,我们需要解以下方程 (exec (compile e) s) = eval e s

未知量是 exec, compile 还有 compile的输出的类型(栈的指令的数据结构也是由我们定义的)。为了解这个方程,我们对e做结构递归(因为源语言已经被定好了),然后对每一个不同的表达式我们都要得到正确的exec,compile 的“反应”。

更详细的信息在论文里可以看见;而且论文中的源语言还支持更多更高级的特性。

Thus we want to solve the following equation (exec (compile e) s) = eval e s where the unknown is exec, compile . To solve this equation, we do induction on e (because it is the source language program is already inductively defined) and for each case we find appropriate compile and exec according to the behaviour of eval .

The paper covers more – as the source language supports more advanced feature.


Leibniz equality is isomorphic to Martin-Löf identity, parametrically (JFP Presentation),

presented by LambdaMan

如果读者不了解 HoTT (不理解为什么同一个等号的命题可以有多个证明而且我们会在乎这些证明是不是相等)以及 parametricity/logical relation (LR)/theorem for free!, 这个论文和这个演讲真的不太能读懂。然而LR这个东西很难讲清楚/讲好,难讲到让我怀疑Progress and Preservation是不是PL界走上的弯路 – 因为如果从最初就没有Syntactic Type Safety, 哪LR可能会在本科PL就教,可能这个理论会被简化到早就可以被轻松的理解了。

This talk is almost non-comprehensible if you don’t have knowledge on HoTT (why an equality can have two proof for it?) and parametricity/logical relation (theorem for free!). Good thing is everything is written in literal agda, that means the whole paper is a reproducible artifact – that is also bad because if the audience has no idea what is a proof assistant then they cannot understand anything.

这个演讲 讲了什么是 Leibniz Equality,但是对 Martin Lof Identity Type 讲的不够清楚。如果感兴趣我建议读读 HoTT,当然定义是非常简单的但是定义是怎么来的并不简单。附上的链接是Harper的课,可能不够数学但是足够CS了。当然可能也需要一些定理证明器的基础。

The talk has demonstrate what is a Leibniz Equality but the part on Martin Lof Identity is not clear enough. I will suggest the HoTT link I gave above (by Robert Harper) to really get some idea. Maybe that is still too steep for people don’t know proof assistant, then I will suggest some prerequisites on knowing proof assistant.