This note is a summary of the exploration of developing Coq Plugin. It will collect a lot of (ir)relevant information that might help understanding implementing a Coq Plugin.
Official Documents
https://github.com/coq/coq/tree/master/dev/doc
https://github.com/coq/coq/tree/master/doc/plugin_tutorial
Toolchain
Coq itself
I setup Coq using opam, and then install the coq-core from opam; and then I git-clone the coq source code in another directory. This can avoid all the possible building issues, but with the source code and working Coq on my hand.
I use VSCode, and together with OCaml-lsp-server. Note that, to turn on code analysis in coq repository, you need to first run dune check
inside coq repository, then the OCaml-lsp-server will starts to work.
Coq Plugin
There are four tutorials that include the basic examples of constructing Coq plugins. However, they all use makefile, and it will make LSP non-functional.
To build plugins using dune: https://coq.discourse.group/t/a-guide-to-building-your-coq-libraries-and-plugins-with-dune/20
Now you should have a functional VSCode helping you.
For these four tutorials:
- Tut0 provides a hello world example, in my opinion it is mainly used to test the building environment
- Tut1 has the most detailed comments. The main theme is about Coq and Coq plugin’s interaction. For example, how to accept a piece of Coq AST code and inspect and manipulate it from Plugin/OCaml’s side.
- Tut2’s main theme is about how to maintain customized data inside plugin, possibly persistently
- Tut3 includes the most important
construction_game.ml
, which is about constructing Coq’s (possibly inductive) data from Plugin/OCaml’s side
Some fragments of thought
- There are also a full (maybe outdated) story of how the parsing command for plugin means. You might find it helpful.
- In
construction_game
,find_reference
is annotated as deprecated and suggesting to useCoqlib.lib_ref
. It is easier to use, for example, if you want to refer to the addition function fornat
, then check Coq doc/source code, and looking for this sentenceRegister add as num.nat.add.
. Then you can directly get the reference viaCoqlib.lib_ref "num.nat.add"
. It is written clear incoqlib.mli
- In Coq, some natural number literals and string literals will not be interpreted until certain syntax scope is open. This means natural number and string literals are not intrinsic data of Coq’s core language. But it is still possible to reflect a OCaml string into Coq’s string. Except for using the above idea that recursively calling constructor of string,
- a quick and dirty way is to use
CAst.make @@ CPrim ( Constrexpr.String (Names.Id.to_string f))
for ` f: string. to get a
Constrexpr.constr_expr, which by
interp_constr_evars`, we can get a typed Coq string. - Check
Notation.interp_prim_token_gen
you will see how the string literals are handled – - which also shows the downside – we have to instantiate the
local_scope
. An easy way out is to, in Coq file, first open the string scope, then issue the command that will interpret theCPrim of string
.
- a quick and dirty way is to use
- If you wonder what is
econstr
, check here. Basically it is a typed term with some unsolved meta-variable. So theEvd.evar_map
store the meta-variable. AndGlobal.env()
is about global definitions
How far can I extend Parser?
Module Component
- The projection on module fields like
a.b.c
will be interpreted asNames.ModPath.t
. MPdot
is the constructor that represent.
projection;MPbound of uniq_ident (* reference to a module parameter in a functor *)
(locate atnames.mli
) is also related <!–- How does
.
becomeMPdot
? - How to make sure ModPath can access to the thing I want? (Like a database, or library something)
- Can I extend grammar for new core language? Not only new command, but also new constr_expr/EConstr.expr ?
- 3.5 What is wit_ all about?
- How does type checking interplay with module parameter?
- How to instantiate [a term parametrized by a module type] by a module? –>
Fragments of thought
- Coq program starts inside the folder
toplevel
, alsostm.mli
will callVernacinterp.interp
which handles the interpretation of vernacular commands in Coq.stm
stands for state-transaction-machine interface pcoq.mli
seems includes the key documentation of parser extensionGRAMMAR EXTEND Gram
is really just context free grammar (complete style), unlike in the plugin- For example, the
GRAMMAR EXTEND Gram\nGlobal: [sth]
sth here are all the non terminal I assume - Unlike in the plugin,
VERNAC COMMAND EXTEND ...
this style is very partial, and doesn’t create new non-terminal symbol explicitly - Look at How Mtac extends grammar, it is definitely possible to extend using
GRAMMAR EXTEND
- Look at How Coq-Equations extends a existent non-terminal
- Of course we know we cannot extend
constr_expr/EConstr.expr
because it is a OCaml data structure - but we don’t know after this parsing, for this new equation for this non-terminal, where will the data constructed by extended law go to? How to make control flow
- Of course we know we cannot extend
- For example, the
- How does
.
becomeMPdot
?- The possible answer is on
g_constr.mlg:189
, we have a rule| id = ident; f = Prim.fields; i = univ_annot -> { let (l,id') = f in CAst.make loc @@ CRef (make_qualid loc (DirPath.make (l@[id])) id', i)
- we should test this idea by putting an
reference/qualid
in plugin
- we should test this idea by putting an
- Once CRef is created, the
Constrintern.internalize
will applyintern_applied_reference
to analyzeCRef
, which again usesintern_qualid
- which is called by
intern_gen
, which is called byinterp_constr_evars
to type check
- which is called by
- Note that,
EConstr.t
⊃GlobRef.t
⊃KerName
⊃ModPath.t
- thus
EConstr.mkRef
should be able to represent a module
- The possible answer is on
- Where does
ModPath
refer to? - When modifying the environment
- The
Summary.frozen
seems to relates to all the current global informationstart_module
will callprotect_summaries
which generate summary and pass intoLib.start_module
- which will again store this summary information by
add_entry.start_mod
inLib.start_mod
Summary.freeze()
andSummary.unfreeze
will store and take out the snapshot of the memory and- The problem is, I still cannot see how environment
Environ.env()
information are changed by theSummary.freeze()
andSummary.unfreeze
- Because
Global.env()
will just return!global_env
at the end, and thisglobal_env
seems something Summary in charge Summary
is working because there are no “higher-order reference”, like pointer to a pointer to a pointer, everyref
points to an immutable stuff. So the idea is that,Summary
freeze
will record all the reference- actually even if it is “higher-order reference”, it is still working as long as every level of reference is registered in
Summary
. Then a one copy of a flatSummary
is just like a deep copy
- Because
- which will again store this summary information by
- Manipulating context related to Module
Environ.add_modtype
might be able to do modification- then what is
the_modtypetab
used for? why there is this side effect stuff
- then what is
add_modtype
seems to have correspondence inGlobal
and other modules
Contrexpr.module_ast_r
only includeCMident, CMapply, CMwith
, the part that refer to the outer usage of moduleGlobal.end_module
has indicate the complexity of module construction
- How to construct a module body?
- The interaction happens by
g_vernac.VernacDefineModule
, it will be translated tovernac_define_module
andvernac_begin_segment
invernacentries
Vernacentries.vernac_end_module
seems to be the starting point of “internalizing” a module- this calls
declaremods.end_module
- this calls
- The interaction happens by
- Consider the complexity of Module and all other facility. Directly simulate the Vernacular Command seems a better idea.
interp_control
will callVernacentries.translate_vernac
seems to be the place that interpret a vernacular command- still – to call it requires setting up the
atts:Attributes.vernac_flags
- I still need to figure out which part is triggering the vernacular commandline.
- It seems like
Stm.query
is the one that mainly handover tointerp_control
, however calling it is also infeasible because it is too stateful - Let’s try to set up
atts:Attributes.vernac_flags
by empty list first - I think
atts:Attributes.vernac_flags
is really just those attributes from compilers – set to empty is totally valid
Vernacentries.translate_vernac
- is called by
vernacinterp.interp_expr
, which is called byvernacinterp.interp_control
, theatts
is setup bycmd.attrs
wherecmd
is inside thevernac_control
interp_control
is called byvernacinterp.interp
which is called byStm.stm_vernac_interp
Stm.stm_vernac_interp
will setupst
usingState.get_cached at
at
is of typeStateid.t
, which can be constructed byStateid.dummy
stm_vernac_interp
will pass onexpr: vernac_control
toVernacinterp.interp
Stm.query
will useparse_sentence
to constructvernac_control
- is called by
Vernacentries.translate_vernac
will not actually do anything but only pass the DSL data tointerp_typed_vernac
, which will return a pair of “Lemma and Program” information- these information will be propagated to
interp_gen
, and set into the global byVernacstate.Declare.set
- it will also propagated up until
stm_vernac_interp
, but the returned value fromstm_vernac_interp
is largely ignored
vernac_assumption
will callcomAssumption.do_assumptions
which seems pretty complicated 4. WhenParameter
andAxiom
are used, it is not related toGlobal.push_named_assum
, so weird. 5.| Some b, NoDischarge -> Global (importability_of_bool b) | None, NoDischarge -> Global ImportDefaultBehavior
inenforce_locality_exp
makes the control (setting thescope
) 6. For example,Parameter
hasdischarge=NoDischarge, Decls.assumption_object_kind=Definitional
, and thusdeclare_axiom
- will be called instead of
declare_variable
, which is callingpush_named_assum
- will be called instead of
- Thank god there is the
Modops.module_type_of_module
function that can transform a module bodyModules
- Why module type can have definition inside??
- Each
ident
in parser can resolve to module, but generally a full path will resolve toquald
=reference
- Look at
g_vernac.mlg
, we use parenthesis to indicate theInclude (module_type)
- Module appliciation that resolve to
CMapply(module_expr, module_expr_atom)
is of type - So it seems that using
module_ast = module_ast_r CAst.t
is a good idea to refer to module, module type
- How to extract module type given a fixed module?
Nametab.locate_module
can getModPath.t
fromqualid
- given a
ModPath.t
, using(Declarations/Global/Environ).lookup_module
we can getmodule_body
- then
module_type_of_module
can extractmodule_type_body
- then
Environ.add_modtype
can addmodule_type_body
into th environmentGlobal
migth have a betteradd_modtype
Declaremods.declare_modtype
might have a good idea on how to use thisadd_modtype
- However, the above seems very unreliable,
- see
Declaremods.declare_modtype
, beforeadd_modtype
, there are a bunch of work on universe solving - So the most reliable idea seems to be check how
Printmod.print_module
reify the module signature information- and then we use parser again (I know it is ugly)
- see
- To resolve