1 = Hacking the compiler :camel:
2
3 This document is a work-in-progress attempt to provide useful
4 information for people willing to inspect or modify the compiler
5 distribution's codebase. Feel free to improve it by sending change
6 proposals for it.
7
8 If you already have a patch that you would like to contribute to the
9 official distribution, please see link:CONTRIBUTING.md[].
10
11 === Your first compiler modification
12
13 1. Create a new git branch to store your changes.
14 +
15 ----
16 git checkout -b my-modification
17 ----
18
19 2. Consult link:INSTALL.adoc[] for build instructions. Here is the gist of it:
20 +
21 ----
22 ./configure
23 make
24 ----
25
26 3. Try the newly built compiler binaries `ocamlc`, `ocamlopt` or their
27 `.opt` version. To try the toplevel, use:
28 +
29 ----
30 make runtop
31 ----
32
33 4. Hack frenetically and keep rebuilding.
34
35 5. Run the testsuite from time to time.
36 +
37 ----
38 make tests
39 ----
40
41 5. Install in a new opam switch to try things out:
42 +
43 ----
44 opam compiler-conf install
45 ----
46 +
47 With opam 2, create a local opam switch with the compiler installed from
48 the current source directory:
49 +
50 ----
51 opam switch create . --empty
52 opam install .
53 ----
54
55 6. You did it, Well done! Consult link:CONTRIBUTING.md[] to send your contribution upstream.
56
57 See our <<Development tips and tricks>> for various helpful details,
58 for example on how to automatically <<opam compiler script,create an
59 opam switch>> from a compiler branch.
60
61 === What to do
62
63 There is always a lot of potential tasks, both for old and
64 newcomers. Here are various potential projects:
65
66 * http://caml.inria.fr/mantis/view_all_bug_page.php[The OCaml
67 bugtracker] contains reported bugs and feature requests. Some
68 changes that should be accessible to newcomers are marked with the
69 tag link:++http://caml.inria.fr/mantis/search.php?
70 project_id=1&sticky_issues=1&sortby=last_updated&dir=DESC&highlight_changed=24&hide_status_id=90&tag_string=junior_job++[
71 junior_job].
72
73 * The
74 https://github.com/ocamllabs/compiler-hacking/wiki/Things-to-work-on[OCaml
75 Labs compiler-hacking wiki] contains various ideas of changes to
76 propose, some easy, some requiring a fair amount of work.
77
78 * Documentation improvements are always much appreciated, either in
79 the various `.mli` files or in the official manual
80 (See link:manual/README.md[]). If you invest effort in understanding
81 a part of the codebase, submitting a pull request that adds
82 clarifying comments can be an excellent contribution to help you,
83 next time, and other code readers.
84
85 * The https://github.com/ocaml/ocaml[github project] contains a lot of
86 pull requests, many of them being in dire need of a review -- we
87 have more people willing to contribute changes than to review
88 someone else's change. Picking one of them, trying to understand the
89 code (looking at the code around it) and asking questions about what
90 you don't understand or what feels odd is super-useful. It helps the
91 contribution process, and it is also an excellent way to get to know
92 various parts of the compiler from the angle of a specific aspect or
93 feature.
94 +
95 Again, reviewing small or medium-sized pull requests is accessible to
96 anyone with OCaml programming experience, and helps maintainers and
97 other contributors. If you also submit pull requests yourself, a good
98 discipline is to review at least as many pull requests as you submit.
99
100 == Structure of the compiler
101
102 The compiler codebase can be intimidating at first sight. Here are
103 a few pointers to get started.
104
105 === Compilation pipeline
106
107 ==== The driver -- link:driver/[]
108
109 The driver contains the "main" function of the compilers that drive
110 compilation. It parses the command-line arguments and composes the
111 required compiler passes by calling functions from the various parts
112 of the compiler described below.
113
114 ==== Parsing -- link:parsing/[]
115
116 Parses source files and produces an Abstract Syntax Tree (AST)
117 (link:parsing/parsetree.mli[] has lot of helpful comments). See
118 link:parsing/HACKING.adoc[].
119
120 The logic for Camlp4 and Ppx preprocessing is not in link:parsing/[],
121 but in link:driver/[], see link:driver/pparse.mli[] and
122 link:driver/pparse.ml[].
123
124 ==== Typing -- link:typing/[]
125
126 Type-checks the AST and produces a typed representation of the program
127 (link:typing/typedtree.mli[] has some helpful comments). See
128 link:typing/HACKING.adoc[].
129
130 ==== The bytecode compiler -- link:bytecomp/[]
131
132 ==== The native compiler -- link:middle_end/[] and link:asmcomp/[]
133
134 === Runtime system
135
136 === Libraries
137
138 link:stdlib/[]:: The standard library. Each file is largely
139 independent and should not need further knowledge.
140
141 link:otherlibs/[]:: External libraries such as `unix`, `threads`,
142 `dynlink`, `str` and `bigarray`.
143
144 Instructions for building the full reference manual are provided in
145 link:manual/README.md[]. However, if you only modify the documentation
146 comments in `.mli` files in the compiler codebase, you can observe the
147 result by running
148
149 ----
150 make html_doc
151 ----
152
153 and then opening link:./ocamldoc/stdlib_html/index.html[] in a web browser.
154
155 === Tools
156
157 link:lex/[]:: The `ocamllex` lexer generator.
158
159 link:yacc/[]:: The `ocamlyacc` parser generator. We do not recommend
160 using it for user projects in need of a parser generator. Please
161 consider using and contributing to
162 link:http://gallium.inria.fr/~fpottier/menhir/[menhir] instead, which
163 has tons of extra features, lets you write more readable grammars, and
164 has excellent documentation.
165
166 === Complete file listing
167
168 BOOTSTRAP.adoc:: instructions for bootstrapping
169 Changes:: what's new with each release
170 CONTRIBUTING.md:: how to contribute to OCaml
171 HACKING.adoc:: this file
172 INSTALL.adoc:: instructions for installation
173 LICENSE:: license and copyright notice
174 Makefile:: main Makefile
175 Makefile.common:: common Makefile definitions
176 Makefile.tools:: used by manual/ and testsuite/ Makefiles
177 README.adoc:: general information on the compiler distribution
178 README.win32.adoc:: general information on the Windows ports of OCaml
179 VERSION:: version string
180 asmcomp/:: native-code compiler and linker
181 boot/:: bootstrap compiler
182 build-aux/: autotools support scripts
183 bytecomp/:: bytecode compiler and linker
184 compilerlibs/:: the OCaml compiler as a library
185 configure:: configure script
186 configure.ac: autoconf input file
187 debugger/:: source-level replay debugger
188 driver/:: driver code for the compilers
189 flexdll/:: git submodule -- see link:README.win32.adoc[]
190 lex/:: lexer generator
191 man/:: man pages
192 manual/:: system to generate the manual
193 middle_end/:: the flambda optimisation phase
194 ocamldoc/:: documentation generator
195 ocamltest/:: test driver
196 otherlibs/:: several additional libraries
197 parsing/:: syntax analysis -- see link:parsing/HACKING.adoc[]
198 runtime/:: bytecode interpreter and runtime systems
199 stdlib/:: standard library
200 testsuite/:: tests -- see link:testsuite/HACKING.adoc[]
201 tools/:: various utilities
202 toplevel/:: interactive system
203 typing/:: typechecking -- see link:typing/HACKING.adoc[]
204 utils/:: utility libraries
205 yacc/:: parser generator
206
207 == Development tips and tricks
208
209 === opam compiler script
210
211 The separately-distributed script
212 https://github.com/gasche/opam-compiler-conf[`opam-compiler-conf`] can
213 be used to easily build opam switches out of a git branch of the
214 compiler distribution. This lets you easily install and test opam
215 packages from an under-modification compiler version.
216
217 === Useful Makefile targets
218
219 Besides the targets listed in link:INSTALL.adoc[] for build and
220 installation, the following targets may be of use:
221
222 `make runtop` :: builds and runs the ocaml toplevel of the distribution
223 (optionally uses `rlwrap` for readline+history support)
224 `make natruntop`:: builds and runs the native ocaml toplevel (experimental)
225
226 `make partialclean`:: Clean the OCaml files but keep the compiled C files.
227
228 `make depend`:: Regenerate the `.depend` file. Should be used each time new dependencies are added between files.
229
230 `make -C testsuite parallel`:: see link:testsuite/HACKING.adoc[]
231
232 Additionally, there are some developer specific targets in link:Makefile.dev[].
233 These targets are automatically available when working in a Git clone of the
234 repository, but are not available from a tarball.
235
236 === Automatic configure options
237
238 If you have options to `configure` which you always (or at least frequently)
239 use, it's possible to store them in Git, and `configure` will automatically add
240 them. For example, you may wish to avoid building the debug runtime by default
241 while developing, in which case you can issue
242 `git config --global ocaml.configure '--disable-debug-runtime'`. The `configure`
243 script will alert you that it has picked up this option and added it _before_
244 any options you specified for `configure`.
245
246 Options are added before those passed on the command line, so it's possible to
247 override them, for example `./configure --enable-debug-runtime` will build the
248 debug runtime, since the enable flag appears after the disable flag. You can
249 also use the full power of Git's `config` command and have options specific to
250 particular clone or worktree.
251
252 === Speeding up configure
253
254 `configure` includes the standard `-C` option which caches various test results
255 in the file `config.cache` and can use those results to avoid running tests in
256 subsequent invocations. This mechanism works fine, except that it is easy to
257 clean the cache by mistake (e.g. with `git clean -dfX`). The cache is also
258 host-specific which means the file has to be deleted if you run `configure` with
259 a new `--host` value (this is quite common on Windows, where `configure` is
260 also quite slow to run).
261
262 You can elect to have host-specific cache files by issuing
263 `git config --global ocaml.configure-cache .`. The `configure` script will now
264 automatically create `ocaml-host.cache` (e.g. `ocaml-x86_64-pc-windows.cache`,
265 or `ocaml-default.cache`). If you work with multiple worktrees, you can share
266 these cache files by issuing `git config --global ocaml.configure-cache ..`. The
267 directory is interpreted _relative_ to the `configure` script.
268
269 === Bootstrapping
270
271 The OCaml compiler is bootstrapped. This means that
272 previously-compiled bytecode versions of the compiler and lexer are
273 included in the repository under the
274 link:boot/[] directory. These bytecode images are used once the
275 bytecode runtime (which is written in C) has been built to compile the
276 standard library and then to build a fresh compiler. Details can be
277 found in link:BOOTSTRAP.adoc[].
278
279 === Speeding up builds
280
281 Once you've built a natively-compiled `ocamlc.opt`, you can use it to
282 speed up future builds by copying it to `boot`:
283
284 ----
285 cp ocamlc.opt boot/
286 ----
287
288 If `boot/ocamlc` changes (e.g. because you ran `make bootstrap`), then
289 the build will revert to the slower bytecode-compiled `ocamlc` until
290 you do the above step again.
291
292 === Continuous integration
293
294 ==== Github's CI: Travis and AppVeyor
295
296 The script that is run on Travis continuous integration servers is
297 link:tools/ci/travis/travis-ci.sh[]; its configuration can be found as
298 a Travis configuration file in link:.travis.yml[].
299
300 For example, if you want to reproduce the default build on your
301 machine, you can use the configuration values and run command taken from
302 link:.travis.yml[]:
303
304 ----
305 CI_KIND=build XARCH=x64 bash -ex tools/ci/travis/travis-ci.sh
306 ----
307
308 The scripts support two other kinds of tests (values of the
309 `CI_KIND` variable) which both inspect the patch submitted as part of
310 a pull request. `tests` checks that the testsuite has been modified
311 (hopefully, improved) by the patch, and `changes` checks that the
312 link:Changes[] file has been modified (hopefully to add a new entry).
313
314 These tests rely on the `$TRAVIS_COMMIT_RANGE` variable which you can
315 set explicitly to reproduce them locally.
316
317 The `changes` check can be disabled by including "(no change
318 entry needed)" in one of your commit messages -- but in general all
319 patches submitted should come with a Changes entry; see the guidelines
320 in link:CONTRIBUTING.md[].
321
322 ==== INRIA's Continuous Integration (CI)
323
324 INRIA provides a Jenkins continuous integration service that OCaml
325 uses, see link:https://ci.inria.fr/ocaml/[]. It provides a wider
326 architecture support (MSVC and MinGW, a zsystems s390x machine, and
327 various MacOS versions) than the Travis/AppVeyor testing on github,
328 but only runs on commits to the trunk or release branches, not on every
329 PR.
330
331 You do not need to be an INRIA employee to open an account on this
332 jenkins service; anyone can create an account there to access build
333 logs and manually restart builds. If you
334 would like to do this but have trouble doing it, please email
335 ocaml-ci-admin@inria.fr.
336
337 To be notified by email of build failures, you can subscribe to the
338 ocaml-ci-notifications@inria.fr mailing list by visiting
339 https://sympa.inria.fr/sympa/info/ocaml-ci-notifications[its web page.]
340
341 ==== Running INRIA's CI on a publicly available git branch
342
343 If you have suspicions that your changes may fail on exotic architectures
344 (they touch the build system or the backend code generator,
345 for example) and would like to get wider testing than github's CI
346 provides, it is possible to manually start INRIA's CI on arbitrary git
347 branches even before opening a pull request as follows:
348
349 1. Make sure you have an account on Inria's CI as described before.
350
351 2. Make sure you have been added to the ocaml project.
352
353 3. Prepare a branch with the code you'd like to test, say "mybranch". It
354 is probably a good idea to make sure your branch is based on the latest
355 trunk.
356
357 4. Make your branch publicly available. For instance, you can fork
358 OCaml's GitHub repository and then push "mybranch" to your fork.
359
360 5. Visit https://ci.inria.fr/ocaml/job/precheck and log in. Click on
361 "Build with parameters".
362
363 6. Fill in the REPO_URL and BRANCH fields as appropriate and run the build.
364
365 7. You should receive a bunch of e-mails with the build logs for each
366 slave and each tested configuration (with and without flambda) attached.
367
368 ==== Changing what the CI does
369
370 INRIA's CI "main" and "precheck" jobs run the script
371 tools/ci-build. In particular, when running the CI on a publicly
372 available branch via the "precheck" job as explained in the previous
373 section, you can edit this script to change what the CI will test.
374
375 For instance, parallel builds are only tested for the "trunk"
376 branch. In order to use "precheck" to test parallel build on a custom
377 branch, add this at the beginning of tools/ci-build:
378
379 ----
380 OCAML_JOBS=10
381 ----
382
383 === The `caml-commits` mailing list
384
385 If you would like to receive email notifications of all commits made to the main
386 git repository, you can subscribe to the caml-commits@inria.fr mailing list by
387 visiting https://sympa.inria.fr/sympa/info/caml-commits[its web page.]
388
389 Happy Hacking!
390