I recently spent several days improving the OCaml FreeTDS C bindings for work, and I thought it might be useful to share the problems I ran into and how to solve them.
I tried to order things so the most likely issues are listed first, but if you're trying to debug some C binding crashes, I recommend just reading the whole thing.
This post will assume you're already familiar with the official documentation.
value
is an alias for int
When debugging errors in C bindings, one of the first things you should check is whether you're consistently using Val_int
and Int_val
correctly. Unfortunately, value
and int
are the same type as far as C is concerned, so the compiler won't warn you if you mix these two macros up.
The way to remember which is which is that they're shorthand for "value of int" (Val_int
) and "int of value" (Int_val
).
Also note that the same problem exists for Bool_val
and Val_bool
because booleans are also integers in C.
Bad
value why_is_c_so_bad(value unit) {
CAMLparam1(unit);
// error, effectively casting 5 to pointer
CAMLreturn(Int_val(5));
}
Good
value why_is_c_so_bad(value unit) {
CAMLparam1(unit);
CAMLreturn(Val_int(5));
}
Follow the rules for CAMLparam and CAMLlocal
Whenever you allocate an OCaml value, you give the garbage collector a chance to run. To ensure that values you're using don't get garbage collected, you have to use the macros CAMLparam
, CAMLlocal
, and CAMLreturn
. As far as I can tell, these are basically the equivalent of the standard ref and unref functions most C libraries have, with some special cases to handle cleanup if your C code throws an OCaml exception.
There are cases where you can avoid this (if you're sure you're not allocating), but this is extremely annoying to debug if you get it wrong, since your code will work 99% of the time. As fun as it is to track down heisenbugs, save yourself the trouble and just use those macros every time you have a variable of type value
.
Technically OK
value simplified_dbexec(value vdbconn, value vsql) {
DBPROCESS* dbconn = DBPROCESS_VALUE(vdbconn);
RETCODE ret = dbsqlexec(dbconn, String_val(vsql));
return Val_int(ret);
}
Bad
value simplified_dbexec(value vdbconn, value vsql) {
DBPROCESS* dbconn = DBPROCESS_VALUE(vdbconn);
RETCODE ret = dbsqlexec(dbconn, String_val(vsql));
if (ret == FAIL) {
// Throw `exception Example_exception of string * string`
value exn = caml_alloc_small(3, 0);
Field(exn, 0) = EXAMPLE_EXCEPTION_TAG;
Field(exn, 1) = caml_copy_string("simplified_dbexec");
// error, vsql could have been GC'd while allocating vexn
// or while copying the first string arg
// also error, using direct Field assignment after an allocation
Field(exn, 2) = vsql;
caml_raise(exn);
}
return Val_int(ret);
}
Good
value simplified_dbexec(value vdbconn, value vsql) {
CAMLparam2(vdbconn, vsql);
DBPROCESS* dbconn = DBPROCESS_VALUE(vdbconn);
RETCODE ret = dbsqlexec(dbconn, String_val(vsql));
CAMLreturn(Val_int(ret));
}
value simplified_dbexec(value vdbconn, value vsql) {
CAMLparam2(vdbconn, vsql);
CAMLlocal1(exn);
DBPROCESS* dbconn = DBPROCESS_VALUE(vdbconn);
RETCODE ret = dbsqlexec(dbconn, String_val(vsql));
if (ret == FAIL) {
exn = caml_alloc_small(3, 0);
Store_field(exn, 0, EXAMPLE_EXCEPTION_TAG);
Store_field(exn, 1, caml_copy_string("simplified_dbexec"));
Store_field(exn, 2, vsql);
caml_raise(exn);
}
CAMLreturn(Val_int(ret));
}
Don't hold onto string data
The garbage collector is allowed to move OCaml data around in memory whenever it runs. This doesn't matter much for primitives like ints, but for strings you need to be aware that the result of String_val
is only safe to use until the next allocation.
The easiest way to do this is to just never hold onto the result of String_val
.
Technically ok
value simplified_dbexec(value vdbconn, value vsql) {
CAMLparam2(vdbconn, vsql);
DBPROCESS* dbconn = DBPROCESS_VALUE(vdbconn);
const char* sql = String_val(vsql);
RETCODE ret = dbsqlexec(dbconn, sql);
CAMLreturn(Val_int(ret));
}
Bad
value simplified_dbexec(value vdbconn, value vsql) {
CAMLparam2(vdbconn, vsql);
DBPROCESS* dbconn = DBPROCESS_VALUE(vdbconn);
const char* sql = String_val(vsql);
RETCODE ret = dbsqlexec(dbconn, sql);
if (ret == FAIL) {
value exn = caml_alloc_small(3, 0);
Store_field(exn, 0, EXAMPLE_EXCEPTION_TAG);
Store_field(exn, 1, caml_copy_string("simplified_dbexec"));
// error, vsql's data could have been moved by the two previous
// allocations
Store_field(exn, 2, caml_copy_string(sql));
caml_raise(exn);
}
CAMLreturn(Val_int(ret));
}
Good
value simplified_dbexec(value vdbconn, value vsql) {
CAMLparam2(vdbconn, vsql);
DBPROCESS* dbconn = DBPROCESS_VALUE(vdbconn);
RETCODE ret = dbsqlexec(dbconn, String_val(vsql));
CAMLreturn(Val_int(ret));
}
Take the GC rules very seriously if releasing the runtime lock
When you release the runtime lock, OCaml is allowed to run the garbage collector at any time, and move memory around at will. This means you definitely want to follow the rule above about using the CAMLparam
and CAMLlocal
macros correctly, but additionally, you can't reference any value
data until the runtime lock is re-acquired.s
Since we shouldn't store references to strings when the GC might be running, this means we need to copy any string data we need access to while the runtime lock is released.
Another thing to keep in mind is that you also shouldn't mess with the list of local variables or parameters while the lock is released, since you can seriously confuse the garbage collector. Don't call CAMLparam
, CAMLlocal
, or CAMLreturn
if you're not holding the runtime lock. If you have a situation where you need to release the runtime lock and then call CAMLreturn
, you can use CAMLdrop
instead of CAMLreturn
or refactor your code into two functions, where the inner function uses CAML*
macros and the outer function releases the runtime lock after calling it.
Bad
value simplified_dbexec(value vdbconn, value vsql) {
CAMLparam2(vdbconn, vsql);
caml_release_runtime_lock();
// error, vsql's data could be moved at any time
// (not clear to me if DBPROCESS_VALUE(vdbconn) is safe)
RETCODE ret = dbsqlexec(DBPROCESS_VALUE(vdbconn), String_val(vsql));
caml_acquire_runtime_lock();
CAMLreturn(Val_int(ret));
}
void error_handler(int error_code)
{
caml_acquire_runtime_lock();
CAMLparam0();
CAMLlocal1(verror_code);
verror_code = Val_int(error_code);
caml_callback1(*handler, verror_code);
caml_release_runtime_lock();
// error, messing with the OCaml locals list when we don't have the
// runtime lock
CAMLreturn0;
}
Still bad
value simplified_dbexec(value vdbconn, value vsql) {
CAMLparam2(vdbconn, vsql);
DBPROCESS* dbconn = DBPROCESS_VALUE(vdbconn);
char* sql = String_val(vsql);
caml_release_runtime_lock();
// error, sql could be moved at any time
RETCODE ret = dbsqlexec(dbconn, sql);
caml_acquire_runtime_lock();
CAMLreturn(Val_int(ret));
}
Good
value simplified_dbexec(value vdbconn, value vsql) {
CAMLparam2(vdbconn, vsql);
DBPROCESS* dbconn = DBPROCESS_VALUE(vdbconn);
char* sql = caml_stat_strdup(String_val(vsql));
caml_release_runtime_lock();
RETCODE ret = dbsqlexec(dbconn, sql);
caml_acquire_runtime_lock();
caml_stat_free(sql);
CAMLreturn(Val_int(ret));
}
void error_handler_with_lock(int error_code)
{
CAMLparam0();
CAMLlocal1(verror_code);
verror_code = Val_int(error_code);
caml_callback1(*handler, verror_code);
CAMLreturn0;
}
void error_handler(int error_code)
{
caml_acquire_runtime_lock();
err_handler_with_lock(error_code);
caml_release_runtime_lock();
}
// alternative with CAMLdrop
void error_handler(int error_code)
{
caml_acquire_runtime_lock();
CAMLparam0();
CAMLlocal1(verror_code);
verror_code = Val_int(error_code);
caml_callback1(*handler, verror_code);
CAMLdrop;
caml_release_runtime_lock();
}
Be careful about throwing OCaml exceptions from C
If you call caml_raise
or any of its friends, you will immediately throw away the existing C stack. Any OCaml variables registered with CAMLparam
or CAMLlocal
will be handled correctly, but you need to be very careful about the current state of the C library you're calling into.
For example, you need to make sure to clean up any non-OCaml memory. Also, if you're in a callback from a C function.. just don't do it. You can maybe try to guess that the C function doesn't need to do any cleanup, but really.. don't.
If you run into this situation where you have to do this (i.e., the only way to do error handling in FreeTDS's dblib is through callbacks), you can save exceptions and rethrow them when the C library finishes.
Properly saving OCaml values
If you need to save an OCaml value, you use caml_register_global_root
and caml_remove_global_root
. When doing this, keep in mind that you're registering a pointer to a value
variable, not the contents, so the value
needs to always be valid!
Bad
typedef struct My_thing {
value example;
int have_example;
} My_thing;
static void my_thing_new() {
My_thing* thing = caml_stat_alloc(sizeof(My_thing));
thing->have_example = FALSE;
// error, thing->example is invalid
}
static void my_thing_free_example(My_thing* thing) {
if (thing->have_example) {
// error, we never registered this
caml_remove_global_root(&(thing->example));
this->have_example = FALSE;
}
}
static void my_thing_set_example(My_thing* thing, value example) {
CAMLparam1(example);
my_thing_free_example(thing);
// error, not registering the right variable
caml_register_global_root(&example);
thing->example = example;
thing->have_example = TRUE;
CAMLreturn0;
}
static void my_thing_free(My_thing* thing)
{
my_thing_free_example(thing);
caml_stat_free(thing);
}
Good
typedef struct My_thing {
value example;
} My_thing;
static void my_thing_new() {
My_thing* thing = caml_stat_alloc(sizeof(My_thing));
thing->example = Val_unit;
caml_register_global_root(&(thing->example));
}
static void my_thing_set_example(My_thing* thing, value example) {
CAMLparam1(example);
thing->example = example;
CAMLreturn0;
}
static void my_thing_free(My_thing* thing) {
caml_remove_global_root(&(thing->example));
caml_stat_free(thing);
}
And that's all I have.
Figuring out what was wrong in some cases was pretty confusing, so I hope this helps save someone else some time while writing or debugging C bindings. Thanks to the members of the OCaml Discord for helping me find several of these problems!