I'm trying to make a toy/proof-of-concept CFFI-based Python library exposing the PQClean implementation of Classic McEliece KEM version 6960119f.
The documentation notes that you must "Provide instantiations of any of the common cryptographic algorithms used by the implementation". This is true, because this code:
from cffi import FFI
ffibuilder = FFI()
ffibuilder.cdef("""
int PQCLEAN_MCELIECE6960119F_CLEAN_crypto_kem_enc(
uint8_t *c,
uint8_t *key,
const uint8_t *pk
);
int PQCLEAN_MCELIECE6960119F_CLEAN_crypto_kem_dec(
uint8_t *key,
const uint8_t *c,
const uint8_t *sk
);
int PQCLEAN_MCELIECE6960119F_CLEAN_crypto_kem_keypair
(
uint8_t *pk,
uint8_t *sk
);
""")
ffibuilder.set_source("_libmceliece6960119f", """
#include "api.h"
""",
library_dirs=["Lib/PQClean/crypto_kem/mceliece6960119f/clean"],
include_dirs=["Lib/PQClean/crypto_kem/mceliece6960119f/clean"],
libraries=["libmceliece6960119f_clean"])
if __name__ == "__main__":
import os
assert 'x64' == os.environ['VSCMD_ARG_TGT_ARCH'] == os.environ['VSCMD_ARG_HOST_ARCH']
ffibuilder.compile(verbose=True)
#^ Code crashes here
from _libmceliece6960119f import lib as libmceliece6960119f
...
yields MSVC error 1120 due to lacking implementations for void shake256(uint8_t *output, size_t outlen, const uint8_t *input, size_t inlen);
and int randombytes(uint8_t *output, size_t n);
:
Creating library .\Release\_libmceliece6960119f.cp311-win_amd64.lib and object .\Release\_libmceliece6960119f.cp311-win_amd64.exp
LINK : warning LNK4098: defaultlib 'LIBCMT' conflicts with use of other libs; use /NODEFAULTLIB:library
libmceliece6960119f_clean.lib(operations.obj) : error LNK2001: unresolved external symbol shake256
libmceliece6960119f_clean.lib(operations.obj) : error LNK2001: unresolved external symbol PQCLEAN_randombytes
libmceliece6960119f_clean.lib(encrypt.obj) : error LNK2001: unresolved external symbol PQCLEAN_randombytes
.\_libmceliece6960119f.cp311-win_amd64.pyd : fatal error LNK1120: 2 unresolved externals
…
Traceback (most recent call last):
File "C:\Users\████\Documents\code\███\cffi_compile.py", line 34, in <module>
ffibuilder.compile(verbose=True)
File "C:\Users\████\.venvs\███\Lib\site-packages\cffi\api.py", line 725, in compile
return recompile(self, module_name, source, tmpdir=tmpdir,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\████\.venvs\███\Lib\site-packages\cffi\recompiler.py", line 1564, in recompile
outputfilename = ffiplatform.compile('.', ext,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\████\.venvs\███\Lib\site-packages\cffi\ffiplatform.py", line 20, in compile
outputfilename = _build(tmpdir, ext, compiler_verbose, debug)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\████\.venvs\███\Lib\site-packages\cffi\ffiplatform.py", line 54, in _build
raise VerificationError('%s: %s' % (e.__class__.__name__, e))
cffi.VerificationError: LinkError: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.37.32822\\bin\\HostX86\\x64\\link.exe' failed with exit code 1120
I don't want to choose an implementation for either of these functions at compile-time. I want to force the consumer of this library to provide one early in run-time, ideally as Python functions Callable[[bytes], bytes], Callable[[int], bytes]
(neither function needs to implement any streaming capabilities because the shake256_inc_…
suite apparently isn't required by the PQCLEAN_MCELIECE6960119F_crypto_kem_…
suite.)
How can this be done?
You can do that by providing a C implementation that calls back to Python code, but write that Python code in such a way as to allow being overridden.
This should be enough:
It causes the two functions to be emitted as C code, where the compiler should find them. But the two functions are written so that they call back Python code. At runtime, that Python code must provide two global functions:
which you can implement as you see fit. For example, in your library's Python code, you can write something like that:
It's not a very flexible API because the user needs to give a single, global function. Multiple users overwrite each other. But it's hard to do better because the C API doesn't provide a generic
void *data
parameter to the callbacks.