I need to execute a data-intensive algorithm in AWS Lambda to reduce costs. The main reason is that the application won't be heavily used.
In Python, such algorithm executes an analysis in 5 seconds, while Julia does it in 0.8 when using a SysImage, so there's a clear advantage.
After developing a custom runtime as explained in AWS Docs, containerizing the Julia code with Docker and testing it locally, I tried to deploy it at AWS Lambda without success.
Firstly, the error log showed problems loading the SysImage:
/var/task/CustomImage.so: failed to map segment from shared object
This was fixed by increasing Lambda's memory.
The next issue arised was:
ERROR: Unable to find compatible target in system image.
I thought it related to Julia trying to find the target architecture of its runtime, so I updated the SysImage creation function to take into account a x86_64 arch.
Created the SysImage, deployed everything again, and now the Lambda fails without errors.
Relevant code:
This function creates the SysImage:
using PackageCompiler
create_sysimage(["DataFrames", "CSV", "GeoDataFrames", "SpatialDependence", "Statistics", "LisaLambda"];
sysimage_path="CustomImage.so",
precompile_execution_file="precompile/precompile.jl",
cpu_target="generic")
My bootstrap file:
#!/bin/sh
# This script is called by the lambda execution environment when it
# receives the very first invocation request.
cd /var/task
/usr/local/julia/bin/julia --sysimage=CustomImage.so --project=. main.jl
And the Dockerfile:
# AWS provided base image (Amazon Linux 2)
# It includes Lambda Runtime Emulator for testing locally.
FROM public.ecr.aws/lambda/provided:al2
# Download and install Julia
WORKDIR /usr/local
RUN yum install -y tar gzip \
&& curl -LO https://julialang-s3.julialang.org/bin/linux/x64/1.9/julia-1.9.2-linux-x86_64.tar.gz \
&& tar xf julia-1.9.2-linux-x86_64.tar.gz \
&& rm julia-1.9.2-linux-x86_64.tar.gz \
&& ln -s julia-1.9.2 julia
# Dependency
RUN yum install -y gcc make \
&& curl -LO https://github.com/madler/zlib/archive/refs/tags/v1.2.9.tar.gz \
&& tar xf v1.2.9.tar.gz \
&& rm v1.2.9.tar.gz \
&& cd zlib-1.2.9 \
&& ./configure; make; make install \
&& ln -sf ./lib/libz.so.1.2.9 /lib64/libz.so.1
# && ln -sf ./lib/libz.so.1.2.9 /lib/libz.so.1
# Install application
WORKDIR /var/task
# Use a special depot path to store precompiled binaries
ENV JULIA_DEPOT_PATH /var/task/.julia
# Instantiate project and precompile packages
COPY Manifest.toml .
COPY Project.toml .
RUN /usr/local/julia/bin/julia --project=. -e "using Pkg; Pkg.instantiate();"
RUN /usr/local/julia/bin/julia --project=. -e "using Pkg; Pkg.precompile();"
# Copy application code
COPY . .
# Uncomment this line to allow more precompilation in lamdbda just in case.
# That's because /var/task is a read-only path during runtime.
ENV JULIA_DEPOT_PATH /tmp/.julia:/var/task/.julia
# Install bootstrap script
WORKDIR /var/runtime
COPY bootstrap .
# Create an empty extensions directory
WORKDIR /opt/extensions
# Shared libraries:
ENV LD_PRELOAD /usr/local/lib/libz.so.1
WORKDIR /lib
COPY CustomImage.so .
WORKDIR /opt/lib
COPY CustomImage.so .
# Which module/function to call?
CMD [ "LisaLambda.handle_event" ]
The Julia algorithm also uses some csv and gpkg files stored along with the code.
The project structure is as follows:
.
├── bootstrap
├── CustomImage.so
├── data
│ ├── fileA.gpkg
│ └── fileB.csv
├── Dockerfile
├── main.jl
├── Manifest.toml
├── precompile
│ ├── create_sysimage.jl
│ └── precompile.jl
├── Project.toml
└── src
├── GeoData.jl
├── LisaLambda.jl
└── Moran.jl