Interview with Thomas Graf of Cisco, regarding the Cilium project.
Cilium is a “science project” that Thomas and others at Cisco and
elsewhere are hacking on, to address the question of how to address
policy in a legacy-free container environment that scales to millions of
endpoints. It's an experiment because the outcome isn't yet certain, and
it's a question that hasn't seen much work outside of hyperscale
providers.
Cilium is based on eBPF, a
Linux kernel technology that introduces the ability for userspace to
inject custom programs into the kernel using a bytecode analogous to Java
virtual machine bytecode. Cilium uses eBPF-based hooks can intercept
packets at various places in their path through the kernel to implement a
flexible policy engine.
Topics include:
How Chris Wright encouraged Thomas to become involved with Open
vSwitch, when Thomas was at Red Hat.
The important differences between containers and VMs (quantity,
duration of workloads, and frequency of state changes) and how Cilium
addresses these issues.
The toolchain that Cilium uses to generate a eBPF program customized
for the policy of each container in a minimal and complete way.
The potential to bypass kernel sk_buff overhead by
intercepting packets before these data structures are constructed, via
Express Data Path (XDP), leading to the potential for DPDK-like packet
forwarding performance without ever leaving the kernel. What's the
drawback? We don't know yet how it will play out.
The main benefit of getting XDP-based early access to packets is to
avoid the main Linux networking code paths, which are optimized for
delivery to socket buffers (taking advantage of segmentation offload)
as opposed to forwarding.
Dropping packets quickly is important because many operators are under
attack all the time.
Importance of performance for small and large packets.
Languages available for writing eBPF: C via LLVM/Clang and Python
(among others?).
Limitations due to the verifier (primarily restrictions on loops), and
how to work around them.
How Cilium generates its eBPF programs: a base C program, plus an agent
in Go that generates a C header file. Cilium compiles the C program to
eBPF bytecode, using LLVM, then loads it into the running kernel with
the tc utility.
Potential for difficulty in getting a compiler toolchain into
production deployments. Is the simplicity of supplying Cilium as a
container image that builds in the toolchain an advantage?
How often eBPF programs need to be recompiled in Cilium.
eBPF “map” data structures that can be shared among eBPF programs,
the rest of the kernel, and userspace.
Policy in Cilium. The basic idea is that whoever specifies Cilium
policy should not have to understand traditional networking concepts
like IP addresses ands port. Instead, abstract labels specify which
classes of containers can talk to each other.
Lessons learned from policy in Cilium.
How Cilium does datapath packet processing, how it passes labels from
source to destination, and where it applies policy.
The direction in which Cilium points for eBPF support in Open vSwitch:
first, it shows that it is possible; second, it shows that the tracing
buffer mechanism available from eBPF is a potential replacement for
Open vSwitch “upcalls” currently implemented via Netlink; third, it
points out an alternative for the flow-based model. (Does it makes
sense to implement OVN directly via code generation?)
Connection tracking in eBPF.
eBPF helper functions in Linux, and the limitations of the current
ones.
Potential for applying eBPF to other targets such as DPDK or the OVS
port to Hyper-V.
Performance penalty for eBPF versus native code.
Early controversy in the kernel community over eBPF when it was
introduced.
What's next for Cilium: load balancing, IPsec, IPv4 .
More information about Cilium: slides
and the code repository.
You can find Thomas on the ovs-dev mailing
list, @tgraf__ on Twitter,
or on Facebook.
OVS Orbit is produced by Ben Pfaff. The
intro and bumper music is Electro
Deluxe, featuring Gurdonack, copyright 2014 by My Free Mickey. The
outro music is Girls like
you, featuring Thespinwires, copyright 2014 by Stefan Kartenberg.
All content is licensed under a Creative Commons Attribution 3.0
Unported (CC BY 3.0) license.