
Sign up to save your podcasts
Or
This week on BSDNow, reports from AsiaBSDcon, TrueOS and FreeBSD news, Optimizing IllumOS Kernel, your questions and more.
Previously, Kris directly handed out commit bits. Now, the Core developers have provided a small list of requirements for gaining a TrueOS commit bit:
Create five or more pull requests in a TrueOS Project repository within a single six month period.
Stay active in the TrueOS community through at least one of the available community channels (Gitter, Discourse, IRC, etc.).
Request commit access from the core developers via [email protected] OR Core developers contact you concerning commit access.
Pull requests can be any contribution to the project, from minor documentation tweaks to creating full utilities.
At the end of every month, the core developers review the commit logs, removing elements that break the Project or deviate too far from its intended purpose. Additionally, outstanding pull requests with no active dissension are immediately merged, if possible. For example, a user submits a pull request which adds a little-used OpenRC script. No one from the community comments on the request or otherwise argues against its inclusion, resulting in an automatic merge at the end of the month. In this manner, solid contributions are routinely added to the project and never left in a state of “limbo”.
Contributors to the TrueOS Project enjoy a number of benefits, including:
A personal TrueOS email alias: @trueos.org
Full access for managing TrueOS issues on GitHub.
Regular meetings with the core developers and other contributors.
Access to private chat channels with the core developers.
Recognition as part of an online Who’s Who of TrueOS developers.
The eternal gratitude of the core developers of TrueOS.
A warm, fuzzy feeling.
Intel will be more actively engaging with the FreeBSD Foundation and the FreeBSD Project to deliver more timely support for Intel products and technologies in FreeBSD.
Intel has contributed code to FreeBSD for individual device drivers (i.e. NICs) in the past, but is now seeking a more holistic “systems thinking” approach.
We will work closely with the FreeBSD Foundation to ensure the drivers, tools, and applications needed on Intel® SSD-based storage appliances are available to the community. This collaboration will also provide timely support for future Intel® 3D XPoint™ products.
iSCSI is often touted as a low-cost replacement for fibre-channel (FC) Storage Area Networks (SANs). Instead of having to setup a separate fibre-channel network for the SAN, or invest in the infrastructure to run Fibre-Channel over Ethernet (FCoE), iSCSI runs on top of standard TCP/IP. This means that the same network equipment used for routing user data on a network could be utilized for the storage as well.
This article will cover a very basic setup where a FreeBSD server is configured as an iSCSI Target, and another FreeBSD server is configured as the iSCSI Initiator. The iSCSI Target will export a single disk drive, and the initiator will create a filesystem on this disk and mount it locally. Advanced topics, such as multipath, ZFS storage pools, failover controllers, etc. are not covered.
The real magic is the /etc/ctl.conf file, which contains all of the information necessary for ctld to share disk drives on the network. Check out the man page for /etc/ctl.conf for more details; below is the configuration file that I created for this test setup. Note that on a system that has never had iSCSI configured, there will be no existing configuration file, so go ahead and create it.
Then, enable ctld and start it:
You can use the ctladm command to see what is going on:
root@bsdtarget:/dev # ctladm lunlist
(7:0:0/0): Fixed Direct Access SPC-4 SCSI device
(7:0:1/1): Fixed Direct Access SPC-4 SCSI device
root@bsdtarget:/dev # ctladm devlist
LUN Backend Size (Blocks) BS Serial Number Device ID
0 block 10485760 512 MYSERIAL 0 MYDEVID 0
1 block 10485760 512 MYSERIAL 1 MYDEVID 1
In order for a FreeBSD host to become an iSCSI Initiator, the iscsd daemon needs to be started.
sysrc iscsid_enable=”YES”
service iscsid start
Next, the iSCSI Initiator can manually connect to the iSCSI target using the iscsictl tool. While setting up a new iSCSI session, this is probably the best option. Once you are sure the configuration is correct, add the configuration to the /etc/iscsi.conf file (see man page for this file). For iscsictl, pass the IP address of the target as well as the iSCSI IQN for the session:
You should now have a new device (check dmesg), in this case, da1
The guide them walks through partitioning the disk, and laying down a UFS file system, and mounting it
This it walks through how to disconnect iscsi, incase you don’t want it anymore
This all looked nice and easy, and it works very well. Now lets see what happens when you try to mount the iSCSI from Windows
Ok, that wasn’t so bad.
Now, instead of sharing an entire space disk on the host via iSCSI, share a zvol. Now your windows machine can be backed by ZFS. All of your problems are solved.
Technical Lead at SysFive, and Former OpenBSD Committer
Even though mdoc(7) is a semantic markup language, traditionally none of the semantic annotations were communicated to the reader. [...] Now, at least in -T html output mode, you can see the semantic function of marked-up words by hovering your mouse over them.
In terminal output modes, we have the ctags(1)-like internal search facility built around the less(1) tag jump (:t) feature for quite some time now. We now have a similar feature in -T html output mode. To jump to (almost) the same places in the text, go to the address bar of the browser, type a hash mark ('#') after the URI, then the name of the option, command, variable, error code etc. you want to jump to, and hit enter.
Recently I've had some motivation to look into the KCF on Illumos and discovered that, unbeknownst to me, we already had an AES-NI implementation that was automatically enabled when running on Intel and AMD CPUs with AES-NI support. This work was done back in 2010 by Dan Anderson.This was great news, so I set out to test the performance in Illumos in a VM on my Mac with a Core i5 3210M (2.5GHz normal, 3.1GHz turbo).
So now comes the test for the KCF. I wrote a quick'n'dirty crypto test module that just performed a bunch of encryption operations and timed the results.
What the hell is that?! This is just plain unacceptable. Obviously we must have hit some nasty performance snag somewhere, because this is comical. And sure enough, we did.
When looking around in the AES-NI implementation I came across this bit in aes_intel.s that performed the CLTS instruction.
This is a problem: 3.1.2 Instructions That Cause VM Exits ConditionallyCLTS. The CLTS instruction causes a VM exit if the bits in position 3 (corresponding to CR0.TS) are set in both the CR0 guest/host mask and the CR0 read shadow.
The CLTS instruction signals to the CPU that we're about to use FPU registers (which is needed for AES-NI), which in VMware causes an exit into the hypervisor. And we've been doing it for every single AES block! Needless to say, performing the equivalent of a very expensive context switch every 16 bytes is going to hurt encryption performance a bit. The reason why the kernel is issuing CLTS is because for performance reasons, the kernel doesn't save and restore FPU register state on kernel thread context switches. So whenever we need to use FPU registers inside the kernel, we must disable kernel thread preemption via a call to kpreempt_disable() and kpreempt_enable() and save and restore FPU register state manually. During this time, we cannot be descheduled (because if we were, some other thread might clobber our FPU registers), so if a thread does this for too long, it can lead to unexpected latency bubbles
The solution was to restructure the AES and KCF block crypto implementations in such a way that we execute encryption in meaningfully small chunks. I opted for 32k bytes, for reasons which I'll explain below. Unfortunately, doing this restructuring work was a bit more complicated than one would imagine, since in the KCF the implementation of the AES encryption algorithm and the block cipher modes is separated into two separate modules that interact through an internal API, which wasn't really conducive to high performance (we'll get to that later). Anyway, having fixed the issue here and running the code at near native speed, this is what I get:
AES-128/CTR: 439 MB/s
AES-128/CBC: 483 MB/s
AES-128/GCM: 252 MB/s
Not disastrous anymore, but still, very, very bad. Of course, you've got keep in mind, the thing we're comparing it to, OpenSSL, is no slouch. It's got hand-written highly optimized inline assembly implementations of most of these encryption functions and their specific modes, for lots of platforms. That's a ton of code to maintain and optimize, but I'll be damned if I let this kind of performance gap persist.
Fixing this, however, is not so trivial anymore. It pertains to how the KCF's block cipher mode API interacts with the cipher algorithms. It is beautifully designed and implemented in a fashion that creates minimum code duplication, but this also means that it's inherently inefficient.
ECB, CBC and CTR gained the ability to pass an algorithm-specific "fastpath" implementation of the block cipher mode, because these functions benefit greatly from pipelining multiple cipher calls into a single place.
ECB, CTR and CBC decryption benefit enormously from being able to exploit the wide XMM register file on Intel to perform encryption/decryption operations on 8 blocks at the same time in a non-interlocking manner. The performance gains here are on the order of 5-8x.CBC encryption benefits from not having to copy the previously encrypted ciphertext blocks into memory and back into registers to XOR them with the subsequent plaintext blocks, though here the gains are more modest, around 1.3-1.5x.
After all of this work, this is how the results now look on Illumos, even inside of a VM:
Algorithm/Mode 128k ops
AES-128/CTR: 3121 MB/s
AES-128/CBC: 691 MB/s
AES-128/GCM: 1053 MB/s
On the decryption side of things, CBC decryption also jumped from 627 MB/s to 3011 MB/s. Seeing these performance numbers, you can see why I chose 32k for the operation size in between kernel preemption barriers. Even on the slowest hardware with AES-NI, we can expect at least 300-400 MB/s/core of throughput, so even in the worst case, we'll be hogging the CPU for at most ~0.1ms per run.
Overall, we're even a little bit faster than OpenSSL in some tests, though that's probably down to us encrypting 128k blocks vs 8k in the "openssl speed" utility. Anyway, having fixed this monstrous atrocity of a performance bug, I can now finally get some sleep.
I remember the Y2K problem quite vividly. The world was going crazy for years, paying insane amounts of money to experts to fix critical legacy systems, and there was a neverending stream of predictions from the media on how it’s all going to fail. Most didn’t even understand what the problem was, and I remember one magazine writing something like the following:
Michael Lucas: Get your name in “Absolute FreeBSD 3rd Edition”
ZFS compressed ARC stats to top
Matthew Dillon discovered HAMMER was repeating itself when writing to disk. Fixing that issue doubled write speeds
TedU on Meaningful Short Names
vBSDcon and EuroBSDcon Call for Papers are open
4.9
8989 ratings
This week on BSDNow, reports from AsiaBSDcon, TrueOS and FreeBSD news, Optimizing IllumOS Kernel, your questions and more.
Previously, Kris directly handed out commit bits. Now, the Core developers have provided a small list of requirements for gaining a TrueOS commit bit:
Create five or more pull requests in a TrueOS Project repository within a single six month period.
Stay active in the TrueOS community through at least one of the available community channels (Gitter, Discourse, IRC, etc.).
Request commit access from the core developers via [email protected] OR Core developers contact you concerning commit access.
Pull requests can be any contribution to the project, from minor documentation tweaks to creating full utilities.
At the end of every month, the core developers review the commit logs, removing elements that break the Project or deviate too far from its intended purpose. Additionally, outstanding pull requests with no active dissension are immediately merged, if possible. For example, a user submits a pull request which adds a little-used OpenRC script. No one from the community comments on the request or otherwise argues against its inclusion, resulting in an automatic merge at the end of the month. In this manner, solid contributions are routinely added to the project and never left in a state of “limbo”.
Contributors to the TrueOS Project enjoy a number of benefits, including:
A personal TrueOS email alias: @trueos.org
Full access for managing TrueOS issues on GitHub.
Regular meetings with the core developers and other contributors.
Access to private chat channels with the core developers.
Recognition as part of an online Who’s Who of TrueOS developers.
The eternal gratitude of the core developers of TrueOS.
A warm, fuzzy feeling.
Intel will be more actively engaging with the FreeBSD Foundation and the FreeBSD Project to deliver more timely support for Intel products and technologies in FreeBSD.
Intel has contributed code to FreeBSD for individual device drivers (i.e. NICs) in the past, but is now seeking a more holistic “systems thinking” approach.
We will work closely with the FreeBSD Foundation to ensure the drivers, tools, and applications needed on Intel® SSD-based storage appliances are available to the community. This collaboration will also provide timely support for future Intel® 3D XPoint™ products.
iSCSI is often touted as a low-cost replacement for fibre-channel (FC) Storage Area Networks (SANs). Instead of having to setup a separate fibre-channel network for the SAN, or invest in the infrastructure to run Fibre-Channel over Ethernet (FCoE), iSCSI runs on top of standard TCP/IP. This means that the same network equipment used for routing user data on a network could be utilized for the storage as well.
This article will cover a very basic setup where a FreeBSD server is configured as an iSCSI Target, and another FreeBSD server is configured as the iSCSI Initiator. The iSCSI Target will export a single disk drive, and the initiator will create a filesystem on this disk and mount it locally. Advanced topics, such as multipath, ZFS storage pools, failover controllers, etc. are not covered.
The real magic is the /etc/ctl.conf file, which contains all of the information necessary for ctld to share disk drives on the network. Check out the man page for /etc/ctl.conf for more details; below is the configuration file that I created for this test setup. Note that on a system that has never had iSCSI configured, there will be no existing configuration file, so go ahead and create it.
Then, enable ctld and start it:
You can use the ctladm command to see what is going on:
root@bsdtarget:/dev # ctladm lunlist
(7:0:0/0): Fixed Direct Access SPC-4 SCSI device
(7:0:1/1): Fixed Direct Access SPC-4 SCSI device
root@bsdtarget:/dev # ctladm devlist
LUN Backend Size (Blocks) BS Serial Number Device ID
0 block 10485760 512 MYSERIAL 0 MYDEVID 0
1 block 10485760 512 MYSERIAL 1 MYDEVID 1
In order for a FreeBSD host to become an iSCSI Initiator, the iscsd daemon needs to be started.
sysrc iscsid_enable=”YES”
service iscsid start
Next, the iSCSI Initiator can manually connect to the iSCSI target using the iscsictl tool. While setting up a new iSCSI session, this is probably the best option. Once you are sure the configuration is correct, add the configuration to the /etc/iscsi.conf file (see man page for this file). For iscsictl, pass the IP address of the target as well as the iSCSI IQN for the session:
You should now have a new device (check dmesg), in this case, da1
The guide them walks through partitioning the disk, and laying down a UFS file system, and mounting it
This it walks through how to disconnect iscsi, incase you don’t want it anymore
This all looked nice and easy, and it works very well. Now lets see what happens when you try to mount the iSCSI from Windows
Ok, that wasn’t so bad.
Now, instead of sharing an entire space disk on the host via iSCSI, share a zvol. Now your windows machine can be backed by ZFS. All of your problems are solved.
Technical Lead at SysFive, and Former OpenBSD Committer
Even though mdoc(7) is a semantic markup language, traditionally none of the semantic annotations were communicated to the reader. [...] Now, at least in -T html output mode, you can see the semantic function of marked-up words by hovering your mouse over them.
In terminal output modes, we have the ctags(1)-like internal search facility built around the less(1) tag jump (:t) feature for quite some time now. We now have a similar feature in -T html output mode. To jump to (almost) the same places in the text, go to the address bar of the browser, type a hash mark ('#') after the URI, then the name of the option, command, variable, error code etc. you want to jump to, and hit enter.
Recently I've had some motivation to look into the KCF on Illumos and discovered that, unbeknownst to me, we already had an AES-NI implementation that was automatically enabled when running on Intel and AMD CPUs with AES-NI support. This work was done back in 2010 by Dan Anderson.This was great news, so I set out to test the performance in Illumos in a VM on my Mac with a Core i5 3210M (2.5GHz normal, 3.1GHz turbo).
So now comes the test for the KCF. I wrote a quick'n'dirty crypto test module that just performed a bunch of encryption operations and timed the results.
What the hell is that?! This is just plain unacceptable. Obviously we must have hit some nasty performance snag somewhere, because this is comical. And sure enough, we did.
When looking around in the AES-NI implementation I came across this bit in aes_intel.s that performed the CLTS instruction.
This is a problem: 3.1.2 Instructions That Cause VM Exits ConditionallyCLTS. The CLTS instruction causes a VM exit if the bits in position 3 (corresponding to CR0.TS) are set in both the CR0 guest/host mask and the CR0 read shadow.
The CLTS instruction signals to the CPU that we're about to use FPU registers (which is needed for AES-NI), which in VMware causes an exit into the hypervisor. And we've been doing it for every single AES block! Needless to say, performing the equivalent of a very expensive context switch every 16 bytes is going to hurt encryption performance a bit. The reason why the kernel is issuing CLTS is because for performance reasons, the kernel doesn't save and restore FPU register state on kernel thread context switches. So whenever we need to use FPU registers inside the kernel, we must disable kernel thread preemption via a call to kpreempt_disable() and kpreempt_enable() and save and restore FPU register state manually. During this time, we cannot be descheduled (because if we were, some other thread might clobber our FPU registers), so if a thread does this for too long, it can lead to unexpected latency bubbles
The solution was to restructure the AES and KCF block crypto implementations in such a way that we execute encryption in meaningfully small chunks. I opted for 32k bytes, for reasons which I'll explain below. Unfortunately, doing this restructuring work was a bit more complicated than one would imagine, since in the KCF the implementation of the AES encryption algorithm and the block cipher modes is separated into two separate modules that interact through an internal API, which wasn't really conducive to high performance (we'll get to that later). Anyway, having fixed the issue here and running the code at near native speed, this is what I get:
AES-128/CTR: 439 MB/s
AES-128/CBC: 483 MB/s
AES-128/GCM: 252 MB/s
Not disastrous anymore, but still, very, very bad. Of course, you've got keep in mind, the thing we're comparing it to, OpenSSL, is no slouch. It's got hand-written highly optimized inline assembly implementations of most of these encryption functions and their specific modes, for lots of platforms. That's a ton of code to maintain and optimize, but I'll be damned if I let this kind of performance gap persist.
Fixing this, however, is not so trivial anymore. It pertains to how the KCF's block cipher mode API interacts with the cipher algorithms. It is beautifully designed and implemented in a fashion that creates minimum code duplication, but this also means that it's inherently inefficient.
ECB, CBC and CTR gained the ability to pass an algorithm-specific "fastpath" implementation of the block cipher mode, because these functions benefit greatly from pipelining multiple cipher calls into a single place.
ECB, CTR and CBC decryption benefit enormously from being able to exploit the wide XMM register file on Intel to perform encryption/decryption operations on 8 blocks at the same time in a non-interlocking manner. The performance gains here are on the order of 5-8x.CBC encryption benefits from not having to copy the previously encrypted ciphertext blocks into memory and back into registers to XOR them with the subsequent plaintext blocks, though here the gains are more modest, around 1.3-1.5x.
After all of this work, this is how the results now look on Illumos, even inside of a VM:
Algorithm/Mode 128k ops
AES-128/CTR: 3121 MB/s
AES-128/CBC: 691 MB/s
AES-128/GCM: 1053 MB/s
On the decryption side of things, CBC decryption also jumped from 627 MB/s to 3011 MB/s. Seeing these performance numbers, you can see why I chose 32k for the operation size in between kernel preemption barriers. Even on the slowest hardware with AES-NI, we can expect at least 300-400 MB/s/core of throughput, so even in the worst case, we'll be hogging the CPU for at most ~0.1ms per run.
Overall, we're even a little bit faster than OpenSSL in some tests, though that's probably down to us encrypting 128k blocks vs 8k in the "openssl speed" utility. Anyway, having fixed this monstrous atrocity of a performance bug, I can now finally get some sleep.
I remember the Y2K problem quite vividly. The world was going crazy for years, paying insane amounts of money to experts to fix critical legacy systems, and there was a neverending stream of predictions from the media on how it’s all going to fail. Most didn’t even understand what the problem was, and I remember one magazine writing something like the following:
Michael Lucas: Get your name in “Absolute FreeBSD 3rd Edition”
ZFS compressed ARC stats to top
Matthew Dillon discovered HAMMER was repeating itself when writing to disk. Fixing that issue doubled write speeds
TedU on Meaningful Short Names
vBSDcon and EuroBSDcon Call for Papers are open
1,971 Listeners
272 Listeners
283 Listeners
265 Listeners
215 Listeners
154 Listeners
65 Listeners
189 Listeners
181 Listeners
44 Listeners
21 Listeners
135 Listeners
92 Listeners
29 Listeners
47 Listeners