The 2.6.14 kernel release has an important change in the way SELinux uses kernel memory.
Under SELinux, checkpolicy manipulates security policy entries into low-level access rules and compiles them into a binary format. The binary policy is then loaded into the kernel via writing to /selinux/load. During the load, these access rules are inserted kernel's access vector table (avtab), which is later consulted for every SELinux access control check, with results cached in the access vector cache (AVC).
A typical security policy may contain tens or even hundreds of thousands of these low-level rules, and thus, require significant amounts of kernel memory to store them in the avtab. A patch from Stephen Smalley (here) was merged upstream during 2.6.14 development which dramatically reduces the memory used by the avtab. It first reduces the size of the data structures used to hold the access rule information, resulting in a ~50% reduction in the total size of the memory used. The total number of avtab nodes is also reduced, by essentially pre-computing less information. The latter theoretically imposes an additional runtime overhead, although any calculated results are stored in the AVC, and no significant performance hit has been measured.
This results in memory savings of up to around 20x, depending on the system architecture and policy configuration. I measured the number of avtab slab objects on a 64-bit system at the time (via "grep avtab_node /proc/slabinfo"):
#objs objsize kernmem Targeted policy: Before: 237888 40 9.1MB After: 19968 24 468KB Strict policy: Before: 571680 40 21.81MB After: 221052 24 5.06MB
This shows a massive kernel memory saving for targeted policy, from over 9MB to 470KB. The avtab object size is reduced from 40 bytes to 24 (on 32-bit, it's 32 bytes to 16), and the total number of entries is reduced by an order of magnitude. As well as the sheer reduction of kernel memory, having a smaller object size invokes other efficiencies such as increasing the number of objects per slab. Improvements for strict policy are not as large, but still significant. Given that the default is targeted policy, this is a very welcome outcome.
Generic security xattr handling
Another interesting patch (here), changes the VFS to punt security extended attribute calls to the loaded security module if the filesystem does not support them natively. This allows the security module to perform in-kernel labeling for all such filesystems, and for the removal of existing security xattr code for devpts and tmpfs (and the need to write similar code for other psuedo-filesystems). Under SELinux, this change means that security xattrs on pseudo-filesystems now "just work". Calls to setxattr(2), getxattr(2) and listxattr(2) operate on in-kernel values automatically, and you'll actually see security contexts now when doing things like:
# ls -Z /selinux/null crw-rw-rw- root root system_u:object_r:null_device_t /selinux/null
Atomic inode create and label
This patchset implements atomic inode labeling, ensuring that it is possible to create and label a file in an atomic operation. Previously, there was a gap during file creation where it was possible to access an inode before the labeling hook was invoked. This has always been safe under SELinux, due to inodes being created with a safe default label, although under heavy loads, you would sometimes see false denials caused by this default label being referenced. The security risk here was indirect: people may end up loosening their security policies to avoid the false denial messages. The patchset fixes the problem by adding a new LSM hook inode_init_security, to be called from within each security xattr supporting filesystem during file creation. Each of these filesystems has been modified to call the hook.
As well as minor fixes and the usual audit-related adjustments, there's an update to the way IP protocol sockets are classified by SELinux, to correctly accommodate more recently implemented protocols such as SCTP and DCCP.
Also, there's a month left to get nominations in for the SELinux Symposium Community Delegate Program. If you know someone (including yourself) who's active in the SELinux community, and you think they deserve a free trip to Baltimore, be sure to nominate them before November 30th.