SELinux Sandbox and Ambient Authority Dan Walsh recently introduced SELinux sandbox. This is a mechanism for launching untrusted applications from the command line, which uses a strict MAC policy to isolate the executed application from the rest of the system. There's been a good discussion of the topic LWN, and I thought it might be worth highlighting a few points.
Firstly, this sandboxing scheme is not a separate package. It's an addition to the standard SELinux security policy to define the sandboxed domain (sandbox_t) coupled with a script to set up the environment and launch applications in the sandboxed domain.
The idea for this came out of a few emails following a recent discussion about extending seccomp for more generalized sandboxing. Essentially, the question was asked "what can we do with SELinux and simple sandboxing?", and the result is now available in Fedora development. If you update to the latest policycoreutils and selinux-policy packages, it should simply be there ready to go.
The security policy for the sandbox_t domain is designed to provide the sandboxed application with only the absolute minimum set of permissions required to run. It can load shared libraries, for example, although a future refinement could provide an option to run only static binaries. It cannot interact in an ad-hoc manner with the rest of the system. A scratch tmpfs filesystem may be optionally mounted for the application if required, and unique MCS labels are used to separate sandboxes from each other. Another future refinement will likely include launching sandboxes in private namespaces.
# sandbox id -Z unconfined_u:unconfined_r:sandbox_t:s0:c226,c674
The above shows how the id command launched via the new sandbox utility is running in the sandbox_t domain, with MCS categories c226 and c674. The values of these don't matter, as long as they're unique on the system.
As root (and note that this is not designed to be run as root, but for demonstration purposes it helps to show the confinement of privileges if they exist), you can't do anything special via sandbox:
In fact, you can't open any files on the global system.
Ambient authority describes the form of authority commonly seen in general purpose operating systems. This form of authority is what allows, for example, a user on a Linux system to open any file for which she has read access, whether she needs to open the file or not. It is seen as problematic in establishing strong security, due to problems such as The Confused Deputy, where authority (i.e. the ability to perform an action) is arbitrarily escalated throughout the system.
(For a particularly clear explanation of these concepts, they are covered in the first ten minutes of this talk by David Wagner).
When an application is launched via sandbox, with no inessential permissions, as much ambient authority as is possible has been removed by SELinux MAC. Instead, authority is explicitly provided to the sandboxed application via a pipe file descriptor handed to it via the launching process (i.e. the standard Unix scheme of constructing pipelines).
Note carefully the difference between these two commands: # wc -l /etc/shadow 43 /etc/shadow
# cat /etc/shadow | wc -l 43
In the first example, the wc application directly opened the file /etc/shadow for reading. It used ambient authority to do this.
In the second example, wc was handed a file descriptor which was already opened by the calling process, and did not require any ambient authority to read the data in the file: the authority was explicitly tied to the file by the caller, and wc was entirely unaware of which file it was reading. wc in this case does not need any permissions except to access the file descriptor passed by the caller. (It still has ambient authority, however, it just didn't need to use it here).
Running the above with SELinux sandboxing in effect:
Note that wc now has no authority now except as invoked by the calling process and passed via the sandbox. In other words, it does not have ambient authority when invoked via the sandbox.
This is a very simple and powerful concept for security purposes, as it is possible to define strict information flows between applications in a dynamic and controlled manner, without the need for additional global security policy. It's inherently Unix-y, too.
There are many potential applications of this form of sandboxing, particularly where you need to process information between different security realms (e.g. incoming mail which needs to be passed through a chain of scanning and filtering applications), and for dealing with large and complicated applications processing arbitrary untrusted data.
Keep an eye on Dan's blog for upcoming work on desktop security with SELinux sandboxing.
Dan Walsh has previously implemented SELinux lockdown for browser plugins via nspluginwrapper, as discussed here. Unfortunately, this has been disabled by default, due to a clash with the mozplugger package, which uses nspluginwrapper to launch applications inside the browser.
Personally, I'm happy to have OpenOffice or similar open up in a separate window, using the standard Firefox mechanism for doing so, especially if it means I'm able to keep browser plugin confinement enabled.
This of course removes mozplugger, but I don't seem to need it. When downloading a PDF, for example, Firefox prompts if I want to open it with evince, and provides me with an option to always do that without further prompting. YMMV.
The setsebool commands change several nspluginwrapper options in SELinux, while the -P option ensures that the changes persist across reboots (see setsebool(8)).
Enabling allow_unconfined_nsplugin_transition ensures that nspluginwrapper transitions to a new security label when running a plugin, so that special security policy can be applied to it. This is required for any useful effect.
Disabling allow_nsplugin_execmem ensures that memory protections are being enforced to prevent plugins from executing code on the stack and in mapped memory.
Disabling nsplugin_can_network prevents plugins from connecting to anything other than reserved ports. Apparently, this may upset some flash code which wants to call home (you'd be surprised how much of this goes on, or perhaps not), so you may want to leave this as-is, or at least keep an eye on the messages from setroubleshoot.
Note that if you do run into problems, you can put SELinux into permissive mode rather than disabling it, which will at least provide some useful logging information (and feel free to post questions to the fedora-selinux-list).
Btw, here's how to configure SELinux for permissive mode:
System -> Administration -> SELinux Management
Set 'System Default Enforcing Mode' to 'Permissive'
And you're done.
A bugzilla ticket has been opened on the issue of finding a long-term solution which allows both mozplugger and plugin confinement to co-exist, but unfortunately, users currently need to decide whether they prefer increased security or a more Windows-like experience, with the latter as the default.
The video from my sVirt (MAC security for Linux virtualization) talk is available as an OGG file. I've also re-uploaded it as a google video.
I'd suggest having a copy of the slides open when watching, as they're not always shown in the video, and you're definitely better off looking at them than me in any case.
LCA was a genuinely enjoyable conference: laid-back and really well organized, with a good balance of talks. One really great aspect was the way internet access was provided to the accommodation, which at least in my case, worked perfectly, with a microwave link from UTAS connected to the hotel's internal wiring. I often need to work during conferences, and having good network access is probably my top priority in selecting accommodation.
I was glad to be part of the security miniconf organized by Casey Schaufler, which brought together folk from the kernel security community and various highly technical folk. There were talks from several leading security developers, including Casey (fs capabilities and rootless systems), Russell Coker (standing in for Kaigai Kohei on SE-postgresql and web application MAC), and Kentaro Takeda (TOMOYO). The miniconf concluded with an open panel discussion which was covered by LWN. For reasons I can't quite recall now, I ended up doing an ad-hoc presentation on Fedora Kiosk Mode, which I think helped demonstrate some of the progress SELinux has made in terms of usability and extension to general use scenarios.
It's looking very much like the deeply developer-focused event the organizers were hoping for. On the schedule is a mix of technical talks and workout sessions. I'll be involved in the Kernel Quality Improvement Workout headed up by Christoph Hellwig, as well as giving a talk on Fedora Kiosk Mode. This will be expanded a little on the talk I gave at FOSS.MY due to extra time available.
I was going to talk about sVirt at the planned Fudcon, but the Fudcon was unfortunately cancelled. Fedora folk will still be there, though, and if anyone wants to talk about sVirt and get involved in some really cool and innovative hacking, catch up with me.
The main hall has been set aside for an entire day to host a Linux Kernel Hacker Gathering (LKHG), with sessions on Filesystems, Tracing, Power Management and Porting. It seems that this will be something like an open mini kernel summit, with participants to include Suparna Bhattacharya, Ananth N Mavinakayanahalli, Christoph Hellwig, Aneesh Kumar K V, Balbir Singh, Srikanth Srinivasan, Harald Welte, Srivatsa Vaddagiri, Amit Shah, myself, and Dipankar Sarma.
The final slot will be open for lightning talks from the audience, with the kernel hacker panel providing feedback, followed by an open Q&A session. This is somewhat based on the format of the LF symposium BoF day, and will be a great opportunity for people working on kernel projects to bounce their ideas off upstream kernel hackers. This includes people working on drivers and various kernel projects which are not currently upstream (i.e. work projects), who would like to get some advice on how to get their project upstreamed and how to work more effectively with the community.
A CfP will go out for the lightning talks soon, so if you want to participate, keep an eye out for that.
The organizers have made a video to promote and explain the conference:
And yes, you can hack on the roof of the building, or even hold talks: there's an outdoor auditorium up there.
Currently, there's over 900 delegates registered, which is a lot for a developer conference. (Linux Plumbers had 300, IIRC).
I think this promotional banner sums up my experience so far:
Upcoming conference talks on SELinux applications: sVirt and Kiosk Mode Recently, I've been busy getting the initial cut of sVirt out, and am currently processing community feedback before issuing an update. The basic idea behind sVirt is to apply MAC label security (SELinux, Smack etc.) to Linux-based virtualization schemes such as KVM, allowing the existing OS-level security mechanisms to be re-used for process-based VMs. This is an application one of the core advantages of Linux-based virtualization, where generally, all of the Linux process management infrastructure within the kernel and wider OS may be applied to domains which run inside Linux processes. So, for MAC label security in this case, we don't need to do anything in terms of modifying kernel security mechanisms, and simply modify security policy as desired. We can focus on developing the appropriate high-level abstractions (e.g. management tool support) rather than developing a new security mechanism.
How can this be useful? In the simplest case, we can increase isolation between virtual machines by assigning them different security labels, and enforcing a MAC policy which prevents them from interacting. This helps ameliorate the increased risk arising from running domains on the same hardware where previously they may have been physically separated on different machines. This is just a start. There are plenty of interesting things which can be done once the core functionality is in place, although the initial idea is to simply provide stronger isolation to better protect domains from each other.
At an architectural level, security labeling support is being added to libvirt, a virtualization API which abstracts various aspects of virtualization including different hypervisor types, storage, networking, and with sVirt: MAC security. With sVirt integrated at the API level, security labeling support can be integrated into high-level tools via standardized and flexible abstractions. For example, when creating a new domain, the graphical virt-manager tool may include a checkbox to designate the domain as "isolated"—or perhaps just do it by default for true zeroconf.
I'll be introducing sVirt more completely at LCA next January, so if you're marching south and have interests in both security and virtualization, it might be worth popping in. I'm up against Tridge in the timeslot, so it might be an intimate session.
Next week, I'll be giving a talk on Fedora Kiosk Mode at Malaysia's inaugural developer conference, FOSS.MY. Kiosk Mode is another high-level MAC security application, where anonymous users can safely access desktop sessions and browse the internet. If you have the xguest package installed, it Just Works, as people are starting to notice.
I've been shortlisted on the same topic at the revamped FOSS.IN a few weeks later. There's also been some discussion of a kernel development workout session, in which I'd love to participate, although it's not yet short-listed. There's also the FUDCon attached to FOSS.IN. We're hoping to have a Fedora box there running Kiosk Mode for people to play with.
A FUDCon is being held in conjunction with foss.in, which should also help attract developers. It's the closest upcoming FUDCon to me in geographic terms, and I'm working on attending for that at least.
From discussion with some of the folk involved in the wider event, it seems that many fine details are yet to be worked out, and while the emphasis is very much on Indian developers, I'd suggest that international developers who've been considering submitting a proposal this year definitely still do so.
I've just uploaded slides from the talks, which may be found next to their respective entries in the schedule.
Some of the talks I found particularly useful/interesting:
Josh Brindle on SELinux in Ubuntu. They're making good progress, although the idea of SELinux is to introduce ubiquitous, generalized MAC security, so he is advocating they enable SELinux by default as is done in Fedora, and as you typically do with other OS security layers.
John Weeks from Sun talking about OpenSolaris FMAC (introducing Flask/TE to their OS). It was interesting to see a dtrace graph of the AVC operating—a kernel mechanism for which I've developed an abstract mental model but never "seen".
Dan Walsh Talking about his ongoing work in utilizing SELinux to create practical security features for everyday users.
The above is from a demonstration where nsplugin (the framework for Firefox plugins, i.e. where flash etc. is run) is being sandboxed by SELinux, so that a flawed or malicious plugin cannot be used to snoop your keystrokes. In this case, a simulated (and trivial) exploit was blocked from capturing internet banking passwords by SELinux.
Btw, Dan will be demonstrating this today during his OLS talk on Confining the User. There's a lot of really cool stuff coming in this area & the talk should be well worth attending.
Karl MacMillan on alternatives to comprehensive least-privilege, where he described some ideas and plans for simplifying the way SELinux policy is deployed for general purpose use. He has some really promising ideas on reducing the granularity of the policy while still maintaining strong security. This can lead to simpler and smaller policy, which is important for all kinds of users.
Peter White talked about two higher-level languages being developed to express SELinux policy, Lobster and Shrimp, which will introduce features such as type checking and object orientation to the policy language area. Peter is a Haskell guy, and it all looks very promising.
Yuichi Nakamura talking about embedded systems and SELinux.
The format worked reasonably well—a series of short talks and discussions—although it would have been nicer to have a more relaxed schedule and more time for deep discussions on specific issues. There's already been discussion of what to do next year, and we may move it to a two-day event. Certainly, I think we'll want to have it again in conjunction with a major developer conference, which makes it a good environment for collaboration with the wider FOSS community.
For those that couldn't make it this year, I believe notes were taken and will be sent out to the mailing list. There are more photos here.
This patch by Ahmed Darwish allows a particular security module to be selected at kernel boot time, so that distributions can ship multiple security modules and allow the user to decide which one (if any) to enable. For example: security=selinux selects SELinux, while security=smack selects SMACK. (In Fedora, you don't need to do anything: SELinux is the default).
New SELinux open permission
Until now, opening a file under SELinux invoked the same permission checks as the intended operation on the file, such as read, write, execute and append. There was no separate "open" check: opening a file for write, for example, was considered by SELinux policy as equivalent to actually writing to the file. Experience has shown that this approach is not ideal for handling cases such as IO redirection via the shell, because policy writers cannot usefully guess where users will send redirected output. This is a very common use-case for Linux, so a solution is most definitely necessary, while also preserving strong security. Can it be done? Yes!
Implemented by Eric Paris, the new open permission provides a way to address the issue by providing applications with liberal access to read/write/execute/append permissions but tightly locking down the ability to open a file. In the case of redirecting output via the shell:
bash# /sbin/do-stuff > /tmp/output
the shell forks and creates /tmp/output, calls dup2(2) to replace stdin with the newly created file descriptor, then execs do-stuff. With the old permissions, do-stuff would have required an SELinux write permission on the new file, which it very likely would not have had. By providing do-stuff with liberal file access permissions, but not the new open permission, its output may now be redirected to the file without needing to give it the ability to directly open the file. The invoking shell of course needs the open permission, which it effectively delegates to do-stuff via the open file descriptor.
Updated security policy which utilizes this technique should be available soon in rawhide, and integrated into Fedora 10, providing significant usability improvements for sysadmins and power users.
Also implemented by Eric, permissive types (aka permissive domains) allows permissive mode to be selected on the fly on a per-domain basis. Permissive mode is where security policy is being checked and logged, but not actually enforced, and was previously only possible on a system-wide basis. By making this per-domain, applications which are experiencing SELinux policy issues may be flipped into permissive mode, allowing them to do what they need until a proper fix is available, without disabling policy enforcement for the rest of the system.
Network Port SID Cache
Paul Moore implemented a cache to improve the performance of the SELinux networking code, so that network port labels are no longer looked up in the (typically large) kernel policy database on a per-packet basis, and is instead retrieved from an RCU-based cache. This addresses a long standing network performance issue which has been observed with very high loads on network servers.
There's quite a lot happening in security for 2.6.27, some of which has already been merged into Linus' tree. Due to the pervasive nature of some of the patches (including David Howells' credentials rework), I'm feeding all of the SELinux stuff via my security-testing tree. The "devel" branch is where bleeding edge changes are initially stabilized before being applied to the "next" branch, which is in turn fed into to linux-next.