Filesystem

=Should we mount the root filesystem from the network?= This is a major issue that needs to be decided, and will greatly affect how the rest of the project is planned
 * I'm really not a huge fan of trying to rsync large amounts of data one-to-many rather than just keeping it automatically used from a single source. Remember, AFS is a highly caching network filesystem, and we can give it gigabytes upon gigabytes of local cache to work with.  Hell, we can have redundancy with each machine storing some of the data and serving it up on demand.  I agree /etc should not be locked to be identical on every machine, but I think that we can have the root filesystem be a local stub, with symlinks into /afs for anything that's actually needed. So yeah, if we do that, then no single point of failure; everything's a distributed filesystem.  --Elizabeth@ugcs.caltech.edu 02:13, 4 February 2007 (PST)
 * So it seems relatively settled that we'll mount a majority of stuff from the network. I personally don't favor the idea of having each login node be a fileserver- that could get messy to set up, not a whole lot to gain, and be potential security issues if there is a breakin (login computers are much more likely to get rooted than the filesystem). I'm going to start playing with afs on my computer (maybe through a few UML's) and see what I can learn. Jdhutchin@ugcs.caltech.edu 02:39, 5 February 2007 (PST)
 * Agreed, although you shouldn't spend a horrific amount of time/effort setting up a local test environment using UML, Xen, KVM, etc. - I am planning to immediately purchase three test machines as detailed on the hardware page, and we can have free reign in setting up and tearing them down to our liking. If someone wants to start configuring and pricing things to be reasonable ($2k a server, $500 per user machine), then we can get to work as soon as we discuss how to place orders with the bursar. --Elizabeth@ugcs.caltech.edu 05:08, 5 February 2007 (PST)
 * Ok, we can discuss this some more after the meeting tomorrow. Jdhutchin@ugcs.caltech.edu 13:41, 5 February 2007 (PST)
 * Ummm... The network mounted root over AFS seems a bit like trying to drive a screw with a hammer. If you want this kind of setup you should look at LSTP or a diskless OpenMosix setup for the lab machines. AFS seems like a good idea for the /home directory but I would caution you to have at least 2 fileservers which are set up with a heartbeat DNS configuration so that if the primary goes down the secondary steps in, 3 would be better. also having second Gb NICs in the file servers with a dedicated switch just for the fileservers to talk to eachother would help. It would be a very bad idea to have the core servers running the same root. You really want more fine grained control over those boxes since they will be providing different services and need different software. Not to mention that if the root fs gets rooted, all of the boxes are rooted. Silasb@ugcs.caltech.edu Wed Feb 7 23:01:37 2007

Pros

 * Makes updates really easy Jdhutchin@ugcs.caltech.edu 20:22, 3 February 2007 (PST)

Cons
--Which is why you would want redundant servers. Silasb@ugcs.caltech.edu Wed Feb 7 23:01:37 2007
 * Adds a bunch of network traffic, more load on fileservers, etc. However, UGCS does this mounts a lot of stuff from the network and it more-or-less works fine. Jdhutchin@ugcs.caltech.edu 20:22, 3 February 2007 (PST)
 * Single point of failure- However, if the fileserver goes down, people don't have /home, so it's kinda a moot point either way Jdhutchin@ugcs.caltech.edu 20:22, 3 February 2007 (PST)
 * Different machines may have different /etc sections. This will especially become true if we have some of them set up as 'lab' machines with monitors and X logins for people.  Evan brought up this concern when I discussed this idea with him. Jdhutchin@ugcs.caltech.edu 20:22, 3 February 2007 (PST)
 * If we add another architecture, it doubles the amount of space we take up on the fileservers.
 * It may not really be necessary- we have disk space on all the machines, we have plenty of room. There are other ways of keeping the machines in sync that might be better Jdhutchin@ugcs.caltech.edu 20:22, 3 February 2007 (PST)
 * Speed- With a bunch of machines doing this could make the machines less responsive and not as pleasant to work with (especially for people doing graphical work in the lab)

=AFS= I've been doing some reading on AFS and the possibility of using it with UGCS. http://www.openafs.org/ has a lot of information and an administrator's guide. Debian packages for AFS exist, and AFS is stable and not likely to go anywhere soon (as some other cluster file systems, like Intermezzo or Coda have).
 * I agree, we will definitely be using AFS instead of Coda or NFS. --Elizabeth@ugcs.caltech.edu 03:10, 2 February 2007 (PST)
 * Ok, then we'll have to decide: AFS for /home or AFS for everything? I'll also start looking into setting up Kerberos on Linux Jdhutchin@ugcs.caltech.edu 19:31, 3 February 2007 (PST)
 * I used to admin an Engineering network for GE-Security. That Network had Kerberos and OpenLDAP at it's heart. It was a beautiful thing! I have working configs we could use. Silasb@ugcs.caltech.edu Wed Feb 7 23:01:37 2007

Here's a brief summary of stuff on AFS, you can read more if you want:
 * AFS uses ACL's instead of Unix permissions for file access control
 * AFS requires real authentication between the AFS server and AFS client. More about that below.

Authentication
As said earlier, AFS requires real authentication between the AFS server and client. There are a couple ways of doing this:
 * Have each user log in once to the login machine, and then run a program to get an AFS token.
 * Use a modified login to get the AFS token automatically upon login.
 * Use Kerberos 5 (and some updated AFS utilities) to handle all login stuff.

Of these, I would suggest using Kerberos to it. This way, we can kill a few birds with one stone- many major daemons have kerberos support, and this would take care of our central login server.
 * It looks like openssh will take care of it if we use Kerberos for authentication (There are options in sshd_config to authenticate with a Kerberos server, and also an option to get the AFS token before trying to access /home) Jdhutchin@ugcs.caltech.edu 19:48, 3 February 2007 (PST)

Filesystem Hierarchy

 * The below comments are if we decide to install everything on AFS. I was thinking of maybe just putting home (maybe /etc) on AFS, and everything else locally.  Each machine has plenty of disk space to do that, and it would cut down on network traffic and load on the fileservers.  The advantage of doing it with everything from the central server is that we wouldn't have to push updates to every machine, and doing configuration would be easier.  Maybe we could just rsync (or something like that) the filesystem every night? Jdhutchin@ugcs.caltech.edu 20:22, 3 February 2007 (PST)
 * I agree with the /home over AFS. things like /etc should definately not be on AFS as there will be many tiny files that you don't want to be the same, esentially anything that has to do with the network, or services which are associated with a hostname. I agree with having a central software repository for the lab machines (again LSTP or diskless OpenMosix) which would simplify updates for the lab boxes, but the servers should be thier own entities. Best practices on server admin dictate that you should have as little software installed as possible as it greatly reduces the risk of a root exploit. Servers running software from /mnt/afs/bin is a _very_ bad idea. Silasb@ugcs.caltech.edu Wed Feb 7 23:01:37 2007


 * dpkg should be told to install things to /afs
 * /afs/bin etc. should be added to the default path on all machines
 * All software should be installed via dpkg on the fast fileserver, and set at high cache priority, allowing for centralization
 * All home folders should be appropriately symlinked into the afs paths, along with symlinks to user-specific slow storage
 * What the hell, maybe do everything netbooted and diskless/using disk as swap/cache? That would make adding new machines trivial - plug and chug.

The way it's done now, everything is pulled off a master mirror onto the clinets. We use a multiple remote shell command to do this. Note that even if you proabably also want /var and /tmp to be local at the very least.--Goldstei@ugcs.caltech.edu 05:29, 5 February 2007 (PST)