Difference between revisions of "H: drive on cluster"
| Line 22: | Line 22: | ||
| ==MiSeq Data Area Backup== | ==MiSeq Data Area Backup== | ||
| − | + | Probably the only details a user now (Nov 2017) need know is that a mirror of the MiSeq Data Area is now held and updated nightly on marvin, and visible to all nodes (and therefore can be used on all queues at: | |
| − | |||
| − | + |  /shelf/MiSeq_Data_Area_Backup | |
| − | + | The actual procedure is very clunky, but the automatisation via cron jobs now makes the clunkiness transparent. Continue to read for all the gory details  | |
| − | + | * A backup of MiSeq Data Area exists on the 138.251.175.12 machine and is synchronised with H: drive everyday at midnight. | |
| + | * This backup can then be mounted (only in user's kap6 area) via | ||
| − | + |  sshfs kap6@138.251.175.12:/mnt/vst2/MiSeq_Data_Area ~/miseqdabkmpt | |
| + | * Every night at 02:20, a user kap6 cronjob mounts this backup on marvin and performs and rsync to /shelf/MiSeq_Data_Area_Backup | ||
| + | * The log of this rsync is held in kap6 home directory, in folder mountpointlogs, i.e. | ||
| + | |||
| + |  /storage/home/users/kap6/mountpointlogs/mt_to_shelf.txt | ||
| + | |||
| + | This is visible to member of the miseq0 group. | ||
| + | |||
| + | === why only kap6 user? === | ||
| + | |||
| + | A USERID is required on the 138.251.175.12 for this to work, and the users public key on marvin also needs to be recorded in the user's home on this machine.   | ||
| + | |||
| + | However, the /shelf/MiSeq_Data_Area_Backup is visible to all members of the miseq0 group, user kap6 is only carrying out the mirroring, and at night everyday, so this is a sufficient solution. | ||
| + | |||
| + | == Unmounting == | ||
| To umount, the '''fusermount''' command may be used with the '''-u''' option, like so: | To umount, the '''fusermount''' command may be used with the '''-u''' option, like so: | ||
| Line 37: | Line 51: | ||
|   fusermount -u ~/mnt/MiSeq_Data_Backup |   fusermount -u ~/mnt/MiSeq_Data_Backup | ||
| − | NOTE: This mounts the directory inside a user's home directory,  | + | NOTE: This mounts the directory inside a user's home directory, but the user's home directory is in turn exported via NFS to the nodes. These are two entirely different technologies and besides not being available on the nodes, the SSHFS is likely causing some trouble inside the NFS protocol which may lead to some instability. However, the trouble appears to be minor, so this solution, while ugly, is viable. | 
| == The GVFS method == | == The GVFS method == | ||
Revision as of 16:47, 8 November 2017
Contents
Introduction
NB: A major issue with any networking facilities in the University of St Andrews network, is that only minimum services are unblocked as by default per security policy. Facilities and ports may only be unblocked if specific requests are made to I.T. Services.
Some older machines may have quite an unblocked panorama, as over the years requests have made to unblock certain ports. However, any new service has a high likelihood of not working due to ports being blocked, not because of bad configuration or malfunction, but because of policy.
Accepting the importance of this point will help curtail the amount of exhaustive testing that sometime occurs when troubleshooting these issues.
Due the MS Windows nature of much of the St. Andrews network, and because H: is the windows device name it usually falls under, the H: drive is the name for the following St. Andrews network drive:
//cfs.st-andrews.ac.uk/shared/Med_Research/res
One can of course simply copy files over to the cluster from the H: drive, but for large datasets, this is costly in terms of diskspace. An viable alternative is to "mount" this network drive on marvin, which avoids this duplication. WHen set up this is bette than copying because the directory appears to be available locally.
However., mounting H: depends on individual authentication, and so is not easy to mount system wide. Every user, if they want it, must do it manually. This also means that it cannot be tested without the cooperation of the user, who must enter their ID and password.
So, as it's not an entirely easy thing to do, several methods are presented.
Several methods
MiSeq Data Area Backup
Probably the only details a user now (Nov 2017) need know is that a mirror of the MiSeq Data Area is now held and updated nightly on marvin, and visible to all nodes (and therefore can be used on all queues at:
/shelf/MiSeq_Data_Area_Backup
The actual procedure is very clunky, but the automatisation via cron jobs now makes the clunkiness transparent. Continue to read for all the gory details
- A backup of MiSeq Data Area exists on the 138.251.175.12 machine and is synchronised with H: drive everyday at midnight.
- This backup can then be mounted (only in user's kap6 area) via
sshfs kap6@138.251.175.12:/mnt/vst2/MiSeq_Data_Area ~/miseqdabkmpt
- Every night at 02:20, a user kap6 cronjob mounts this backup on marvin and performs and rsync to /shelf/MiSeq_Data_Area_Backup
- The log of this rsync is held in kap6 home directory, in folder mountpointlogs, i.e.
/storage/home/users/kap6/mountpointlogs/mt_to_shelf.txt
This is visible to member of the miseq0 group.
why only kap6 user?
A USERID is required on the 138.251.175.12 for this to work, and the users public key on marvin also needs to be recorded in the user's home on this machine.
However, the /shelf/MiSeq_Data_Area_Backup is visible to all members of the miseq0 group, user kap6 is only carrying out the mirroring, and at night everyday, so this is a sufficient solution.
Unmounting
To umount, the fusermount command may be used with the -u option, like so:
fusermount -u ~/mnt/MiSeq_Data_Backup
NOTE: This mounts the directory inside a user's home directory, but the user's home directory is in turn exported via NFS to the nodes. These are two entirely different technologies and besides not being available on the nodes, the SSHFS is likely causing some trouble inside the NFS protocol which may lead to some instability. However, the trouble appears to be minor, so this solution, while ugly, is viable.
The GVFS method
The key to this is the Gnome Virtual File system, gvfs.
It is possible to get the h: drive mounted on the marvin frontend, mainly because it is running gnome.
However, the nodes are not, so currently they cannot mount the H: drive.
This means when working with the raw data, only the marvin.q can be used.
Administration Aspects
Tests
Note that smbclient (SAMBA's ftp-type client) is able to work well and navigate folders fine.
Debugging level can increased according to following link.
GVFS Environment
GVFS is part of the Gnome mega project.
Being the Window Manager it's a rather important component and cannot be dealt with abruptly. It's not clear how to restart it remotely. The usual Ctrl+Alt+Backspace
To restart gdm, the following rather rough method is actually the recommended one as can be seen here: https://access.redhat.com/solutions/36382
(This only applies for RHEL6 ... RHEL7 uses systemctl and the new Gnome 3, which are both coordinated and have a systemctl method for restarting)
The command is as follows
pkill -f gdm-binary
This definitely appears to have the desired effect, as can be seen from this interaction:
[root@marvin etc]# psg gdm root 35930 0.0 0.0 134028 2040 ? Ssl Aug04 0:00 /usr/sbin/gdm-binary -nodaemon root 35980 0.0 0.0 176912 3068 ? Sl Aug04 0:00 /usr/libexec/gdm-simple-slave --display-id /org/gnome/DisplayManager/Display1 root 35983 0.0 0.0 357396 29472 tty1 Ssl+ Aug04 5:23 /usr/bin/Xorg :0 -br -verbose -audit 4 -auth /var/run/gdm/auth-for-gdm-yv0pnl/database -nolisten tcp vt1 gdm 36020 0.0 0.0 20048 448 ? S Aug04 0:00 /usr/bin/dbus-launch --exit-with-session gdm 36021 0.0 0.0 44060 848 ? Ssl Aug04 0:00 /bin/dbus-daemon --fork --print-pid 5 --print-address 7 --session gdm 36023 0.0 0.0 269204 7100 ? Ssl Aug04 0:00 /usr/bin/gnome-session --autostart=/usr/share/gdm/autostart/LoginWindow/ gdm 36026 0.0 0.0 133292 2412 ? S Aug04 0:06 /usr/libexec/gconfd-2 gdm 36027 0.0 0.0 120724 4856 ? S Aug04 0:05 /usr/libexec/at-spi-registryd gdm 36031 0.1 0.0 435356 39176 ? Ssl Aug04 26:22 /usr/libexec/gnome-settings-daemon --gconf-prefix=/apps/gdm/simple-greeter/settings-manager-plugins gdm 36033 0.0 0.0 358560 2848 ? Ssl Aug04 0:00 /usr/libexec/bonobo-activation-server --ac-activate --ior-output-fd=12 gdm 36040 0.0 0.0 135288 2164 ? S Aug04 0:00 /usr/libexec/gvfsd gdm 36041 0.0 0.0 346416 8808 ? S Aug04 0:05 metacity gdm 36042 0.0 0.0 442112 13656 ? S Aug04 0:36 /usr/libexec/gdm-simple-greeter gdm 36044 0.0 0.0 248320 6344 ? S Aug04 0:00 /usr/libexec/polkit-gnome-authentication-agent-1 gdm 36045 0.0 0.0 273864 7624 ? S Aug04 0:18 gnome-power-manager root 36054 0.0 0.0 141792 1968 ? S Aug04 0:00 pam: gdm-password root 47707 0.0 0.0 122752 1580 pts/11 S+ 15:05 0:00 grep gdm [root@marvin etc]# pkill -f gdm-binary [root@marvin etc]# psg gdm root 47876 0.1 0.0 134028 2176 ? Ssl 15:09 0:00 /usr/sbin/gdm-binary -nodaemon root 47926 0.2 0.0 176912 3536 ? Sl 15:09 0:00 /usr/libexec/gdm-simple-slave --display-id /org/gnome/DisplayManager/Display1 root 47929 5.2 0.0 354528 34536 tty1 Ssl+ 15:09 0:02 /usr/bin/Xorg :0 -br -verbose -audit 4 -auth /var/run/gdm/auth-for-gdm-QdHU0O/database -nolisten tcp vt1 gdm 47967 0.0 0.0 20048 696 ? S 15:09 0:00 /usr/bin/dbus-launch --exit-with-session gdm 47968 0.0 0.0 44060 1236 ? Ssl 15:09 0:00 /bin/dbus-daemon --fork --print-pid 5 --print-address 7 --session gdm 47970 0.2 0.0 269204 8644 ? Ssl 15:09 0:00 /usr/bin/gnome-session --autostart=/usr/share/gdm/autostart/LoginWindow/ gdm 47973 0.2 0.0 133292 5276 ? S 15:09 0:00 /usr/libexec/gconfd-2 gdm 47974 0.0 0.0 120724 5736 ? S 15:09 0:00 /usr/libexec/at-spi-registryd gdm 47978 1.0 0.0 408268 13224 ? Ssl 15:09 0:00 /usr/libexec/gnome-settings-daemon --gconf-prefix=/apps/gdm/simple-greeter/settings-manager-plugins gdm 47980 0.0 0.0 358560 3596 ? Ssl 15:09 0:00 /usr/libexec/bonobo-activation-server --ac-activate --ior-output-fd=12 gdm 47987 0.0 0.0 135288 2168 ? S 15:09 0:00 /usr/libexec/gvfsd gdm 47988 0.1 0.0 346416 10704 ? S 15:09 0:00 metacity gdm 47989 0.4 0.0 452400 16716 ? S 15:09 0:00 /usr/libexec/gdm-simple-greeter gdm 47991 0.0 0.0 248320 8140 ? S 15:09 0:00 /usr/libexec/polkit-gnome-authentication-agent-1 gdm 47992 0.1 0.0 273864 9476 ? S 15:09 0:00 gnome-power-manager root 48002 0.0 0.0 141792 2344 ? S 15:09 0:00 pam: gdm-password root 48020 0.0 0.0 122748 1572 pts/11 S+ 15:10 0:00 grep gdm
Methods
GVFS will allow the user mount the filesystem, though it also requires a "running user d-bus session, typically started with desktop session on login".
Two tools are used for this: gvfs and fuse
- a user must be a member of group "fuse"
- a gvfs daemon must be running under user gdm: the system administrator should ensure this.
- Script to use is
#!/bin/bash export $(dbus-launch) gvfs-mount smb://cfs.st-andrews.ac.uk/shared/med_research/res /usr/libexec/gvfs-fuse-daemon ~/.gvfs
which can be launched as normal user,
Notes
- gvfs-mount -l seems useless, reports nothing.
Relevant help pages
/usr/libexec/gvfs-fuse-daemon
usage: /usr/libexec/gvfs-fuse-daemon mountpoint [options]
general options:
    -o opt,[opt...]        mount options
    -h   --help            print help
    -V   --version         print version
FUSE options:
    -d   -o debug          enable debug output (implies -f)
    -f                     foreground operation
    -s                     disable multi-threaded operation
    -o allow_other         allow access to other users
    -o allow_root          allow access to root
    -o nonempty            allow mounts over non-empty file/dir
    -o default_permissions enable permission checking by kernel
    -o fsname=NAME         set filesystem name
    -o subtype=NAME        set filesystem type
    -o large_read          issue large read requests (2.4 only)
    -o max_read=N          set maximum size of read requests
    -o hard_remove         immediate removal (don't hide files)
    -o use_ino             let filesystem set inode numbers
    -o readdir_ino         try to fill in d_ino in readdir
    -o direct_io           use direct I/O
    -o kernel_cache        cache files in kernel
    -o [no]auto_cache      enable caching based on modification times (off)
    -o umask=M             set file permissions (octal)
    -o uid=N               set file owner
    -o gid=N               set file group
    -o entry_timeout=T     cache timeout for names (1.0s)
    -o negative_timeout=T  cache timeout for deleted names (0.0s)
    -o attr_timeout=T      cache timeout for attributes (1.0s)
    -o ac_attr_timeout=T   auto cache timeout for attributes (attr_timeout)
    -o intr                allow requests to be interrupted
    -o intr_signal=NUM     signal to send on interrupt (10)
    -o modules=M1[:M2...]  names of modules to push onto filesystem stack
    -o max_write=N         set maximum size of write requests
    -o max_readahead=N     set maximum readahead
    -o async_read          perform reads asynchronously (default)
    -o sync_read           perform reads synchronously
    -o atomic_o_trunc      enable atomic open+truncate support
    -o big_writes          enable larger than 4kB writes
    -o no_remote_lock      disable remote file locking
Module options:
[subdir]
    -o subdir=DIR           prepend this directory to all paths (mandatory)
    -o [no]rellinks         transform absolute symlinks to relative
[iconv]
    -o from_code=CHARSET   original encoding of file names (default: UTF-8)
    -o to_code=CHARSET      new encoding of the file names (default: UTF-8)
Debian Jessie mounts it, Redhat doesn't
(Red Hat appears to mount H-drive, but cannot get into any of the subdirectories, except hallsport. The command getcifsacl fails on Med_Research. It seems to be a wrapper for getxattr (GET eXternal ATTRibute). But all this may be saying the smae thing really, that the subdirs are just not visible.
Debian Linux has no problem mounting H-drive. mount reports its options as the following
noauto,users,rw,credentials=/storage/home/users/ramon/.smbcredentials,nosuid,nodev,noexec,relatime,vers=1.0,sec=ntlm,cache=strict,uid=0,nofo rceuid,gid=0,noforcegid,file_mode=0755,dir_mode=0755,nounix,serverino,mapposix,rsize=61440,wsize=65536,echo_interval=60,actimeo=1
Maybe these defaults are necessary? Attempt made but failed.
As well as version differences, there is the issue of the CIFS kernel module. There is a vague remembrance of this having worked before, and it's possible that there is a bug in the latest kernel, which is nasty.
Latest action on this was to post a description of the problem on the RedHat Customer Portal. As may be expected, Redhat has a less recent version of cifs-utils and, indeed, mount (which belongs to util-linux).
