Jay's journal

Tuesday, July 2, 2013

Some Usages of cpio

The command cpio allows us to copy files between archives and directories. There are three operation modes: output (-o), input (-i), and pass-through (-p).

To create an archive for directory tree src, we issue:

find src -print0 | cpio -ov0 > src.cpio

To extract files from its archive just created, we issue:

cpio -ivd < src.cpio

To copy files from directory tree src to directory des, we issue:

find src -print0 | cpio -pmd0 dest

To copy files from src to dest that are less than 2 days old and whose names contain 'txt', we issue:

find src -mtime -2 -print0 | grep txt | cpio -pmd0 dest

To copy files from src to dest that are less than 2 days old and which contain the word 'txt', we issue:

find src -mtime -2 | xargs grep -l txt | cpio -pmd0 dest

Combining with ssh (or, natcat if you are not concerned with security), cpio allows us to access remote hosts. Here are some example:

To backup the directory tree src on a remote host, we may issue:

find src -print0 | cpio -oaV0 -H tar -O user@remotehost:src.tar

which uses ssh to copy archive file from localhost to remote host.

To copy files from src to a remote host, we issue:

find src -print0 | cpio -oaV0 -H tar | ssh user@remotehost "cpio -imd"

Monday, July 1, 2013

Access UFS File System under Linux

Unix file system (UFS) is widely used in many Unix systems, for example, FreeBSD, OpenBSD, and HP-UX. There are times that we need to access UFS under Linux systems. The following command allows us to mount UFS2 for read-only (ro) under Linux systems:

mount -t ufs -o ufstype=ufs2,ro /dev/sdXY /mnt/path

Write support for UFS is not compiled into Linux kernels by default. One needs to properly configure and compile kernels for write support.

Saturday, June 29, 2013

Netcat for File Transfer

File transfer is one of the most practical usages of netcat.

To transfer a file named filename from client to server, we first issue the following command on server:

nc -l -p 1234 > filename

where 1234 is the port number used by the server. On the client side, we issue:

nc -q 10 server 1234 < filename

where '-q 10' specifies to wait 10 seconds and then quit after EOF on stdin. This would cause the server to quit.

To transfer a director tree named /path from client to server, we issue the following command on server:

nc -l -p 1234 | tar xvzf -

On the client side, we issue:

tar cvzf - /path | nc -q 10 server 1234

If we want to reverse the direction of file transfer, i.e., client pulls file from server, we use:

nc -q 10 -l -p 1234 < filename

on server, and

nc server 1234 > filename

on client. Similarly, to reverse the direction of directory tree transfer, we use:

tar cvzf - /path | nc -q 10 -l -p 1234

on server, and

nc server 1234 | tar xvzf -

on client.

Thursday, June 27, 2013

Using findutils to delete files

Here is a summary from the deleting files page.

The most efficient and secure method to delete any file with name ending in '~' in the directory /path is:

find /path -name \*~ -delete

Using command xargs may allow us to achieve same efficiency but is not as secure:

find /path -name \*~ -print0 | xargs -0 /bin/rm

where '-print0' specifies using ASCII NUL to separate the entries in the file list, and similarly '-0' for command xargs.

If the '-delete' action is not available, we may use action '-execdir' or '-exec':

find /path -name \*~ -execdir /bin/rm {} \+

find /path -name \*~ -exec /bin/rm {} \+

Action '-execdir' is secure but less portable. On the other hand, action '-exec' is most efficient portable but insecure. These two actions can be used for doing things other than deleting files.

Tuesday, June 25, 2013

Use command find to clean up your file system

Some editors create back-up files with the the same file name and a '~' suffix. Once you are done with editing, there is no need to keep these back-up files. The command find allows us to find and remove all such back-up files in working directory:

find . -name \*~ -exec rm {} \;

If you want to remove files not been accessed for more than one year (365 days), you may issue:

find . -atime +365 -exec rm {} \;

Command find comes with many other options, for example, you may use: 'mtime' for last modified, 'ctime' for last change, or even 'inum' for inode number.

Monday, May 6, 2013

Using public keys for SSH authentication

SSH supports many methods for user authentication. Public key, password, and host-based are three main methods specified in RFC 4252 (The Secure Shell (SSH) Authentication Protocol). With the public key method, the possession of a user's private key serves as authentication. Host-based authentication works similarly: the possession of host's (client's) private key enables the authentication based on the user names on the server and the client.

SSH authentication using public keys can be achieved in two steps:

create a public and private key pair on client side, and
copy your public key to server.

Once these are done, we should be able to login remotely without being prompted for a password.

The command ssh-keygen allows us to generate and manage authentication keys. To generate a pair of RSA keys, we use:

ssh-keygen -t rsa

It generates a pair of RSA keys and saves them in directory $HOME/.ssh; the default name for public (private) key is id_rsa.pub (id_rsa). Always use passphrases to protect your private keys.

Public keys are not sensitive data in general, we may choose any method to copy them. However, command ssh-copy-id provides an easy way to accomplish this. You use:

ssh-copy-id user@server

to copy your public key to server. After that, your public key file will be copied/appended to file $HOME/.ssh/authorized_keys in server. Don't forget to specify (option -i) the path of your public key file if it is not in $HOME/.ssh/id_rsa.pub on client side.

An ssh-agent is very helpful in using public keys for SSH authentication; it is strongly recommended.

Wednesday, May 1, 2013

Hard and Symbolic links

A hard link is an entry in a directory file that associates a name with an (existing) file on a file system, which allows a file to appear in multiple paths.

Unix/Linux systems do not allow hard links on directories, since it may create endless cycles. Hard links are limited to files on the same volume, because name and file association in each hard link is through inode. Most file systems that support hard links use link count to keep track on the total number of links created to point to the inode (file). To find all the files which refer to the same file as NAME, we may use command find with the option '-samefile NAME' or '-inum INODE', where INODE is the inode number of NAME. The command ls with option '-il' gives you information on link count and inode for files.

A symbolic link is a special type of file that contains a text string which is interpreted by the operating system as a path to another file/directory. The other file/directory is usually called the "target". A symbolic link is another file that exists independently of its target, i.e., they are two files/directories indexed by two different inodes, as opposed to hard links. Symbolic links are different from hard links in that:

a symbolic link may point to a directory, and
a symbolic link may point to a directory/file in different volume

There is one issue with symbolic links. If a symbolic link is removed, its target remains unaffected. However, there is no automatic update for a symbolic link if its target is moved, renamed, or deleted. The symbolic link continues to exist and point to the original target, which no longer exists. This is called a broken link.