I was tired of searching the interwebs for these common commands again and again. So, I thought this would be a good way to initiate my blog and keep track of these handy commands.
a.k.a Uncompress and unarchive a File/Folder
a.k.a. Compress and archive a File/Folder
tar czf new_archive_name /path/to/file_or_folder
find <path> -name <filename>
- This command is recursive (will look in sub folders).
- regex may be used in the filenames.
Create Symbolic Links
Navigate to the location where you wish to have the link created.
ln -s path/to/file/or/folder link-name
Create symbolic links for multiple files at a time
Delete Symbolic Links
Find the number of processors in your unix system.
grep -c "processor" /proc/cpuinfo
- If your system supports hyper-threading, this will not work. The result will be some multiple of the actual number of processors available.
Find and Replace one liner
sed -i 's/findThis/replaceWith/g' filename
- `-i' means in place ( on the spot )
- The command says, "sed" find all instances(/g) of "findThis" and substitute(s/) them with "replaceWith" on the spot(-i) in "filename".
Sort a file in ascending order
- `-k' means column number, starting from 1, i.e. there is no column `0'.
- `-g' means sort the column alphanumerically.
Sort a File in descending order
- `-k' means column number, starting from 1, i.e. there is no column `0'.
- `-g' means sort the column alphanumerically.
- `-r' reverse the order (descending)
Get unique values in a column
- `-u' means get unique values only
- `-k' means column number, starting from 1, i.e. there is no column `0'.
Calculating the sum of a column
awk '{ total += $3 } END {print total}' filename
- $3 = column 3 (start counting from 1); change 3 to whichever column number you want the average for.
- total = any variable
Calculating the average of a column
awk '{ total += $3 } END {print total/NR}' filename
- $3 = column 3 (start counting from 1); change 3 to whichever column number you want the average for.
- total = any variable
- NR = number of records.
Sync your folders/directories across servers
From your local machine:
rsync -chavzrP --stats user@remote.host:/path/to/copy /path/to/local/storage
- -c, --checksum skip based on checksum, not mod-time & size
- -h, --human-readable output numbers in a human-readable format
- -a, --archive archive mode; equals -rlptgoD (no -H,-A,-X)
- -v, --verbose increase verbosity
- -z, --compress compress file data during the transfer
- -r, --recursive recurse into directories
- -P same as --partial --progress
- --partial keep partially transferred files
- --progress show progress during transfer
- --stats give some file-transfer stats
Download the contents of a webpage to a file
cURL is a software package which consists of command line tool and a library for transferring data using URL syntax.
curl -O https://www.someWebsite.com/somePage.htm
- -o (lowercase o) the result will be saved in the filename provided in the command line
- -O (uppercase O) the filename in the URL will be taken and it will be used as the filename to store the result (in the example above, you'll save a file called "somePage.htm")
- more examples
Recursively md5sum(checksum) all files in a directory
find /path/to/dir/ -name '*.fastq' -type f -execdir md5sum {} \; >> fastq_checksum.md5
- The snippet above says, "find all the files (-type f) with extension fastq(-name *.fastq) in the `path/to/dir' directory, execute(-exec) the md5sum command on each file and save the checksum in the `fastq_checksum.md5' file
- `{ }' is a placeholder for the file that is found. When a file is found it replaces this placeholder and executes the command with the file name.
- The `\' escapes the `;' which indicates the end of the command to be executed.
To check with the actual data
md5sum -c
fastq_checksum.md5