HPC 0

Introduction to Linux for HPC

Research Computing Team and Service

  • Here to support research(ers)
    • Provide training
    • Support users of Grid and Cloud Computing platforms
    • Provide consultancy
      • To develop project proposals
      • To help recruit people with specialist skills
      • Working directly on research projects
  • For details please see our Website
  • Contact us via the IT Service Desk

Training Themes

Aims of this training

Syllabus

  • Logging in
  • Entering commands
  • Finding out about commands
  • File system navigation
  • Commands to list, create, copy, move and delete files
  • Hidden files
  • Command history
  • Control key combinations
  • Editing files
  • File permissions
  • Commands to explore and filter data
  • Wildcards
  • Shell scripting basics

Logging in

Host Operating System

  • OSX/Linux:
    • Use terminal
  • Windows:
    • Use MobaXTerm

Connection

  • On campus (wireless is like ‘off campus’)
  • Off campus (connected to the University VPN is like ‘on campus’)

Logging in demos

Logging in exercise

  • Please login

What is a shell? 1/5

  • A shell is a terminal emulator program and scripting language
  • The terminal is a window on the screen with a prompt at which commands are entered and where Standard Output (stdout) and Standard Error (stderr) streams get displayed
  • During login, set up is finalised by reading files from your HOME directory
  • When you login you start in your HOME directory

What is a shell? 2/5

  • BASH (bash) is a type of shell

What is a shell? 3/5

What is a shell? 4/5

  • Shells have built-in commands:
    • When called from a shell, these run without invoking (calling) other programs
  • Shells call other programs by name, and these are looked for in the user PATH
  • PATH is a variable and ordered list of filesystem directories
  • Each user has a PATH set upon login
  • PATH is a variable and can be changed

What is a shell? 5/5

  • On ARC3 and ARC4 additional Linux command utilities are available
  • Additional software has also been installed and made available via a module system
  • This functionality can differ between ARC3 and ARC4, but is all accessed via the shell

Filesystem basics

Forward-slash / and back-slash  

Absolute path

Relative path

The default prompt

  • Upon login, you should see a prompt that looks like:
[exuser@host ~]$
  • exuser should be your username
  • host should start login1. or login2. and end arc3 or arc4
  • The ~ (tilde) indicates you are in your Home Directory
  • After the $ (dollar) is where commands are formulated and entered
  • In example commands often the $ is there, but you should enter only the part after the $

First commands

Command Description
pwd Print the current working directory
man Load the manual
ls List the contents of a directory
mkdir Make a new directory
cd Change directory

pwd

  • A command to print the working directory
[exuser@host ~]$ pwd
/home/homeXX/exuser
  • When you run this the output will be slightly different:
    • XX should be 01 or 02
    • user should be your username

pwd exercise

  • Type ‘pwd’ at the prompt by typing the letters, then press the <return> or <enter> key:
[exuser@host ~]$ pwd
/home/homeXX/exuser

ls 1/3

  • A command to list the contents of a directory
  • There are many options for ls:
    • The -a option will list all the contents of a directory
      • Files and directories starting with . are hidden by default
    • The -l option will list each item line by line with added details
    • The -h option will report sizes in a more human readable format

ls 2/3

  • Options as for most commands can be listed in any order and they can be combined - the following are equivalent:
$ ls -al
$ ls -la
$ ls -a -l
$ ls -l -a
  • Note that the prompt has been abbreviated to $ in these examples…

ls 3/3

[exuser@host ~]$ ls -al
total 8
drwx-------     2 user group   4096 MMM DD hh:mm .
drwxr-xr-x-     3 root root    4096 MMM DD hh:mm ..
...

ls exercise

  • List the Root directory:
$ ls /
  • List the non-hidden contents of your Home directory:
$ ls ~

– List all the details of the contents of your Home directory including the hidden files and directories:

$ ls -a -l ~

--help and man

  • --help is a command option that prints help about using the command
  • man is a command to bring up the manual
  • For help using man: press the h-key `.
  • To quit the help about man, or man itself: press <q>
  • Use <space_bar> to page through

--help and man exercise

  • Bring up the manual page about the manual:
man man
  • For help using man: press the h-key <h>.
  • To quit the help about man or any man page: press <q>.
  • To page through a man page: press <space_bar>.
  • Have a look at the help for ls:
$ ls --help
  • Bring up the manual page about ls:
man ls

Home directory hidden files and folders

  • .bash_profile
    • Finalises your user login shell
  • .bashrc
    • Finalises your user subshell

mkdir 1/2

  • A command to create a directory.
  • The following should create a directory named test so long as you are in a directory where you have permission to write:
$ mkdir test
  • Attempting to create a directory that already exists should print a warning:
$ mkdir test
mkdir: cannot create directory ‘test’: File exists

mkdir 2/2

  • The -p option allows for creating parent directories
  • With write permission, the following should create the directory test2 and in this the directory test3
$ mkdir -p test2/test3
  • To create a directory, you must have permission to write to the directory in which it is being created
  • Without write permissions a warning is printed, for example:
$ mkdir /test
mkdir: cannot create directory ‘/test’: Permission denied

cd

  • A command to change directory
  • Change to the test directory (from the current directory):
$ cd test
  • Change to the user HOME directory:
$ cd ~
$ cd
- Without read permission you cannot change into a directory and a warning is printed:
```{.bash code-line-numbers=false}
$ cd /root
-bash: cd: /root: Permission denied

ls, mkdir and cd demo

ls, mkdir and cd exercise

  • Use ls, mkdir and cd in your home directory.
  • Make a directory for your files in /nobackup (change “exuser” to be your USERNAME):
$ mkdir /nobackup/exuser

Check point

  • You can only list or change into directories you have permission to read.
  • Any questions?

Time savers 1/2

  • <up> and <down> arrow keys allow you to scroll previously entered commands
    • <return> will enter the command
  • Pressing the control key <Ctrl> and another key will often do things:
    • <Ctrl> + <c> will cancel
      • This can terminate a running command or cancel what is at the prompt
    • <Ctrl> + <a> will move the cursor to the start of what is at the prompt
    • <Ctrl> + <e> will move the cursor to the end of what is at the prompt
    • <Ctrl> + <r> will do a reverse search through the history of commands entered at the prompt which can then be edited or entered

Time savers 2/2

  • The prompt cursor does not have to be at the end of what is written at the prompt to submit what is there to the interpreter (run it)
  • The tab key <Tab> can be used to help autocomplete file paths
    • Press <Tab> repeatedly two or three times to reveal optional paths
      • This can make your computer output an audible “bleep”
  • Copy and paste using the mouse:
    • Highlight by left-click-hold and drag mouse
    • Copy with right click then select copy option
    • Paste with right click and select paste option
    • Or paste highlighted text using third mouse button

Time savers demo

Time savers exercise

  • Use <up> and <down> arrow keys to load previously entered commands into the prompt
  • Use <Ctrl> + <a>, and <Ctrl> and <e> to move the cursor between the start and end of what is at the prompt
  • Use <Ctrl> + <c> to cancel
  • Use <Ctrl> + <r> to reverse search through the history of commands, and select one to edit or run
  • Use copy and paste to paste a command at the prompt

history command

  • history
    • Another way to repeat previous work is to use the history command to get a list of commands that have been executed, and then to use ![number] to repeat one of those commands.
  • .bash_history
    • A log of commands that you have entered is saved in this hidden file in your home directory.
    • The commands for a session are only written to the file when you exit - If your session crashes, the commands are not written.

More commands

Command Description
mv Move a file or directory (can be used to rename)
cp Copy a file or directory - using the -r option
touch Access or create a file or directory
rmdir Remove (delete) a directory
rm Remove (delete) a file or a directory - using the -r option

mv

  • A command to move a file or directory (can be used to rename)
  • The following would rename the file/directory test to test_renamed
$ mv test test_renamed

cp

  • A command to copy a file or directory
  • If test is a file the following would make a copy named test_copy:
$ cp test test_copy
  • If test is a directory a warning would be printed:
$ cp test test_copy
cp: omitting directory 'test'
  • The -r option can copy a directory:
$ cp -r test test_copy

touch, >, >>

  • touch is a command that will create an empty file, or update the time a file was last accessed to the current time
  • the symbol > can be used to direct the standard output to a file.
    • This will overwrite any existing file.
  • The following would direct the output of ls to a file named README
$ ls > README
  • >> can be used in place of > to append to rather than overwrite the file

rmdir

  • A command to remove (delete) a directory.
$ rmdir test
  • It will only remove the directory if it is empty.
  • If the directory is not empty, there will be a warning and the directory is not deleted.
$ rmdir test
rmdir: failed to remove 'test': Directory not empty
  • If the user does not have permission to delete the directory, a warning is printed.
$ rmdir /root
rmdir: failed to remove '/root': Permission denied

rm

  • A command to remove (delete) files and directories.
  • The following will remove a file called test:
$ rm test
  • To delete a directory and all contents the -r option can be used.
  • The -i option allows for interactivity so the user can choose what to delete.
  • The -f option forces.
  • Take care with rm, especially combined with -r and -f options.

mv, cp, rmdir, rm, touch, >, >> exercise

  • Create some directories and files as follows:
$ mkdir -p test0/test01
$ mkdir test0/test02
$ touch test0/test01/test_file
  • Copy test0/test01 into test0/test03
  • Rename test0/test02 into test0/test04
  • Delete test0/test03
  • Direct the output of ls into ls.out
  • Append the output of ls -al into ls.out

Check point 2

  • Any questions?

Yet more commands

Command Description
cat Concatenate and print
less Open into a paginated reader view
sort Sort lines
cut Cut lines based on a given character
head Print the first lines of a file (default 10)
tail Print the last lines of a file (default 10)
wc Print newline, word, and byte counts

git clone exercise

  • Get a copy of some data using the following git command:
$ git clone https://github.com/ARCTraining/shell-training/
  • git is a distributed version control system
  • Change directory into the directory called shell-training.
  • List the directory contents.

cat, sort, cut and pipe |

  • cat concatenates (streams one thing after another) and prints to standard output
  • sort sorts lines (by default alphabetically)
  • cut cuts lines based on a given character
  • Commands that create an output stream can have the output piped into another command that can process the stream using the symbol |
    • For example two files test and test2 can be concatenated, sorted alphabetically and piped into head as follows:
    $ cat test test2 | sort | head

cat, sort, cut, pipe | exercise

  • Combine cat, cut, and sort to print out the Latin names from the file IOM-animals/birds.txt in alphabetical order.
  • Save the output to a new file. Hint: cut -d ',' -f 2 will split lines around the comma and yield the second field.
$ cat birds.txt | sort | cut -d ',' -f 2 > sorted-birds.txt

Another option uses awk instead of cut:

$ cat birds.txt | sort | awk '{FS=","}{print $2}' > sorted-birds.txt
  • awk is a scripting language used for manipulating data and generating reports
    • It supports variables, numeric functions, string functions, and logical operators
    • It allows for setting field separators of multiple characters, whereas cut is restricted to a single character field separator.
  • The {FS=","} part of the awk command sets the field separator to a comma, to set as a space followed by a comma use {FS=" ,"}

head and tail

  • The head command prints the first lines of a file
  • The tail command prints the last lines of a file
  • The default number of lines is 10
    • This can be changed with the -n option

head and tail exercise

List all the lines in the files in IOM-animals directory alphabetically and find the 50th item.

Use head to get the first 50 lines and pipe it to tail to get the last one:

$ sort *.txt | head -n 50 | tail -n 1

Another option is to use the sed command instead of head and tail:

$ sort *.txt | sed -n 50p

Yet even more commands

Command Description
grep Filter lines
sed Stream editor

sed 1/2

  • A program command tool for stream editing that reads an input stream and produces an output stream
  • Internally it has a pattern space and a hold buffer
  • Data is read from the input stream until the next newline character and placed into the pattern space
  • Operations can be performed on the data in the pattern space
  • Data can be exchanged between the pattern space and the hold buffer
  • Once all specified operations have been performed, the pattern space is output and a newline character is added at the end

sed 2/2

  • A simple sed program:
$ sed 's/foo/bar/'
  • This program replaces all instances of foo with bar on every line
  • Take care using sed as it can modify files

grep

  • A command to print all lines containing or not containing a string of characters
  • The following will stream out all input lines containing foo:
$ grep foo
  • The following will stream out all input lines not containing foo:
$ grep -v foo

sed and grep demo

Data challenge exercise

  • shell-training/data/ contains 300 data files, each of which should contain 100 values. One or more of these files are missing some data though…
  • Use a series of commands connected by pipes to identify which files have missing data.
  • Hints:
    • wc -w will tell you the number of values in a file
    • sort -n will sort numerically

Using grep:

$ wc -w values* | sort -n | grep -v '100 values'

Using head:

$ wc -w values* | sort -n | head

Using awk:

$ wc -w values* | sort -n | awk '$1 != 100 {print $0}'

Permissions 1/4

  • File permissions can be identified using the ls command with the -l option
  • By default, the owner has read, write and execute permissions on a directory they create, and read and write (but not execute) permission on a file they create
  • On ARC4, group and all have no permissions by default
  • On ARC3, group and all have read permissions on files and also execute permissions on directories by default
$ mkdirs -p test0/test
$ touch test0/test/test-file
$ ls -la test0/test
drwx------- 2 <owner> <group> <size> <date> <time> .
drwx------- 3 <owner> <group> <size> <date> <time> ..
-rw-------- 1 <owner> <group> <size> <date> <time> test-file

Permissions 2/4

  • A schematic to help explain what ls -l shows:
-rw-------- 1 <owner> <group> <size> <date> <time> test-file
     |      |    |       |       |      |      |
     |      |    |       |       |      |      +----> time last modified
     |      |    |       |       |      +-----------> date last modified
     |      |    |       |       +------------------> size in bytes
     |      |    |       +--------------------------> group assignment
     |      |    +----------------------------------> owner username
     |      +---------------------------------------> number of hard link
     +----------------------------------------------> type and permissions
  • A schematic to help explain the type and permissions part:
-rw--------
| |  |  | |
| |  |  | +--> another type
| |  |  +----> all user permissions (3 characters)
| |  +-------> group user permissions (3 characters)
| +----------> owner permissions (3 characters)
+------------> type (1 character)

Permissions 3/4

Type and permissions examples

  • drwxr-x----
    • type: directory
    • permissions
      • owner: read, write, execute
      • group: read, execute
      • all: none
  • -rw--------
    • type: file
    • permissions
      • owner: read, write
      • group: none
      • all: none

Permissions 4/4

  • Permissions may be changed using the chmod command.
  • Specific permissions can be granted or removed or all permissions can be specified
  • The following would set permissions -rwx------ to test.file:
chmod 700 test.file
  • Don’t worry about remembering details of how to change permissions.
  • Try to remember that there are:
  • 3 levels of permission:
    • user
    • group
    • all
  • 3 types of permission:
    • read
    • write
    • execute

Groups

  • System users belong to groups (at least one)
  • A user can change the group assignment for a file or directory to other groups they belong to using the command chgrp
  • The command groups can be used to print out what groups there are, and what group a user is part of
  • This can be useful if you want to share files between those in a group on the system

Check point 3

  • Any questions?

Variables

  • Variables are defined and accessed as follows:
$ var=1
$ echo $var
  • The variable var is set equal to the number 1.
  • echo is a command that will send what comes after it to the Standard Output (stdout).
  • The value of the variable is accessed using the $ symbol followed by the name of the variable.
  • The command printenv will print all the variables in use.

For loop 1/2

Examples

  • Iterator syntax to print the numbers 1 to 5:
$ for ((i=1;i<=5;i=i+1)) do echo $i; done
  • Sequence syntax to print the numbers 1 to 5:
$ for i in {1..5}; do echo $i; done
  • Sequence syntax to print even numbers from 2 to 10:
$ for i in {2..10..2}; do echo $i; done
  • Iterator syntax to print a geometric series from 2 to 256:
$ for ((i=2;i<=265;i=i*2)) do echo $i; done

For loop 2/2

  • Example for loop to loop through directory contents:
$ for f in *; do echo $f; done
  • Where there is a ; there can be a new line:
$ for f in *
do echo $f
done

For loops exercise 1

  • Use a for loop to create five directories called calculation_?, where ? is a number.
  • For loop with an iterator syntax
$ for ((i=1;i<6;i++))
do
mkdir calculation_$i
done
  • i++ is shorthand for i=i+1
  • For loop with a sequence syntax
$ for i in {1..5}
do
mkdir calculation_$i
done

For loops exercise 2

  • Use a for loop to create five directories, each one the parent of the next.
$ for i in {1..5}; do
mkdir calculation_$i
cd calculation_$i
done

Wildcards

  • In Linux:
    • * can represent anything.
    • ? can represent any single character.
    • [] can represent any single character detailed in the square brackets.
  • The following will list all files starting with a:
$ ls a*
  • The following will list all files starting with a and ending .txt that have one character between these:
$ ls a?.txt
  • The following will list all files starting with a and ending .txt that have any other string with a 0 or a 1 character between these:
$ ls a[12].txt

Wildcards exercise

  • Print out the first line of each file in the wildcards directory.
$ for f in *.txt
do
head -n 1 $f
done

Quiz

If the following is run from the wildcards directory. What will be printed?

$ for f in *.txt
do
echo $f
cat $f > new-file.txt
done

What will the contents of new-file.txt be and why?

  • The command will print the name of each .txt file:
    • The echo command prints a different file name each iteration
  • The content of new-file.txt will be the same as xyz.txt:
    • The contents of a different file are written to new-file.txt each iteration, overwriting whatever was written on the previous iteration
    • Using >> instead of > would append instead of overwrite.

Editing files

  • These are three programs available on the system that can be used for text editing:
    • nano
    • vim
    • emacs
  • If you are unfamiliar with any of these, we recommend you try nano.
  • Care is needed if transferring scripts and data between Windows and Linux due to differences in line endings.
    • dos2unix and unix2dos may be helpful to get this right.

Shell Scripts

  • These are simple text files:
    • By convention the filename ends .sh
  • The file should start with a shebang:
    • A line which tells the Linux system what command to run:
    #!/bin/bash
  • After the shebang, simply enter all the commands for the script.
  • Lines starting with # will be regarded as comments
    • Use these to make your script easier to understand.
  • To run the script it must have executable permissions.
  • A script saved as script.sh can be run using:
$ ./script.sh

Shell Scripts demo

Shell Script exercise

  • Create and run a shell script to print out the first line of each file in a directory. Test using the wildcards directory.
  • Hints:
    • Use the history command to see the commands you’ve entered
    • Pipe the history output into tail
    • Direct the output into a file and then use nano to modify this into a script
    • Modify the file permissions so you can execute the file

Thank you

If you have any questions or would like to learn more about Research Computing, please do not hesitate to get in touch with us.


We are always here to assist you!