Crash Course in Scientific Computing

Operating Systems

We now jump to the top of the layer, directly to the system that operates all these features, that brings together programs (so-called softwares) with data (files, data in memory, data downloaded from the web). This is the so-called Operating System.

From a scientific perspective, the Operating System of choice is called Unix. This was the system used initially by computer scientists, scientists in general and even the military. This was mainly developed for mainframes, workstations, etc. Someone very clever understood there was a big market in bringing computers to the general public. Bill Gates introduced a user-friendly, layman's system to use so-called Personal Computers (PC). This is the so-called Windows, already a marketing coup since the concepts of Windows in the computer environment comes from Unix (so-called X-Windows). Gates' company, Microsoft, which everybody knows, simplified everything and delivered a baby's version of operating systems, which got so popular (since the wide public is not technically minded) that Gates became the richest man on the planet for decades. Another very clever guy, Steve Jobs, sensed that there was a niche market in providing computers for even less-technically minded people, who would favour even simpler, shinier, fancier computer environment. Along with a tech guy (Steve Wozniak), he developed the Apple company. At the same time, on the Unix side, something wonderful happened with a student, Linus Torvalds, developed on his own a Unix-like operating system for PCs, as an exercise or toy-project. This was at the time where internet forums were starting to become popular and other people followed him in playing with this fun little project. The collaboration of dedicated, enthusiastic and curious people led this fun project into one of today's most powerful, important and growing computer environment: linux (a blend of Linux and Unix).

A fourth figure, less famous, needs to be mentioned at this point: Richard Stallman. He is a computer scientist who developed a series of software for the Unix environment, known as GNU, standing for "GNU is Not Unix", which principle is that of being free-software, meaning, the source code is free (as in freedom, not as in cost-free) for the user to view, modify, adapt to their own needs. This is compared to Windows/Apple's philosophy which is that of copyright and intellectual property, where not only you must pay for each copy, but also have no regard to what's inside and of course no way to actually modify it. Stallman actually had planned to write a GNU version of the Unix operating system, which was called Hurd. But this was too difficult and the GNU team focused on the softwares instead. Linus, on the other hand, managed to break a dynamics to do just that, independently, an open-source, free-software version of a Unix like operating system. Together, these two things provide the GNU/Linux environment, which is the one we encourage you to use for the following reasons:

  • It is free (as in freedom, you can do whatever you want).
  • It is also free in the sense of cost-free.
  • It is more powerful for scientific/technological purposes than Windows & Apple products.
  • It is more ethical.

You can install Linux on your personal computer, to side along your other operating systems in a so-called dual-booting configuration: you decide which OS you will take command of the computer with.

Once linux is installed, one needs a Windows manager/desktop environment, that abstracts concepts such as files and directories into folders, etc. Two popular for Linux are kde and gnome. It's just a question to get used to them.

Let us look at some simple/basic commands or functions that the Linux (or Unix) operating systems bring to our disposition. We can go back to the more basic Operating System from the Windows environment by using a console, or terminal. In kde, this can be used with konsole. That is what one would normally access before launching a Windows manager, but nowadays everybody starts at the Windows level.

In the OS, one is logged-in as a user. Linux answers your most existential question with the command whoami

laussy@wulfrun2:~$ whoami
laussy

Note the (typical, could differ depending on the configuration) structure of the prompt, i.e., the point at which you enter your command: laussy is the user name, wulfrun2 is the name of the computer:

laussy@wulfrun2:~$ hostname
wulfrun2

so here you have laussy working @ wulfrun2, which is the computer with which I am writing these notes. The symbol ~ tells us where we are in the Directory tree, this is a shortcut for "home". We can ask this to be explicit with the command:

laussy@wulfrun2:~$ pwd
/home/laussy

We can change the directories with the cd command, e.g., cd .. goes one way up:

laussy@wulfrun2:~$ cd ..
laussy@wulfrun2:/home$ cd ..
laussy@wulfrun2:/$ 

You see that we went from home (~ or /home/laussy) to /home to /.

What's in /? We can see the content of a directory with the command ls (for "list"):

laussy@wulfrun2:/$ ls
bin   cdrom  etc   lib    lib64   lost+found  mnt  proc  run   snap  sys  usr
boot  dev    home  lib32  libx32  media       opt  root  sbin  srv   tmp  var

For now, we do not need to go into much more details of these directories... /media, for instance, is where the system will link external devices (like a USB key). Let's get back home and make a working directory:

laussy@wulfrun2:/$ cd
laussy@wulfrun2:~$ mkdir julia
laussy@wulfrun2:~$ rmdir julia
laussy@wulfrun2:~$ mkdir lecture-1
laussy@wulfrun2:~$ cd lecture-1
laussy@wulfrun2:~/lecture-1$ 

We created (mkdir) but also deleted (rmdir) a directory julia since we shall see that later on. Instead we created lecture-1 and went there. Of course it is currently empty:

laussy@wulfrun2:~/lecture-1$ ls

We can create an empty file with touch and then list it:

laussy@wulfrun2:~/lecture-1$ touch file
laussy@wulfrun2:~/lecture-1$ ls
file

Not very exciting. In Unix (Linux), commands typically come with options. For instance, we can ask a listing with more information with:

laussy@wulfrun2:~/lecture-1$ ls -l
total 0
-rw-rw-r-- 1 laussy laussy 0 Jan 13 14:06 file

and this is already more than you'd like to know. It tells us the permissions on the file: r is for read, the file can be read, w is for write, the file can be modified and there is a x for executable, meaning the file can be run as a program, which here is not set so we have - instead. The first batch rw- is for the user (laussy), the 2nd, also rw-, is for the group (which is also laussy, in case you work in a group with other people; this is a relic of the Unix environment which was on mainframes) and the last batch is for people not in the group: they can only read the file, not change it. Note that this status rwx requires one byte, so three bytes for each file, that you can change with the command chmod. So if we want to make the file not readable to people not in our group (confidential file), we'd write:

laussy@wulfrun2:~/lecture-1$ chmod 660 file 
laussy@wulfrun2:~/lecture-1$ ls -l
total 0
-rw-rw---- 1 laussy laussy 0 Jan 13 14:06 file

If we'd like to make it executable, for ourselves only (not people of our group, but still allowing everybody to read the file), we'd ask:

laussy@wulfrun2:~/lecture-1$ chmod 764 file 
laussy@wulfrun2:~/lecture-1$ ls -l
total 0
-rwxrw-r-- 1 laussy laussy 0 Jan 13 14:06 file

It is now executable (by us) but does nothing, since it's empty.

Let's populate this file with something, for instance with the date:

laussy@wulfrun2:~/lecture-1$ date
Wed 13 Jan 14:26:23 CET 2021
laussy@wulfrun2:~/lecture-1$ date > file 
laussy@wulfrun2:~/lecture-1$ cat file
Wed 13 Jan 14:26:26 CET 2021

The first command asks the date (it is echoed on the screen), the second sends the output to file, the last one asks to display its content. If we use >>, the content is added at the end rather than overwriting:

laussy@wulfrun2:~/lecture-1$ date >> file 
laussy@wulfrun2:~/lecture-1$ cat file
Wed 13 Jan 14:26:26 CET 2021
Wed 13 Jan 14:27:26 CET 2021

Remember that it's an executable. If we ask to run the content:

laussy@wulfrun2:~/lecture-1$ ./file 
./file: line 1: Wed: command not found

we get a mistake, since we populated it with data, not code! On the other hand:

laussy@wulfrun2:~/lecture-1$ echo date > file
laussy@wulfrun2:~/lecture-1$ ./file 
Wed 13 Jan 14:29:55 CET 2021

where echo echoes what comes next rather than execute it. Hey! we made our first program. Here's a slightly more interesting one:

laussy@wulfrun2:~/lecture-1$ echo "date; sleep 1; date" > file
laussy@wulfrun2:~/lecture-1$ ./file 
Wed 13 Jan 14:31:16 CET 2021
Wed 13 Jan 14:31:17 CET 2021

and you can guess what happened as well! We could rename this to a more meaningful name, by moving mv or copying cp the file:

laussy@wulfrun2:~/lecture-1$ cp file just-a-sec
laussy@wulfrun2:~/lecture-1$ ls
file  just-a-sec

You can see the options available for ls with the --help argument:

laussy@wulfrun2:~/lecture-1$ ls --help
Usage: ls [OPTION]... [FILE]...
List information about the FILEs (the current directory by default).
Sort entries alphabetically if none of -cftuvSUX nor --sort is specified.

Mandatory arguments to long options are mandatory for short options too.
  -a, --all                  do not ignore entries starting with .
  -A, --almost-all           do not list implied . and ..
      --author               with -l, print the author of each file
  -b, --escape               print C-style escapes for non-graphic characters
      --block-size=SIZE      with -l, scale sizes by SIZE when printing them;
                               e.g., '--block-size=M'; see SIZE format below
  -B, --ignore-backups       do not list implied entries ending with ~
  -c                         with -lt: sort by, and show, ctime (time of last
                               modification of file status information);
                               with -l: show ctime and sort by name;
                               otherwise: sort by ctime, newest first
  -C                         list entries by columns
      --color[=WHEN]         colorize the output; WHEN can be 'always' (default
                               if omitted), 'auto', or 'never'; more info below
  -d, --directory            list directories themselves, not their contents
  -D, --dired                generate output designed for Emacs' dired mode
  -f                         do not sort, enable -aU, disable -ls --color
  -F, --classify             append indicator (one of */=>@|) to entries
      --file-type            likewise, except do not append '*'
      --format=WORD          across -x, commas -m, horizontal -x, long -l,
                               single-column -1, verbose -l, vertical -C
      --full-time            like -l --time-style=full-iso
  -g like -l, but do not list owner
      --group-directories-first
                             group directories before files;
                               can be augmented with a --sort option, but any
                               use of --sort=none (-U) disables grouping
  -G, --no-group             in a long listing, don't print group names
  -h, --human-readable       with -l and -s, print sizes like 1K 234M 2G etc.
      --si                   likewise, but use powers of 1000 not 1024
  -H, --dereference-command-line
                             follow symbolic links listed on the command line
      --dereference-command-line-symlink-to-dir
                             follow each command line symbolic link
                               that points to a directory
      --hide=PATTERN         do not list implied entries matching shell PATTERN
                               (overridden by -a or -A)
      --hyperlink[=WHEN]     hyperlink file names; WHEN can be 'always'
                               (default if omitted), 'auto', or 'never'
      --indicator-style=WORD  append indicator with style WORD to entry names:
                               none (default), slash (-p),
                               file-type (--file-type), classify (-F)
  -i, --inode                print the index number of each file
  -I, --ignore=PATTERN       do not list implied entries matching shell PATTERN
  -k, --kibibytes            default to 1024-byte blocks for disk usage;
                               used only with -s and per directory totals
  -l                         use a long listing format
  -L, --dereference          when showing file information for a symbolic
                               link, show information for the file the link
                               references rather than for the link itself
  -m                         fill width with a comma separated list of entries
  -n, --numeric-uid-gid      like -l, but list numeric user and group IDs
  -N, --literal              print entry names without quoting
  -o                         like -l, but do not list group information
  -p, --indicator-style=slash
                             append / indicator to directories
  -q, --hide-control-chars   print ? instead of nongraphic characters
      --show-control-chars   show nongraphic characters as-is (the default,
                               unless program is 'ls' and output is a terminal)
  -Q, --quote-name           enclose entry names in double quotes
      --quoting-style=WORD   use quoting style WORD for entry names:
                               literal, locale, shell, shell-always,
                               shell-escape, shell-escape-always, c, escape
                               (overrides QUOTING_STYLE environment variable)
  -r, --reverse              reverse order while sorting
  -R, --recursive            list subdirectories recursively
  -s, --size                 print the allocated size of each file, in blocks
  -S                         sort by file size, largest first
      --sort=WORD            sort by WORD instead of name: none (-U), size (-S),
                               time (-t), version (-v), extension (-X)
      --time=WORD            with -l, show time as WORD instead of default
                               modification time: atime or access or use (-u);
                               ctime or status (-c); also use specified time
                               as sort key if --sort=time (newest first)
      --time-style=TIME_STYLE  time/date format with -l; see TIME_STYLE below
  -t                         sort by modification time, newest first
  -T, --tabsize=COLS         assume tab stops at each COLS instead of 8
  -u                         with -lt: sort by, and show, access time;
                               with -l: show access time and sort by name;
                               otherwise: sort by access time, newest first
  -U                         do not sort; list entries in directory order
  -v                         natural sort of (version) numbers within text
  -w, --width=COLS           set output width to COLS.  0 means no limit
  -x                         list entries by lines instead of by columns
  -X                         sort alphabetically by entry extension
  -Z, --context              print any security context of each file
  -1                         list one file per line.  Avoid '\n' with -q or -b
      --help     display this help and exit
      --version  output version information and exit

The SIZE argument is an integer and optional unit (example: 10K is 10*1024).
Units are K,M,G,T,P,E,Z,Y (powers of 1024) or KB,MB,... (powers of 1000).

The TIME_STYLE argument can be full-iso, long-iso, iso, locale, or +FORMAT.
FORMAT is interpreted like in date(1).  If FORMAT is FORMAT1<newline>FORMAT2,
then FORMAT1 applies to non-recent files and FORMAT2 to recent files.
TIME_STYLE prefixed with 'posix-' takes effect only outside the POSIX locale.
Also the TIME_STYLE environment variable sets the default style to use.

Using colour to distinguish file types is disabled both by default and
with --color=never.  With --color=auto, lt emits colour codes only when
standard output is connected to a terminal.  The LS_COLORS environment
variable can change the settings.  Use the dircolors command to set it.

Exit status:
 0  if OK,
 1  if minor problems (e.g., cannot access subdirectory),
 2  if serious trouble (e.g., cannot access command-line argument).

GNU coreutils online help: <https://www.gnu.org/software/coreutils/>
Full documentation at: <https://www.gnu.org/software/coreutils/ls>
or available locally via: info '(coreutils) ls invocation'

As you can see, it's quite mighty. We usually don't need to enter into such details. Those are the very basics, and all these things you can do with the mouse and clicking on files, but it's important to know how they arise at a more fundamental level because when we will program, or instruct the computer, we'll need to refer to these basic commands. There are of course very many commands, even to do fairly similar things. For instance, cat outputs the entire file, which, if it's big, will flood the screen. If you want only the beginning, you can use head, actually the content of the bitmap files above was obtained with, e.g., head -20 fabrice.laussy.pgm (to get the first 20 lines). If you want the last lines you'd ask for tail instead. To see the content of a file page by page, we would use more but this has been replaced with less that allows you come back (as you can see, Unix computer scientists have a sense of humour).

Anyway, let's have a look at more powerful things, involving text processing, images, music, movies, etc. Actually, although you may not know the basics so well, you certainly know the advanced applications, such as Word for text-editing (power-point for presentations), etc. In Linux, there are many text editors. A very powerful one, but maybe not the most user friendly, is Emacs, which you can invoke as follows:

laussy@wulfrun2:~/lecture-1$ emacs just-a-sec 

at which point the console freezes (if you do not want it to freeze and remain independent, use an amperan: emacs just-a-sec &) and opens a window:

Screenshot 20210113 145446.png

There one can edit the "program", for instance making it wait 2 seconds. Everybody has a text-editor of their choice, you can use whichever you want. Some are even more cryptic than Emacs (such as vi; launch it and get frustrated). To get something Windows-looking like you can invoke libre-office, but this is certainly not the best for computer programming.

Emacs works with key-control rather than click with the mouse. To open a new file, try C-x C-f (meaning first do Control-x then Control-f). To give you an idea of the GNU spirit, run C-h C-t, it will bring you to the Emacs TODO list which gives a list of things you can do to enhance the editor. At the time of writing (which you know from just-a-sec above), it reads:

* Speed up Elisp execution
** Speed up function calls
Change src/bytecode.c so that calls from byte-code functions to byte-code
functions don't go through Ffuncall/funcall_lambda/exec_byte_code but instead
stay within exec_byte_code.

** Improve the byte-compiler to recognize immutable (lexical) bindings
and get rid of them if they're used only once and/or they're bound to
a constant expression.

Etc...

Emacs has been benefitting from hundreds of contributors for decades. It is possibly the most powerful text editor existing. Some say it's not even a text-editor anymore, but a religion. If you are interested to learn to use it, try C-h t, this will bring you to a self-contained tutorial.

What are other applications of interest? Here are some which you are invited to explore:

Next Lecture, we will look at one specific software, $\TeX$, which is a typesetting program.