The Hitchhiker's Guide to CLIs in Python — Part 1: Anatomy of a Terminal and CLI

May 04, 2020

tty (short for teletype) is a Unix command that prints the name of the terminal connected to standard input. It is also the prefix in names for virtual terminals on Unix-based operating systems.

The fundamental type of application that runs on a terminal is the shell. The shell prompts for commands from the user and sends it for execution after they press Enter, similar to the old teletype workflow.


    keyboard
            \
             \ input
              \
            (terminal)- - - - - - - -(process)
              /
             / output
            /
    display

Based on intuition, the whole shebang looks like the diagram above. Keyboard passes input to the terminal, that passes it to the process, the process does some work and gives the output back to the terminal, that prints it on the display.


    keyboard
            \
             \ input
              \
            (terminal)- -(termios)- -(process)
              /
             / output
            /
    display

But, an illusion sits between the terminal and the process.

termios and stty

termios is an interface to some default settings that affect how text is entered and printed on a terminal.


  $ man termios

The man page for termios shows all the available settings.


  $ stty -a
  speed 38400 baud; rows 34; columns 166; line = 0;
  intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D;
  -ignbrk brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr
  opost -olcuc -ocrnl onlcr -onocr -onlret -ofill -ofdel nl0 cr0
  isig icanon iexten echo echoe echok -echonl -noflsh -xcase -tostop

The stty utility can be used to turn these settings on or off. An stty -a shows all the default settings and their values. For example, the speed of serial communication, and the number of rows and columns.

Let's see what some of these settings do. We'll go through the same examples Brandon Rhodes shared in his 2017 North Bay Python keynote.

icanon


  $ man termios
  ...
      ICANON Enable canonical mode (described below).
  ...

The first one is icanon, which is on by default. It refers to the canonical text editor, that buffers input to allow basic text editing of commands. For example, moving the cursor back and forth, or removing characters using backspace. Interactive applications like text editors turn off icanon and handle all the text editing themselves.


  $ stty -icanon

We can turn off icanon using stty -icanon, - being the off switch. This is how stty works, the utility name; - for off, no - for on, with the setting name.

Let's open cat and type in some text. Since the canonical text editor is on by default, we can see that the input is buffered till Enter is pressed. We can also use backspace to remove characters from our input. Let's see how the terminal behaves when icanon is off.

We can see that the text is not being buffered for editing. cat receives a character as soon we type it and prints it right away, rather than one line at a time. To turn on icanon, we just remove the - and run stty icanon.

onlcr


  $ man termios
  ...
      ONLCR  (XSI) Map NL to CR-NL on output.
  ...

The second one is onlcr, which is on by default. Here nl stands for new line and cr for carriage return. onlcr finds new lines in text and adds a carriage return to each one. A carriage return makes sure that the cursor moves back to the first column after a new line. Similar to the teletype days, when the paper carriage would return to the first column after a new line.

A carriage return, without the new line character, is used to make progress bars on modern terminals. The program updates the progress, moves the cursor back to the first column, and then overwrites the earlier progress with the new one.


  $ stty -onlcr

We can turn it off using stty -onlcr. Let's see the output for ps, a utility that reports the status of the current processes. The output looks very structured. Let's turn off onclr and see the output again.

It's all over the place! The illusion is gone, this is the real thing. Each line is being printed on a new line but the cursor does not return to the first column. A lot of applications are written with this assumption, that the terminal will move the cursor back to the first column, automatically, when a new line is printed.

echo


  $ man termios
  ...
      ECHO   Echo input characters.
  ...

The third one is echo, which is also on by default. It directs the terminal to print every character that we input, back on the display.


  $ stty -echo

We can turn it off using stty -echo. We'll look at cat again. We can see the input text being typed. What happens if we turn it off?

We didn't see cat and the input text being typed. Until cat, which was actually running, printed them for us. Programs turn off echo when they ask the users for sensitive input, like passwords.


  $ reset

If you're experimenting with termios settings, you can use the reset command to return all of them to their default values.


  import termios

The Python standard library also contains the termios module that can be used to turn these settings on or off from Python code.

Signals

Another way to change a terminal's state is through in-band and out-of-band signals. In-band signaling means we throw in some special characters in our text input, that the terminal interprets as commands. It does not print these special characters but instead causes the intended effect.

Control characters

One way to do in-band signaling is using control characters. For example:

^H backspace
^J add a new line
^C interrupt the running process
^D end text input or exit the shell

Escape sequences

Another way is to use escape sequences, that control things like cursor location and text color. For example:

\u001b[2J: clear screen
\u001b[1m: make text bold
\u001b[31m: make text red
\u001b[{n}A: moves cursor up by n

Printing each escape sequence will have the corresponding effect.

Streams

Terminals are also preconfigured with input and output streams, where input stream is mapped to the keyboard, and output stream to the display.

This ability to automatically map input and output to the keyboard and display, by default, was a Unix breakthrough! In operating systems before Unix, programs had to explicitly connect to appropriate input-output devices, which was a tedious thing to do, because of a lack of standards across systems.

stdin

stdin is the input stream, where the program reads its input data.

stdout and stderr

stdout and stderr are output streams, where the program writes its output data and error messages.

Redirection

Program output can also be redirected using the redirection operators.

> and >>


  $ echo "hello" > file
  $ echo "world" >> file

The greater than (>) and the double greater than (>>) are operators that redirect a program's output to a file. The only difference between the two is that > will overwrite the file, while >> will append to the file.

|


  $ echo "hello" | cat
  hello

Another redirection operator is the pipe that makes the output of one program the input to another.

Now that we have an understanding of how terminals evolved and how they work, let's look at programs that run inside a terminal.

Command-line interfaces

The words interfaces, applications, programs and tools are used interchangeably, but they refer to the same thing most of the time. CLIs are kinda fun to use and make it easy to automate repetitive tasks via shell scripting!


  Prompt command option argument <Enter>
  Output

The general usage pattern of a CLI looks like above. The shell displays the prompt as a sign that it's ready to take in commands. The users type in the command that they want to run, along with options, and arguments. Finally ending the input by pressing the Enter key. This completes the command line of text. The command is then executed and output is printed on the terminal. But what are these arguments and options?

Arguments

Arguments are required items of information for a program. Required, in the sense that the program won't work without them. They are often positional, which means an argument's position in the line helps the program identify the argument's type.


  $ cp src dst

For example, here's the cp command. It can't function without both the required arguments. The argument in the first position will always be identified as source, and the argument in the second position will always be the destination where files from source need to be copied.

Options

An option, or a flag, is used to modify the operation of a command. As the name suggests, they are optional and have some default values. The general convention is to have hyphen(s) in front of a character or word to identify the option.


  $ cp -r src dst

For example, in the cp command, -r changes the operation by recursively looking for files in the source, and then copying them to the destination.

Documentation

One of the criticisms of a CLI is the lack of cues to the user about the available actions, in contrast to a GUI that usually inform the user of available actions with menus, icons, or other visual cues.

Help text


  $ cp --help

To overcome this limitation, many CLI programs display some documentation around the arguments and options that they support. This documentation can be viewed by invoking the CLI with the --help option.

Manual pages


  $ man termios

Some of them also have man pages (short for manual page). By default, the man Unix command uses a terminal pager program such as more or less to display the large documentation for a CLI. This makes it easy for the user to scroll and search through it.

Standards

You must be wondering that there's a lot of moving parts here. Each programmer could write their CLI differently. For example, they could use -x instead of -h to display help text. Are there any standards to make sure every CLI follows some basic conventions?

POSIX

There is a standard and it's called POSIX. POSIX makes APIs provided by Unix-based operating systems uniform. APIs such as command-line interfaces. To follow the POSIX standard is to be POSIX-compliant.

XDG base directory specification

There's also the XDG base directory spec that dictates how a CLI should store different types of files which it needs for its function. This makes sure that a CLI doesn't save files all over the place.

$XDG_CONFIG_HOME=$HOME/.config for configuration files
$XDG_DATA_HOME=$HOME/.local/share for data files
$XDG_CACHE_HOME=$HOME/.cache for the program cache

Continue to Part 2 — Python packages for writing CLIs