The Hitchhiker's Guide to CLIs in Python — Part 1: Anatomy of a Terminal and CLI
04 May 2020 Tweettty (short for teletype) is a Unix command that prints the name of the terminal connected to standard input. It is also the prefix in names for virtual terminals on Unix-based operating systems.
The fundamental type of application that runs on a terminal is the shell. The shell prompts for commands from the user and sends it for execution after they press Enter, similar to the old teletype workflow.
keyboard
\
\ input
\
(terminal)- - - - - - - -(process)
/
/ output
/
display
Based on intuition, the whole shebang looks like the diagram above. Keyboard passes input to the terminal, that passes it to the process, the process does some work and gives the output back to the terminal, that prints it on the display.
keyboard
\
\ input
\
(terminal)- -(termios)- -(process)
/
/ output
/
display
But, an illusion sits between the terminal and the process.
termios and stty
termios
is an interface to some default settings that affect how text is entered and printed on a terminal.
$ man termios
The man
page for termios
shows all the available settings.
$ stty -a
speed 38400 baud; rows 34; columns 166; line = 0;
intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D;
-ignbrk brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr
opost -olcuc -ocrnl onlcr -onocr -onlret -ofill -ofdel nl0 cr0
isig icanon iexten echo echoe echok -echonl -noflsh -xcase -tostop
The stty
utility can be used to turn these settings on or off. An stty -a
shows all the default settings and their values. For example, the speed of serial communication, and the number of rows and columns.
Let's see what some of these settings do. We'll go through the same examples Brandon Rhodes shared in his 2017 North Bay Python keynote.
icanon
$ man termios
...
ICANON Enable canonical mode (described below).
...
The first one is icanon
, which is on by default. It refers to the canonical text editor, that buffers input to allow basic text editing of commands. For example, moving the cursor back and forth, or removing characters using backspace. Interactive applications like text editors turn off icanon
and handle all the text editing themselves.
$ stty -icanon
We can turn off icanon
using stty -icanon
, -
being the off switch. This is how stty
works, the utility name; -
for off, no -
for on, with the setting name.
Let's open cat
and type in some text. Since the canonical text editor is on by default, we can see that the input is buffered till Enter is pressed. We can also use backspace to remove characters from our input. Let's see how the terminal behaves when icanon
is off.
We can see that the text is not being buffered for editing. cat
receives a character as soon we type it and prints it right away, rather than one line at a time. To turn on icanon
, we just remove the -
and run stty icanon
.
onlcr
$ man termios
...
ONLCR (XSI) Map NL to CR-NL on output.
...
The second one is onlcr
, which is on by default. Here nl stands for new line and cr for carriage return. onlcr
finds new lines in text and adds a carriage return to each one. A carriage return makes sure that the cursor moves back to the first column after a new line. Similar to the teletype days, when the paper carriage would return to the first column after a new line.
A carriage return, without the new line character, is used to make progress bars on modern terminals. The program updates the progress, moves the cursor back to the first column, and then overwrites the earlier progress with the new one.
$ stty -onlcr
We can turn it off using stty -onlcr
. Let's see the output for ps
, a utility that reports the status of the current processes. The output looks very structured. Let's turn off onclr
and see the output again.
It's all over the place! The illusion is gone, this is the real thing. Each line is being printed on a new line but the cursor does not return to the first column. A lot of applications are written with this assumption, that the terminal will move the cursor back to the first column, automatically, when a new line is printed.
echo
$ man termios
...
ECHO Echo input characters.
...
The third one is echo
, which is also on by default. It directs the terminal to print every character that we input, back on the display.
$ stty -echo
We can turn it off using stty -echo
. We'll look at cat
again. We can see the input text being typed. What happens if we turn it off?
We didn't see cat
and the input text being typed. Until cat
, which was actually running, printed them for us. Programs turn off echo
when they ask the users for sensitive input, like passwords.
$ reset
If you're experimenting with termios
settings, you can use the reset command to return all of them to their default values.
import termios
The Python standard library also contains the termios module that can be used to turn these settings on or off from Python code.
Signals
Another way to change a terminal's state is through in-band and out-of-band signals. In-band signaling means we throw in some special characters in our text input, that the terminal interprets as commands. It does not print these special characters but instead causes the intended effect.
Control characters
One way to do in-band signaling is using control characters. For example:
- ^H backspace
- ^J add a new line
- ^C interrupt the running process
- ^D end text input or exit the shell
Escape sequences
Another way is to use escape sequences, that control things like cursor location and text color. For example:
\u001b[2J
: clear screen\u001b[1m
: make text bold\u001b[31m
: make text red\u001b[{n}A
: moves cursor up by n
Printing each escape sequence will have the corresponding effect.
Streams
Terminals are also preconfigured with input and output streams, where input stream is mapped to the keyboard, and output stream to the display.
This ability to automatically map input and output to the keyboard and display, by default, was a Unix breakthrough! In operating systems before Unix, programs had to explicitly connect to appropriate input-output devices, which was a tedious thing to do, because of a lack of standards across systems.
stdin
stdin is the input stream, where the program reads its input data.
stdout and stderr
stdout and stderr are output streams, where the program writes its output data and error messages.
Redirection
Program output can also be redirected using the redirection operators.
> and >>
$ echo "hello" > file
$ echo "world" >> file
The greater than (>
) and the double greater than (>>
) are operators that redirect a program's output to a file. The only difference between the two is that >
will overwrite the file, while >>
will append to the file.
|
$ echo "hello" | cat
hello
Another redirection operator is the pipe that makes the output of one program the input to another.
Now that we have an understanding of how terminals evolved and how they work, let's look at programs that run inside a terminal.
Command-line interfaces
The words interfaces, applications, programs and tools are used interchangeably, but they refer to the same thing most of the time. CLIs are kinda fun to use and make it easy to automate repetitive tasks via shell scripting!
Prompt command option argument <Enter>
Output
The general usage pattern of a CLI looks like above. The shell displays the prompt as a sign that it's ready to take in commands. The users type in the command that they want to run, along with options, and arguments. Finally ending the input by pressing the Enter key. This completes the command line of text. The command is then executed and output is printed on the terminal. But what are these arguments and options?
Arguments
Arguments are required items of information for a program. Required, in the sense that the program won't work without them. They are often positional, which means an argument's position in the line helps the program identify the argument's type.
$ cp src dst
For example, here's the cp
command. It can't function without both the required arguments. The argument in the first position will always be identified as source, and the argument in the second position will always be the destination where files from source need to be copied.
Options
An option, or a flag, is used to modify the operation of a command. As the name suggests, they are optional and have some default values. The general convention is to have hyphen(s) in front of a character or word to identify the option.
$ cp -r src dst
For example, in the cp
command, -r
changes the operation by recursively looking for files in the source, and then copying them to the destination.
Documentation
One of the criticisms of a CLI is the lack of cues to the user about the available actions, in contrast to a GUI that usually inform the user of available actions with menus, icons, or other visual cues.
Help text
$ cp --help
To overcome this limitation, many CLI programs display some documentation around the arguments and options that they support. This documentation can be viewed by invoking the CLI with the --help
option.
Manual pages
$ man termios
Some of them also have man pages (short for manual page). By default, the man
Unix command uses a terminal pager program such as more
or less
to display the large documentation for a CLI. This makes it easy for the user to scroll and search through it.
Standards
You must be wondering that there's a lot of moving parts here. Each programmer could write their CLI differently. For example, they could use -x
instead of -h
to display help text. Are there any standards to make sure every CLI follows some basic conventions?
POSIX
There is a standard and it's called POSIX. POSIX makes APIs provided by Unix-based operating systems uniform. APIs such as command-line interfaces. To follow the POSIX standard is to be POSIX-compliant.
XDG base directory specification
There's also the XDG base directory spec that dictates how a CLI should store different types of files which it needs for its function. This makes sure that a CLI doesn't save files all over the place.
$XDG_CONFIG_HOME=$HOME/.config
for configuration files$XDG_DATA_HOME=$HOME/.local/share
for data files$XDG_CACHE_HOME=$HOME/.cache
for the program cache
Continue to Part 2 — Python packages for writing CLIs