Comparing Bash and Python for Linux scripting
Sh (from English shell) is a mandatory command interpreter for UNIX-compatible systems according to the POSIX standard. However, its capabilities are limited, so more feature-rich command interpreters such as Bash or Ksh are often used instead. Ksh is typically used in BSD-family operating systems, while Bash is used in Linux-family operating systems. Command interpreters simplify solving small tasks related to working with processes and files. This article will focus on Linux operating systems, so the discussion will revolve around Bash.
Python, on the other hand, is a full-fledged interpreted programming language, often used for writing scripts or solving small application tasks. It is hard to imagine a modern UNIX-like system without both sh and Python, unless it is a device with a minimalist OS like a router. For example, in Ubuntu Oracular, the python3
package cannot be removed because the grub-common
package depends on it, which in turn depends on grub2-common
and, consequently, grub-pc
, the actual operating system bootloader. Thus, Python 3 can confidently be used as a replacement for Bash when necessary.
When solving various tasks at the OS or file system level, the question may arise: which language, Bash or Python, is more advantageous to use in a particular case? The answer depends on the task at hand. Bash is advantageous when you need to quickly solve a simple task related to process management, file search, or modification. However, as the logic becomes more complex, Bash code can become cumbersome and difficult to read (although readability primarily depends on the programmer). Of course, you can break the code into scripts and functions, create sh-libraries, and connect them via the source
command, but covering them with modular tests becomes challenging.
Preface
Who is this article for?
This article is for those who are interested in system administration, are familiar with one of the two languages, and want to understand the other. Or for those who want to learn about some features of Bash and Python that they might not have known before. Basic command-line skills and familiarity with programming fundamentals are required to understand the material.
For a complete picture, including code readability, the article will compare debugging capabilities, syntax, and various use cases. Similar examples in both languages will be provided. In Python code, you may occasionally see commas at the end of enumerations—this is not an error. Such a style is considered good practice because it avoids marking the last element as modified when adding new elements to the enumeration.
The article will consider Bash version 3.0 or higher and Python version 3.7 or higher.
Debugging Scripts
Both languages are interpreted, meaning that during script execution, the interpreter knows a lot about the current execution state.
Debugging in Bash
Debugging via xtrace
Bash supports the xtrace
option (-x
), which can be set either in the command line when starting the interpreter or within the script itself:
#!/bin/bash
# Specify where to write logs, open the file for writing:
exec 3>/path/to/log/file
BASH_XTRACEFD=3 # which file descriptor to output debug information to
set -x # enable debugging
# ... code to debug ...
set +x # disable debugging
Such logs can also be written to the systemd journal if implementing a simple service:
#!/bin/bash
# Specify where to write logs:
exec 3> >(systemd-cat --priority=debug)
BASH_XTRACEFD=3 # which stream to output debug information to
set -x # enable debugging
# ... code to debug ...
set +x # disable debugging
Debugging in Bash will show which commands are being executed and with which arguments. If you need to get the current values of variables or the code of executed functions, you can do this with the set
command without arguments. However, since the output of the command can be quite large, set
is more suitable for manual debugging than for event logging.
Debugging via trap
Another debugging method is setting handlers for command execution using the trap
command on the special “trap” DEBUG. The commands being executed can be obtained through the built-in variable BASH_COMMAND
. However, you cannot get the return code from this handler because it is executed before the command itself is called.
trap 'echo "+ $BASH_COMMAND"' DEBUG
But it will be more useful to intercept errors and output the command and line number where the error occurred. To inherit this interception by functions, you also need to set the functrace
option:
set -o functrace
trap 'echo "+ line $LINENO: $BASH_COMMAND -> $?"' ERR
# Test:
ls "$PWD"
ls unknown_file
Debugging in Python
Debugging via pdb
Python has rich debugging and logging tools. For debugging, Python has the pdb
module. You can run a script with debugging enabled from the console, in which case the debug mode will be activated in exceptional situations:
python3 -m pdb my_script.py
Directly in the code, you can set breakpoints using the built-in breakpoint()
function.
#!/usr/bin/python3
import os
breakpoint()
# Now you can try, for example, the source os command:
# (Pdb) source os
The language is object-oriented, and everything in it is an object. You can see the methods available for an object using the dir()
command. For example, dir(1)
will show the methods available for the object 1
. Example of calling one of these methods: (1).bit_length()
. In many cases, this helps to understand arising questions without the need to read the documentation. In debug mode, you can also use the dir()
command to get information about objects and print()
to get variable values.
Logging via the logging module
Python provides the logging
module, which allows you to log debug information with specified logging levels and log sources. In general, logging looks something like this:
import logging
logging.basicConfig(
filename="myscript.log",
level = logging.DEBUG, # output DEBUG, INFO, WARNING, ERROR, and CRITICAL levels
)
logger = logging.getLogger('MyApp')
logger.debug('Some debug information')
logger.error('Some error')
Comparison of Bash and Python Semantics
Variables and Data Types
Primitive Data Types
In Bash, all variables are strings, but string variables can also be used as numbers. For arithmetic calculations, the syntax $(( expression ))
is used.
str_var="some_value" # string, array of characters
int_var=1234 # string "1234", but can be used in calculations
int_var=$(( 1 + (int_var - 44) / 111 - 77 )) # string: "-66"
In Python:
str_var="some_value" # class str
int_var = 1234 # class int
int_var = 1 + (int_var - 44) // 111 - 77 # -66, class int
Floating-point numbers are not supported in Bash. And this is logical, because if you need to use floating-point numbers in command-line scripts, you are clearly doing something at the wrong level or in the wrong programming language. However, floating-point numbers are supported in Ksh.
String Formatting
Both Bash and Python support variable substitution in formatted strings. In Bash, formatted strings are strings enclosed in quotes, while in Python, they are strings with the f
prefix.
Both languages also support C-like style output of formatted strings. In Bash, this way you can even format floating-point numbers, although the language itself does not support them (the decimal separator is determined by the locale).
var1='Some string'
var2=0,5
echo "Variable 1: $var1, variable 2: $var2"
# Variable 1: Some string, variable 2: 0,5
# Without the current locale
LANG=C
printf 'String: %s, number: %d, floating-point number: %f.n'
'str' '1234' '0.1'
# With the current locale
printf 'String: %s, number: %d, floating-point number: %f.n'
'str' '1234' '0,1'
# String: str, number: 1234, floating-point number: 0,100000.
In Python:
var1 = 'Some string'
var2 = 0.5
print(f"Variable 1: var1, variable 2: var2")
# Variable 1: Some string, variable 2: 0.5
# Without the current locale:
print('String: %s, number: %d, floating-point number: %f.'
% ('str', 1234, 0.1))
# String: str, number: 1234, floating-point number: 0.100000.
# With the current locale:
import locale
locale.setlocale('') # apply the current locale
print(locale.format_string('String: %s, number: %d, floating-point number: %f.',
('str', 1234, 0.1)))
# String: str, number: 1234, floating-point number: 0,100000.
You can notice a difference regarding the locale—in Python, the print()
function ignores the locale. If you need to output values considering the locale, you must use the locale.format_string()
function.
Arrays
In Bash, arrays are essentially text separated by spaces (by default). The syntax is very specific; for example, to copy an array (via @
), you must enclose all its elements in quotes; otherwise, any spaces in the elements themselves will cause the element to be split into parts. But in general, working with arrays is similar in simple cases:
arr=( 'First item' 'Second item' 'Third item' )
echo "$arr[0]" "$arr[1]" "$arr[2]"
arr_copy="$arr[@]" # copying the array, quotes are mandatory
arr[0]=1
arr[1]=2
arr[2]=3
echo "$arr[@]"
echo "$arr_copy[0]" "$arr_copy[1]" "$arr_copy[2]"
In Python:
arr = [ 'First', 'Second', 'Third' ]
print(arr[0], arr[1], arr[2])
arr_copy = arr.copy() # but you can also do it like in Bash: [ *arr ]
arr[0] = 1
arr[1] = 2
arr[2] = 3
print(*arr)
print(arr_copy[0], arr_copy[1], arr_copy[2])
The *
operator in Python performs unpacking of lists, dictionaries, iterators, etc. That is, the elements of the array are listed as if they were separated by commas as arguments.
Associative Arrays
Bash also supports associative arrays (unlike Sh), but the capabilities for working with them are limited. In Python, associative arrays are called dictionaries, and the language provides very rich capabilities for working with them.
declare -A assoc_array=(
[name1]='Value 1'
[name2]='Value 2'
[name3]='Value 3'
)
# Assigning a value by key:
assoc_array['name4']='Value 4' # assigning a value
# Element-wise access:
echo "$assoc_array['name1']"
"$assoc_array['name2']"
"$assoc_array['name3']"
"$assoc_array['name4']"
echo "$!assoc_array[@]" # output all keys
echo "$assoc_array[@]" # output all values
# Iterate over all elements
for key in "$!assoc_array[@]"; do
echo "Key: $key"
echo "Value: $assoc_array[$key]"
done
In Python:
assoc_array =
'name1': 'Value 1',
'name2': 'Value 2',
'name3': 'Value 3',
# Assigning a value by key:
assoc_array['name4'] = 'Value 4'
# Element-wise access
print(
assoc_array['name1'],
assoc_array['name2'],
assoc_array['name3'],
assoc_array['name4']
)
print(*assoc_array) # output all keys
print(*assoc_array.values()) # output all values
for key, value in assoc_array.items():
print(f"Key: key")
print(f"Value: value")
Module Importing
In Bash, there are no modules as such. But you can execute a script in the current interpreter using the source
command. Essentially, this is analogous to importing modules, since all functions of the included script become available in the current interpreter’s namespace. In Python, there is full support for modules with the ability to import them. Moreover, the Python standard library contains a large number of modules for a wide variety of use cases. Essentially, what is implemented in Bash by third-party command-line utilities may be available in Python as standard library modules (and if not, you can install additional libraries).
In Bash:
# Include the file mylib.sh with some functions:
source mylib.sh
# Let's see the list of available functions (all of them):
declare -F
In Python:
# Import the module mylib.py or mylib.pyc:
import mylib
# Let's see the list of available objects in the mylib module:
print(dir(mylib))
Conditionals and Loops
Conditional Operator
In Bash, conditions work on two principles: either a command is provided as a condition, and its return code is checked, or built-in Bash double square or double round brackets are used. In the case of a return code, 0 is true (everything is fine), while in the case of double round brackets, it’s the opposite—the result of an arithmetic expression is checked, where 0 is false.
In Python, the standard approach for programming languages is used: False
, 0
, ''
, []
, set()
, —all of these are equated to
False
. Non-empty, non-zero values are equated to True
.
In Bash:
if [[ "$PWD" == "$HOME" ]]; then
echo 'Current directory: ~'
elif [[ "$PWD" == "$HOME"* ]]; then
echo "Current directory: ~$PWD#$HOME"
else
echo "Current directory: $PWD"
fi
if (( UID < 1000 )); then
echo "You are logged in as a system user. Please log in as yourself."
fi
In Python:
import os
curr_dir = os.environ['PWD']
home_dir = os.environ['HOME']
if curr_dir == home_dir:
print('Current directory: ~')
elif curr_dir.startswith(home_dir):
print('Current directory: ~' + curr_dir[len(home_dir):])
else:
print(f"Current directory: curr_dir")
if os.environ['UID'] < 1000:
print('You are logged in as a system user. Please log in as yourself.')
Loops
Both languages support for
and while
loops.
Loop with Element Iteration
In both languages, the for
loop supports element iteration via the in
operator. In Bash, elements of an array or elements of a string separated by separators recorded in the IFS
variable (default: space, tab, and newline) are iterated. In Python, the in
operator allows iterating over any iterable objects, such as lists, sets, tuples, and dictionaries, and is safer to work with.
In Bash:
# Recoding text files from CP1251 to UTF-8
for filename in *.txt; do
tmp_file=`mktemp`
iconv -f CP1251 -t UTF-8 "$filename" -o "$tmp_file"
mv "$tmp_file" "$filename"
done
In Python:
import glob
from pathlib import Path
# Recoding text files from CP1251 to UTF-8
for filename in glob.glob('*.txt'):
file = Path(filename)
text = file.read_text(encoding='cp1251')
file.write_text(text, encoding='utf8')
Loop with Counter
A loop with a counter in Bash looks unusual; the form for arithmetic calculations is used (((initialization; conditions; actions after iteration))
).
In Bash:
# Get a list of all locally registered hosts:
mapfile -t lines < <(grep -P -v '(^s*$|^s*#)' /etc/hosts)
# Output the list with numbering:
for ((i = 0; i < "$#lines[@]"; i += 1)); do
echo "$((i + 1)). $lines[$i]"
done
In Python:
from pathlib import Path
import re
def is_host_line(s):
return not re.match(r'(^s*$|^s*#)', s)
lines = list(filter(is_host_line, Path('/etc/hosts').read_text().splitlines()))
for i in range(0, len(lines)):
print(f"i + 1. lines[i]")
Functions
As in regular languages, Bash supports functions. In essence, functions in Bash are similar to separate scripts—they can also accept arguments like regular scripts, and they return a return code. But, unlike Python, they cannot return a result other than a return code. However, you can return text through the output stream.
In Bash:
some_function()
echo "Script: $0."
echo "Function: $FUNCNAME."
echo "Function arguments:"
for arg in "$@"; do
echo "$arg"
done
return 0
some_function One Two Three Four Five
echo $? # Return code
In Python:
import inspect
def some_function_is_ok(*args):
try: # If suddenly run from the interpreter
script_name = __file__
except:
script_name=""
print('Script: ' + script_name)
print('Function: ' + inspect.getframeinfo(inspect.currentframe()).function)
print('Function arguments:')
print(*args, sep='n')
return True
result = some_function_is_ok('One', 'Two', 'Three', 'Four', 'Five')
print(result) # True
Input, Output, and Error Streams
The input stream is used to receive information by a process, while the output stream outputs information. Why streams and not regular variables? Because in streams, information can be processed as it arrives. Since information from the output stream can undergo further processing, error messages can break this information. Therefore, errors are output to a separate error stream. However, when running a command in interactive mode, these streams are mixed. Since these are streams, they can be redirected, for example, to a file. Or vice versa, read a file into the input stream. In Bash, the input stream has the number 0, the output stream—1, and the error stream—2. If the stream number is not specified in the redirection operator to a file, the output stream is redirected.
Writing to a File
Writing to a file in Bash is done using the >
operator, which redirects the command’s output to the specified file. In Python, you can write text files using the pathlib
module or standard means—by opening a file via the open()
function. The latter option is more complex but is well-known to programmers.
In Bash:
# Clear a text file by redirecting an empty string to it:
echo -n > some_text_file.txt
# Write a line to a file, overwriting it:
echo 'Line 1' > some_other_text_file.txt
# Append a line to a file:
echo 'Line 2' >> some_other_text_file.txt
In Python:
from pathlib import Path
# Overwrite the file with an empty string (make it empty):
Path('some_text_file.txt').write_text('')
# Overwrite the file with a line:
Path('some_other_text_file.txt').write_text('Line 1')
# Open the file for appending (a):
with open('some_other_text_file.txt', 'a') as fd:
print('Line 2', file=fd)
Writing Multi-line Text to a File
For multi-line text in Bash, there is a special heredoc format (an arbitrary label after <<<
, the repetition of which on a new line will mean the end of the text), which allows redirecting arbitrary text to the command’s input stream, and from the command, it can be redirected to a file (and here you can’t do without the external cat
command). Redirecting file contents to a process is much simpler.
In Bash:
# Redirect multi-line text to a file for appending:
cat <<<EOF >> some_other_text_file.txt
Line 3
Line 4
Line 5
EOF
# Redirect file contents to the cat command:
cat < some_other_text_file.txt
In Python:
# Open the file for appending (w+):
with open('some_other_text_file.txt', 'w+') as fd:
print("""Line 3
Line 4
Line 5""", file=fd)
# Open the file for reading (r):
with open('some_other_text_file.txt', 'r') as fd:
# Output the file contents line by line:
for line in fd:
print(line)
# Or fd.read(), but then the entire file will be read into memory.
Reading from a File
In Bash, reading from a file is done via the <
sign. In Python, you can read in the standard way via open()
, or simply via Path(...).read_text()
:
In Bash:
cat < some_other_text_file.txt
In Python:
import pathlib
print(Path('some_other_text_file.txt').read_text())
Stream Redirection
Streams can be redirected not only to a file or process but also to another stream.
In Bash:
error()
# Redirect the output stream and error stream to the error stream (2).
>&2 echo "$@"
error 'An error occurred.'
In Python:
print('An error occurred.', file=sys.stderr)
In simple cases, redirection to a file or from a file in Bash looks much clearer and simpler than writing to a file or reading from it in Python. However, in complex cases, Bash code will be less understandable and more difficult to analyze.
Executing External Commands
Running external commands in Python is more cumbersome than in Bash. Although, of course, there are simple functions subprocess.getoutput()
and subprocess.getstatusoutput()
, but they lose the advantage of Python in terms of passing each individual argument as a list element.
Getting Command Output
If you simply need to get text from a command and you are sure that it will always work, you can do it as follows:
In Bash:
cmd_path="`which ls`" # backticks execute the command and return its output
echo "$cmd_path" # output the command path
In Python:
import subprocess
cmd_path = subprocess.getoutput("which ls").rstrip('n')
print(cmd_path) # output the path to the ls command
But getting command output via backticks in Bash will be incorrect if you need to get an array of lines. In Python, subprocess.getoutput()
accepts a command line, not an array of arguments, which carries some risks when substituting values. And both options do not ignore the return code of the executed command.
Running a utility in Python to get some list into a variable will take much more code than in Bash, although the code in Python will be much clearer and simpler:
In Bash:
mapfile -t root_files < <(ls /) # put the list of files from / into root_files
echo "$root_files[@]" # Output the list of files
In Python:
import subprocess
result = subprocess.run(
['ls', "https://techplanet.today/"], # we are sure that such a command exists
capture_output = True, # get the command output
text = True, # interpret input and output as text
)
root_files = result.stdout.splitlines() # get lines from the output
print(*root_files, sep='n') # output one file per line
Getting and Processing Return Codes
With full error handling, it becomes even more complicated, adding checks that complicate the code:
In Bash:
root_files="`ls /some/path`" # Run the command in backticks
if [[ $? != 0 ]]; then
exit $?
fi
echo "$root_files[@]" # Output the list of files
In Python:
import subprocess
import sys
result = subprocess.run(
['ls', '/some/path'],
capture_stdout = True, # get the command output
text = True, # interpret input and output as text
shell = True, # to get the return code, not an exception, if the command does not exist
)
if result.returncode != 0:
sys.exit(result.returncode)
root_files = result.stdout.split('n') # get lines from the output
del root_files[-1] # the last line will be empty due to n at the end, delete it
print(*root_files, sep='n') # output one file per line
Executing a Command with Only Getting the Return Code
Executing a command with only getting the return code is slightly simpler:
In Bash:
any_command any_arg1 any_arg2
exit_code=$? # get the return code of the previous command
if [[ $exit_code != 0 ]]; then
exit 1
fi
In Python:
import subprocess
import sys
result = subprocess.run(
[
'any_command',
'any_arg1',
'any_arg2',
],
shell = True, # to get the error code of a non-existent process, not an exception
)
if result.returncode != 0:
sys.exit(1)
Exceptions Instead of Handling Return Codes
But everything becomes even simpler if the script exit mode on any error is enabled. In Python, this approach is used by default; errors do not need to be checked manually; a function can throw an exception and crash the process.
In Bash:
set -o errexit # crash on command errors
set -o pipefail # the entire pipeline fails if there is an error inside the pipeline
critical_command any_arg1 any_arg2
In Python:
import subprocess
subprocess.run(
[
'critical_command',
'any_arg1',
'any_arg2',
],
check = True, # throw an exception on a non-zero return code
)
In some cases, exceptions can be caught and handled. In Python, this is done via the try
operator. In Bash, such catches are done via the usual if
operator.
In Bash:
set -o errexit # crash on command errors
set -o pipefail # the entire pipeline fails if there is an error inside the pipeline
if any_command any_arg1 any_arg2; then
do_something_else any_arg1 any_arg2
fi
In Python:
import subprocess
try:
subprocess.run(
[
'critical_command',
'any_arg1',
'any_arg2',
],
check = True, # throw an exception on a non-zero return code
)
except:
subprocess.run(
[
'do_something_else',
'any_arg1',
'any_arg2',
],
check = True, # throw an exception on a non-zero return code
)
In high-level languages, error handling via exceptions is preferred. The code becomes simpler and clearer, meaning there is less chance of making a mistake, and code review becomes cheaper. Although sometimes such checks look more cumbersome than a simple return code check. Whether to use this style of error handling largely depends on whether such exception checks will be frequent or will be in exceptional cases.
Building Pipelines
In Bash, pipelines are common practice, and the language itself has syntax for creating pipelines. Since Python is not a command interpreter, it is done a bit more cumbersome via the subprocess
module.
In Bash:
ls | grep -v '.txt$' | grep 'build'
In Python:
import subprocess
p1 = subprocess.Popen(
['ls'],
stdout = subprocess.PIPE, # to pass output to the next command
text = True,
)
p2 = subprocess.Popen(
[
'grep',
'-v',
'\.txt$'
],
stdin = p1.stdout, # create a pipeline
stdout = subprocess.PIPE, # to pass output to the next command
text = True,
)
p3 = subprocess.Popen(
[
'grep',
'build',
],
stdin = p2.stdout, # create a pipeline
stdout = subprocess.PIPE, # already for reading from the current process
text = True,
)
for line in p3.stdout: # read line by line as data arrives
print(line, end='') # each line already ends with n
Pipelines with Parallel Data Processing
In Bash, pipelines can be created both between commands and between commands and interpreter blocks. For example, you can redirect a pipeline to a line-by-line reading loop. In Python, processing data from a parallel process is also done by simple line-by-line reading from the process’s output stream.
In Bash:
# Get a list of files containing some text:
find . -name '*.txt'
| while read line; do # sequentially get file paths
if [[ "$line" == *'text'* ]]; then # substring in string
echo "$line"
fi
done
In Python:
import subprocess
p = subprocess.Popen(
[
'find',
'.',
'-name',
'*.txt'
],
stdout=subprocess.PIPE,
text=True,
)
while True:
line = p.stdout.readline().rstrip('n') # there is always n at the end
if not line:
break
if 'text' in line: # substring in string
print(line)
Parallel Process Execution with Waiting for Completion
In Bash, running a process in the background is supported at the language syntax level (the &
operator), and you can run both individual commands in the background and parts of the interpreter (for example, functions or loops). But at this level of complexity, the code will often be simpler and clearer if it is written in Python, especially since the standard library provides capabilities that at the command interpreter level are implemented by third-party utilities that need to be considered as dependencies.
In Bash:
unalias -a # in case someone copies directly into the terminal
get_size_by_url()
url="$1"
# Get the file size from the Content-Length field of the response headers to a HEAD request
curl --head --silent --location "$url"
download_range()
url="$1"
start=$2
end=$3
output_file="$4"
((curr_size = end - start + 1))
curl
--silent
--show-error
--range "$start-$end"
"$url"
--output -
download_url()
url="$1"
output_file="$2"
((file_size = $(get_size "$url")))
# Allocate disk space for the file in advance:
fallocate -l "$file_size" "$output_file"
range_size=10485760 # 10 MiB
# Divide into parts of up to 100 MiB:
((ranges_count = (file_size + range_size - 1) / range_size))
declare -a pids ## We will save all process identifiers
for ((i = 0; i < ranges_count; i += 1)); do
((start = i * range_size))
((end = (i + 1) * range_size - 1))
if ((end >= file_size)); then
((end = file_size - 1))
fi
# Start downloading in the background:
download_range "$url" $start $end "$output_file" &
pids[$i]=$! # remember the PID of the background process
done
wait "$pids[@]" # wait for the processes to complete
In Python:
import requests
from multiprocessing import Process
import os
def get_size_by_url(url):
response = requests.head(url)
return int(response.headers['Content-Length'])
def download_range(url, start, end, output_file):
req = requests.get(
url,
headers = 'Range': 'bytes=" + str(start) + "-' + str(end) ,
stream = True,
)
req.raise_for_status()
with open(output_file, 'r+b') as fd:
fd.seek(start)
for block in req.iter_content(4096):
fd.write(block)
def download_url(url, output_file):
file_size = get_size_by_url(url)
range_size = 10485760 # 10 MiB
ranges_count = (file_size + range_size - 1) // range_size
with open(output_file, 'wb') as fd:
# Allocate space for the file in advance:
os.posix_fallocate(fd.fileno(), 0, file_size)
processes = []
for i in range(ranges_count):
start = i * range_size
end = start + range_size - 1
if end >= file_size:
end = file_size - 1
# Prepare the process and run it in the background:
process = Process(
target = download_range, # this function will work in the background
args = (url, start, end, output_file),
)
process.start()
processes.append(process)
for process in processes:
process.join() # wait for each process to complete
Process Substitution
A separate topic worth mentioning is process substitution in Bash via the <(...)
construct, since not everyone knows about it, but it makes life much easier. Sometimes you need to pass streams of information from other processes to commands, but the commands themselves can only accept file paths as input. You could redirect the output of processes to temporary files, but such code would be cumbersome. Therefore, Bash has support for process substitution. Essentially, a virtual file is created in the /dev/fd/
space, through which information is transmitted by passing the name of this file to the necessary command as a regular argument.
In Bash:
# Find common processes on two hosts:
comm
<(ssh user1@host1 'ps -x --format cmd' | sort)
<(ssh user2@host2 'ps -x --format cmd' | sort)
In Python:
from subprocess import check_output
def get_common_lines(lines1, lines2):
i, j = 0, 0
common = []
while i < len(lines1) and j < len(lines2):
while lines2[j] < lines1[i]:
j += 1
if j >= len(lines2):
return common
while lines2[j] > lines1[i]:
i += 1
if i >= len(lines1):
return common
common.append(lines1[i])
i += 1
j += 1
return common
lines1 = check_output(
['ssh', 'user1@host1', 'ps -x --format cmd'],
text = True,
).splitlines()
lines1.sort()
lines2 = check_output(
['ssh', 'user2@host2', 'ps -x --format cmd'],
text = True,
).splitlines()
lines2.sort()
print(*get_common_lines(lines1, lines2), sep='n')
Environment Variables
Working with Environment Variables
Environment variables allow passing information from parent processes to child processes. Bash has built-in support for environment variables at the language level, but there is no associative array of all environment variables. Information about them can only be obtained via the external env
command.
In Bash:
# Assigning a value to an environment variable:
export SOME_ENV_VAR='Some value'
echo "$SOME_ENV_VAR" # getting the value
env # output the list of environment variables using an external command
In Python:
import os
# Assigning a value to an environment variable:
os.environ['SOME_ENV_VAR'] = 'Some value'
print(os.environ['SOME_ENV_VAR']) # getting the value
print(os.environ) # output the array of environment variables
Setting Values for Individual Processes
Environment variables are passed from the parent process to child processes. Sometimes you may need to change only one environment variable. Since Python is positioned as an application programming language, it will be somewhat more complicated to do this in Python, while in Bash, support for such variable setting is built-in:
In Bash:
# Set Russian localization for launched applications
export LANG='ru_RU.UTF-8'
LANG='C' ls --help # but run this command with English localization
echo "LANG=$LANG" # make sure the environment variables are not affected
In Python:
import os
import subprocess
# Assigning a value to an environment variable:
os.environ['LANG'] = 'ru_RU.UTF-8'
new_env = os.environ.copy()
new_env['LANG'] = 'C'# Assigning a value to an environment variable:
export SOME_ENV_VAR='Some value'
echo "$SOME_ENV_VAR" # getting the value
subprocess.run(
['ls', '--help'],
env = new_env,
)
print('LANG=' + os.environ['LANG']) # make sure the environment variables are not affected
Executing Arbitrary Code
Executing arbitrary code is not required in everyday situations, but both languages have this capability. In Bash, this may be useful, for example, to return variables modified by the process or to return named execution results. In Python, there are two operators: eval()
and exec()
. The Bash eval
analog in this case is the exec()
operator, since it allows executing a list of commands, not just evaluating expressions. Using eval()
and exec()
is very bad practice in Python, and these operators can always be replaced with something more suitable, unless you need to write your own command interpreter based on Python.
In Bash:
get_user_info()
echo "user=`whoami`"
echo "curr_dir=`pwd`"
eval $(get_user_info) # execute the command output
echo "$user"
echo "$curr_dir"
In Python:
import getpass
import os
def user_info_code():
return f"""
user="getpass.getuser()" # very bad practice
curr_dir="os.getcwd()" # please don't do this
"""
exec(user_info_code())
print(user)
print(curr_dir)
# But returning named values in general
# is better through classes, namedtuple, or dictionaries
from collections import namedtuple
import getpass
import os
UserInfo=namedtuple('UserInfo', ['user', 'curr_dir'])
def get_user_info():
return UserInfo(getpass.getuser(), os.getcwd())
info = get_user_info()
print(info.user)
print(info.curr_dir)
Working with the File System and Processes
Getting and Changing the Current Directory
Changing the current directory in the command line is usually required when doing something manually. But getting the current directory may be needed in scripts, for example, if the script or the program being launched does something with files in the current directory. For the same reason, you may need to change the current directory if you need to run another program that does something in it.
In Bash:
current_dir=`pwd` # get the current directory
echo "$current_dir"
cd /some/path # change to a directory
In Python:
import os
current_dir = os.getcwd() # get the current directory
print(current_dir)
os.chdir('/some/path') # change to a directory
Working with Signals
In Bash, the kill
command is built-in, which is why man kill
will display help for a completely different command with different arguments. By the way, sudo kill
will already call the kill
utility. But Python code is still slightly clearer.
In Bash:
usr1_handler()
echo "Received USR1 signal"
# Set a handler for the SIGUSR1 signal:
trap 'usr1_handler' USR1
# Send a signal to the current interpreter:
kill -USR1 $$ # $$ — PID of the parent interpreter
Compilation Capability
Bash by definition does not support compiling its scripts, which is perhaps why everything in it strives for minimalism in names. Python, although interpreted, can be compiled into platform-independent bytecode executed by the Python Virtual Machine (PVM). Executing such code can improve script performance. Usually, bytecode files have the .pyc
extension.
Choosing a Language Depending on the Task
As a summary of the article, the main postulates can be formed about which language is better to use in which cases.
Bash is more advantageous to use in cases:
- solving simple tasks that can be solved faster with good knowledge of the language;
- simple command-line scripts where work is done with processes, files, directories, or even hard drives and the file system;
- if wrappers are created over other commands (starting a command interpreter can be faster than starting the Python interpreter);
- if Python is not available in the system for some reason.
Python is more suitable for cases:
- solving tasks related to text processing, mathematical calculations, or implementing non-trivial algorithms;
- if Bash code would be difficult to read and understand;
- if you need to cover the code with unit tests (the
unittest
module); - if you need to parse a large set of command-line parameters with a hierarchy of options between commands;
- if you need to display graphical dialog boxes;
- if script performance is critical (starting in Python may be slower, but executing code can be faster);
- for creating constantly running services (systemd services).
In case you have found a mistake in the text, please send a message to the author by selecting the mistake and pressing Ctrl-Enter.
https://techplanet.today/storage/posts/2025/01/09/vLGhTfUF5Me5OGcBOxQ8olJ5EorUMzmoAlIydKfc.webp
2025-01-09 12:39:00