Overview

When working with an existing Python script, particularly a legacy script, or a script that was supposed to be used once and then thrown away but grew into a business critical application (yep, this happens), it can be common to see extensive usage of print or logging statements. Those statements can be spread across the program code and often provide useful information regarding the status of the process while the script is being executed.

However, if you have been writing a new script and have finished working on it, or if the script output is not of interest any longer, you most likely wouldn’t want to clutter the Python console with print/logging outputs (particularly if the script is part of another larger pipeline). However, the information emitted can still be useful to get logged.

Redirecting to a file

Instead of removing each print statement (or switching to logging.debug from logging.info), it is possible to specify to what file the sys.stdout will redirect writing to. This will make the print and logging calls to write to a file on disk instead.

import sys

# keeping the original reference to the output destination
stdout = sys.stdout

print("Started script")

# redirecting the print statements to the file
f = open('log.txt', 'a')
sys.stdout = f

# main program execution, gets logged to a file
print("Getting work done")

# setting it to the original output destination
sys.stdout = stdout
f.close()

print("Finished script")

Now, when running the program, the print() calls within the main program logic are being redirected to a file on disk.

$ python3 program_print.py
Started script
Finished script
$ cat log.txt 
Getting work done

Redirecting to StringIO

It is also possible to use the io.StringIO() object to capture everything that will be written to the stdout for the whole script or only a portion of it.

import sys
from io import StringIO

print("Started script")

# to capture anything that will be written to the stdout
buf = StringIO()
stdout = sys.stdout
sys.stdout = buf

print('Getting work done')

sys.stdout = stdout
# collecting what has been written into a variable
captured = buf.getvalue()

print("Finished script\n")
print(captured)

Now, when running the program, the print() calls within the main program logic are being collected into a variable (which is printed here for examination, but can be used for any custom logging).

$ python3 program_stringio_var.py
Started script
Finished script

Getting work done

Overriding the sys.stdout.write method

In both of the examples above, the text that was sent to the original stdout wasn’t shown in the console (it’s either simply suppressed or captured into a variable). However, it can be sometimes useful to print the output both to the console and put the output into a variable. For this use case, we are essentially after what the tee command does in Linux (which can read stdin and then write it to both the stdout and to a file). In Python, this can be achieved by overriding the sys.stdout.write method.

import sys
from io import StringIO


class StdOutTee:
    def __init__(self, *authors):
        self.authors = authors

    def write(self, text):
        for author in self.authors:
            author.write(text)


print("Started script")

# to capture anything that will be written to the stdout
buf = StringIO()
stdout = sys.stdout
sys.stdout = StdOutTee(buf, stdout)

print('Getting work done 1')
print('Getting work done 2')

sys.stdout = stdout
# collecting what has been written into a variable
captured = buf.getvalue()

print("Finished script\n")

print(captured)

Now, when running the program, the print() calls within the main program logic are being collected into a variable (which is printed here for examination, but can be used for any custom logging). However, all the print() statements are printed as well.

$ python3 program_tee.py
Started script
Getting work done 1
Getting work done 2
Finished script

Getting work done 1
Getting work done 2

Buffering and flushing

When you run a Python program, if the standard output (stdout) of its process is redirected to some other target (different from your active terminal), then the output of this process will be buffered into a buffer. Therefore, output of Python programs that have any text sent to the stdout may be buffered and not shown until the newline character (\n) is sent.

This program won’t print anything in your Python console or terminal when being run:

import time 
for i in range(5): 
    print(i, end=" ") 
    time.sleep(.2)

In contrast, if there is a print call (which by default has the newline character as its end parameter), the output will be shown; however, all the numbers will be printed at once (not one after another with 0.2 second interval) :

import time 
for i in range(5): 
    print(i, end=" ") 
    time.sleep(.2) 
print()

To be able to see each number being printed instead of waiting for the loop to complete and see them all at once, one can change the stdout buffering with the stdbuf utility. However, the end parameter has to be a newline character:

$ stdbuf -oL python3 program.py > result.log

Alternatively, one can use the flush parameter of the print function:

import time 
for i in range(5): 
    print(i, flush=True) 
    time.sleep(2)

and the call becomes (running tail -F result.log will let you see numbers printed in real time):

$ python3 std.py > result.log

A solution that does not involve flushing is to set the PYTHONUNBUFFERED environment variable. When this environment variable is set, the stdout of the Python process will be sent to the active terminal in real time (which can be useful for tailing any application logs, particularly inside a Docker container).

The same effect can also be achieved by passing the -u parameter:

$ python3 -u std.py > result.log

Happy printing!


Published

Category

python

Tags