Check if a file exists in a directory with Python

With Python there are several methods which can be used to check if a file exists, in a certain directory. When checking if a file exists, often it is performed right before accessing (reading and/or writing) a file. Below we will go through each method of checking if a file exists (and whether it is accessible), and discuss some of the potential issues with each one.

1. os.path.isfile(path)

This function returns true if the given path is an existing, regular file. It follows symbolic links, therefore it's possible for os.path.islink(path) to be true while os.path.isfile(path) is also true. This is a handy function to check if a file exists, because it's a simple one liner. Unfortunately the function only checks whether the specified path is a file, but does not guarantee that the user has access to it. It also only tells you that the file existed at the point in time you called the function. It is possible (although highly unlikely), that between the time you called this function, and when you access the file, it has been deleted or moved/renamed.

For example, it may fail in the following scenario:

[python]
>>> os.path.isfile('foo.txt')
True
>> f = open('foo.txt', 'r')
Traceback (most recent call last):
File "", line 1, in
IOError: [Errno 13] Permission denied: 'foo.txt'
[/python]

2. os.access(path, mode)

This function tests if the current user (with the real uid/gid) has access (read and/or write privileges) to a given path. To test if the file is readable os.R_OK can be used, and os.W_OK can be used to determine if the file is writable. For example, as follows.

[python]
>>> # Check for read access to foo.txt
>>> os.access('foo.txt', os.R_OK)
True # This means the file exists AND you can read it.
>>>
>>> # Check for write access to foo.txt
>>> os.access('foo.txt', os.W_OK)
False # You cannot write to the file. It may or may not exist.
[/python]

If you are planning on accessing a file, using this function is somewhat safer (although not completely recommend) because it also checks if you can access (reading or writing) the file. However if you plan on accessing the file, it is possible (although unlikely), that in between the time you check it is accessible and the time you access it, it has been deleted or moved/renamed. This is known as a race condition, and should be avoided. The following is an example of how it can happen.

[python]
>>> # The file 'foo.txt' currently exists and is readable.
>>> if os.access('foo.txt', os.R_OK):
>>> # After executing os.access() and before open(),
>>> # another program deletes the file.
>>> f = open('foo.txt', 'r')
Traceback (most recent call last):
File "", line 1, in
IOError: [Errno 2] No such file or directory: 'foo.txt'
[/python]

3. os.stat()

In addition to the file existence check, the os.stat() function provides a powerful way to retrieve detailed information about a file in Python. This function allows you to access a wide range of file attributes, including permissions, size, ownership, timestamps, and more.

While it offers comprehensive file details, it requires additional setup and error handling compared to some of the more straightforward methods.

The os.stat() function operates on a specified file path and returns a stat_result object that encapsulates various file attributes. Here's an example showcasing the usage of os.stat():

import os

path_to_file = '/the/path/to/file.txt'
try:
    file_details = os.stat(path_to_file)
    print('This file exists.')
    # Accessing specific attributes
    print(f"File size: {file_details.st_size} bytes")
    print(f"File permissions: {file_details.st_mode}")
    print(f"Last modified: {file_details.st_mtime}")
    # ... and more attributes available in the 'stat_result' object
except OSError:
    print('The file is not in the specified path.')

In the example above, os.stat() is used to retrieve information about the file specified by path_to_file. The stat_result object, file_details, contains various attributes that can be accessed using dot notation.

Here, we demonstrate accessing the file size (st_size), permissions (st_mode), and the last modified timestamp (st_mtime). Of course, you can skip accessing these attributes entirely and still use os.stat().

It's important to note that os.stat() may raise an OSError if the file is inaccessible or does not exist. Thus, enclosing the os.stat() call within a try-except block is crucial for proper error handling, as demonstrated in the code above.

4. os.listdir()

As the os.listdir() function's name suggests, it returns a list of the directories and files located inside the specified directory. However, you can also use this function in the os module to verify whether a specific file exists.

You might have guessed that this can be done by searching through the function's output (list of files and directories) for the file's name that you want to check exists.
Here's the function in action:
import os

path_to_file = '/the/path/to/file.txt'
directory, name_of_file = os.path.split(path_to_file)
if name_of_file in os.listdir(directory):
    print('The function found that the file exists.')
else:
    print('The file is not in the specified path.')

When the condition name_of_file in os.listdir(directory) evaluates to True, it indicates the file exists. Conversely, if the file is not found, the condition evaluates to False, indicating that the file does not exist in the specified path.

It's important to note that os.listdir() returns only the names of files and directories present within the specified directory, not their full paths. Therefore, we extract the directory path separately using os.path.split() to compare the file name against the list of names returned by os.listdir().

5. Using the glob Module

Globbing refers to the process of specifying sets of directories and files with wildcard characters. Python's standard library features the glob module, using which you can search for files with what are called "globbing" patterns.

To find a file with this module, you must specify its globbing pattern. Of course, you must first import the module.

You can then use the glob.glob function to search for a file with a matching globbing pattern. The function returns a sequence of names of the directories and files having the same pattern. At this stage, all you have to do is use an if statement to see if the outputted list is empty or not.

import glob

# Defining globbing pattern
pattern = '/path/to/*.txt'
files = glob.glob(pattern)
if files:
    print('The specified globbing pattern was found, the file exists')
else:
    print('The specified globbing pattern not was found, the file does not exist')

6. Using the shutil Module

The shutil module is another standard library module in Python that enables its users to perform high-level operations such as moving, copying, and deleting files.

This is the right module to rely on if you want to check if an executable file exists in your PATH variable.

When you provide a filename to the shutil.which() function, it scans for that file within the directories specified in the PATH environment variable. It provides the path to the executable file if it discovers it and returns None if it is not found.

Let's see the function in action:

import shutil

# Check if an executable file exists
executable_name = 'python'
if shutil.which(executable_name):
    print('The executable file exists.')
else:
    print('The executable file does not exist.')

7. Using the subprocess Module

The subprocess module in Python allows you to run programs outside of Python and capture their output. Python's subprocess module enables capturing outputs from other programs running on the machine.

While it is not specifically designed for file existence checks, you can leverage its functionality by executing commands that help determine file existence.

After importing the module, you must define the file's name you're trying to check and define the ls command and appropriate arguments.

You can then use the subprocess.run() function to run and capture the command's output. The function will return a CompletedProcess object containing the command's output. You can search for your file's name in the output with the in operator.

Let's see this method in action:

import subprocess

name_of_file_name = 'file.txt'
command = ['ls', '/path/to/']
output = subprocess.run(command, capture_output=True)
if name_of_file in output.stdout:
    print('The ls command found the file via subprocess module')
else:
    print(' The ls command did not find the file via subprocess module')

8. Using Pathlib Module

The pathlib module is a valuable addition to Python's standard library, introduced in Python 3.4 and later versions.

It offers an object-oriented and consistent interface for working with file paths, making code more readable and concise. Further, it comprises methods that enable file system operations, including deletion and renaming.

So, with pathlib, you can conveniently manipulate file paths and check the existence of files without relying on external libraries.

Using this method to verify a file's existence involves defining the file's path and creating a Path object for it. Of course, all this can be done only after importing the module.

Next, you can use the Path.exists method to verify the file's existence. If it exists, the function will output the Boolean True. Like in the examples discussed earlier, you can use the if statement to handle the output:

from pathlib import Path

path_to_file_path = '/path/to/file.txt'
path = Path(path_to_file)
if path.exists():
    print('The file exists')
else:
    print('The file does not exist')

9. Attempting to access (open) the file.

In order to absolutely guarantee that the file not only exists, but is accessible at the current time, the easiest method is actually attempting to open the file.

[python]
try:
f = open('foo.txt')
f.close()
except IOError as e:
print('Uh oh!')
[/python]

This can be transformed into an easy to use function, as follows.

[python]
def file_accessible(filepath, mode):
''' Check if a file exists and is accessible. '''
try:
f = open(filepath, mode)
f.close()
except IOError as e:
return False

return True
[/python]

For example, you can use it as follows:

[python]
>>> # Say the file 'foo.txt' exists and is readable,
>>> # whereas the file 'bar.txt' doesn't exist.
>>> foo_accessible = file_accessible('foo.txt', 'r')
True
>>>
>>> bar_accessible = file_accessible('bar.txt', 'r')
False
[/python]

So... which is best?

Whichever method you decide to use depends on why you need to check if a file exists, whether speed matters, and often how many files you are trying to open at any given time. In many cases os.path.isfile should suffice just fine. However keep in mind that when using any of the methods, each has its own list of benefits and potential issues.

It's important to note that these methods primarily serve different purposes, such as retrieving file information, opening files, or obtaining file statistics. While they may indirectly indicate file existence, the methods aren't designed to verify a file's existence.

How to Check if a File Exists in a Directory with Python