So you want to know how to copy a file in Python? Good! It's very useful to learn and most complex applications that you may design will need at least some form of copying files.

Copying a Single File in Python

Alright, let's get started. This first section will describe how to copy a single file (not a directory) to another location on the hard disk.

Python has a special module called shutil for simple, high level file operations that is useful when copying single files.

Here's an example of a function that will copy a single file to a destination file or folder (with error handling/reporting):

And that's it! We just call that method, and the file is copied. If the source or destination file doesn't exist, we print an error notifying the user that the operation has failed. If the source and destination files are the same, we don't copy them and notify the user of the failed operation.

Python shutil's Different Copy Methods

The module shutil has several methods for copying files other than the simple copy method that we have seen above.

I'll go over them in some detail here, explaining the differences between them and situations where we might need them.

shutil.copyfileobj(fsrc, fdst[, buffer_length])

This function allows copying of files with the actual file objects themselves. If you've already opened a file to read from and a file to write to using the built-in open function, then you would use shutil.copyfileobj. It is also of interest to use this function when it is necessary to specify the buffer length of the copy operation. It may help, when copying large files, to increase the buffer length from its default value of 16 KB in order to speed up the copy operation.

All of the other copy functions mentioned below call this function at some point. It is the "base" copy method.

Let's look at a benchmark for copying files using a 50 KB, 100 KB, 500 KB, 1 MB, 10 MB, and 100 MB buffer size vs a normal copy operation. We will test an archived file in iso format of 3.2 GB.

We'll be using this function to specify the buffer size:

And Ubuntu's built in time bash command to time the operation.

Here are the results:

50 KB: 29.539s
100 KB: 27.423s
500 KB: 25.245s
1 MB: 26.261s
10 MB: 25.521s
100 MB: 24.886s

As you can see, there is quite a big difference between the buffer sizes. Almost a 16% decrease in the amount of time it took using a 50 KB buffer size to using a 100 MB buffer size.

The optimal buffer size ultimately depends on the amount of RAM you have available as well as the file size.

shutil.copyfile(src, dst)

This method copies a file from the source, src, to the destination, dst. This differs from copy in that you must ensure that the destination path exists and also contains the file name. For example, '/home/' would be invalid because it's the name of a directory. '/home/test.txt' would be valid because it contains a file name.

shutil.copy(src, dst)

The copy that we used above detects if the destination path contains a file name or not. If the path doesn't contain a file name, copy uses the original file name in the copy operation. It also copies the permission bits to the destination file.

You would use this function if you are uncertain of the destination path format or if you'd like to copy the permission bits of the source file.

shutil.copy2(src, dst)

This is the same as the copy function we used except it copies the file metadata with the file. The metadata includes the permission bits, last access time, last modification time, and flags.

You would use this function over copy if you want an almost exact duplicate of the file.

Comparison of Python File Copying Functions

Below we can see a comparison of shutil's file copying functions, and how they differ.

FunctionCopies MetadataCopies PermissionsCan Specify Buffer
shutil.copyNoYesNo
shutil.copyfileNoNoNo
shutil.copy2YesYesNo
shutil.copyfileobjNoNoYes

That's pretty much the end of copying files. I hope you benefited from this article, and I hope it was worth your time to learn a little bit of file operations in Python.

Until the next time!

Ciao!

About The Author

  • jam

    Nice description.

    Do you know if copyfile and move are
    synchronous, I mean next python instruction is executed only when the
    copy/move is terminated. I had abnormal results with an external USB
    disk.
    Best regards

    • http://jacksonc.com Jackson Cooper

      Thanks!

      Yeah, they’re both synchronous. If you want to make them asynchronous, I’d say the easiest option would be to put them in a thread. But keep in mind that move() is usually very quick on the same filesystem (depends on the filesystem being used), because it doesn’t actually move the data, it just changes the pointer. move() is the same as renaming a file, also.

      • jean

        Thanks for the answer, but I don’t understand the problem I experienced when copying a great number of video files from an internal disk to an external one : some files were not copied. Problem on buffering in Ubuntu ?
        But anyway, thanks a lot

        • rApeNB

          My English is poor. The problem may be Disk I/O 。
          I suggest you can use time.sleep(0.5)
          or
          use context manager to open and write files, like
          with open(‘src’, ‘rb’) as f1:
          with open(‘desc’, ‘wb’) as f2:
          f2.wirte(f1.read())