Move pairs of files (.txt & .xml) into their corresponding folder using Python

1.2k Views Asked by At

I have been working this challenge for about a day or so. I've looked at multiple questions and answers asked on SO and tried to 'MacGyver' the code used for my purpose, but still having issues.

I have a directory (lets call it "src\") with hundreds of files (.txt and .xml). Each .txt file has an associated .xml file (let's call it a pair). Example:

src\text-001.txt
src\text-001.xml
src\text-002.txt
src\text-002.xml
src\text-003.txt
src\text-003.xml

Here's an example of how I would like it to turn out so each pair of files are placed into a single unique folder:

src\text-001\text-001.txt
src\text-001\text-001.xml
src\text-002\text-002.txt
src\text-002\text-002.xml
src\text-003\text-003.txt
src\text-003\text-003.xml

What I'd like to do is create an associated folder for each pair and then move each pair of files into its respective folder using Python. I've already tried working from code I found (thanks to a post from Nov '12 by Sethdd, but am having trouble figuring out how to use the move function to grab pairs of files. Here's where I'm at:

import os
import shutil

srcpath = "PATH_TO_SOURCE"
srcfiles = os.listdir(srcpath)

destpath = "PATH_TO_DEST"

# grabs the name of the file before extension and uses as the dest folder name
destdirs = list(set([filename[0:9] for filename in srcfiles]))


def create(dirname, destpath):
    full_path = os.path.join(destpath, dirname)
    os.mkdir(full_path)
    return full_path

def move(filename, dirpath):
    shutil.move(os.path.join(srcpath, filename)
                ,dirpath)

# create destination directories and store their names along with full paths
targets = [
    (folder, create(folder, destpath)) for folder in destdirs
]

for dirname, full_path in targets:
    for filename in srcfile:
        if dirname == filename[0:9]:
            move(filename, full_path)

I feel like it should be easy, but Python isn't something I work with everyday and it's been a while since my scripting days... Any help would be greatly appreciated!

Thanks,

WK2EcoD

4

There are 4 best solutions below

0
On

Use the glob module to interate all of the 'txt' files. From that you can parse and create the folders and copy the files.

0
On

The process should be as simple as it appears to you as a human.

for file_name in os.listdir(srcpath):
    dir = file_name[:9]
    # if dir doesn't exist, create it
    # move file_name to dir

You're doing a lot of intermediate work that seems to be confusing you.

Also, insert some simple print statements to track data flow and execution flow. It appears that you have no tracing output so far.

0
On

You can do it with os module. For every file in directory check if associated folder exists, create if needed and then move the file. See the code below:

import os

SRC = 'path-to-src'

for fname in os.listdir(SRC):
    filename, file_extension = os.path.splitext(fname)
    if file_extension not in ['xml', 'txt']:
        continue
    folder_path = os.path.join(SRC, filename)
    if not os.path.exists(folder_path):
        os.mkdir(folderpath)
    os.rename(
        os.path.join(SRC, fname),
        os.path.join(folder_path, fname)
    )
0
On

My approach would be:

  1. Find the pairs that I want to move (do nothing with files without a pair)
  2. Create a directory for every pair
  3. Move the pair to the directory
#! /usr/bin/env python
# -*- coding: utf-8 -*-

import os, shutil
import re

def getPairs(files):
    pairs = []
    file_re = re.compile(r'^(.*)\.(.*)$')

    for f in files:
        match =  file_re.match(f)

        if match:
            (name, ext) = match.groups()

            if ext == 'txt'  and name + '.xml' in files:
                pairs.append(name)
    return pairs

def movePairsToDir(pairs):

    for name in pairs:
        os.mkdir(name)
        shutil.move(name+'.txt', name)
        shutil.move(name+'.xml', name)


files = os.listdir()
pairs = getPairs(files)

movePairsToDir(pairs)

NOTE: This script works when called inside the directory with the pairs.