Skip to content Skip to sidebar Skip to footer

How Can I Run A Python Script On Many Files To Get Many Output Files?

I am new at programming and I have written a script to extract text from a vcf file. I am using a Linux virtual machine and running Ubuntu. I have run this script through the comma

Solution 1:

I would integrate it within the Python script, which will allow you to easily run it on other platforms too and doesn't add much code anyway.

import glob
import os

# Find all files ending in 'vcf'for vcf_filename in glob.glob('*.vcf'):
    vcf_file = open(vcf_filename, 'r+')

    # Similar name with a different extension
    output_filename = os.path.splitext(vcf_filename)[0] + '.txt'
    outputfile = open(output_filename, 'w')

    # Process the data
    ...

To output the resulting files in a separate directory I would:

import glob
import os

output_dir = 'processed'
os.makedirs(output_dir, exist_ok=True)

# Find all files ending in 'vcf'for vcf_filename in glob.glob('*.vcf'):
    vcf_file = open(vcf_filename, 'r+')

    # Similar name with a different extension
    output_filename = os.path.splitext(vcf_filename)[0] + '.txt'
    outputfile = open(os.path.join(output_dir, output_filename), 'w')

    # Process the data
    ...

Solution 2:

You don't need write shell script, maybe this question will help you?

How to list all files of a directory?

Solution 3:

It depends on how you implement the iteration logic.

  1. If you want to implement it in python, just do it;

  2. If you want to implement it in a shell script, just change your python script to accept parameters, and then use shell script to call the python script with your suitable parameters.

Solution 4:

I have a script I frequently use which includes using PyQt5 to pop up a window that prompts the user to select a file... then it walks the directory to find all of the files in the directory:

pathname = first_fname[:(first_fname.rfind('/') + 1)] #figures out the pathname by finding the last '/'new_pathname = pathname + 'for release/'#makes a new pathname to be added to the names of new files so that they're put in another directory...but their names will be altered file_list = [f for f in os.listdir(pathname) if f.lower().endswith('.xls') and not 'map' in f.lower() and not 'check' in f.lower()] #makes a list of the files in the directory that end in .xls and don't have key words in the names that would indicate they're not the kind of file I want

You need to import os to use the os.listdir command.

Solution 5:

You can use listdir(you need to write condition to filter the particular extension) or glob. I generally prefer glob. For example

import os
import glob
for file in glob.glob('*.py'):
    data = open(file, 'r+')
    output_name = os.path.splitext(file)[0]
    output = open(output_name+'.txt', 'w')
    output.write(data.read())

This code will read the content from input and store it in outputfile.

Post a Comment for "How Can I Run A Python Script On Many Files To Get Many Output Files?"