python script to inventorise your ab1 files with md5sums

November 7, 2008 at 11:35 am (opensource, tips) (, , , , )

The problem that needed solving this time was having a list of filenames of my ab1 files, location( directory path) and a md5sum so i know if duplicate filenames are the same file or just a result of misnaming.

managed to come up with this after copying from two different scripts

one that was used to make an inventory of  a directory of ogg songs and the other a python equivalent of md5sum check in linux.

Have fun!

#!/usr/bin/python
#===============================================================================
#
#         FILE:  inventory-abi.py
#
#        USAGE:  ./inventory-abi.py
#
#  DESCRIPTION:  Lists all the files of extension .ab1 with the directory and its md5sum
#  adapated from code from http://pthree.org/2007/08/09/recursion-in-python/ and
#  used md5sum code from http://code.activestate.com/recipes/266486/
#      OPTIONS:  ---
# REQUIREMENTS:  ---
#         BUGS:  will execute md5 on directory as well
#                current method to get CWD is not OS independent
#        NOTES:  ---
#       AUTHOR:  Kevin ,
#      VERSION:  1.0
#      CREATED:  11/07/2008 07:03:16 PM SGT
#     REVISION:  ---
#===============================================================================

import dircache, os, md5
counter = 0

def sumfile(fobj):
    '''Returns an md5 hash for an object with read() method.'''
    m = md5.new()
    while True:
        d = fobj.read(8096)
        if not d:
            break
        m.update(d)
    return m.hexdigest()

def md5sum(fname):
    '''Returns an md5 hash for file fname, or stdin if fname is "-".'''
    if fname == '-':
        ret = sumfile(sys.stdin)
    else:
        try:
            f = file(fname, 'rb')
        except:
            return 'Failed to open file'
        ret = sumfile(f)
        f.close()
    return ret

def PrintFiles(indent):
    global counter
    thisDir = os.getcwd()

    for file in dircache.listdir(thisDir):
        if (file.endswith('ab1') or os.path.isdir(file)) and not file.startswith('.'):
            if file.endswith('ab1'):
                counter += 1

            currdir = os.popen("pwd") #for output of cwd currently works for linux pending upgrade to OS independent
            md5 = md5sum(file) #calls the md5sum function, md5 lib ships with Python

            ab1File.write('%s%s\t%s\t%s\n' %(indent, file, currdir.readline()[:-1], md5))

            if os.path.isdir(file):
                os.chdir(file)
                PrintFiles(indent + '  ')
                os.chdir('../')

try:
    ab1File = open('ab1files.txt', 'w')
except IOError, e:
    print "Unable to open 'ab1files.txt' for writing: ", e
else:
    PrintFiles('')
    ab1File.write('\nCurrent number of ab1 files: %d\n\n' %(counter))
    ab1File.close()
Advertisements

Permalink 4 Comments

Python script to split a text file by even or odd numbers

June 20, 2008 at 11:50 am (opensource, software, tips) (, , , , , , )

written a short script to split a file into even or odd line numbers 🙂

#!/usr/bin/python
## loop do something to each line of input file
## changed to write the even line numbers to a file
## and the odd line numbers to another
## note that even numbers start with line 0 (not 1!)
## usage: sort-even-odd.py inputfile
##  written by kevinl @ kevinl.wordpress.com

import sys

def isodd(n):
    return bool(n%2)

input=open(sys.argv[1], 'r')
L=input.readlines()
evenout=open('evenout', 'w')
oddout=open('oddout','w')

for linecount in range(len(L)):
    if isodd(linecount):
        oddout.write(L[linecount])
    else:
        evenout.write(L[linecount])
    #print "line number is " + str(linecount)

Permalink Leave a Comment