I wrote some simple examples about handling excel with python in article Read Excel Table With Python.
Now let’s deal with a more complex task, to combine two different tables based on a common column.


Sheet0
 



Sheet1

Above two tables have the common column, 编号,the number of rows in sheet0 is more than 200, so it’s disgusting to copy and paste tables and create a new sheet manually. We need a better solution to get things done.
Python is a powerful tool to handle trivial stuff in daily life, we can use it to complete this task, generate the following result sheet.


 Sheet2

The python library openpyxl is very convenient to read/write xlsx/xlsm files, we will use it on this occasion, more implement details are in the following code snippet.

#coding=UTF-8

import openpyxl
fileName = "/Users/weiyang/Desktop/table1.xlsx"
resultSheetName = 'Sheet2'
wb = openpyxl.load_workbook( fileName )
sheet0 = wb.get_sheet_by_name( 'Sheet0' )
sheet1 = wb.get_sheet_by_name( 'Sheet1' )
if resultSheetName in wb.get_sheet_names():
    wb.remove_sheet( wb.get_sheet_by_name( resultSheetName ) )
sheet2 = wb.create_sheet( resultSheetName )

print openpyxl.__version__
print "size of sheet0 is:", sheet0.max_row, sheet0.max_column
print "size of sheet0 is:", sheet1.max_row, sheet1.max_column

# read info in sheet0
families = {}
for row in range( 8, sheet0.max_row+1 ):
    if sheet0.cell( row, 1 ).value != "" and sheet0.cell( row, 1).value != None :
        key = sheet0.cell( row, 1 ).value
        family = []
        person = []
        for col in range(2, 7):
            person.append( sheet0.cell( row, col ).value )
        family.append( person )
        row = row + 1
        while sheet0.cell( row, 1 ).value == None and sheet0.cell( row, 2 ).value != None:
            person = []
            for col in range(2, 6):
                person.append( sheet0.cell( row, col ).value )
            family.append( person )
            row = row + 1
        row = row - 1
        families[ key ] = family

# read info in sheet1
numberInfo = []
for row in range( 2, sheet1.max_row+1 ):
    info = []
    for col in range( 1, 4 ):
        info.append( sheet1.cell( row, col ).value )
    numberInfo.append( info )
    #numberInfo[ sheet1.cell( row, 3 ).value ] = info

# write keys in sheet2
Keys = []
for col in range(1, 4):
    Keys.append( sheet1.cell( 1, col ).value )
for col in range(1, 7):
    Keys.append( sheet0.cell( 7, col ).value )
for col in range(1, Keys.__len__()):
    sheet2.cell( 1, col ).value = Keys[col]

# write values in sheet2
sheet2Row = 2
for row in range( 0, numberInfo.__len__() ):
    info = numberInfo[row]
    key = info[2]
    family = families[key]
    #print family
    for col in range( 1, 4 ):
        sheet2.cell( sheet2Row, col ).value = info[col-1]
    for i in range( 0, family.__len__() ):
        person = family[i]
        rowInSheet2 = sheet2[sheet2Row]
        for index in range(0, person.__len__()):
            rowInSheet2[3+index].value = person[index]
        sheet2Row = sheet2Row+1

wb.save( fileName )

 

Categories: Python

0 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

You cannot copy content of this page