User Tools

Site Tools


python:exelprocess

Exel Processing

Function in Exel

Trước khi xử lý các hàm tìm kiếm Vlookup and Hlookup, If trong exel cần phải format dữ liệu và sort trước → Lúc đó hàm xử lý mới chạy đúng VLOOKUP($B2,omni!$A$1:$C$4778,3,FALSE) → Tìm chính xác nhờ tuỳ chọn FALSE

Install

  • Install on windows:
    easy_install.exe xlrd
    easy_install.exe xlwt
    easy_install.exe xlutils
  • Install on linux:
    pip install xlrd
    pip install xlwt
    pip install xlutils

Using xlrd

Basic APIs:

  • xlrd.open_workbook('adminup/shop.xlsx') return xlrd.book.Book object

Example codes:

Open Workbook and get sheets

  • Simple Open workbook
    import xlrd
     
    xlsfile = xlrd.open_workbook('adminup/shop.xlsx')
    print xlsfile

    output:

    <xlrd.book.Book object at 0x023263B0>
  • Get sheets:
    import xlrd
    from xlrd.book import Book
     
    xlsfile = xlrd.open_workbook('adminup/shop.xlsx')
    print xlsfile.sheets()

    output:

    [<xlrd.sheet.Sheet object at 0x02256530>, <xlrd.sheet.Sheet object at 0x022564D0>, <xlrd.sheet.Sheet object at 0x022569F0>]

Using sheet(xlrd.sheet.Sheet)

  • dump sheet to get basic informations
    import xlrd
    from xlrd.book import Book
    from xlrd.sheet import Sheet
     
    xlsfile = xlrd.open_workbook('adminup/shop.xlsx')
    for sheet in xlsfile.sheets():
        print '******************************'
        sheet.dump()

    output for sheet1:

    _cell_types: <type 'list'>, len = 8
    _cell_values: <type 'list'>, len = 8
    _dimncols: 4
    _dimnrows: 8
    _first_full_rowx: 3
    
    name: u'Th\xf4ng tin shop'
    ncols: 4
    nrows: 8
    number: 0
    

    output for sheet2:

    _cell_types: <type 'list'>, len = 6
    _cell_values: <type 'list'>, len = 6
    _dimncols: 3
    _dimnrows: 54
    _first_full_rowx: 2
    
    name: u'Danh m\u1ee5c'
    ncols: 3
    nrows: 6
    number: 1
  • Get sheet by name: Using http://www.rapidmonkey.com/unicodeconverter/ to convert utf-8 to ASCII(Unicode Escaped) Content
    import xlrd
    from xlrd.book import Book
    from xlrd.sheet import Sheet
     
    xlsfile = xlrd.open_workbook('adminup/shop.xlsx')
    sheet_info = xlsfile.sheet_by_name(u'Th\xf4ng tin shop')
    sheet_cat = xlsfile.sheet_by_name(u'Danh m\u1ee5c')
    sheet_products = xlsfile.sheet_by_name(u'S\u1EA3n ph\u1EA9m')

    Or

    import xlrd
    from xlrd.book import Book
    from xlrd.sheet import Sheet
     
    xlsfile = xlrd.open_workbook('adminup/shop.xlsx')
    sheet_info = None
    sheet_cat = None
    sheet_products = None
    for sheet in xlsfile.sheets():
        if sheet.name == u'Th\xf4ng tin shop':
            sheet_info = sheet
        elif sheet.name == u'Danh m\u1ee5c':
            sheet_cat = sheet
        elif sheet.name == u'S\u1EA3n ph\u1EA9m':
            sheet_products = sheet
    if sheet_info is not None:
        print sheet_info.dump()        

Parse xls file

import xlrd
from xlrd.book import Book
from xlrd.sheet import Sheet
nhapkhaufile = 'nhapkhau.xlsx'
xuatkhaufile = 'xuatkhau.xlsx'
constantsfile = 'constants.xlsx'
def parsexls(filename):
    xlsfile = xlrd.open_workbook(filename)
    for sheet in xlsfile.sheets():
        n_cells = sheet.ncols             
        for curr_row in range(sheet.nrows):
            curr_cell = 0
            while curr_cell < n_cells:
                cell_type = sheet.cell_type(curr_row, curr_cell)
                if cell_type == xlrd.XL_CELL_EMPTY:
                    curr_cell += 1
                    continue
                value = sheet.cell_value(curr_row, curr_cell)
                curr_cell += 1
                print value
parsexls(constantsfile)  

Error Parse UTF-8 file

UnicodeDecodeError: 'utf16' codec can't decode bytes in position 3438-3439: unexpected end of data

⇒ Check exel format in encoding(not utf-8)(May be exel file was not stored as encoding utf-8, It can be saved as utf16 default encoding) You can debug for information:

xlsfile = xlrd.open_workbook(filename,verbosity=3)

output:

BOF: op=0x0809 vers=0x0600 stream=0x0005 buildid=14420 buildyr=1997 -> BIFF80
CODEPAGE: codepage 1200 -> encoding 'utf_16_le'
DATEMODE: datemode 0

⇒ default encoding: 'utf_16_le'

Fix: Lỗi do file exel, mở lại file exel và lưu lại sẽ hết lỗi

Using xlwt

python/exelprocess.txt · Last modified: by 127.0.0.1