User Tools

Site Tools


python:exelprocess

This is an old revision of the document!


Exel Processing

Function in Exel

Trước khi xử lý các hàm tìm kiếm Vlookup and Hlookup, If trong exel cần phải format dữ liệu và sort trước → Lúc đó hàm xử lý mới chạy đúng VLOOKUP($B2,omni!$A$1:$C$4778,3,FALSE) → Tìm chính xác nhờ tuỳ chọn FALSE

Install

  • Install on windows:
    easy_install.exe xlrd
    easy_install.exe xlwt
    easy_install.exe xlutils
  • Install on linux:
    pip install xlrd
    pip install xlwt
    pip install xlutils

Using xlrd

Basic APIs:

  • xlrd.open_workbook('adminup/shop.xlsx') return xlrd.book.Book object

Example codes:

Open Workbook and get sheets

  • Simple Open workbook
    import xlrd
     
    xlsfile = xlrd.open_workbook('adminup/shop.xlsx')
    print xlsfile

    output:

    <xlrd.book.Book object at 0x023263B0>
  • Get sheets:
    import xlrd
    from xlrd.book import Book
     
    xlsfile = xlrd.open_workbook('adminup/shop.xlsx')
    print xlsfile.sheets()

    output:

    [<xlrd.sheet.Sheet object at 0x02256530>, <xlrd.sheet.Sheet object at 0x022564D0>, <xlrd.sheet.Sheet object at 0x022569F0>]

Using sheet(xlrd.sheet.Sheet)

  • dump sheet to get basic informations
    import xlrd
    from xlrd.book import Book
    from xlrd.sheet import Sheet
     
    xlsfile = xlrd.open_workbook('adminup/shop.xlsx')
    for sheet in xlsfile.sheets():
        print '******************************'
        sheet.dump()

    output for sheet1:

    _cell_types: <type 'list'>, len = 8
    _cell_values: <type 'list'>, len = 8
    _dimncols: 4
    _dimnrows: 8
    _first_full_rowx: 3
    
    name: u'Th\xf4ng tin shop'
    ncols: 4
    nrows: 8
    number: 0
    

    output for sheet2:

    _cell_types: <type 'list'>, len = 6
    _cell_values: <type 'list'>, len = 6
    _dimncols: 3
    _dimnrows: 54
    _first_full_rowx: 2
    
    name: u'Danh m\u1ee5c'
    ncols: 3
    nrows: 6
    number: 1
  • Get sheet by name: Using http://www.rapidmonkey.com/unicodeconverter/ to convert utf-8 to ASCII(Unicode Escaped) Content
    import xlrd
    from xlrd.book import Book
    from xlrd.sheet import Sheet
     
    xlsfile = xlrd.open_workbook('adminup/shop.xlsx')
    sheet_info = xlsfile.sheet_by_name(u'Th\xf4ng tin shop')
    sheet_cat = xlsfile.sheet_by_name(u'Danh m\u1ee5c')
    sheet_products = xlsfile.sheet_by_name(u'S\u1EA3n ph\u1EA9m')

    Or

    import xlrd
    from xlrd.book import Book
    from xlrd.sheet import Sheet
     
    xlsfile = xlrd.open_workbook('adminup/shop.xlsx')
    sheet_info = None
    sheet_cat = None
    sheet_products = None
    for sheet in xlsfile.sheets():
        if sheet.name == u'Th\xf4ng tin shop':
            sheet_info = sheet
        elif sheet.name == u'Danh m\u1ee5c':
            sheet_cat = sheet
        elif sheet.name == u'S\u1EA3n ph\u1EA9m':
            sheet_products = sheet
    if sheet_info is not None:
        print sheet_info.dump()        

Parse xls file

import xlrd
from xlrd.book import Book
from xlrd.sheet import Sheet
nhapkhaufile = 'nhapkhau.xlsx'
xuatkhaufile = 'xuatkhau.xlsx'
constantsfile = 'constants.xlsx'
def parsexls(filename):
    xlsfile = xlrd.open_workbook(filename)
    for sheet in xlsfile.sheets():
        n_cells = sheet.ncols             
        for curr_row in range(sheet.nrows):
            curr_cell = 0
            while curr_cell < n_cells:
                cell_type = sheet.cell_type(curr_row, curr_cell)
                if cell_type == xlrd.XL_CELL_EMPTY:
                    curr_cell += 1
                    continue
                value = sheet.cell_value(curr_row, curr_cell)
                curr_cell += 1
                print value
parsexls(constantsfile)  

Error Parse UTF-8 file

UnicodeDecodeError: 'utf16' codec can't decode bytes in position

⇒ Check exel format in encoding(not utf-8)

Using xlwt

python/exelprocess.1618022457.txt.gz · Last modified: 2022/10/29 16:15 (external edit)