====== Exel Processing ====== ===== Function in Exel ===== Trước khi xử lý các hàm tìm kiếm Vlookup and Hlookup, If trong exel cần phải format dữ liệu và sort trước -> Lúc đó hàm xử lý mới chạy đúng VLOOKUP($B2,omni!$A$1:$C$4778,3,FALSE) -> Tìm chính xác nhờ tuỳ chọn **FALSE** ===== Install ===== * Install on windows: easy_install.exe xlrd easy_install.exe xlwt easy_install.exe xlutils * Install on linux: pip install xlrd pip install xlwt pip install xlutils ===== Using xlrd ===== Basic APIs: * **xlrd.open_workbook('adminup/shop.xlsx')** return **xlrd.book.Book** object Example codes: ==== Open Workbook and get sheets ==== * Simple Open workbook import xlrd xlsfile = xlrd.open_workbook('adminup/shop.xlsx') print xlsfile output: * Get sheets: import xlrd from xlrd.book import Book xlsfile = xlrd.open_workbook('adminup/shop.xlsx') print xlsfile.sheets() output: [, , ] ==== Using sheet(xlrd.sheet.Sheet) ==== * dump sheet to get basic informations import xlrd from xlrd.book import Book from xlrd.sheet import Sheet xlsfile = xlrd.open_workbook('adminup/shop.xlsx') for sheet in xlsfile.sheets(): print '******************************' sheet.dump() output for sheet1: _cell_types: , len = 8 _cell_values: , len = 8 _dimncols: 4 _dimnrows: 8 _first_full_rowx: 3 name: u'Th\xf4ng tin shop' ncols: 4 nrows: 8 number: 0 output for sheet2: _cell_types: , len = 6 _cell_values: , len = 6 _dimncols: 3 _dimnrows: 54 _first_full_rowx: 2 name: u'Danh m\u1ee5c' ncols: 3 nrows: 6 number: 1 * Get sheet by name: Using http://www.rapidmonkey.com/unicodeconverter/ to convert utf-8 to ASCII(Unicode Escaped) Content import xlrd from xlrd.book import Book from xlrd.sheet import Sheet xlsfile = xlrd.open_workbook('adminup/shop.xlsx') sheet_info = xlsfile.sheet_by_name(u'Th\xf4ng tin shop') sheet_cat = xlsfile.sheet_by_name(u'Danh m\u1ee5c') sheet_products = xlsfile.sheet_by_name(u'S\u1EA3n ph\u1EA9m') Or import xlrd from xlrd.book import Book from xlrd.sheet import Sheet xlsfile = xlrd.open_workbook('adminup/shop.xlsx') sheet_info = None sheet_cat = None sheet_products = None for sheet in xlsfile.sheets(): if sheet.name == u'Th\xf4ng tin shop': sheet_info = sheet elif sheet.name == u'Danh m\u1ee5c': sheet_cat = sheet elif sheet.name == u'S\u1EA3n ph\u1EA9m': sheet_products = sheet if sheet_info is not None: print sheet_info.dump() ==== Parse xls file ==== import xlrd from xlrd.book import Book from xlrd.sheet import Sheet nhapkhaufile = 'nhapkhau.xlsx' xuatkhaufile = 'xuatkhau.xlsx' constantsfile = 'constants.xlsx' def parsexls(filename): xlsfile = xlrd.open_workbook(filename) for sheet in xlsfile.sheets(): n_cells = sheet.ncols for curr_row in range(sheet.nrows): curr_cell = 0 while curr_cell < n_cells: cell_type = sheet.cell_type(curr_row, curr_cell) if cell_type == xlrd.XL_CELL_EMPTY: curr_cell += 1 continue value = sheet.cell_value(curr_row, curr_cell) curr_cell += 1 print value parsexls(constantsfile) ==== Error Parse UTF-8 file ==== UnicodeDecodeError: 'utf16' codec can't decode bytes in position 3438-3439: unexpected end of data => Check exel format in encoding(not utf-8)(May be exel file was not stored as encoding utf-8, It can be saved as utf16 default encoding) You can debug for information: xlsfile = xlrd.open_workbook(filename,verbosity=3) output: BOF: op=0x0809 vers=0x0600 stream=0x0005 buildid=14420 buildyr=1997 -> BIFF80 CODEPAGE: codepage 1200 -> encoding 'utf_16_le' DATEMODE: datemode 0 => default encoding: **'utf_16_le'** **Fix**: Lỗi do file exel, mở lại file exel và lưu lại sẽ hết lỗi ===== Using xlwt =====