python - Sort xls file content by mutiple column with data-type -
i have sort xls file content 4 columns in ascending order.
i converted xls file content list of list. following input
input:
data = """abc, not consider1, 101, title , subtitle, not consider2, 30/12/2015 abc, not consider1, 100, title , subtitle, not consider2, 31/12/2015 abc, not consider1, 99, bic codes, not consider2, 31/12/2015 abc, not consider1, 98, title , subtitle, not consider2, 25/12/2015 abc, not consider1, 100, atitle , subtitle, not consider2, 30/12/2015 xyz, not consider1, 100, atitle , subtitle, not consider2, 30/12/2015 xyz, not consider1, 100, atitle , subtitle, not consider2, 30/12/2015 abc, not consider1, 100, title , subtitle, not consider2, 30/12/2015"""
respective output in string format:
data = """abc, not consider1, 98, title , subtitle, not consider2, 25/12/2015 abc, not consider1, 99, bic codes, not consider2, 31/12/2015 abc, not consider1, 100, atitle , subtitle, not consider2, 30/12/2015 abc, not consider1, 100, title , subtitle, not consider2, 30/12/2015 abc, not consider1, 100, title , subtitle, not consider2, 31/12/2015 abc, not consider1, 101, title , subtitle, not consider2, 30/12/2015 xyz, not consider1, 100, atitle , subtitle, not consider2, 30/12/2015 xyz, not consider1, 100, atitle , subtitle, not consider2, 30/12/2015 """
first split data list format:
# split data list. >>> data_list = [i.split(", ") in data.split("\n")] >>> print "\n".join([", ".join(i) in data_list]) abc, not consider1, 101, title , subtitle, not consider2, 30/12/2015 abc, not consider1, 100, title , subtitle, not consider2, 31/12/2015 abc, not consider1, 99, bic codes, not consider2, 31/12/2015 abc, not consider1, 98, title , subtitle, not consider2, 25/12/2015 abc, not consider1, 100, atitle , subtitle, not consider2, 30/12/2015 xyz, not consider1, 100, atitle , subtitle, not consider2, 30/12/2015 xyz, not consider1, 100, atitle , subtitle, not consider2, 30/12/2015 abc, not consider1, 100, title , subtitle, not consider2, 30/12/2015
following sorting requirement:
- have sort index0 , if index0 have same values multiple items sort index2 if index0 , index2 same multiple items sort index3 if index0, index2 , index3 same multiple items sort index5
my logic
- to create string of index0, index2, index5 , index5
- create dictionary key step 1
- use sorted function sort key list
- again create xls file.
code:
>>> collections import defaultdict >>> data_dict = defaultdict(list) >>> in data_list: ... key = "%s%s%s%s"%(i[0].strip(), i[2].strip(), i[3].strip(), i[5].strip()) ... data_dict[key].append(i) ... >>> sorted_keys = sorted(data_dict.keys()) >>> >>> in sorted_keys: ... j in data_dict[i]: ... print j ... ... ['abc', 'do not consider1', '100', 'atitle , subtitle', 'do not consider2', '30/12/2015'] ['abc', 'do not consider1', '100', 'title , subtitle', 'do not consider2', '30/12/2015'] ['abc', 'do not consider1', '100', 'title , subtitle', 'do not consider2', '31/12/2015'] ['abc', 'do not consider1', '101', 'title , subtitle', 'do not consider2', '30/12/2015'] ['abc', 'do not consider1', '98', 'title , subtitle', 'do not consider2', '25/12/2015 '] ['abc', 'do not consider1', '99', 'bic codes', 'do not consider2', '31/12/2015'] ['xyz', 'do not consider1', '100', 'atitle , subtitle', 'do not consider2', '30/12/2015'] ['xyz', 'do not consider1', '100', 'atitle , subtitle', 'do not consider2', '30/12/2015']
but there numbers in index2 i.e. column 2 , date in index5 i.e. column5, not sorted data.
can me fix this?
you can sort multiple keys using sorted
function follows:-
sorted_list = sorted(data_list, key=lambda item: (item[0], int(item[2]), item[3])) print "\n".join([", ".join(i) in sorted_list])
returns
abc, not consider1, 98, title , subtitle, not consider2, 25/12/2015 abc, not consider1, 99, bic codes, not consider2, 31/12/2015 abc, not consider1, 100, atitle , subtitle, not consider2, 30/12/2015 abc, not consider1, 100, title , subtitle, not consider2, 31/12/2015 abc, not consider1, 100, title , subtitle, not consider2, 30/12/2015 abc, not consider1, 101, title , subtitle, not consider2, 30/12/2015 xyz, not consider1, 100, atitle , subtitle, not consider2, 30/12/2015 xyz, not consider1, 100, atitle , subtitle, not consider2, 30/12/2015
the trick let key
lambda return tuple containing values sort on, , convert third column's values integers using int()
function.
Comments
Post a Comment