csv - How to read index data as string with Python pandas? -


i'm trying read csv file dataframe pandas, , want read index row string. however, if row index doesn't have characters, pandas handles data integer. how read string?

to specific, code follow.

sample.csv

uid,f1,f2,f3 01,0.1,1,10 02,0.2,2,20 03,0.3,3,30 

the code

df = pd.read_csv('sample.csv', index_col="uid" dtype=float) print df.index.values 

the result

>>> [1 2 3] 

but, hope result

>>> ['01', '02', '03'] 

and additional condition.

the rest of index data have numeric value , they're many , can't point them specific column names.

pass dtype param specify dtype:

in [159]: import pandas pd import io t="""uid,f1,f2,f3 01,0.1,1,10 02,0.2,2,20 03,0.3,3,30""" df = pd.read_csv(io.stringio(t), dtype={'uid':str}) df.set_index('uid', inplace=true) df.index  out[159]: index(['01', '02', '03'], dtype='object', name='uid') 

so in case following should work:

df = pd.read_csv('sample.csv', dtype={'uid':str}) df.set_index('uid', inplace=true) 

there still outstanding bug here dtype param ignored on cols treated index following doesn't work:

df = pd.read_csv('sample.csv', dtype={'uid':str}, index_col='uid') 

you can dynamically if assume first column index column:

in [171]: t="""uid,f1,f2,f3 01,0.1,1,10 02,0.2,2,20 03,0.3,3,30""" cols = pd.read_csv(io.stringio(t), nrows=1).columns.tolist() index_col_name = cols[0] dtypes = dict(zip(cols[1:], [float]* len(cols[1:]))) dtypes[index_col_name] = str df = pd.read_csv(io.stringio(t), dtype=dtypes) df.set_index('uid', inplace=true) df.info()  <class 'pandas.core.frame.dataframe'> index: 3 entries, 01 03 data columns (total 3 columns): f1    3 non-null float64 f2    3 non-null float64 f3    3 non-null float64 dtypes: float64(3) memory usage: 96.0+ bytes  in [172]: df.index  out[172]: index(['01', '02', '03'], dtype='object', name='uid') 

here read header row column names:

cols = pd.read_csv(io.stringio(t), nrows=1).columns.tolist() 

we generate dict of column names desired dtypes:

index_col_name = cols[0] dtypes = dict(zip(cols[1:], [float]* len(cols[1:]))) dtypes[index_col_name] = str 

we index name, assuming it's first entry , create dict rest of cols , assign float desired dtype , add index col specifying type str, can pass dtype param read_csv


Comments

Popular posts from this blog

Hatching array of circles in AutoCAD using c# -

ios - UITEXTFIELD InputView Uipicker not working in swift -