Phyton How to format IPython html display of Pandas dataframe?

How can I format IPython html display of pandas dataframes so that

  1. numbers are right justified
  2. numbers have commas as thousands separator
  3. large floats have no decimal places

I understand that numpy has the facility of set_printoptions where I can do:

int_frmt:lambda x : '{:,}'.format(x)
np.set_printoptions(formatter={'int_kind':int_frmt})

and similarly for other data types.

But IPython does not pick up these formatting options when displaying dataframes in html. I still need to have

pd.set_option('display.notebook_repr_html', True)

but with 1, 2, 3 as in above.

Edit: Below is my solution for 2 & 3 ( not sure this is the best way ), but I still need to figure out how to make number columns right justified.

from IPython.display import HTML
int_frmt = lambda x: '{:,}'.format(x)
float_frmt = lambda x: '{:,.0f}'.format(x) if x > 1e3 else '{:,.2f}'.format(x)
frmt_map = {np.dtype('int64'):int_frmt, np.dtype('float64'):float_frmt}
frmt = {col:frmt_map[df.dtypes[col]] for col in df.columns if df.dtypes[col] in frmt_map.keys()}
HTML(df.to_html(formatters=frmt))
Answer:1

This question was asked a long time ago. Back then, pandas didn't yet include pd.Styler. It was added in version 0.17.1.

Here's how you would use this to achieve your desired goal and some more:

  • Center the header
  • right-align any number columns
  • left-align the other columns.
  • Add a formatter for the numeric columns like you want
  • make it so that each column has the same width.

Here's some example data:

In [1]:
df = pd.DataFrame(np.random.rand(10,3)*2000, columns=['A','B','C'])
df['D'] = np.random.randint(0,10000,size=10)
df['TextCol'] = np.random.choice(['a','b','c'], 10)
df.dtypes

Out[1]:
A          float64
B          float64
C          float64
D            int64
TextCol     object
dtype: object

Let's format this using df.style:

# Construct a mask of which columns are numeric
numeric_col_mask = df.dtypes.apply(lambda d: issubclass(np.dtype(d).type, np.number))

# Dict used to center the table headers
d = dict(selector="th",
    props=[('text-align', 'center')])

# Style
df.style.set_properties(subset=df.columns[numeric_col_mask], # right-align the numeric columns and set their width
                        **{'width':'10em', 'text-align':'right'})\
        .set_properties(subset=df.columns[~numeric_col_mask], # left-align the non-numeric columns and set their width
                        **{'width':'10em', 'text-align':'left'})\
        .format(lambda x: '{:,.0f}'.format(x) if x > 1e3 else '{:,.2f}'.format(x), # format the numeric values
                subset=pd.IndexSlice[:,df.columns[numeric_col_mask]])\
        .set_table_styles([d]) # center the header

Result using pd.Styler


Note that instead of calling .format on the subset columns, you can very well set the global default pd.options.display.float_format instead:

pd.options.display.float_format = lambda x: '{:,.0f}'.format(x) if x > 1e3 else '{:,.2f}'.format(x)
Answer:2



I am trying to read in a csv file with numpy.genfromtxt but some of the fields are strings which contain commas. The strings are in quotes, but numpy is not recognizing the quotes as defining a ...

I am trying to read in a csv file with numpy.genfromtxt but some of the fields are strings which contain commas. The strings are in quotes, but numpy is not recognizing the quotes as defining a ...

NLTK version 3.4.5. Python 3.7.4. OSX version 10.14.5. Upgrading the codebase from 2.7, started running into this issue just now. I've done a fresh no-cache reinstall of all packages and extensions, ...

NLTK version 3.4.5. Python 3.7.4. OSX version 10.14.5. Upgrading the codebase from 2.7, started running into this issue just now. I've done a fresh no-cache reinstall of all packages and extensions, ...