Introduction
Pandas is an important data science library and everybody involved in data science uses it extensively. It presents the data in the form of a table similar to what we see in excel. If you have worked with excel, you must be aware that you can customize your sheets, add colors to the cells, and mark important figures that need extra attention.
Pandas Dataframe
Let's create a sample dataset and display the output data frame.
Introduction to Pandas Styling
import pandas as pd
import numpy as np
np.random.seed(100)
df = pd.DataFrame({'A': np.linspace(1,10,10)})
df=pd.concat([df,pd.DataFrame(np.random.randn(10,5),columns=list('BCDEF'))],axis=1)
df.iloc[3,3] = np.nan
df.iloc[1,2] = np.nan
df
Styling the Data Frame: To access all the styling properties for the pandas data frame, you need to use the accessor. This accessor helps in the modification of the styler object(df.style), which controls the display of the data frame on the web
df.style
Highlight Min-Max-null values
.highlight_max() For highlighting maximum number values in dataframe
.highlight_min() For highlighting minimum values in dataframe.
.highlight_null() for highlighting null values in dataframe.
df.style.highlight_max()
df.style.highlight_min()
df.style.highlight_null()
set_na_rep()
along with highlighting the missing values, they may be represented as ‘nan’
df.style.set_na_rep("OutofSpace").highlight_null()
Create Heatmap within data frame
Heatmaps are used to represent values with the color shades. the higher is the color shade, the larger is the value present. These color shades represent the intensity of values as compared to other values. To plot such a mapping in the data frame itself.
.background_gradient()
df.style.background_gradient()
Table Properties
As mentioned earlier also, the data frame presented in the jupyter notebooks is a table rendered using HTML and CSS. The table properties can be controlled using the “set_properties” method.
Example = Making table borders green with text color as purple.
df.style.set_properties(**{'border':'1px solid green','color':'magenta'})
Create Bar Charts
Just as the heatmap, the bar charts can also be plotted within the data frames itself.
“.bar()”
df.style.bar()
Control precision
The current values of the dataframe have float values and their decimals have no boundary condition. Even the column”A”, which had to hold a single value is having too many decimal places.to control this behavior, you can use the “.set_precision()” function and pass the values for maximum decimals to be allowed.
df.style.set_precision(2)
Add captions
Like every image has a caption that defines the post text, you can add captions to your dataframes using the .set_caption() function
df.style.set_caption("this is heatmap with background gradient ").set_precision(2).background_gradient()
Hiding Index or Column
As the title suggests, you can hide the index or any particular column from dataframe.
df.style.hide_index()
Control display values
Using the styler objects “.format()” function, you can distinguish between the actual values held by the dataframes and the values you present. the “format” function takes in the format spec string that defines how individual values are presented.
df.style.format("{:.3%}")
df.style.format("{:.3%}",na_rep=" ")
Create your Own Styling Method
You can create your function and use it with the styler object in two ways:
- apply function: When you chain the “apply” function to the styler object, it sends out the entire row(series) or the dataframe depending upon the axis selected. if you make your function work with the “apply” function.
- apply map function: This function sends out scaler values (or element wise) and therefore, your functioon should return a scaler only with CSS attribute values pair.
def highlighting_mean_greater(s):'''highlighting yello value is greater than mean else red'''is_max = s>s.mean()return['background-color:yellow' if i else 'background-color:red'for i in is_max]df.style.apply(highlighting_mean_greater)
Target : apply map Function
def color_negative_red(val):"""Takes a scalar and returns a string withthe css property `'color: red'` for negativestrings, black otherwise."""color = 'red' if val < 0 else 'black'return 'color: %s' % colordf.style.apply(color_negative_red)styles = [dict(selector="tr:hover",props=[("background", "#f4f4f4")]),dict(selector="th", props=[("color", "#fff"),("border", "1px solid #eee"),("padding", "12px 35px"),("border-collapse", "collapse"),("background", "#00cccc"),("text-transform", "uppercase"),("font-size", "18px")]),dict(selector="td", props=[("color", "#999"),("border", "1px solid #eee"),("padding", "12px 35px"),("border-collapse", "collapse"),("font-size", "15px")]),dict(selector="table", props=[("font-family" , 'Arial'),("margin" , "25px auto"),("border-collapse" , "collapse"),("border" , "1px solid #eee"),("border-bottom" , "2px solid #00cccc"),]),dict(selector="caption", props=[("caption-side", "bottom")])]df.style.set_table_styles(styles).set_caption("Image by Author (Made in Pandas)").highlight_max().highlight_null(null_color='red')
Export to Excel
You can store all the styling you have done on your dataframe in an excel file. The “.to_excel” function on the styler object makes it possible. The function needs two parameters: the name of the file to be saved (with extension XLSX) and the “engine” parameter should be “openpyxl”.
df.style.set_precision(2).background_gradient().hide_index().to_excel('styled.xlsx', engine='openpyxl')
Conclusion
In this detailed article, we saw all the built-in methods to style the data frame. Then we looked at how to create custom styling functions and then we saw how to customize the data frame by modifying it at HTML and CSS levels. We also saw how to save our styled data frame into excel files.