Summarizing data with a pivot table

A pivot table is a data analysis tool that enables you to summarize, reorganize, and explore large datasets. It aggregates data, rotates perspectives, and filters, sorts, and groups information into meaningful categories.

Key benefits

Simplify Complex Data — Break down large datasets into manageable summaries
Identify Trends — Reveal patterns not immediately obvious in raw data
Flexible Analysis — Quickly change layout and structure to answer different questions
Efficiency — Automate data analysis for faster, regular reporting

Pandas pivot table syntax

import pandas as pd

pd.pivot_table(df, values='D', index=['A', 'B'], columns=['C'], aggfunc='sum')

Practical example: analyzing odd lots in AAPL trades

This example demonstrates analyzing AAPL trade data by:

Extracting trades between 9:30 AM and 4 PM
Creating an odd_lots boolean column (shares < 100)
Extracting hour from timestamp
Pivoting to count shares by hour and lot size
Calculating percentage of odd lots per hour

import liberator

%time df = liberator.get_dataframe(
    liberator.query(
        name='trades',
        symbols='AAPL',
        as_of='2024-07-30 15:59:58',
        back_to='2024-07-30 09:30:01'
    )
)

df['odd_lots'] = df['shares'] < 100

df['Hour'] = df.apply(lambda row: str(row.timestamp)[11:13], axis=1)

pivot_table = df.pivot_table(
    values='shares',
    index='Hour',
    columns='odd_lots',
    aggfunc='sum',
    fill_value=0
)

pivot_table['% Odd Lots'] = (
    pivot_table[True] / (pivot_table[True] + pivot_table[False]) * 100
)

pivot_table

Parameter reference

Parameter	Purpose
`aggfunc`	Specifies aggregation function (`'sum'`, `'mean'`, etc.)
`values`	Target column for aggregation
`index`	Column(s) defining row groupings
`columns`	Column(s) defining column splits
`fill_value`	Default value for missing data

Using Market Calendar (mcal) to Identify Trading Dates

Melt and Wide to Long - Unpivoting a Pivot

⌘I

​Summarizing data with a pivot table

​Key benefits

​Pandas pivot table syntax

​Practical example: analyzing odd lots in AAPL trades

​Parameter reference

Summarizing data with a pivot table

Key benefits

Pandas pivot table syntax

Practical example: analyzing odd lots in AAPL trades

Parameter reference