> ## Documentation Index
> Fetch the complete documentation index at: https://knowledge.cloudquant.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Summarizing Data with a Pivot Table

> Use pandas pivot tables to summarize, reorganize, and explore large market datasets by aggregating data into meaningful categories.

# Summarizing data with a pivot table

A pivot table is a data analysis tool that enables you to summarize, reorganize, and explore large datasets. It aggregates data, rotates perspectives, and filters, sorts, and groups information into meaningful categories.

## Key benefits

* **Simplify Complex Data** — Break down large datasets into manageable summaries
* **Identify Trends** — Reveal patterns not immediately obvious in raw data
* **Flexible Analysis** — Quickly change layout and structure to answer different questions
* **Efficiency** — Automate data analysis for faster, regular reporting

## Pandas pivot table syntax

```python theme={null}
import pandas as pd

pd.pivot_table(df, values='D', index=['A', 'B'], columns=['C'], aggfunc='sum')
```

## Practical example: analyzing odd lots in AAPL trades

This example demonstrates analyzing AAPL trade data by:

1. Extracting trades between 9:30 AM and 4 PM
2. Creating an `odd_lots` boolean column (shares \< 100)
3. Extracting hour from timestamp
4. Pivoting to count shares by hour and lot size
5. Calculating percentage of odd lots per hour

```python theme={null}
import liberator

%time df = liberator.get_dataframe(
    liberator.query(
        name='trades',
        symbols='AAPL',
        as_of='2024-07-30 15:59:58',
        back_to='2024-07-30 09:30:01'
    )
)

df['odd_lots'] = df['shares'] < 100

df['Hour'] = df.apply(lambda row: str(row.timestamp)[11:13], axis=1)

pivot_table = df.pivot_table(
    values='shares',
    index='Hour',
    columns='odd_lots',
    aggfunc='sum',
    fill_value=0
)

pivot_table['% Odd Lots'] = (
    pivot_table[True] / (pivot_table[True] + pivot_table[False]) * 100
)

pivot_table
```

## Parameter reference

| Parameter    | Purpose                                                  |
| ------------ | -------------------------------------------------------- |
| `aggfunc`    | Specifies aggregation function (`'sum'`, `'mean'`, etc.) |
| `values`     | Target column for aggregation                            |
| `index`      | Column(s) defining row groupings                         |
| `columns`    | Column(s) defining column splits                         |
| `fill_value` | Default value for missing data                           |
