When working with timestamped market data, you often need to extract specific components like the date, time, or hour. Rather than using slower row-by-row loops, you can use vectorized operations for much better performance.
String Slicing with Lambda Functions
The primary approach uses apply() with lambda functions to extract substrings from timestamp data:
df = liberator.get_dataframe(
liberator.query(
name='minute_bars',
symbols='SPY',
as_of='2024-07-26',
back_to='2024-07-25'
)
)
df['Date'] = df.apply(lambda row: str(row.timestamp)[0:10], axis=1)
df['Time'] = df.apply(lambda row: str(row.timestamp)[11:16], axis=1)
df['Hour'] = df.apply(lambda row: str(row.timestamp)[11:13], axis=1)
df['DateTime'] = df.apply(lambda row: str(row.timestamp)[0:16], axis=1)
df['DateHour'] = df.apply(lambda row: str(row.timestamp)[0:13], axis=1)
df[['timestamp', 'Date', 'Time', 'Hour', 'DateTime', 'DateHour']]
This creates new columns by slicing the timestamp string at specific positions:
| Column | Slice | Example Output |
|---|
Date | [0:10] | 2024-07-26 |
Time | [11:16] | 14:30 |
Hour | [11:13] | 14 |
DateTime | [0:16] | 2024-07-26 14:30 |
DateHour | [0:13] | 2024-07-26 14 |
Alternative: FLOOR Method Using Pandas DateTime
You can also use pandas built-in datetime methods:
df['seconds'] = df['timestamp'].dt.floor('s')
df['Date'] = pd.to_datetime(df["timestamp"]).dt.date
String slicing with lambda functions provides fast processing for extracting timestamp components, making it preferable to manual loops for DataFrame operations.