IT Learning

実践形式でITのお勉強

Python

【Python】How to datetime aggregation by pandas

投稿日:

Description

Pandas is powerful method to deal with the data and here we are showing how to aggregate by this method with some examples.

Data for test

We are preparing test data like following.
以下のようなテスト用の時系列データフレームを用意します。

import pandas as pd
from pandas import Series

datetime = pd.date_range(start='2021-12-31',end='2022-01-02', freq='30s')

df = pd.DataFrame(range(len(datetime)), index=date, columns = ['val'])
df.head()
df.tail()

Method1 : resample

The resumple function is actually for downsampling but it’s also useful to aggregate by grouping datetime.

You have to remember to set datatime column as index in advance.

df['datetime'] = pd.to_datetime(df['datetime'])
df.set_index('datetime', inplace=True)

And then you set the character which represents year, month, day, hour, minutes, second and microseconds. This is examples of aggregating by year and month.

df.resample('Y').sum()
df.resample('M').sum()

Method2 : Grouper

Second way is to use pandas Grouper. This function has flexibility of grouping for aggregation.

This is a simple example to aggregate by year.

df.groupby(pd.Grouper(freq="Y")).count()
df.groupby(pd.Grouper(freq="M")).count()

You can use groupby and Grouper in a way similar to resample function.

Grouper can group not only index but also other columns.

Here is a example of grouping not index column.。

df_notindex = df.reset_index().rename(columns={'index':'datetime'})
df_notindex.head()

You can set the column you like that is not index as a key like following.

df_notindex.groupby(pd.Grouper(key="datetime",freq="Y")).count()

Moreover, you can set multiple columns for grouping. So It seems more useful for grouping than resample function.

また、複数のカラムでグループしたい場合はgroupby自体の引数の中にリストで渡すことで実現できますので、汎用性が高いです。

pandas.groupby([pd.Grouper(key="datetime",freq="Y"),'column1','column2']).count()

Summary

  • You can use resample function or groupby & Grouper to aggregate with datetime.
  • Grouper is more useful than resample function for the purpose of grouping.

Related

-Python

執筆者:


comment

Your email address will not be published. Required fields are marked *