时间序列的差分与复原

在时序分析时,我们经常需要将原始序列进行差分,然后做出拟合或者预测,最后还需要将拟合的或者预测的值恢复成原始序列。这里,使用Pandas的Series中的diff和cumsum函数可以方便的实现。

一阶差分与复原

1
2
3
4
5
6
7
8
9
10
11
12
13
import matplotlib.pyplot as plt
import pandas as pd
time_series = pd.Series([2, 4, 3, 5, 6, 7, 4, 5, 6, 3, 2, 4])
time_series_diff = time_series.diff(1).dropna()
time_series_restored = pd.Series([time_series[0]], index=[time_series.index[0]]).append(time_series_diff).cumsum()
print(time_series)
print(time_series_diff)
print(time_series_restored)
plt.plot(time_series, color='red', label='time_series')
plt.plot(time_series_diff, color='green', label='time_series_diff')
plt.plot(time_series_restored, color='blue',linestyle='--', label='time_series_restored')
plt.legend()
plt.show()

多阶差分复原

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
import matplotlib.pyplot as plt
import pandas as pd
time_series = pd.Series([2,4,3,5,6,7,4,5,6,3,2,4], index=pd.date_range(start='2000', periods=12, freq='a'))
time_series_diff = time_series
diff_times = 3
first_values = []
for i in range(diff_times):
   first_values.append(pd.Series([time_series_diff[0]],index=[time_series_diff.index[0]]))
   time_series_diff = time_series_diff.diff(1).dropna()

time_series_restored = time_series_diff
for first in reversed(first_values):
   time_series_restored = first.append(time_series_restored).cumsum()
print(time_series)
print(time_series_diff)
print(time_series_restored)
plt.plot(time_series, color='red', label='time_series')
plt.plot(time_series_diff, color='green', label='time_series_diff')
plt.plot(time_series_restored, color='blue',linestyle='--', label='time_series_restored')
plt.legend()
plt.show()

参考: