add_dates
This is an old revision of the document!
Table of Contents
Add dates
with pandas; multiple dates
Given
In [1]: import pandas as pd df = pd.DataFrame({'begin': ['20201107', '20201121'], 'offset': [5, -6]}) df Out[1]: begin offset 0 20201107 5 1 20201121 -6 In [2]: df.dtypes Out[2]: begin object offset int64 dtype: object
we want
begin offset end 0 20201107 5 20201112 1 20201121 -6 20201115
where end date = begin date + offset
Solution:
In [3]: fmt='%Y%m%d' from datetime import datetime df['end'] = (df['begin'].apply(lambda x: datetime.strptime(x, fmt)) + pd.to_timedelta(df['offset'], 'd')).dt.strftime(fmt) df Out[3]: begin offset end 0 20201107 5 20201112 1 20201121 -6 20201115
Intermediate steps:
In [4]: df['begin'].apply(lambda x: datetime.strptime(x, fmt)) Out[4]: 0 2020-11-07 1 2020-11-21 Name: begin, dtype: datetime64[ns] In [5]: df['begin'].apply(lambda x: datetime.strptime(x, fmt)) + pd.to_timedelta(df['offset'], 'd') Out[5]: 0 2020-11-12 1 2020-11-15 dtype: datetime64[ns] In [6]: (df['begin'].apply(lambda x: datetime.strptime(x, fmt)) + pd.to_timedelta(df['offset'], 'd')).dt.strftime(fmt) Out[6]: 0 20201112 1 20201115 dtype: object
without pandas; single date
For a single date, we can do this without pandas also
In [1]: from datetime import datetime, timedelta print(datetime.now()) 2020-11-21 23:54:10.971599 In [2]: print(datetime.now() + timedelta(days=10)) 2020-12-01 23:54:32.393776
Related links
offset calculation
If the offsets are computed from floating point numbers that were originally derived from integers, watch out for round off errors.
For example, let's say:
- There are two applications App1 and App2 where App1 provides input to App2
- App1 converts the number of days (an integer value) into number of years (a floating point number) and rounds it to 9 digits and gives it to App2.
In [1]: days_original = [158, 928, 882, 341, 596, 878, 526] years_given = [round(x/365.0, 9) for x in days_original] print(years_given) [0.432876712, 2.542465753, 2.416438356, 0.934246575, 1.632876712, 2.405479452, 1.44109589]
Now if App2 tries to compute the offsets by multiplying with 365 and converting that back to integer, it will give wrong results (due to limitations in floating point representation).
In [2]: wrong_offsets = [int(x*365.0) for x in years_given] print(wrong_offsets) [157, 927, 881, 340, 595, 877, 525]
The correct approach is to multiply by 365, apply round() and then convert it back to integer.
In [3]: correct_offsets = [int(round(x*365.0)) for x in years_given] print(correct_offsets) [158, 928, 882, 341, 596, 878, 526]
add_dates.1607020537.txt.gz · Last modified: 2020/12/03 18:35 by raju