User Tools

Site Tools


add_dates

This is an old revision of the document!


Add dates

with pandas; multiple dates

Given

In [1]: 
import pandas as pd
df = pd.DataFrame({'begin': ['20201107', '20201121'], 'offset': [5, -6]})
df
Out[1]: 
      begin  offset
0  20201107       5
1  20201121      -6

In [2]: 
df.dtypes
Out[2]: 
begin     object
offset     int64
dtype: object

we want

      begin  offset       end
0  20201107       5  20201112
1  20201121      -6  20201115

where end date = begin date + offset

Solution:

In [3]: 
fmt='%Y%m%d'
from datetime import datetime
df['end'] = (df['begin'].apply(lambda x: datetime.strptime(x, fmt)) +
             pd.to_timedelta(df['offset'], 'd')).dt.strftime(fmt)
df
Out[3]: 
      begin  offset       end
0  20201107       5  20201112
1  20201121      -6  20201115

Intermediate steps:

In [4]: 
df['begin'].apply(lambda x: datetime.strptime(x, fmt))
Out[4]: 
0   2020-11-07
1   2020-11-21
Name: begin, dtype: datetime64[ns]

In [5]: 
df['begin'].apply(lambda x: datetime.strptime(x, fmt)) + pd.to_timedelta(df['offset'], 'd')
Out[5]: 
0   2020-11-12
1   2020-11-15
dtype: datetime64[ns]

In [6]: 
(df['begin'].apply(lambda x: datetime.strptime(x, fmt)) +
 pd.to_timedelta(df['offset'], 'd')).dt.strftime(fmt)
Out[6]: 
0    20201112
1    20201115
dtype: object

without pandas; single date

For a single date, we can do this without pandas also

In [1]: 
from datetime import datetime, timedelta
print(datetime.now())
2020-11-21 23:54:10.971599

In [2]: 
print(datetime.now() + timedelta(days=10))
2020-12-01 23:54:32.393776

offset calculation

If the offsets are computed from floating point numbers that were originally derived from integers, watch out for round off errors.

For example, let's say:

  • There are two applications App1 and App2 where App1 provides input to App2
  • App1 converts the number of days (an integer value) into number of years (a floating point number) and rounds it to 9 digits and gives that to App2.
In [1]:
days_original = [158, 928, 882, 341, 596, 878, 526]
years_given = [round(x/365.0, 9) for x in days_original]
print(years_given)
[0.432876712, 2.542465753, 2.416438356, 0.934246575, 1.632876712, 2.405479452, 1.44109589]

Now if App2 tries to compute the offsets by multiplying with 365 and converting back to integers, it will give wrong results (due to limitations in floating point representation)

In [2]:
wrong_offsets = [int(x*365.0) for x in years_given]
print(wrong_offsets)
[157, 927, 881, 340, 595, 877, 525]

The correct approach is to multiply by 365, apply round() and convert that into integers.

In [3]:
correct_offsets = [int(round(x*365.0)) for x in years_given]
print(correct_offsets)
[158, 928, 882, 341, 596, 878, 526]
add_dates.1607020404.txt.gz · Last modified: 2020/12/03 18:33 by raju