add_dates
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
add_dates [2020/11/22 16:26] – raju | add_dates [2021/01/13 14:41] (current) – raju | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ==== Add dates ==== | + | ===== Add dates ===== |
+ | |||
+ | ==== single start date; multiple offsets; with pandas ==== | ||
+ | prepare the input | ||
+ | < | ||
+ | In [1]: | ||
+ | start_yyyymmdd = ' | ||
+ | fmt = ' | ||
+ | from datetime import datetime | ||
+ | start_date = datetime.strptime(start_yyyymmdd, | ||
+ | |||
+ | In [2]: | ||
+ | print(start_date) | ||
+ | 2020-12-03 00:00:00 | ||
+ | |||
+ | In [3]: | ||
+ | import pandas as pd | ||
+ | days_original = [158, 928, 882, 341, 596, 878, 526] | ||
+ | offset_years = [round(x/ | ||
+ | df = pd.DataFrame({' | ||
+ | |||
+ | In [4]: | ||
+ | print(df) | ||
+ | | ||
+ | 0 0.432877 | ||
+ | 1 2.542466 | ||
+ | 2 2.416438 | ||
+ | 3 0.934247 | ||
+ | 4 1.632877 | ||
+ | 5 2.405479 | ||
+ | 6 1.441096 | ||
+ | </ | ||
+ | Convert the offset to days | ||
+ | < | ||
+ | In [5]: | ||
+ | import numpy as np | ||
+ | df[' | ||
+ | df | ||
+ | Out[5]: | ||
+ | | ||
+ | 0 0.432877 | ||
+ | 1 2.542466 | ||
+ | 2 2.416438 | ||
+ | 3 0.934247 | ||
+ | 4 1.632877 | ||
+ | 5 2.405479 | ||
+ | 6 1.441096 | ||
+ | </ | ||
+ | Apply the offset to get end dates | ||
+ | < | ||
+ | In [6]: | ||
+ | df[' | ||
+ | df | ||
+ | Out[6]: | ||
+ | | ||
+ | 0 0.432877 | ||
+ | 1 2.542466 | ||
+ | 2 2.416438 | ||
+ | 3 0.934247 | ||
+ | 4 1.632877 | ||
+ | 5 2.405479 | ||
+ | 6 1.441096 | ||
+ | </ | ||
+ | |||
+ | Used | Python 3.8.5, ipython 7.18.1, pandas 1.1.3, numpy 1.19.2 | ||
+ | |||
+ | demonstrates | round float to int | ||
+ | |||
+ | ==== multiple dates; different offsets; with pandas | ||
Given | Given | ||
< | < | ||
Line 66: | Line 134: | ||
</ | </ | ||
- | For a single date, we can do this without pandas | + | ==== single date; single offset; |
< | < | ||
In [1]: | In [1]: | ||
Line 78: | Line 146: | ||
</ | </ | ||
- | Related links: | + | ==== Related links ==== |
* https:// | * https:// | ||
+ | * https:// | ||
+ | |||
+ | ==== offset calculation ==== | ||
+ | If the offsets are computed from floating point numbers that were originally derived from integers, watch out for round off errors. | ||
+ | |||
+ | For example, let's say: | ||
+ | * There are two applications App1 and App2 | ||
+ | * App1 converts the number of days (an integer value) into number of years (a floating point number) and rounds it to 9 digits | ||
+ | * App2 reads that as input | ||
+ | |||
+ | < | ||
+ | In [1]: | ||
+ | days_original = [158, 928, 882, 341, 596, 878, 526] | ||
+ | years_given = [round(x/ | ||
+ | print(years_given) | ||
+ | [0.432876712, | ||
+ | </ | ||
+ | |||
+ | Now if App2 tries to compute the offsets by multiplying with 365 and converting that back to integer, it will give wrong results (due to limitations in floating point representation). | ||
+ | < | ||
+ | In [2]: | ||
+ | wrong_offsets = [int(x*365.0) for x in years_given] | ||
+ | print(wrong_offsets) | ||
+ | [157, 927, 881, 340, 595, 877, 525] | ||
+ | </ | ||
+ | |||
+ | The correct approach is to multiply by 365, apply round() and then convert it back to integer. | ||
+ | < | ||
+ | In [3]: | ||
+ | correct_offsets = [int(round(x*365.0)) for x in years_given] | ||
+ | print(correct_offsets) | ||
+ | [158, 928, 882, 341, 596, 878, 526] | ||
+ | </ | ||
add_dates.1606062412.txt.gz · Last modified: 2020/11/22 16:26 by raju