get_the_first_non_null_value_in_each_column
This is an old revision of the document!
Table of Contents
Get the first non null value in each column
Task
Get the first non null value in each column
Corner cases:
- If a column is all NaNs, return a NaN.
For example, given
jim joe jolie jack 0 1.0 NaN NaN 0 NaN 2.0 NaN
We want
jim joe jolie jack 0 1.0 2.0 NaN
Solution
$ ipython In [1]: import pandas as pd import numpy as np df = pd.DataFrame({'jim': [0, 0], 'joe': [1, np.nan], 'jolie': [np.nan, 2], 'jack': [np.nan, np.nan]}) df Out[1]: jim joe jolie jack 0 0 1.0 NaN NaN 1 0 NaN 2.0 NaN In [2]: def get_first_non_nan(s): values = s.loc[~s.isnull()] value = values.iloc[0] if not values.empty else np.nan return value In [3]: df.groupby('jim').agg(get_first_non_nan) Out[3]: joe jolie jack jim 0 1.0 2.0 NaN In [4]: df.groupby('jim').agg(get_first_non_nan).reset_index() Out[4]: jim joe jolie jack 0 0 1.0 2.0 NaN
Used Python 3.9.4 and IPython 7.22.0
get_the_first_non_null_value_in_each_column.1631740477.txt.gz · Last modified: 2021/09/15 21:14 by raju