User Tools

Site Tools


pandas_groupby

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Last revisionBoth sides next revision
pandas_groupby [2024/03/26 22:17] – [extract groupby object by key] rajupandas_groupby [2024/05/07 20:46] – [extract groupby object by key] raju
Line 235: Line 235:
  
 Ref: https://stackoverflow.com/questions/49859182/understanding-level-0-and-group-keys Ref: https://stackoverflow.com/questions/49859182/understanding-level-0-and-group-keys
 +
 +==== filter elements from groups that dont satisfy a criterion ====
 +tags | pandas groupby filter groups
 +<code>
 +In [2]:
 +import pandas as pd
 +df = pd.DataFrame({
 +    'A' : ['foo', 'bar', 'foo', 'bar', 'foo', 'bar'],
 +    'B' : [1, 2, 3, 4, 5, 6],
 +    'C' : [2.0, 5., 8., 1., 2., 9.]})
 +df
 +Out[2]:
 +      B    C
 +0  foo  1  2.0
 +1  bar  2  5.0
 +2  foo  3  8.0
 +3  bar  4  1.0
 +4  foo  5  2.0
 +5  bar  6  9.0
 +
 +In [3]:
 +grouped = df.groupby('A')
 +
 +In [4]:
 +grouped.filter(lambda x: x['B'].mean() > 3.)
 +Out[4]:
 +      B    C
 +1  bar  2  5.0
 +3  bar  4  1.0
 +5  bar  6  9.0
 +</code>
  
 ==== extract groupby object by key ==== ==== extract groupby object by key ====
Line 321: Line 352:
  
 ==== groupby slicing ==== ==== groupby slicing ====
 +Consider
 +<code>
 +In [1]: 
 +import pandas as pd
 +import numpy as np
 +rand = np.random.RandomState(1)
 +df = pd.DataFrame({'A': ['foo', 'bar'] * 3,
 +                   'B': rand.randn(6),
 +                   'C': rand.randint(0, 20, 6)})
 +
 +In [2]: 
 +df
 +Out[2]: 
 +               C
 +0  foo  1.624345   5
 +1  bar -0.611756  18
 +2  foo -0.528172  11
 +3  bar -1.072969  10
 +4  foo  0.865408  14
 +5  bar -2.301539  18
 +</code>
 +
 +Group by on column 'A'
 +<code>
 +In [3]: 
 +gb = df.groupby(['A'])
 +</code>
 +
 +You can use get_group() to get a single group
 +<code>
 +In [4]: 
 +gb.get_group('foo')
 +Out[4]: 
 +               C
 +0  foo  1.624345   5
 +2  foo -0.528172  11
 +4  foo  0.865408  14
 +</code>
 +
 +You can select different columns using the groupby slicing:
 +<code>
 +In [5]: 
 +gb[['A', 'B']].get_group('foo')
 +Out[5]: 
 +             B
 +0  foo  1.624345
 +2  foo -0.528172
 +4  foo  0.865408
 +
 +In [6]: 
 +gb[['C']].get_group('foo')
 +Out[6]: 
 +    C
 +0   5
 +2  11
 +4  14
 +</code>
  
 +Ref:
 +  * https://stackoverflow.com/questions/14734533/how-to-access-subdataframes-of-pandas-groupby-by-key
  
 ==== apply a function on each group ==== ==== apply a function on each group ====
pandas_groupby.txt · Last modified: 2024/05/07 20:47 by raju