pandas_series
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
pandas_series [2021/09/15 19:57] – raju | pandas_series [2024/02/06 05:18] (current) – [return a random element] raju | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ===== creating a series ===== | ||
+ | ==== create a series from a list ==== | ||
+ | < | ||
+ | >>> | ||
+ | >>> | ||
+ | 0 sun | ||
+ | 1 mon | ||
+ | 2 tue | ||
+ | dtype: object | ||
+ | </ | ||
+ | |||
+ | To assign an index | ||
+ | < | ||
+ | >>> | ||
+ | >>> | ||
+ | s sun | ||
+ | m mon | ||
+ | t tue | ||
+ | dtype: object | ||
+ | </ | ||
+ | |||
+ | To assign a name to the column | ||
+ | < | ||
+ | >>> | ||
+ | >>> | ||
+ | s sun | ||
+ | m mon | ||
+ | t tue | ||
+ | Name: day, dtype: object | ||
+ | </ | ||
+ | |||
+ | To assign a name to the index | ||
+ | < | ||
+ | >>> | ||
+ | >>> | ||
+ | >>> | ||
+ | letter | ||
+ | s sun | ||
+ | m mon | ||
+ | t tue | ||
+ | Name: day, dtype: object | ||
+ | </ | ||
+ | |||
+ | Column name is useful when converting the series to dataframe. | ||
+ | < | ||
+ | >>> | ||
+ | 0 | ||
+ | s sun | ||
+ | m mon | ||
+ | t tue | ||
+ | |||
+ | >>> | ||
+ | day | ||
+ | s sun | ||
+ | m mon | ||
+ | t tue | ||
+ | </ | ||
+ | |||
+ | If the series did not have a name to begin with but we desire to have one while converting to the dataframe | ||
+ | < | ||
+ | >>> | ||
+ | days | ||
+ | s sun | ||
+ | m mon | ||
+ | t tue | ||
+ | </ | ||
+ | |||
+ | The index name comes in handy while resetting the index | ||
+ | < | ||
+ | >>> | ||
+ | index day | ||
+ | 0 | ||
+ | 1 | ||
+ | 2 | ||
+ | >>> | ||
+ | letter | ||
+ | 0 s sun | ||
+ | 1 m mon | ||
+ | 2 t tue | ||
+ | </ | ||
+ | |||
===== dummy ===== | ===== dummy ===== | ||
==== append element to series ==== | ==== append element to series ==== | ||
Line 21: | Line 102: | ||
</ | </ | ||
+ | ==== return a random element ==== | ||
+ | Use pandas.Series.sample | ||
+ | |||
+ | Ref:- | ||
+ | * https:// | ||
+ | |||
+ | ==== expand a series ==== | ||
+ | tags | using reindex, change index | ||
+ | |||
+ | Given two series S, I of length n, and an integer N which is >= n, the idea here is to expand S into an N-element vector, E so that E[I[:]] = S[:]. | ||
+ | |||
+ | For example if S is [3.4, 1.8], I is [3, 5] and N is 10, we want E to be [0, 0, 0, 3.4, 0, 1.8, 0, 0, 0, 0] | ||
+ | |||
+ | < | ||
+ | import pandas as pd | ||
+ | import numpy as np | ||
+ | |||
+ | def expand_series(S, | ||
+ | E = pd.Series(S.values, | ||
+ | return E | ||
+ | </ | ||
+ | |||
+ | < | ||
+ | df = pd.DataFrame({' | ||
+ | print(df) | ||
+ | </ | ||
+ | |||
+ | < | ||
+ | | ||
+ | 0 | ||
+ | 1 | ||
+ | </ | ||
+ | |||
+ | < | ||
+ | unravelled_series = expand_series(df[' | ||
+ | print(unravelled_series) | ||
+ | </ | ||
+ | |||
+ | < | ||
+ | id | ||
+ | 0 0.0 | ||
+ | 1 0.0 | ||
+ | 2 0.0 | ||
+ | 3 3.4 | ||
+ | 4 0.0 | ||
+ | 5 1.8 | ||
+ | 6 0.0 | ||
+ | 7 0.0 | ||
+ | 8 0.0 | ||
+ | 9 0.0 | ||
+ | Name: val, dtype: float64 | ||
+ | </ | ||
+ | |||
+ | Sample code: https:// | ||
+ | |||
+ | Ref: | ||
+ | |||
+ | * https:// | ||
+ | * https:// | ||
+ | * https:// | ||
+ | |||
+ | ==== Convert series to a dataframe ==== | ||
+ | |||
+ | Use to_frame(). By default, it will use the series name to set the column name in the dataframe. But you can also assign one while calling the to_frame function. | ||
+ | |||
+ | < | ||
+ | >>> | ||
+ | >>> | ||
+ | >>> | ||
+ | s sun | ||
+ | m mon | ||
+ | t tue | ||
+ | dtype: object | ||
+ | >>> | ||
+ | 0 | ||
+ | s sun | ||
+ | m mon | ||
+ | t tue | ||
+ | >>> | ||
+ | >>> | ||
+ | s sun | ||
+ | m mon | ||
+ | t tue | ||
+ | Name: day, dtype: object | ||
+ | >>> | ||
+ | day | ||
+ | s sun | ||
+ | m mon | ||
+ | t tue | ||
+ | >>> | ||
+ | days | ||
+ | s sun | ||
+ | m mon | ||
+ | t tue | ||
+ | </ | ||
===== check if ===== | ===== check if ===== | ||
==== check if a series is empty ==== | ==== check if a series is empty ==== | ||
Line 65: | Line 241: | ||
Used Python 3.9.4 and IPython 7.22.0 | Used Python 3.9.4 and IPython 7.22.0 | ||
+ | |||
+ | tags | check if a series has at least one element | ||
+ | |||
+ | ==== check if all elements in a series are unique ==== | ||
+ | Use pandas.Series.is_unique | ||
+ | |||
+ | < | ||
+ | In [1]: | ||
+ | import pandas as pd | ||
+ | |||
+ | In [2]: | ||
+ | pd.Series([1, | ||
+ | Out[2]: | ||
+ | True | ||
+ | |||
+ | In [3]: | ||
+ | pd.Series([1, | ||
+ | Out[3]: | ||
+ | False | ||
+ | </ | ||
+ | |||
+ | Missing values are treated as any other value. So if there are multiple NaNs, it will return True. If this is not desired, drop the NaNs first. | ||
+ | < | ||
+ | In [4]: | ||
+ | import numpy as np | ||
+ | pd.Series([1, | ||
+ | Out[4]: | ||
+ | False | ||
+ | |||
+ | In [5]: | ||
+ | pd.Series([1, | ||
+ | Out[5]: | ||
+ | True | ||
+ | </ | ||
+ | |||
+ | For completeness | ||
+ | < | ||
+ | In [6]: | ||
+ | pd.Series([1, | ||
+ | Out[6]: | ||
+ | False | ||
+ | |||
+ | In [7]: | ||
+ | pd.Series([1, | ||
+ | Out[7]: | ||
+ | False | ||
+ | </ | ||
+ | |||
+ | Using | pandas 1.5.3, python 3.11.4, ipython 8.12.0 | ||
+ | |||
+ | Ref:- | ||
+ | * https:// | ||
+ | * https:// | ||
+ |
pandas_series.1631735874.txt.gz · Last modified: 2021/09/15 19:57 by raju