User Tools

Site Tools


map_one_column_to_another

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Last revisionBoth sides next revision
map_one_column_to_another [2022/12/19 22:21] rajumap_one_column_to_another [2022/12/19 22:45] – [map one series to another in a case insensitive manner] raju
Line 1: Line 1:
 ==== map one series to another in a case insensitive manner ==== ==== map one series to another in a case insensitive manner ====
  
 +Given two dataframes
 +<code>
 +df1
 +        Item    Team
 +0    jetties  Team_1
 +1      Jetty  Team_2
 +2  Joint use  Team_1
  
 +df2
 +        Item
 +0      Jetty
 +1    Jetties
 +2  Joint use
 +</code>
 +
 +map df2['Item'] to the corresponding Team using df1 as the lookup table.
 +
 +Expected output (if case is included):
 +<code>
 +        Item Team_include_case
 +0      Jetty            Team_2
 +1    Jetties               NaN
 +2  Joint use            Team_1
 +</code>
 +
 +Expected output (if case is ignored):
 +<code>
 +        Item Team_ignore_case
 +0      Jetty           Team_2
 +1    Jetties           Team_1
 +2  Joint use           Team_1
 +</code>
 +
 +Solution:
 +<code>
 + % cat 72994107_v4.py 
 +import pandas as pd
 +
 +df1 = pd.DataFrame({
 +    "Item": ["jetties", "Jetty", "Joint use"],
 +    "Team": ["Team_1", "Team_2", "Team_1"]
 +})
 +print("df1")
 +print(df1)
 +
 +df2 = pd.DataFrame({"Item": ["Jetty", "Jetties", "Joint use"]})
 +print("df2")
 +print(df2)
 +
 +
 +def map_series(s, df, key_col, value_col, ignore_case):
 +    if ignore_case:
 +        result = s.str.lower().map(
 +            df.assign(**{key_col: df[key_col].str.lower()})
 +              .set_index(key_col)[value_col]
 +        )
 +    else:
 +        result = s.map(df.set_index(key_col)[value_col])
 +    return result
 +
 +
 +df2["Team_include_case"] = map_series(
 +    df2["Item"], df1, "Item", "Team", ignore_case=False
 +)
 +df2["Team_ignore_case"] = map_series(
 +    df2["Item"], df1, "Item", "Team", ignore_case=True
 +)
 +print("after mapping")
 +print(df2)
 +</code>
 +
 +Sample run:
 +<code>
 + % python 72994107_v4.py
 +df1
 +        Item    Team
 +0    jetties  Team_1
 +1      Jetty  Team_2
 +2  Joint use  Team_1
 +df2
 +        Item
 +0      Jetty
 +1    Jetties
 +2  Joint use
 +after mapping
 +        Item Team_include_case Team_ignore_case
 +0      Jetty            Team_2           Team_2
 +1    Jetties               NaN           Team_1
 +2  Joint use            Team_1           Team_1
 +</code>
 +
 +Ref:
 +  * https://stackoverflow.com/questions/72994107/pandas-using-map-and-ignore-case-sensitivity - I posted the above answer there.
 +  * https://stackoverflow.com/questions/39767718/pandas-assign-with-new-column-name-as-string - shows how to pass a string variable to the dataframe assign function
 +
 +{{tags>["panadas dataframe assign string as variable"]}}
 ==== map one column to another ==== ==== map one column to another ====
 <code> <code>
map_one_column_to_another.txt · Last modified: 2022/12/19 22:46 by raju