There will be times when you are tempted to loop through rows or columns in Pandas to achieve your results – and the lesson I keep learning is **Don’t do it!**

Every time I’m tempted to write a for loop with Pandas data I find myself clock watching and cursing… 9 times out of 10 there is another way, and here are 2 of my favourite recipes, enjoy :).

## Some sample data

df = pd.DataFrame({'Type' : ['click', 'buy', 'click', 'buy', | |

'click', 'buy', 'click', 'click'], | |

'Event' : ['one', 'one', 'two', 'three', | |

'two', 'two', 'one', 'three'], | |

'Statistic 1' : np.random.randn(8), | |

'Statistic 2' : np.random.randn(8)}).sort_values("Type") |

## The power of groupby() and lambda

Let’s say that for each Type in our dataframe, we want to create a sequenced list of Events. Nothing easier:

df_grouped = df.groupby("Type") | |

df_list_events = df_grouped['Event'].apply(lambda x: list(x)).to_frame().reset_index() | |

df_list_events |

## The power of itertools and lambda

And now let’s say we’re satisfied with our lists, but we’d like to de-dupe where adjacent events occur (e.g. “two, two” above should be reduced to “two)

df_list_events["Deduped_Events"] = df_list_events["Event"].apply(lambda x: [k for k, g in itertools.groupby(x)]) | |

df_list_events |

## And let’s not forget the magic of list comprehension

Let’s say we want to add a “U-” prefix to each of our Deduped_Events:

df_list_events["Deduped_Events"] = df_list_events["Deduped_Events"].apply(lambda x: ["U-" + str(k) for k in x]) | |

df_list_events |