15 buildings, 35 aspects per building, each weighted according to importance, and then condensed into 6 key categories, with colour coding depending on how the results tumble out.
What helped me get the final result?
- Pandas
- to import data from spreadsheet
- extract only the data I need
- manipulate it so we end up with scores (0 – 1) instead of qualitative assessments like “in progress”
- finally pivot the data into a summarized format
- Matplotlib.GridSpec
- to draw a grid into which I place the graphs
- to easily achieve the correct spacing and placement between graphs
- Matplotlib.PyPlot
- a function to draw each graph and colour-code the bars instead of writing my code 15 times – don’t laugh, I’m really proud of this part: it was challenging for me!
- and then to draw the actual graphs
- Normal Python
- for everything else like making sure the graphs appear in descending order of building size
What was difficult about it?
I ended up with different lists, dictionaries, and dataframes for different aspects of the project – the most challenging thing at the beginning was wrapping my head around extracting a value from one source, and passing it to another source as a selection criterion.
What was useful in overcoming the difficulties?
- In trying to make my code as generic as possible I quickly discovered the merits of .iloc() over .loc() – especially in my for loops!
- For a df with index names df1.index[0] is useful for getting the index names without referring to them by name!
- Which would then allow me to get the column number based on the named index value in a related dataframe: df2.columns.get_loc(df1.index[0])
- And this would allow me to extract values based on numeric locations: df2.columns.values[df1[0]]
- And then zip was great for quickly building a dictionary from 2 lists: my_dict = dict(zip(my_list1, my_list2))