Generate the data and import packages¶
First, we need to create the data. I'll start by defining it as a dictionary and then convert it into a pandas DataFrame, since pandas is commonly used in many projects for data manipulation.
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
color_dict = {
"Norway": "#2B314D",
"Denmark": "#A54836",
"Sweden": "#5375D4",
}
xy_ticklabel_color, grand_totals_color, grid_color, datalabels_color = "#757C85", "#101628", "#ECEFEF", "#FFFFFF"
data = {
"year": [2004, 2022, 2004, 2022, 2004, 2022],
"countries": ["Sweden", "Sweden", "Denmark", "Denmark", "Norway", "Norway"],
"sites": [13, 15, 4, 10, 5, 8],
}
df = pd.DataFrame(data)
df
| year | countries | sites | |
|---|---|---|---|
| 0 | 2004 | Sweden | 13 |
| 1 | 2022 | Sweden | 15 |
| 2 | 2004 | Denmark | 4 |
| 3 | 2022 | Denmark | 10 |
| 4 | 2004 | Norway | 5 |
| 5 | 2022 | Norway | 8 |
We will sort the data to match the original chart and add two new columns:
- The year labels ('22, '04) (year_lbl) and
- the colors for the each country (color).
df = df.sort_values(["sites"], ascending=True).reset_index(drop=True)
df["year_lbl"] = "'" + df["year"].astype(str).str[-2:].astype(str)
# map the colors of a dict to a dataframe
df["color"] = df.countries.map(color_dict)
We then add a new column called new_sites, which reverses the site counts. This approach allows us to maintain the default direction of the polar chart, avoiding complications when plotting curved text.
Since the arc range goes from 0 to 20, we define a variable called cnt_segments and subtract each site's value from it to reverse the order.
cnt_segments = 20 # number of segmnets to divide the arc
df["new_sites"] = cnt_segments - df.sites
df
| year | countries | sites | year_lbl | color | new_sites | |
|---|---|---|---|---|---|---|
| 0 | 2004 | Denmark | 4 | '04 | #A54836 | 16 |
| 1 | 2004 | Norway | 5 | '04 | #2B314D | 15 |
| 2 | 2022 | Norway | 8 | '22 | #2B314D | 12 |
| 3 | 2022 | Denmark | 10 | '22 | #A54836 | 10 |
| 4 | 2004 | Sweden | 13 | '04 | #5375D4 | 7 |
| 5 | 2022 | Sweden | 15 | '22 | #5375D4 | 5 |
Define the elements of the chart¶
Even thought the original chart looks like an arc, we will use ax.bar() method in polar coordinates, which requires the following parameters:
| Parameter | Description | Value |
|---|---|---|
| x | The angular positions of each bar in radians | (values between 0 and 2π). Think of these as the starting angles for each bar around the circle dfined by angular_positions parameter. |
| width | The angular width of each bar in radians | Defined by angular_width parameter. |
| height | The length of each bar outward from the center | In a radial plot, this is like the radius the bar extends to defined by bar_height parameter. |
| bottom= | The starting radius of each bar from the center | This shifts the bars outward, useful if you don’t want them to start from the origin (center of the circle) defined by starting_radius parameter. |
Angular positions¶
To calculate the angular_positions, we divide 180 degrees (since we're plotting only a half circle) into 20 segments.
We then map each segment's angle to its corresponding site value, so we now the angle for each country and add this information to the DataFrame for later use.
# divide 180 degrees into 20 segments
angular_positions = np.deg2rad(np.linspace(0, 180, cnt_segments, endpoint=False))
# map the angle where each site bubble should be and add it to the dataframe
df["angles"] = angular_positions[df.new_sites]
df
| year | countries | sites | year_lbl | color | new_sites | angles | |
|---|---|---|---|---|---|---|---|
| 0 | 2004 | Denmark | 4 | '04 | #A54836 | 16 | 2.513274 |
| 1 | 2004 | Norway | 5 | '04 | #2B314D | 15 | 2.356194 |
| 2 | 2022 | Norway | 8 | '22 | #2B314D | 12 | 1.884956 |
| 3 | 2022 | Denmark | 10 | '22 | #A54836 | 10 | 1.570796 |
| 4 | 2004 | Sweden | 13 | '04 | #5375D4 | 7 | 1.099557 |
| 5 | 2022 | Sweden | 15 | '22 | #5375D4 | 5 | 0.785398 |
Angular width and Starting radius¶
Now we calculate how wide each bar will be angular_width by dividing the half circle (180 degrees) by the number of bars (20 in our case). We don't need to add this to the dataframe as it is a constant.
We will also define the offset for the bars starting_radius, so they wont start at the center.
angular_with = np.deg2rad(180 / cnt_segments) # the width of each segment
starting_radius = 2 # offset the circle to create an arc
Now, we define the bar_length (how far the bar reaches from the center). We choose 0.5 to visually match the original chart. Since the dots (or bubbles) are positioned at different heights to avoid overlapping, we assign a specific radius to each country using the radius_map. This mapping is then applied to the dataframe radius, so the correct vertical position can be used later when plotting.
bar_length = 0.5 # height of the segments
#define the radius for each country
radius_map = {
"Denmark": 2 + bar_length / 3,
"Norway": 2 + bar_length / 3 * 2,
"Sweden": 2 + bar_length / 2,
}
# map the radius to the dataframe
df['radius'] = df['countries'].map(radius_map)
df
| year | countries | sites | year_lbl | color | new_sites | angles | radius | |
|---|---|---|---|---|---|---|---|---|
| 0 | 2004 | Denmark | 4 | '04 | #A54836 | 16 | 2.513274 | 2.166667 |
| 1 | 2004 | Norway | 5 | '04 | #2B314D | 15 | 2.356194 | 2.333333 |
| 2 | 2022 | Norway | 8 | '22 | #2B314D | 12 | 1.884956 | 2.333333 |
| 3 | 2022 | Denmark | 10 | '22 | #A54836 | 10 | 1.570796 | 2.166667 |
| 4 | 2004 | Sweden | 13 | '04 | #5375D4 | 7 | 1.099557 | 2.250000 |
| 5 | 2022 | Sweden | 15 | '22 | #5375D4 | 5 | 0.785398 | 2.250000 |
And finally before starting plotting, we define some other variables we will reuse in matplotlib:
colors = df.color
site_positions = df.angles
radius = df.radius
Plot the arch chart¶
We will now create the polar figure and set the ax.set_theamax() property so only 180 degrees of the polar plot will show.
fig, ax = plt.subplots(figsize=(10, 7), subplot_kw=dict(polar=True))
ax.set_thetamax(180) # plot half circle
ax.bar(
angular_positions,
width = angular_with,
height = bar_length,
bottom = starting_radius,
linewidth = 1,
edgecolor = "white",
color = grid_color,
align = "edge",
)
<BarContainer object of 20 artists>
Plot the dots¶
And then we plot the dots or bubbles using ax.scatter method which requires the following parameters:
| Parameter | Description | Value |
|---|---|---|
| x | The x position for each dot | The column site_positions |
| y | The y position of each dot | The column radius |
| s | The area of each dot | We will calculate it proportional to the bar height np.pi * (bar_length * scale)**2 |
scale = 20 # You can tune this number
ax.scatter(
site_positions,
radius,
color=colors,
s=np.pi * (bar_length * scale)**2,
zorder=2)
fig
Add the data labels¶
To add the data labels, we will use ax.text() method which requires the following parameters:
| Parameter | Description | Value |
|---|---|---|
| x | The x position of the text | The df.angles |
| y | The y position of the text | The df.radius |
| text | The text of the text | The df.year_lbl |
# bubble data labels
for row in df.itertuples():
ax.text(
row.angles,
row.radius,
row.year_lbl,
color = "w",
size = 8,
ha = "center",
va = "center"
)
fig
Add connecting arcs¶
Next, we will add connecting lines (or arcs) between the dots using arrowprops from ax.annotate.
The trick here is that if no annotation text is provided, then:
- xy becomes the end of the arrow,
- xytext becomes the start of the arrow.
Specify the connection style¶
To control how the two points are connected, we use the connectionstyle parameter:
- arc3 specifies that the line should be a smooth arc.
- rad controls the curvature of the arc.
We calculate the value of rad by first converting the polar coordinates to Cartesian coordinates to measure the distance between the points. This length is then used to scale the curvature appropriately.
for (country, group), color, in zip(df.groupby("countries"), colors.unique(), ):
angles_g = group["angles"].values
radius_g = group["radius"].values
#calculate the curvature of the arcs.
theta1, r1 = angles_g[0], radius_g[0]
theta2, r2 = angles_g[1], radius_g[1]
# First calculate the chord length between the two points (straight line distance in Cartesian coordinates)
x1, y1 = r1 * np.cos(theta1), r1 * np.sin(theta1)
x2, y2 = r2 * np.cos(theta2), r2 * np.sin(theta2)
chord_length = np.sqrt((x2 - x1)**2 + (y2 - y1)**2)
#scale it
max_r = ax.get_rmax()
rad = chord_length / max_r * 0.3 # scale factor 0.3 adjusts curvature strength
connectionstyle = f"arc3,rad={rad:.2f}"
ax.annotate(
"", # leave this blank
xy=(angles_g[0], radius_g[0]), # arrow tip (where the arrow points to)
xytext=(angles_g[1], radius_g[1]), # arrow start (where the arrow comes from)
zorder=1,
arrowprops=dict(
arrowstyle="-",
connectionstyle=connectionstyle,
color=color,
alpha=0.5,
linewidth=2,
linestyle="-",
antialiased=True,
),
)
fig
Axis labels¶
The axis labels are straightforward now that we understand how to position elements at an angle:
- axis_labels contains the text labels.
- ang_pos holds the corresponding angular positions.
With both available, we simply loop through them and use ax.text() to place each label at the correct angle.
# axis labels
axis_labels = list(range(0, 25, 5))
ang_positions = np.deg2rad(np.linspace(0, 180, len(axis_labels), endpoint=True))
for ang_position, axis_label, r in zip(ang_positions, axis_labels, radius):
ax.text(
ang_position,
starting_radius - 0.2,
axis_label,
size=12,
va="center",
ha="center",
color=xy_ticklabel_color,
)
fig
Curve text¶
To curve the text and position it correctly in alignment with the image, we need to:
- Filter the DataFrame to include only the data from the last year.
- Sort the filtered data in a custom order:
Norway,Denmark, andSweden.
country_labels = df[df.year_lbl == "'22"]
# custom sort
sort_order_dict = {"Denmark": 2, "Sweden": 3, "Norway": 1}
country_labels = country_labels.sort_values( by=["countries",], key=lambda x: x.map(sort_order_dict),)
country_labels
| year | countries | sites | year_lbl | color | new_sites | angles | radius | |
|---|---|---|---|---|---|---|---|---|
| 2 | 2022 | Norway | 8 | '22 | #2B314D | 12 | 1.884956 | 2.333333 |
| 3 | 2022 | Denmark | 10 | '22 | #A54836 | 10 | 1.570796 | 2.166667 |
| 5 | 2022 | Sweden | 15 | '22 | #5375D4 | 5 | 0.785398 | 2.250000 |
Now, to be able to curve text, we need to define the following paratemters:
| Parameter | Description |
|---|---|
| start_angle_deg | Converts the angular positions (in radians) from the country_labels['angles'] column to degrees — this is where each label will be centered. |
| angle_per_char | Defines the angular width (in degrees) allotted per character. |
| text_radie | The radial distance from the center where the text will be placed. |
and the parameters for the loop are:
| Parameter | Description |
|---|---|
| arc_span_deg | Calculates the total angular span or length of the word. |
| start_angle | Centers the word on the original angle. |
| text_angles | Generates a sequence of angles (in degrees), one for each character in the word. |
NOTE I am hardcoding the width of each letter angle_per_char for better placement it is better to calculate the width of the letter.
I am also adding the styling.
start_angle_deg = np.rad2deg(country_labels['angles'].values) # starting point for the label
angle_per_char = 1.8 # degrees per character
text_radie= 2.7 # radius
for ang, text, color in zip(start_angle_deg, country_labels.countries, country_labels.color):
cnt_text = len(text)
arc_span_deg = cnt_text * angle_per_char # total length of the word (number of letters * width)
start_angle = ang - arc_span_deg / 2 # center the position
text_angles = np.linspace(start_angle, start_angle + arc_span_deg, cnt_text) # calculate the angles in degrees
for char, angle_deg in zip(text, reversed(text_angles)):
angle_rad = np.deg2rad(angle_deg) #convert to radians
ax.text(
angle_rad,
text_radie,
char,
rotation = angle_deg - 90,
rotation_mode = "anchor",
fontsize = 8,
color = color,
)
#styling
ax.text(0.5, 0, "World Heritage \nSites", size=12, ha="center", va="center")
ax.set_axis_off()
fig