Global Human Rights Disparities Analysis ¶
Ashley Yu
Relevant libaries import:¶
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import random
import sympy as sp
from pivottablejs import pivot_ui
import hvplot.pandas
from bokeh.io import output_notebook
import seaborn as sns
Insight Question 1:
Between high-income countries and low/middle income countries, do people have different overall right to safety from state from 2017-2023? Specifically, do people’s Physical Integrity Rights postively correlated with migrants/immigrants' risk of torture and ill-treatment?
1. Import the dataset:
hiy is Income-Adjusted Data Sets of High Income Countries
identity is HRMI People at Risk (PaR) Dataset
cpr is HRMI Civil and Political Rights (CPR) Dataset
hiy = pd.read_csv('csv files/esr_hiy_incomeadjusted.csv')
identity = pd.read_csv('csv files/people_at_risk.csv')
cpr = pd.read_csv('csv files/cpr.csv')
2. Data processing
In the dataset hiy, the column High_Income_Country uses numeric labels:
- 1 represents High-Income Countries (HIY).
- 0 represents Low- and Middle-Income Countries (LMY).
3. Label the data
A new dataset hiy_label is created with 'Country' and 'High_Income_Country'.
hiy['High_Income_Country'] = hiy['High_Income_Country'].replace({1: 'HIY', 0: 'LMY'})
hiy = hiy.drop_duplicates(subset=['Country'])
hiy_label = hiy[['Country','High_Income_Country']]
display(hiy_label)
| Country | High_Income_Country | |
|---|---|---|
| 0 | Aruba | HIY |
| 32 | Afghanistan | LMY |
| 64 | Angola | LMY |
| 96 | Albania | LMY |
| 128 | Andorra | HIY |
| ... | ... | ... |
| 6944 | Kosovo | LMY |
| 6976 | Yemen, Rep. | LMY |
| 7008 | South Africa | LMY |
| 7040 | Zambia | LMY |
| 7072 | Zimbabwe | LMY |
222 rows × 2 columns
4. Merge the data:
Merge the income classification labels hiy_label with the Civil and Political Rights dataset cpr, adding the income group (HIY / LMY) for each country.
Extract variables related to Physical Integrity Rights (Physint).
- Physint: Physical Integrity Rights, which measure people's overall right to safety from the state.
cpr = cpr.copy()
hiy_label = hiy_label.copy()
cpr.rename(columns={'country': 'Country'}, inplace=True)
hiy_label.rename(columns={'High_Income_Country': 'Label'}, inplace=True)
label_merge = pd.merge(cpr, hiy_label, on='Country', how='inner')
display(label_merge)
cpr_phy = label_merge[['Country', 'year', 'physint_mean', 'physint_sd', 'physint_lo', 'physint_hi', 'Label']]
display(cpr_phy)
| Country | year | countryyear | iso3c | iso3n | cowcode | hrmicode | disap_mean | disap_sd | disap_lo | ... | rel_hi | physint_mean | physint_sd | physint_lo | physint_hi | empower_mean | empower_sd | empower_lo | empower_hi | Label | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Angola | 2017 | AGO2017 | AGO | 24.0 | 540.0 | 1 | 6.464161 | 0.898106 | 5.314585 | ... | NaN | 4.401868 | 0.795705 | 3.383366 | 5.420371 | 2.735583 | 0.737381 | 1.791735 | 3.679431 | LMY |
| 1 | Angola | 2018 | AGO2018 | AGO | 24.0 | 540.0 | 1 | 7.174069 | 0.491192 | 6.545342 | ... | NaN | 5.122885 | 0.812793 | 4.082510 | 6.163259 | 4.580472 | 0.701104 | 3.683059 | 5.477886 | LMY |
| 2 | Angola | 2019 | AGO2019 | AGO | 24.0 | 540.0 | 1 | 6.357960 | 0.626105 | 5.556545 | ... | NaN | 4.486554 | 0.790036 | 3.475308 | 5.497800 | 4.226362 | 0.671121 | 3.367327 | 5.085398 | LMY |
| 3 | Angola | 2020 | AGO2020 | AGO | 24.0 | 540.0 | 1 | 5.453531 | 0.792786 | 4.438765 | ... | NaN | 3.438196 | 0.924271 | 2.255129 | 4.621263 | 2.709881 | 0.691141 | 1.825221 | 3.594541 | LMY |
| 4 | Angola | 2021 | AGO2021 | AGO | 24.0 | 540.0 | 1 | 4.963902 | 0.588297 | 4.210882 | ... | NaN | 3.121782 | 0.849364 | 2.034596 | 4.208967 | 2.996323 | 0.736295 | 2.053865 | 3.938781 | LMY |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 177 | Thailand | 2023 | THA2023 | THA | 764.0 | 800.0 | 47 | 8.162083 | 0.621875 | 7.366082 | ... | 9.428472 | 6.726162 | 0.901969 | 5.571642 | 7.880681 | 4.842543 | 0.708837 | 3.935232 | 5.749854 | LMY |
| 178 | Singapore | 2022 | SGP2022 | SGP | 702.0 | 830.0 | 49 | 8.410766 | 0.316615 | 8.005499 | ... | 7.127632 | 7.231867 | 0.765176 | 6.252442 | 8.211292 | 4.089519 | 0.672613 | 3.228574 | 4.950464 | HIY |
| 179 | Singapore | 2023 | SGP2023 | SGP | 702.0 | 830.0 | 49 | 8.403709 | 0.313198 | 8.002816 | ... | 7.073044 | 7.294268 | 0.743383 | 6.342738 | 8.245797 | 4.190406 | 0.712885 | 3.277913 | 5.102899 | HIY |
| 180 | Sri Lanka | 2022 | LKA2022 | LKA | 144.0 | 780.0 | 50 | 7.047200 | 0.292876 | 6.672319 | ... | 6.533591 | 5.090563 | 0.740136 | 4.143188 | 6.037937 | 3.610210 | 0.680381 | 2.739323 | 4.481098 | LMY |
| 181 | Sri Lanka | 2023 | LKA2023 | LKA | 144.0 | 780.0 | 50 | 7.356123 | 0.282373 | 6.994685 | ... | 6.520242 | 5.441030 | 0.729936 | 4.506712 | 6.375348 | 3.370074 | 0.657596 | 2.528351 | 4.211797 | LMY |
182 rows × 52 columns
| Country | year | physint_mean | physint_sd | physint_lo | physint_hi | Label | |
|---|---|---|---|---|---|---|---|
| 0 | Angola | 2017 | 4.401868 | 0.795705 | 3.383366 | 5.420371 | LMY |
| 1 | Angola | 2018 | 5.122885 | 0.812793 | 4.082510 | 6.163259 | LMY |
| 2 | Angola | 2019 | 4.486554 | 0.790036 | 3.475308 | 5.497800 | LMY |
| 3 | Angola | 2020 | 3.438196 | 0.924271 | 2.255129 | 4.621263 | LMY |
| 4 | Angola | 2021 | 3.121782 | 0.849364 | 2.034596 | 4.208967 | LMY |
| ... | ... | ... | ... | ... | ... | ... | ... |
| 177 | Thailand | 2023 | 6.726162 | 0.901969 | 5.571642 | 7.880681 | LMY |
| 178 | Singapore | 2022 | 7.231867 | 0.765176 | 6.252442 | 8.211292 | HIY |
| 179 | Singapore | 2023 | 7.294268 | 0.743383 | 6.342738 | 8.245797 | HIY |
| 180 | Sri Lanka | 2022 | 5.090563 | 0.740136 | 4.143188 | 6.037937 | LMY |
| 181 | Sri Lanka | 2023 | 5.441030 | 0.729936 | 4.506712 | 6.375348 | LMY |
182 rows × 7 columns
5. Visualize the data
Create a line graph that shows the average Physical Integrity Rights (physint_mean) for high-income (HIY) and low/middle-income (LMY) countries over the years 2017–2023.
- X-axis:
year(2017 to 2023) - Y-axis: Average Physical Integrity Rights (
physint_mean) - Red Line: Represents HIY (High-Income Countries)
- Blue Line: Represents LMY (Low/Middle-Income Countries)
averaged_data = cpr_phy.groupby(['Label', 'year'])['physint_mean'].mean().reset_index()
hiy_data = averaged_data[averaged_data['Label'] == 'HIY']
lmy_data = averaged_data[averaged_data['Label'] == 'LMY']
plt.plot(hiy_data['year'], hiy_data['physint_mean'], label='HIY', marker='o')
plt.plot(lmy_data['year'], lmy_data['physint_mean'], label='LMY', marker='o')
plt.xlabel('Year')
plt.ylabel('Average Physical Integrity Rights')
plt.title('Average Physical Integrity Rights Over Years for HIY and LMY Countries')
plt.legend(title='Label')
plt.grid(True)
plt.show()
6. Observation
From the line graph, high-income countries (HIY) shows higher averaged physical integrity right than low/middle-income (LMY) countries across all different years. Thus, people in high-income countries overall have more right to safety from the state.
Part 2: In both HIY and LMY, migrant/immigrant risk of torture and ill-treatment show great variations.
1. Merge Data
Merge the HRMI People at Risk dataset identity with the income classification labels hiy_label, adding a classification of High-Income (HIY) or Low/Middle-Income (LMY) for each country.
2. Extract Migrant/Immigrant related Data
Extract variables related to migrants and immigrants identity_imm.
tort_atrisk_prop24: Proportion of respondents who identified migrants/immigrants at risk of torture and ill-treatment.tort_atrisk_count24: Count of respondents identifying migrants/immigrants at risk of torture and ill-treatment.
hiy_label = hiy_label.copy()
hiy_label.rename(columns={'High_Income_Country': 'Label'}, inplace=True)
identity.rename(columns={'country': 'Country'}, inplace=True)
label_identity = pd.merge(identity, hiy_label, on='Country', how='inner')
display(label_identity)
identity_imm = label_identity[['Country', 'year', 'tort_atrisk_prop24', 'tort_atrisk_count24', 'Label']]
display(identity_imm)
| Country | year | countryyear | iso3c | iso3n | cowcode | hrmicode | food_total_atrisk_resp | food_atrisk_count1 | food_atrisk_prop1 | ... | union_atrisk_prop27 | union_atrisk_count28 | union_atrisk_prop28 | union_atrisk_count29 | union_atrisk_prop29 | union_atrisk_count30 | union_atrisk_prop30 | union_atrisk_count31 | union_atrisk_prop31 | Label | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Angola | 2018 | AGO2018 | AGO | 24.0 | 540.0 | 1 | 11 | 3 | 0.272727 | ... | 0.090909 | 2.0 | 0.181818 | 3.0 | 0.272727 | 4.0 | 0.363636 | 1.0 | 0.090909 | LMY |
| 1 | Angola | 2019 | AGO2019 | AGO | 24.0 | 540.0 | 1 | 14 | 3 | 0.214286 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | LMY |
| 2 | Angola | 2020 | AGO2020 | AGO | 24.0 | 540.0 | 1 | 13 | 3 | 0.230769 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | LMY |
| 3 | Angola | 2021 | AGO2021 | AGO | 24.0 | 540.0 | 1 | 12 | 6 | 0.500000 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | LMY |
| 4 | Angola | 2022 | AGO2022 | AGO | 24.0 | 540.0 | 1 | 10 | 5 | 0.500000 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | LMY |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 139 | Maldives | 2023 | MDV2023 | MDV | 462.0 | 781.0 | 46 | 7 | 0 | 0.000000 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | LMY |
| 140 | Thailand | 2022 | THA2022 | THA | 764.0 | 800.0 | 47 | 13 | 4 | 0.307692 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | LMY |
| 141 | Thailand | 2023 | THA2023 | THA | 764.0 | 800.0 | 47 | 6 | 0 | 0.000000 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | LMY |
| 142 | Singapore | 2023 | SGP2023 | SGP | 702.0 | 830.0 | 49 | 7 | 1 | 0.142857 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | HIY |
| 143 | Sri Lanka | 2023 | LKA2023 | LKA | 144.0 | 780.0 | 50 | 29 | 12 | 0.413793 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | LMY |
144 rows × 1359 columns
| Country | year | tort_atrisk_prop24 | tort_atrisk_count24 | Label | |
|---|---|---|---|---|---|
| 0 | Angola | 2018 | 0.615385 | 8.0 | LMY |
| 1 | Angola | 2019 | 0.562500 | 9.0 | LMY |
| 2 | Angola | 2020 | 0.071429 | 1.0 | LMY |
| 3 | Angola | 2021 | 0.333333 | 4.0 | LMY |
| 4 | Angola | 2022 | 0.090909 | 1.0 | LMY |
| ... | ... | ... | ... | ... | ... |
| 139 | Maldives | 2023 | 0.857143 | 6.0 | LMY |
| 140 | Thailand | 2022 | 0.071429 | 1.0 | LMY |
| 141 | Thailand | 2023 | 0.333333 | 2.0 | LMY |
| 142 | Singapore | 2023 | 0.142857 | 1.0 | HIY |
| 143 | Sri Lanka | 2023 | 0.068966 | 2.0 | LMY |
144 rows × 5 columns
3. Visualize the data
Create a line graph to show the proportion of respondents who identified migrants/immigrants as being at risk of torture and ill-treatment in High-Income (HIY) and Low/Middle-Income (LMY) countries from 2017 to 2023.
- X-axis:
year(2017 to 2023) - Y-axis: The proportion of respondents who identified migrants/immigrants at risk for torture and ill-treatment (
tort_atrisk_prop24). - Red Line: Represents HIY (High-Income Countries)
- Blue Line: Represents LMY (Low/Middle-Income Countries)
averaged_imm = identity_imm.groupby(['Label', 'year'])['tort_atrisk_prop24'].mean().reset_index()
hiy_imm = averaged_imm[averaged_imm['Label'] == 'HIY']
lmy_imm = averaged_imm[averaged_imm['Label'] == 'LMY']
plt.plot(hiy_imm['year'], hiy_imm['tort_atrisk_prop24'], label='HIY', marker='o')
plt.plot(lmy_imm['year'], lmy_imm['tort_atrisk_prop24'], label='LMY', marker='o')
plt.xlabel('Year')
plt.ylabel('Proportion of respondents \n who identified migrants/immigrants \n at risk for torture and ill-treatment')
plt.title('Averaged Migrants/Immigrants Proportion at Risk of Torture Over Years for HIY and LMY Countries')
plt.legend(title='Label')
plt.grid(True)
plt.show()
4. Calculate standard deviation
Calculate the standard deviation (std) of the proportion of respondents tort_atrisk_prop24 identifying migrants/immigrants as at risk of torture and ill-treatment for each income group (Label - HIY and LMY)to measure how much the values vary or spread out from the mean within each income group.
print(identity_imm.groupby('Label')['tort_atrisk_prop24'].std())
Label HIY 0.215330 LMY 0.219511 Name: tort_atrisk_prop24, dtype: float64
5. Observation
The line plot shows that, between high-income (HIY) and low/middle-income (LMY) countries, there is no significant difference in the average proportion of migrants/immigrants at risk of torture and ill-treatment. This insignificant result may largely due to high variability and the influence of outliers within both groups:
- Both HIY and LMY groups show a standard deviation of approximately 0.2, which is substantial given the range of values (0.05 to 0.3). This large variability in responses within each group makes it difficult to detect meaningful differences between the two income groups.
- High standard deviation suggests the existence of outliers in both groups.
Outliers could disproportionately influence the overall averages.
Part 3: People in HIY have more physical integrity rights, and migrants/immigrants experience lower risks of torture and ill-treatment
1. Merge the data
Merge cpr_phy (Physical Integrity Rights data) with identity_imm (migrants/immigrants' risks). The new dataset after combination is called final_merge.
final_merge = pd.merge(cpr_phy, identity_imm, on='Country', how='inner')
display(final_merge)
| Country | year_x | physint_mean | physint_sd | physint_lo | physint_hi | Label_x | year_y | tort_atrisk_prop24 | tort_atrisk_count24 | Label_y | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Angola | 2017 | 4.401868 | 0.795705 | 3.383366 | 5.420371 | LMY | 2018 | 0.615385 | 8.0 | LMY |
| 1 | Angola | 2017 | 4.401868 | 0.795705 | 3.383366 | 5.420371 | LMY | 2019 | 0.562500 | 9.0 | LMY |
| 2 | Angola | 2017 | 4.401868 | 0.795705 | 3.383366 | 5.420371 | LMY | 2020 | 0.071429 | 1.0 | LMY |
| 3 | Angola | 2017 | 4.401868 | 0.795705 | 3.383366 | 5.420371 | LMY | 2021 | 0.333333 | 4.0 | LMY |
| 4 | Angola | 2017 | 4.401868 | 0.795705 | 3.383366 | 5.420371 | LMY | 2022 | 0.090909 | 1.0 | LMY |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 815 | Thailand | 2023 | 6.726162 | 0.901969 | 5.571642 | 7.880681 | LMY | 2023 | 0.333333 | 2.0 | LMY |
| 816 | Singapore | 2022 | 7.231867 | 0.765176 | 6.252442 | 8.211292 | HIY | 2023 | 0.142857 | 1.0 | HIY |
| 817 | Singapore | 2023 | 7.294268 | 0.743383 | 6.342738 | 8.245797 | HIY | 2023 | 0.142857 | 1.0 | HIY |
| 818 | Sri Lanka | 2022 | 5.090563 | 0.740136 | 4.143188 | 6.037937 | LMY | 2023 | 0.068966 | 2.0 | LMY |
| 819 | Sri Lanka | 2023 | 5.441030 | 0.729936 | 4.506712 | 6.375348 | LMY | 2023 | 0.068966 | 2.0 | LMY |
820 rows × 11 columns
2. Visualize the data
Create an interactive scatter plot showing the relationship between:
physint_mean (Physical Integrity Rights) and tort_atrisk_prop24 (Proportion of migrants/immigrants at risk of torture and ill-treatment), comparing high-income (HIY) and low/middle-income (LMY) countries.
- X-axis: Average Physical Integrity Rights (
physint_mean). - Y-axis: Proportion of migrants/immigrants at risk of torture (
tort_atrisk_prop24). - Blue dots represent HIY countries
- Red dots represent LMY countries.
scatter_data = final_merge.groupby(['Country', 'Label_x']).agg({
'physint_mean': 'mean',
'tort_atrisk_prop24': 'mean'
}).reset_index()
hiy_data = scatter_data[scatter_data['Label_x'] == 'HIY']
lmy_data = scatter_data[scatter_data['Label_x'] == 'LMY']
hiy_scatter = hiy_data.hvplot.scatter(
x='physint_mean',
y='tort_atrisk_prop24',
by='Country',
marker='o',
color='blue',
label='HIY',
xlabel='Average Physical Integrity Rights',
ylabel='Average Proportion of respondents \n who identified migrants/immigrants \n at risk for torture and ill-treatment'
)
lmy_scatter = lmy_data.hvplot.scatter(
x='physint_mean',
y='tort_atrisk_prop24',
by='Country',
marker='o',
color='red',
label='LMY',
xlabel='Average Physical Integrity Rights',
ylabel='Average Proportion of respondents \n who identified migrants/immigrants \n at risk for torture and ill-treatment'
)
final_plot = (hiy_scatter * lmy_scatter).opts(
title='Interactive Scatter Plot: \n Physical Integrity Rights vs. \n Migrant/Immigrant Risks Between HIY and LMY',
legend_position='right',
width=800,
height=600
)
final_plot
3. Observation
In the line graph comparing high-income (HIY) and low/middle-income (LMY) countries, there appears to be no significant distinction in the averaged proportion of migrants/immigrants at risk of torture and ill-treatment. This lack of clear distinction could be attributed to the significant variance within each income group. By averaging the countries in just two broad groups (HIY and LMY), we may overlook the heterogeneity and outliers within these groups.
In contrast, the scatter plot above zooms in the data for individual countries within each group, allowing us to reduce the influence of the large variation in the averages and provide a more precise evaluation. By plotting each country individually, between high-income and low/middle-income countries, there is a clear differences in the pattern of average physical integrity rights and propotion of immigrants/migrants at risk of torture and ill-treatment.
For high income countries: HIY countries cluster mostly in the lower-right corner of the scatter plot. In HIY countries, people have more physical integrity rights (physint_mean), which corresponds to a lower proportion of migrants/immigrants at risks of torture and ill-treatment (tort_atrisk_prop24). Thus, in HIY countries, there is a negative correlation between people's physical integrity rights and immigrants/migrants' risks of torture and ill-treatment.
For low income countries: LMY countries show great variability, with points scattered across the graph. This reflects wide disparities in physical integrity rights and immigrants/migrants' risks of torture and ill-treatment. Thus, in low/middle-income countries, there is no correlation between people's physical integrity rights and immigrants/migrants' risks of torture and ill-treatment.
However, two outliers are identified in HIY:
- The United States:
Located at approximately (4.8, 0.6), it exhibits a lower-than-expected physical integrity score and higher risks for migrants/immigrants compared to other HIY countries.
- Saudi Arabia:
Found at (3.1, 0.3), it has a low physical integrity score and a moderate risk proportion, deviating from the general HIY trend.
Insight 1 :
From 2017 to 2023, people in high-income countries (HIY) overall have more rights to safety from state compared to people in low/middle income countries. A stronger protection for physical integrity rights in high-income countries also correlates to lower proportions of migrants/immigrants at risks of torture and ill-treatment, indicating a negative correlation. Thus, in high-income countries, stronger overall protections for physical integrity is related with better safety and rights for migrants/immigrants. In contrast, in low/middle-income countries, there is a considerable variability in both physical integrity rights and the risks faced by migrants/immigrants. Thus, individuals in low/middle-income countries are lack of consistent protection, and migrants are exposed to varying degrees of risk for torture and ill-treatment. This variability reflects governance challenges and the unequal provision of physical integrity rights, leading to inconsistent safety for migrants/immigrants in low/middle-income countries.Insight Question 2:
From 2017 to 2023, which human right is the most often denied or poorly protected in pacific countries vs. continental countries?
1. Import the dataset
pacific_taiwan_data_df is Data Sources used for Taiwan and Pacific Countries
cpr_df is HRMI Civil and Political Rights (CPR) Dataset
cpr_df = pd.read_csv('csv files/cpr.csv')
pacific_taiwan_data_df = pd.read_csv('csv files/Pacific_Taiwan_data.csv', encoding='latin1')
2. Label the data
Add a binary column 'pacific_binary' to cpr_df:
- 1 indicates the country is in the Pacific region.
- 0 indicates the country is not in the Pacific region.
pacific_countries = pacific_taiwan_data_df['Country'].str.strip().unique()
cpr_df['pacific_binary'] = cpr_df['country'].apply(lambda x: 1 if x in pacific_countries else 0)
display(cpr_df)
| country | year | countryyear | iso3c | iso3n | cowcode | hrmicode | disap_mean | disap_sd | disap_lo | ... | rel_hi | physint_mean | physint_sd | physint_lo | physint_hi | empower_mean | empower_sd | empower_lo | empower_hi | pacific_binary | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Angola | 2017 | AGO2017 | AGO | 24.0 | 540.0 | 1 | 6.464161 | 0.898106 | 5.314585 | ... | NaN | 4.401868 | 0.795705 | 3.383366 | 5.420371 | 2.735583 | 0.737381 | 1.791735 | 3.679431 | 0 |
| 1 | Angola | 2018 | AGO2018 | AGO | 24.0 | 540.0 | 1 | 7.174069 | 0.491192 | 6.545342 | ... | NaN | 5.122885 | 0.812793 | 4.082510 | 6.163259 | 4.580472 | 0.701104 | 3.683059 | 5.477886 | 0 |
| 2 | Angola | 2019 | AGO2019 | AGO | 24.0 | 540.0 | 1 | 6.357960 | 0.626105 | 5.556545 | ... | NaN | 4.486554 | 0.790036 | 3.475308 | 5.497800 | 4.226362 | 0.671121 | 3.367327 | 5.085398 | 0 |
| 3 | Angola | 2020 | AGO2020 | AGO | 24.0 | 540.0 | 1 | 5.453531 | 0.792786 | 4.438765 | ... | NaN | 3.438196 | 0.924271 | 2.255129 | 4.621263 | 2.709881 | 0.691141 | 1.825221 | 3.594541 | 0 |
| 4 | Angola | 2021 | AGO2021 | AGO | 24.0 | 540.0 | 1 | 4.963902 | 0.588297 | 4.210882 | ... | NaN | 3.121782 | 0.849364 | 2.034596 | 4.208967 | 2.996323 | 0.736295 | 2.053865 | 3.938781 | 0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 225 | Thailand | 2023 | THA2023 | THA | 764.0 | 800.0 | 47 | 8.162083 | 0.621875 | 7.366082 | ... | 9.428472 | 6.726162 | 0.901969 | 5.571642 | 7.880681 | 4.842543 | 0.708837 | 3.935232 | 5.749854 | 0 |
| 226 | Singapore | 2022 | SGP2022 | SGP | 702.0 | 830.0 | 49 | 8.410766 | 0.316615 | 8.005499 | ... | 7.127632 | 7.231867 | 0.765176 | 6.252442 | 8.211292 | 4.089519 | 0.672613 | 3.228574 | 4.950464 | 0 |
| 227 | Singapore | 2023 | SGP2023 | SGP | 702.0 | 830.0 | 49 | 8.403709 | 0.313198 | 8.002816 | ... | 7.073044 | 7.294268 | 0.743383 | 6.342738 | 8.245797 | 4.190406 | 0.712885 | 3.277913 | 5.102899 | 0 |
| 228 | Sri Lanka | 2022 | LKA2022 | LKA | 144.0 | 780.0 | 50 | 7.047200 | 0.292876 | 6.672319 | ... | 6.533591 | 5.090563 | 0.740136 | 4.143188 | 6.037937 | 3.610210 | 0.680381 | 2.739323 | 4.481098 | 0 |
| 229 | Sri Lanka | 2023 | LKA2023 | LKA | 144.0 | 780.0 | 50 | 7.356123 | 0.282373 | 6.994685 | ... | 6.520242 | 5.441030 | 0.729936 | 4.506712 | 6.375348 | 3.370074 | 0.657596 | 2.528351 | 4.211797 | 0 |
230 rows × 52 columns
3. Select the data
Dynamically select all columns that include 'mean' in their name, which represent the average scores for different human rights.
Disap: Right to freedom from disappearanceExkill: Right to freedom from extrajudicial executionArrest: Right to freedom from arbitrary or political arrest and imprisonmentTort: Right to freedom from torture and ill-treatmentDepex: Right to freedom from death penalty executionExpress: Right to opinion and expressionPolpart: Right to participate in governmentAssem: Right to assembly and associationRel: Right to freedom of religion and beliefPhysint: Overall right to safety from the state, i.e. Physical Integrity RightsEmpower: Overall right to empowerment
rights_columns = [col for col in cpr_df.columns if 'mean' in col]
display(rights_columns)
['disap_mean', 'exkill_mean', 'arrest_mean', 'tort_mean', 'dpex_mean', 'express_mean', 'polpart_mean', 'assem_mean', 'rel_mean', 'physint_mean', 'empower_mean']
4. Calculation
Calculate the average values of the selected human rights columns, grouped by pacific_binary and year.
grouped_means = cpr_df.groupby(['pacific_binary', 'year'])[rights_columns].mean().reset_index()
display(grouped_means)
| pacific_binary | year | disap_mean | exkill_mean | arrest_mean | tort_mean | dpex_mean | express_mean | polpart_mean | assem_mean | rel_mean | physint_mean | empower_mean | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 2017 | 6.733320 | 6.247352 | 5.245753 | 4.890763 | 9.230942 | 4.675750 | 4.603258 | 5.023186 | NaN | 5.368153 | 4.579556 |
| 1 | 0 | 2018 | 6.735316 | 6.199742 | 5.115282 | 4.893804 | 9.173853 | 4.625586 | 5.073367 | 5.323752 | NaN | 5.280598 | 4.812908 |
| 2 | 0 | 2019 | 6.911426 | 6.292821 | 4.920348 | 5.026125 | 9.195390 | 4.578641 | 5.143065 | 5.457412 | NaN | 5.460887 | 4.880391 |
| 3 | 0 | 2020 | 6.805790 | 6.411092 | 5.212349 | 4.964187 | 8.866773 | 4.335139 | 5.019546 | 4.740300 | NaN | 5.426631 | 4.448103 |
| 4 | 0 | 2021 | 6.749326 | 6.534168 | 5.391714 | 5.196561 | 8.971095 | 4.411066 | 5.088646 | 4.501338 | 5.103049 | 5.464674 | 4.375018 |
| 5 | 0 | 2022 | 6.827311 | 6.611590 | 5.205224 | 5.132033 | 8.844059 | 4.422194 | 5.056482 | 4.396123 | 6.558734 | 5.480520 | 4.397002 |
| 6 | 0 | 2023 | 6.892017 | 6.650289 | 5.085363 | 5.054111 | 8.794397 | 4.508944 | 4.993601 | 4.393300 | 6.528699 | 5.559597 | 4.387060 |
| 7 | 1 | 2017 | 8.292837 | 7.052309 | 5.469275 | 4.838824 | 10.000000 | 2.841465 | 4.365599 | 3.590295 | NaN | 6.566249 | 3.152234 |
| 8 | 1 | 2018 | 8.554965 | 8.021153 | 6.894804 | 6.795908 | 10.000000 | 5.856057 | 6.803963 | 7.379413 | NaN | 7.612841 | 6.698705 |
| 9 | 1 | 2019 | 8.621577 | 8.031246 | 7.357238 | 7.071760 | 10.000000 | 6.240824 | 7.021942 | 7.347842 | NaN | 7.649969 | 6.901580 |
| 10 | 1 | 2020 | 8.420540 | 8.169713 | 7.810239 | 7.158492 | 9.885901 | 6.118137 | 7.015026 | 6.891534 | NaN | 7.714486 | 6.544884 |
| 11 | 1 | 2021 | 8.492205 | 8.502044 | 8.006396 | 7.448673 | 10.000000 | 6.510010 | 7.129258 | 6.632172 | 8.946402 | 8.001933 | 6.669821 |
| 12 | 1 | 2022 | 8.402668 | 8.419019 | 7.961962 | 6.805907 | 10.000000 | 6.253374 | 6.465792 | 6.645225 | 9.271708 | 7.800919 | 6.443977 |
| 13 | 1 | 2023 | 8.377878 | 8.366530 | 7.955499 | 5.973340 | 10.000000 | 7.621193 | 6.639057 | 7.016024 | 9.297732 | 7.622438 | 7.247096 |
5. Combine the data
Reshape the data into a long format that combine all human rights into one column right.
Identify the human right with the lowest mean value for each combination of pacific_binary and year.
lowest_means = grouped_means.melt(id_vars=['pacific_binary', 'year'], var_name='right', value_name='right_mean')
lowest_means_per_year = lowest_means.loc[lowest_means.groupby(['pacific_binary', 'year'])['right_mean'].idxmin()]
lowest_means_per_year['right'] = lowest_means_per_year['right'].replace({
'express_mean': 'express',
'empower_mean': 'empower',
'assem_mean': 'assem',
'exkill_mean': 'exkill',
'disap_mean': 'disap',
'arrest_mean': 'arrest',
'tort_mean': 'tort',
'dpex_mean': 'dpex',
'rel_mean': 'rel',
'physint_mean': 'physint'
})
display(lowest_means)
display(lowest_means_per_year)
| pacific_binary | year | right | right_mean | |
|---|---|---|---|---|
| 0 | 0 | 2017 | disap_mean | 6.733320 |
| 1 | 0 | 2018 | disap_mean | 6.735316 |
| 2 | 0 | 2019 | disap_mean | 6.911426 |
| 3 | 0 | 2020 | disap_mean | 6.805790 |
| 4 | 0 | 2021 | disap_mean | 6.749326 |
| ... | ... | ... | ... | ... |
| 149 | 1 | 2019 | empower_mean | 6.901580 |
| 150 | 1 | 2020 | empower_mean | 6.544884 |
| 151 | 1 | 2021 | empower_mean | 6.669821 |
| 152 | 1 | 2022 | empower_mean | 6.443977 |
| 153 | 1 | 2023 | empower_mean | 7.247096 |
154 rows × 4 columns
| pacific_binary | year | right | right_mean | |
|---|---|---|---|---|
| 140 | 0 | 2017 | empower | 4.579556 |
| 71 | 0 | 2018 | express | 4.625586 |
| 72 | 0 | 2019 | express | 4.578641 |
| 73 | 0 | 2020 | express | 4.335139 |
| 144 | 0 | 2021 | empower | 4.375018 |
| 103 | 0 | 2022 | assem | 4.396123 |
| 146 | 0 | 2023 | empower | 4.387060 |
| 77 | 1 | 2017 | express | 2.841465 |
| 78 | 1 | 2018 | express | 5.856057 |
| 79 | 1 | 2019 | express | 6.240824 |
| 80 | 1 | 2020 | express | 6.118137 |
| 81 | 1 | 2021 | express | 6.510010 |
| 82 | 1 | 2022 | express | 6.253374 |
| 55 | 1 | 2023 | tort | 5.973340 |
6. Split the data
Split the data for plotting into two subsets:
pacific_data: Contains Pacific countries.continental_data: Contains non-Pacific (continental) countries.
pacific_data = lowest_means_per_year[lowest_means_per_year['pacific_binary'] == 1]
continental_data = lowest_means_per_year[lowest_means_per_year['pacific_binary'] == 0]
display(pacific_data)
display(continental_data)
| pacific_binary | year | right | right_mean | |
|---|---|---|---|---|
| 77 | 1 | 2017 | express | 2.841465 |
| 78 | 1 | 2018 | express | 5.856057 |
| 79 | 1 | 2019 | express | 6.240824 |
| 80 | 1 | 2020 | express | 6.118137 |
| 81 | 1 | 2021 | express | 6.510010 |
| 82 | 1 | 2022 | express | 6.253374 |
| 55 | 1 | 2023 | tort | 5.973340 |
| pacific_binary | year | right | right_mean | |
|---|---|---|---|---|
| 140 | 0 | 2017 | empower | 4.579556 |
| 71 | 0 | 2018 | express | 4.625586 |
| 72 | 0 | 2019 | express | 4.578641 |
| 73 | 0 | 2020 | express | 4.335139 |
| 144 | 0 | 2021 | empower | 4.375018 |
| 103 | 0 | 2022 | assem | 4.396123 |
| 146 | 0 | 2023 | empower | 4.387060 |
7. Visualize the data
Create two bar plost showing how the human rights with the lowest average scores evolved over this period for Pacific and Continental countries:
- X-axis:
year - Y-axis:
right_mean(the lowest mean value per year for each group).
# Create a figure with two subplots (1 row, 2 columns)
fig, axes = plt.subplots(1, 2, figsize=(15, 6))
custom_palette = {
'express': 'seagreen',
'empower': 'limegreen',
'assem': 'darkturquoise',
'tort': 'greenyellow'
}
# Plot for HIY countries (pacific_binary = 1) in the first subplot
sns.barplot(x='year', y='right_mean', hue='right', data=pacific_data, ax=axes[0], palette=custom_palette)
axes[0].set_title('Right with Lowest Mean by Year for Pacific Countries', fontsize=14)
axes[0].set_xlabel('Year')
axes[0].set_ylabel('Right Mean')
axes[0].tick_params(axis='x', rotation=45)
axes[0].set_ylim(0, 7)
# Plot for LMY countries (pacific_binary = 0) in the second subplot
sns.barplot(x='year', y='right_mean', hue='right', data=continental_data, ax=axes[1], palette=custom_palette)
axes[1].set_title('Right with Lowest Mean by Year for Continental Countries', fontsize=14)
axes[1].set_xlabel('Year')
axes[1].set_ylabel('Right Mean')
axes[1].tick_params(axis='x', rotation=45)
axes[1].set_ylim(0, 7)
# Adjust layout to avoid overlap
plt.tight_layout()
# Show the combined plot
plt.show()
8. Observation
The graph highlights notable disparities between Pacific and Continental countries in terms of their lowest-scoring human rights over time.
Pacific countries: the lowest mean score in 2017 was rights of expression freedom (express) with a value slightly above 3, which indicates significant challenges in rights of opinion and expression. However, there is a significant improvement in right of expression from 2017 to 2018, and the average of expression right remains relative high from then on (around 6). Then, by 2023, the lowest-scoring right shifted to freedom from torture and ill-treatment (tort), which also has a relatively high score (around 7). This progression reflects significant improvement made by Pacific countries in addressing their weakest human rights issues.
Continental countries: continental countries show no significant improvements in their lowest right score, and the lowest score stabilizes around 4.5 over years. There is also a fluctuation bewteen rights with the lowest score over years. In 2017, empowerment (empower) was the lowest-scoring right, with a mean score around 4. Between 2018 to 2020, freedom of expression emerged as the lowest-scoring right, with scores stabilizing around 5, indicates significant challenges in rights of opinion and expression. By 2022, assembly rights (assem) became the lowest-scoring right, indicating challenges in guaranteeing the right to assembly and association. In 2023, empowerment once again became the lowest-scoring right, showing persistent barriers to individual empowerment.
Pacific countries show more significant progresses overtime, with their lowest-scoring rights improving from ~3 in 2017 to nearly 7 in 2023. In contrast, continental countries showed slower improvements, with their lowest-scoring rights stabilizes around 4.5 throughout the years. Continental countries also have greater fluctuations between rights with the lowest score, suggesting more rights that are at risks/with low scores.
Insight 2 :
In pacfic countries, rights of expression freedom is the most poorly protected human right, appearing as the lowest-scoring right for 5 years (between 2017 and 2022). In 2023, freedom from torture and ill-treatment became the lowest-scoring right. In continental countries, both the empowerment right and the right of expression freedom are the most denied human righs, each appears as the lowest right for three years. Assembly right is also poorly protected and appear as the lowest-scoring right in 2022.
When comparing the two regions, pacific countries made more significant improvements in protecting human rights compared to the continental countries. Pacific countries also have less fluctuations in their lowest-scoreing human rights compared to continental countries, suggesting fewer rights at risks or with low values. Thus, individuals in pacific countries generally have better protected human rights compared to continental countries.
Insight Question 3:
What is the status of empowerment rights in each Asia-Pacific regions between 2018 to 2023? Specifically, does women in regions with stronger empowerment rights have greater rights in education?
1. Import the dataset
people_at_risk_df is People At Risk Dataset
cpr_df is HRMI Civil and Political Rights (CPR) Dataset
cpr_df = pd.read_csv('csv files/cpr.csv')
people_at_risk_df = pd.read_csv('csv files/people_at_risk.csv')
2. Filter the data
Filter the data to include only the selected Asia-Pacific regions countries (listed in target_countries). Also extract relevant columns including:
- Empower: Overall right to empowerment
educ_atrisk_prop10: Proportion of women/girls at risk of lacking right to education
target_countries = ['China', 'South Korea', 'HongKong', 'Taiwan', 'Vietnam', 'Malaysia', 'Thailand', 'Singapore']
filtered_cpr_df = cpr_df[(cpr_df['country'].isin(target_countries))]
filtered_cpr_df = filtered_cpr_df[['country', 'year', 'empower_mean']]
filtered_people_at_risk_df = people_at_risk_df[(people_at_risk_df['country'].isin(target_countries))]
filtered_people_at_risk_df = filtered_people_at_risk_df[['country', 'year', 'educ_atrisk_prop10']]
display(filtered_cpr_df)
display(filtered_people_at_risk_df)
| country | year | empower_mean | |
|---|---|---|---|
| 104 | South Korea | 2017 | 7.091490 |
| 105 | South Korea | 2018 | 7.096694 |
| 106 | South Korea | 2019 | 6.599675 |
| 107 | South Korea | 2020 | 6.437470 |
| 108 | South Korea | 2021 | 7.256850 |
| 109 | South Korea | 2022 | 7.158103 |
| 110 | South Korea | 2023 | 5.785240 |
| 125 | Vietnam | 2017 | 1.207825 |
| 126 | Vietnam | 2018 | 2.108225 |
| 127 | Vietnam | 2019 | 2.430420 |
| 128 | Vietnam | 2020 | 2.496144 |
| 129 | Vietnam | 2021 | 2.357853 |
| 130 | Vietnam | 2022 | 2.212683 |
| 131 | Vietnam | 2023 | 2.257634 |
| 199 | Malaysia | 2019 | 5.672837 |
| 200 | Malaysia | 2020 | 3.773163 |
| 201 | Malaysia | 2021 | 4.088863 |
| 202 | Malaysia | 2022 | 4.383113 |
| 203 | Malaysia | 2023 | 4.775669 |
| 204 | Taiwan | 2019 | 7.072268 |
| 205 | Taiwan | 2020 | 6.988512 |
| 206 | Taiwan | 2021 | 6.993072 |
| 207 | Taiwan | 2022 | 7.069246 |
| 208 | Taiwan | 2023 | 7.247096 |
| 209 | China | 2020 | 1.613400 |
| 210 | China | 2021 | 1.516172 |
| 211 | China | 2022 | 1.627610 |
| 212 | China | 2023 | 1.913970 |
| 223 | Thailand | 2021 | 3.681815 |
| 224 | Thailand | 2022 | 4.080241 |
| 225 | Thailand | 2023 | 4.842543 |
| 226 | Singapore | 2022 | 4.089519 |
| 227 | Singapore | 2023 | 4.190406 |
| country | year | educ_atrisk_prop10 | |
|---|---|---|---|
| 89 | South Korea | 2018 | 0.000000 |
| 90 | South Korea | 2019 | 0.000000 |
| 91 | South Korea | 2020 | 0.000000 |
| 92 | South Korea | 2021 | 0.000000 |
| 93 | South Korea | 2022 | 0.000000 |
| 94 | South Korea | 2023 | 0.125000 |
| 107 | Vietnam | 2018 | 0.235294 |
| 108 | Vietnam | 2019 | 0.153846 |
| 109 | Vietnam | 2020 | 0.000000 |
| 110 | Vietnam | 2021 | 0.200000 |
| 111 | Vietnam | 2022 | 0.052632 |
| 112 | Vietnam | 2023 | 0.000000 |
| 163 | Malaysia | 2020 | 0.200000 |
| 164 | Malaysia | 2021 | 0.166667 |
| 165 | Malaysia | 2022 | 0.333333 |
| 166 | Malaysia | 2023 | 0.166667 |
| 167 | Taiwan | 2020 | 0.000000 |
| 168 | Taiwan | 2021 | 0.033333 |
| 169 | Taiwan | 2022 | 0.040000 |
| 170 | Taiwan | 2023 | 0.066667 |
| 171 | China | 2021 | 0.243243 |
| 172 | China | 2022 | 0.145833 |
| 173 | China | 2023 | 0.090909 |
| 181 | Thailand | 2022 | 0.076923 |
| 182 | Thailand | 2023 | 0.333333 |
| 183 | Singapore | 2023 | 0.000000 |
3. Merge the data
Merges the filtered datasets filtered_cpr_df and filtered_people_at_risk_df on country and year to create a new dataset merged_df.
merged_df = pd.merge(filtered_cpr_df, filtered_people_at_risk_df, on=['country', 'year'])
display(merged_df)
| country | year | empower_mean | educ_atrisk_prop10 | |
|---|---|---|---|---|
| 0 | South Korea | 2018 | 7.096694 | 0.000000 |
| 1 | South Korea | 2019 | 6.599675 | 0.000000 |
| 2 | South Korea | 2020 | 6.437470 | 0.000000 |
| 3 | South Korea | 2021 | 7.256850 | 0.000000 |
| 4 | South Korea | 2022 | 7.158103 | 0.000000 |
| 5 | South Korea | 2023 | 5.785240 | 0.125000 |
| 6 | Vietnam | 2018 | 2.108225 | 0.235294 |
| 7 | Vietnam | 2019 | 2.430420 | 0.153846 |
| 8 | Vietnam | 2020 | 2.496144 | 0.000000 |
| 9 | Vietnam | 2021 | 2.357853 | 0.200000 |
| 10 | Vietnam | 2022 | 2.212683 | 0.052632 |
| 11 | Vietnam | 2023 | 2.257634 | 0.000000 |
| 12 | Malaysia | 2020 | 3.773163 | 0.200000 |
| 13 | Malaysia | 2021 | 4.088863 | 0.166667 |
| 14 | Malaysia | 2022 | 4.383113 | 0.333333 |
| 15 | Malaysia | 2023 | 4.775669 | 0.166667 |
| 16 | Taiwan | 2020 | 6.988512 | 0.000000 |
| 17 | Taiwan | 2021 | 6.993072 | 0.033333 |
| 18 | Taiwan | 2022 | 7.069246 | 0.040000 |
| 19 | Taiwan | 2023 | 7.247096 | 0.066667 |
| 20 | China | 2021 | 1.516172 | 0.243243 |
| 21 | China | 2022 | 1.627610 | 0.145833 |
| 22 | China | 2023 | 1.913970 | 0.090909 |
| 23 | Thailand | 2022 | 4.080241 | 0.076923 |
| 24 | Thailand | 2023 | 4.842543 | 0.333333 |
| 25 | Singapore | 2023 | 4.190406 | 0.000000 |
4. Visualize the data -- Bar chart of Empowerment Rights in Asia-Pacific Regions
Plots a bar chart comparing empower_mean scores for countries/regions across different years to visualize changes in empowerment rights over time for each country.
- X-axis:
country - Y-axis: Empowerment mean scores
empower_mean.
plt.figure(figsize=(12, 6))
sns.barplot(data=merged_df, x='country', y='empower_mean', hue='year', palette='viridis')
plt.title('Empowerment Rights in Asia-Pacific Regions between 2018 and 2023')
plt.ylabel('Empowerment Mean Score')
plt.xlabel('Country/Region')
plt.xticks(rotation=45)
plt.legend(title='Year')
plt.show()
5. Observation
The bar chart compares the empowerment scores of Asia-Pacific regions between 2018 and 2023. Taiwan and South Korea rank the highest in empowerment scores among the regions, having average empowerment scores around 7 and 6.5 respectively. Malaysia follow, with an improving empowerment scores from 4 to 5 overtime. Vietnam and China exhibit significantly lower empowerment scores (around 2). Interestingly, while both Vietnam and China are communist countries, Taiwan and South Korean are capitalist regions. This difference in political and economic systems may be a potential factor impacting people's empowerment rights. The data is missing for Thailand and Singapore from 2018 to 2021, posing a difficulty in analyzing their long term trend. However, both countries tend to have similar scores as Malasysia (arounf 4.5), suggesting a similarity shared between capitalist countries in Southeast Asia. Due to the disparity in empowerment scores between capitalist countries/regions and communist countries, further explorations into the impact of governance systems on empowerment scores may provide meaningful insights.
6. Scatter Plot -- Empowerment vs. Education Rights at Risk in Asia-Pacific Regions
Plots a scatter plot showing the relationship between empowerment rights empower_mean and work rights at risk educ_atrisk_prop10. Points are distinguished by country and year to analyze the correlation between empowerment scores and perceived risks to work rights.
- X-axis: Empowerment mean scores
empower_mean. - Y-axis: Total respondents identifying work rights at risk
educ_atrisk_prop10.
scatter_data = merged_df.groupby('country').agg({
'empower_mean': 'mean',
'educ_atrisk_prop10': 'mean'
}).reset_index()
# Create the scatter plot using the 'hvplot.scatter' function
scatter = scatter_data.hvplot.scatter(
x='empower_mean',
y='educ_atrisk_prop10',
by='country',
marker='o',
c='empower_mean', # Color points based on 'empower_mean'
cmap='viridis', # Apply 'viridis' colormap
label='country',
ylabel='Average Proportion of Women/Girls at Risks of Education',
xlabel='Empowerment Mean'
)
# Customize the plot options
final_plot = scatter.opts(
title="Interactive Scatter Plot: Average Proportion of Women/Girls' at Risks of Education vs. \n Empowerment Mean between Asia-Pacific Countries",
legend_position='right',
width=800,
height=600
)
# Display the final plot
final_plot
7. Observation
From the scatter plot, there is a negative correlation between empowerment mean and the proportion of women/girls at risks of education. A higher empowerment generally associated with fewer risks at education for women/girls. Countries/regions with higher empowerment scores (e.g., Taiwan, South Korea, and singapore), show lower proportions of women/girls at education risks, with values below 0.1. This indicates that higher empowerment may be associated with stronger policies and societal structures that mitigate women/girls' risks in education. In contrast, China and Vietnam, with lower empowerment scores, exhibit higher proportions of women/girls at education risks (above 0.1). Interestingly, Malaysia and Singapore, having a moderate empowerment mean scores (~4-5), exhibit a high proportion of women/girls at education risks (~0.05-0.1). This discrepency from the general trend suggest the existence of other confounding factors influencing women/girls' risks at education (e.g., economic development, governance systems, cultural norms).
Insight 3 :
In Asia-pacific regions, capitalist countries/regions generally have higher empowerment rights than communist countries, therefore, political and economic systems may be a potential factor impacting people's empowerment rights. Moreover, a negative correlation is observed between empowerment score and women/girls' risks in education. A higher empowerment score is related to lower risks of women/girls in education. In countries with weaker empowerment rights, women or girls are more likely to experience systemic barriers such as gender inequality or lack of resources, exposing them to higher risks at education. Combining with previous observation that capitalist regions generally have higher empowerment rights, this observation may be shaped by socio-economic and political dynamics in different countries.