Global Human Rights Disparities Analysis ¶

Ashley Yu

Relevant libaries import:¶

In [2]:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import random
import sympy as sp
from pivottablejs import pivot_ui
import hvplot.pandas
from bokeh.io import output_notebook
import seaborn as sns

Insight Question 1:

Between high-income countries and low/middle income countries, do people have different overall right to safety from state from 2017-2023? Specifically, do people’s Physical Integrity Rights postively correlated with migrants/immigrants' risk of torture and ill-treatment?

Part 1: People in High-Income Countries have more Physical Integrity Rights

1. Import the dataset:
hiy is Income-Adjusted Data Sets of High Income Countries
identity is HRMI People at Risk (PaR) Dataset
cpr is HRMI Civil and Political Rights (CPR) Dataset

In [15]:
hiy = pd.read_csv('csv files/esr_hiy_incomeadjusted.csv')
identity = pd.read_csv('csv files/people_at_risk.csv')
cpr = pd.read_csv('csv files/cpr.csv')

2. Data processing

In the dataset hiy, the column High_Income_Country uses numeric labels:

  • 1 represents High-Income Countries (HIY).
  • 0 represents Low- and Middle-Income Countries (LMY).

3. Label the data

A new dataset hiy_label is created with 'Country' and 'High_Income_Country'.

In [16]:
hiy['High_Income_Country'] = hiy['High_Income_Country'].replace({1: 'HIY', 0: 'LMY'})
hiy = hiy.drop_duplicates(subset=['Country'])
hiy_label = hiy[['Country','High_Income_Country']]
display(hiy_label)
Country High_Income_Country
0 Aruba HIY
32 Afghanistan LMY
64 Angola LMY
96 Albania LMY
128 Andorra HIY
... ... ...
6944 Kosovo LMY
6976 Yemen, Rep. LMY
7008 South Africa LMY
7040 Zambia LMY
7072 Zimbabwe LMY

222 rows × 2 columns

4. Merge the data:

Merge the income classification labels hiy_label with the Civil and Political Rights dataset cpr, adding the income group (HIY / LMY) for each country. Extract variables related to Physical Integrity Rights (Physint).

  • Physint: Physical Integrity Rights, which measure people's overall right to safety from the state.
In [17]:
cpr = cpr.copy()
hiy_label = hiy_label.copy()
cpr.rename(columns={'country': 'Country'}, inplace=True)
hiy_label.rename(columns={'High_Income_Country': 'Label'}, inplace=True)
label_merge = pd.merge(cpr, hiy_label, on='Country', how='inner')
display(label_merge)
cpr_phy = label_merge[['Country', 'year', 'physint_mean', 'physint_sd', 'physint_lo', 'physint_hi', 'Label']] 
display(cpr_phy)
Country year countryyear iso3c iso3n cowcode hrmicode disap_mean disap_sd disap_lo ... rel_hi physint_mean physint_sd physint_lo physint_hi empower_mean empower_sd empower_lo empower_hi Label
0 Angola 2017 AGO2017 AGO 24.0 540.0 1 6.464161 0.898106 5.314585 ... NaN 4.401868 0.795705 3.383366 5.420371 2.735583 0.737381 1.791735 3.679431 LMY
1 Angola 2018 AGO2018 AGO 24.0 540.0 1 7.174069 0.491192 6.545342 ... NaN 5.122885 0.812793 4.082510 6.163259 4.580472 0.701104 3.683059 5.477886 LMY
2 Angola 2019 AGO2019 AGO 24.0 540.0 1 6.357960 0.626105 5.556545 ... NaN 4.486554 0.790036 3.475308 5.497800 4.226362 0.671121 3.367327 5.085398 LMY
3 Angola 2020 AGO2020 AGO 24.0 540.0 1 5.453531 0.792786 4.438765 ... NaN 3.438196 0.924271 2.255129 4.621263 2.709881 0.691141 1.825221 3.594541 LMY
4 Angola 2021 AGO2021 AGO 24.0 540.0 1 4.963902 0.588297 4.210882 ... NaN 3.121782 0.849364 2.034596 4.208967 2.996323 0.736295 2.053865 3.938781 LMY
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
177 Thailand 2023 THA2023 THA 764.0 800.0 47 8.162083 0.621875 7.366082 ... 9.428472 6.726162 0.901969 5.571642 7.880681 4.842543 0.708837 3.935232 5.749854 LMY
178 Singapore 2022 SGP2022 SGP 702.0 830.0 49 8.410766 0.316615 8.005499 ... 7.127632 7.231867 0.765176 6.252442 8.211292 4.089519 0.672613 3.228574 4.950464 HIY
179 Singapore 2023 SGP2023 SGP 702.0 830.0 49 8.403709 0.313198 8.002816 ... 7.073044 7.294268 0.743383 6.342738 8.245797 4.190406 0.712885 3.277913 5.102899 HIY
180 Sri Lanka 2022 LKA2022 LKA 144.0 780.0 50 7.047200 0.292876 6.672319 ... 6.533591 5.090563 0.740136 4.143188 6.037937 3.610210 0.680381 2.739323 4.481098 LMY
181 Sri Lanka 2023 LKA2023 LKA 144.0 780.0 50 7.356123 0.282373 6.994685 ... 6.520242 5.441030 0.729936 4.506712 6.375348 3.370074 0.657596 2.528351 4.211797 LMY

182 rows × 52 columns

Country year physint_mean physint_sd physint_lo physint_hi Label
0 Angola 2017 4.401868 0.795705 3.383366 5.420371 LMY
1 Angola 2018 5.122885 0.812793 4.082510 6.163259 LMY
2 Angola 2019 4.486554 0.790036 3.475308 5.497800 LMY
3 Angola 2020 3.438196 0.924271 2.255129 4.621263 LMY
4 Angola 2021 3.121782 0.849364 2.034596 4.208967 LMY
... ... ... ... ... ... ... ...
177 Thailand 2023 6.726162 0.901969 5.571642 7.880681 LMY
178 Singapore 2022 7.231867 0.765176 6.252442 8.211292 HIY
179 Singapore 2023 7.294268 0.743383 6.342738 8.245797 HIY
180 Sri Lanka 2022 5.090563 0.740136 4.143188 6.037937 LMY
181 Sri Lanka 2023 5.441030 0.729936 4.506712 6.375348 LMY

182 rows × 7 columns

5. Visualize the data

Create a line graph that shows the average Physical Integrity Rights (physint_mean) for high-income (HIY) and low/middle-income (LMY) countries over the years 2017–2023.

  • X-axis: year (2017 to 2023)
  • Y-axis: Average Physical Integrity Rights (physint_mean)
  • Red Line: Represents HIY (High-Income Countries)
  • Blue Line: Represents LMY (Low/Middle-Income Countries)
In [18]:
averaged_data = cpr_phy.groupby(['Label', 'year'])['physint_mean'].mean().reset_index()
hiy_data = averaged_data[averaged_data['Label'] == 'HIY']
lmy_data = averaged_data[averaged_data['Label'] == 'LMY']
plt.plot(hiy_data['year'], hiy_data['physint_mean'], label='HIY', marker='o')
plt.plot(lmy_data['year'], lmy_data['physint_mean'], label='LMY', marker='o')
plt.xlabel('Year')
plt.ylabel('Average Physical Integrity Rights')
plt.title('Average Physical Integrity Rights Over Years for HIY and LMY Countries')
plt.legend(title='Label')
plt.grid(True)
plt.show()
No description has been provided for this image

6. Observation

From the line graph, high-income countries (HIY) shows higher averaged physical integrity right than low/middle-income (LMY) countries across all different years. Thus, people in high-income countries overall have more right to safety from the state.

Part 2: In both HIY and LMY, migrant/immigrant risk of torture and ill-treatment show great variations.

1. Merge Data

Merge the HRMI People at Risk dataset identity with the income classification labels hiy_label, adding a classification of High-Income (HIY) or Low/Middle-Income (LMY) for each country.

2. Extract Migrant/Immigrant related Data

Extract variables related to migrants and immigrants identity_imm.

  • tort_atrisk_prop24: Proportion of respondents who identified migrants/immigrants at risk of torture and ill-treatment.
  • tort_atrisk_count24: Count of respondents identifying migrants/immigrants at risk of torture and ill-treatment.
In [19]:
hiy_label = hiy_label.copy()
hiy_label.rename(columns={'High_Income_Country': 'Label'}, inplace=True)
identity.rename(columns={'country': 'Country'}, inplace=True)
label_identity = pd.merge(identity, hiy_label, on='Country', how='inner')
display(label_identity)
identity_imm = label_identity[['Country', 'year', 'tort_atrisk_prop24', 'tort_atrisk_count24', 'Label']]
display(identity_imm)
Country year countryyear iso3c iso3n cowcode hrmicode food_total_atrisk_resp food_atrisk_count1 food_atrisk_prop1 ... union_atrisk_prop27 union_atrisk_count28 union_atrisk_prop28 union_atrisk_count29 union_atrisk_prop29 union_atrisk_count30 union_atrisk_prop30 union_atrisk_count31 union_atrisk_prop31 Label
0 Angola 2018 AGO2018 AGO 24.0 540.0 1 11 3 0.272727 ... 0.090909 2.0 0.181818 3.0 0.272727 4.0 0.363636 1.0 0.090909 LMY
1 Angola 2019 AGO2019 AGO 24.0 540.0 1 14 3 0.214286 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN LMY
2 Angola 2020 AGO2020 AGO 24.0 540.0 1 13 3 0.230769 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN LMY
3 Angola 2021 AGO2021 AGO 24.0 540.0 1 12 6 0.500000 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN LMY
4 Angola 2022 AGO2022 AGO 24.0 540.0 1 10 5 0.500000 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN LMY
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
139 Maldives 2023 MDV2023 MDV 462.0 781.0 46 7 0 0.000000 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN LMY
140 Thailand 2022 THA2022 THA 764.0 800.0 47 13 4 0.307692 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN LMY
141 Thailand 2023 THA2023 THA 764.0 800.0 47 6 0 0.000000 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN LMY
142 Singapore 2023 SGP2023 SGP 702.0 830.0 49 7 1 0.142857 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN HIY
143 Sri Lanka 2023 LKA2023 LKA 144.0 780.0 50 29 12 0.413793 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN LMY

144 rows × 1359 columns

Country year tort_atrisk_prop24 tort_atrisk_count24 Label
0 Angola 2018 0.615385 8.0 LMY
1 Angola 2019 0.562500 9.0 LMY
2 Angola 2020 0.071429 1.0 LMY
3 Angola 2021 0.333333 4.0 LMY
4 Angola 2022 0.090909 1.0 LMY
... ... ... ... ... ...
139 Maldives 2023 0.857143 6.0 LMY
140 Thailand 2022 0.071429 1.0 LMY
141 Thailand 2023 0.333333 2.0 LMY
142 Singapore 2023 0.142857 1.0 HIY
143 Sri Lanka 2023 0.068966 2.0 LMY

144 rows × 5 columns

3. Visualize the data

Create a line graph to show the proportion of respondents who identified migrants/immigrants as being at risk of torture and ill-treatment in High-Income (HIY) and Low/Middle-Income (LMY) countries from 2017 to 2023.

  • X-axis: year (2017 to 2023)
  • Y-axis: The proportion of respondents who identified migrants/immigrants at risk for torture and ill-treatment (tort_atrisk_prop24).
  • Red Line: Represents HIY (High-Income Countries)
  • Blue Line: Represents LMY (Low/Middle-Income Countries)
In [20]:
averaged_imm = identity_imm.groupby(['Label', 'year'])['tort_atrisk_prop24'].mean().reset_index()
hiy_imm = averaged_imm[averaged_imm['Label'] == 'HIY']
lmy_imm = averaged_imm[averaged_imm['Label'] == 'LMY']
plt.plot(hiy_imm['year'], hiy_imm['tort_atrisk_prop24'], label='HIY', marker='o')
plt.plot(lmy_imm['year'], lmy_imm['tort_atrisk_prop24'], label='LMY', marker='o')
plt.xlabel('Year')
plt.ylabel('Proportion of respondents \n who identified migrants/immigrants \n at risk for torture and ill-treatment')
plt.title('Averaged Migrants/Immigrants Proportion at Risk of Torture Over Years for HIY and LMY Countries')
plt.legend(title='Label')
plt.grid(True)
plt.show()
No description has been provided for this image

4. Calculate standard deviation

Calculate the standard deviation (std) of the proportion of respondents tort_atrisk_prop24 identifying migrants/immigrants as at risk of torture and ill-treatment for each income group (Label - HIY and LMY)to measure how much the values vary or spread out from the mean within each income group.

In [21]:
print(identity_imm.groupby('Label')['tort_atrisk_prop24'].std())
Label
HIY    0.215330
LMY    0.219511
Name: tort_atrisk_prop24, dtype: float64

5. Observation

The line plot shows that, between high-income (HIY) and low/middle-income (LMY) countries, there is no significant difference in the average proportion of migrants/immigrants at risk of torture and ill-treatment. This insignificant result may largely due to high variability and the influence of outliers within both groups:

  • Both HIY and LMY groups show a standard deviation of approximately 0.2, which is substantial given the range of values (0.05 to 0.3). This large variability in responses within each group makes it difficult to detect meaningful differences between the two income groups.
  • High standard deviation suggests the existence of outliers in both groups.

Outliers could disproportionately influence the overall averages.

Part 3: People in HIY have more physical integrity rights, and migrants/immigrants experience lower risks of torture and ill-treatment

1. Merge the data

Merge cpr_phy (Physical Integrity Rights data) with identity_imm (migrants/immigrants' risks). The new dataset after combination is called final_merge.

In [22]:
final_merge = pd.merge(cpr_phy, identity_imm, on='Country', how='inner')
display(final_merge)
Country year_x physint_mean physint_sd physint_lo physint_hi Label_x year_y tort_atrisk_prop24 tort_atrisk_count24 Label_y
0 Angola 2017 4.401868 0.795705 3.383366 5.420371 LMY 2018 0.615385 8.0 LMY
1 Angola 2017 4.401868 0.795705 3.383366 5.420371 LMY 2019 0.562500 9.0 LMY
2 Angola 2017 4.401868 0.795705 3.383366 5.420371 LMY 2020 0.071429 1.0 LMY
3 Angola 2017 4.401868 0.795705 3.383366 5.420371 LMY 2021 0.333333 4.0 LMY
4 Angola 2017 4.401868 0.795705 3.383366 5.420371 LMY 2022 0.090909 1.0 LMY
... ... ... ... ... ... ... ... ... ... ... ...
815 Thailand 2023 6.726162 0.901969 5.571642 7.880681 LMY 2023 0.333333 2.0 LMY
816 Singapore 2022 7.231867 0.765176 6.252442 8.211292 HIY 2023 0.142857 1.0 HIY
817 Singapore 2023 7.294268 0.743383 6.342738 8.245797 HIY 2023 0.142857 1.0 HIY
818 Sri Lanka 2022 5.090563 0.740136 4.143188 6.037937 LMY 2023 0.068966 2.0 LMY
819 Sri Lanka 2023 5.441030 0.729936 4.506712 6.375348 LMY 2023 0.068966 2.0 LMY

820 rows × 11 columns

2. Visualize the data

Create an interactive scatter plot showing the relationship between: physint_mean (Physical Integrity Rights) and tort_atrisk_prop24 (Proportion of migrants/immigrants at risk of torture and ill-treatment), comparing high-income (HIY) and low/middle-income (LMY) countries.

  • X-axis: Average Physical Integrity Rights (physint_mean).
  • Y-axis: Proportion of migrants/immigrants at risk of torture (tort_atrisk_prop24).
  • Blue dots represent HIY countries
  • Red dots represent LMY countries.
In [23]:
scatter_data = final_merge.groupby(['Country', 'Label_x']).agg({
    'physint_mean': 'mean',
    'tort_atrisk_prop24': 'mean'
}).reset_index()

hiy_data = scatter_data[scatter_data['Label_x'] == 'HIY']
lmy_data = scatter_data[scatter_data['Label_x'] == 'LMY']

hiy_scatter = hiy_data.hvplot.scatter(
    x='physint_mean',
    y='tort_atrisk_prop24',
    by='Country',
    marker='o',
    color='blue',
    label='HIY',
    xlabel='Average Physical Integrity Rights', 
    ylabel='Average Proportion of respondents \n who identified migrants/immigrants \n at risk for torture and ill-treatment'
)

lmy_scatter = lmy_data.hvplot.scatter(
    x='physint_mean',
    y='tort_atrisk_prop24',
    by='Country',
    marker='o',
    color='red',
    label='LMY',
    xlabel='Average Physical Integrity Rights',
    ylabel='Average Proportion of respondents \n who identified migrants/immigrants \n at risk for torture and ill-treatment'
)

final_plot = (hiy_scatter * lmy_scatter).opts( 
    title='Interactive Scatter Plot: \n Physical Integrity Rights vs. \n Migrant/Immigrant Risks Between HIY and LMY',
    legend_position='right', 
    width=800, 
    height=600
)

final_plot
Out[23]:

3. Observation

In the line graph comparing high-income (HIY) and low/middle-income (LMY) countries, there appears to be no significant distinction in the averaged proportion of migrants/immigrants at risk of torture and ill-treatment. This lack of clear distinction could be attributed to the significant variance within each income group. By averaging the countries in just two broad groups (HIY and LMY), we may overlook the heterogeneity and outliers within these groups.

In contrast, the scatter plot above zooms in the data for individual countries within each group, allowing us to reduce the influence of the large variation in the averages and provide a more precise evaluation. By plotting each country individually, between high-income and low/middle-income countries, there is a clear differences in the pattern of average physical integrity rights and propotion of immigrants/migrants at risk of torture and ill-treatment.

For high income countries: HIY countries cluster mostly in the lower-right corner of the scatter plot. In HIY countries, people have more physical integrity rights (physint_mean), which corresponds to a lower proportion of migrants/immigrants at risks of torture and ill-treatment (tort_atrisk_prop24). Thus, in HIY countries, there is a negative correlation between people's physical integrity rights and immigrants/migrants' risks of torture and ill-treatment.

For low income countries: LMY countries show great variability, with points scattered across the graph. This reflects wide disparities in physical integrity rights and immigrants/migrants' risks of torture and ill-treatment. Thus, in low/middle-income countries, there is no correlation between people's physical integrity rights and immigrants/migrants' risks of torture and ill-treatment.

However, two outliers are identified in HIY:

  • The United States:

Located at approximately (4.8, 0.6), it exhibits a lower-than-expected physical integrity score and higher risks for migrants/immigrants compared to other HIY countries.

  • Saudi Arabia:

Found at (3.1, 0.3), it has a low physical integrity score and a moderate risk proportion, deviating from the general HIY trend.

Insight 1 :

From 2017 to 2023, people in high-income countries (HIY) overall have more rights to safety from state compared to people in low/middle income countries. A stronger protection for physical integrity rights in high-income countries also correlates to lower proportions of migrants/immigrants at risks of torture and ill-treatment, indicating a negative correlation. Thus, in high-income countries, stronger overall protections for physical integrity is related with better safety and rights for migrants/immigrants. In contrast, in low/middle-income countries, there is a considerable variability in both physical integrity rights and the risks faced by migrants/immigrants. Thus, individuals in low/middle-income countries are lack of consistent protection, and migrants are exposed to varying degrees of risk for torture and ill-treatment. This variability reflects governance challenges and the unequal provision of physical integrity rights, leading to inconsistent safety for migrants/immigrants in low/middle-income countries.

Insight Question 2:

From 2017 to 2023, which human right is the most often denied or poorly protected in pacific countries vs. continental countries?

1. Import the dataset

pacific_taiwan_data_df is Data Sources used for Taiwan and Pacific Countries
cpr_df is HRMI Civil and Political Rights (CPR) Dataset

In [24]:
cpr_df = pd.read_csv('csv files/cpr.csv') 
pacific_taiwan_data_df = pd.read_csv('csv files/Pacific_Taiwan_data.csv', encoding='latin1')

2. Label the data

Add a binary column 'pacific_binary' to cpr_df:

  • 1 indicates the country is in the Pacific region.
  • 0 indicates the country is not in the Pacific region.
In [25]:
pacific_countries = pacific_taiwan_data_df['Country'].str.strip().unique()
cpr_df['pacific_binary'] = cpr_df['country'].apply(lambda x: 1 if x in pacific_countries else 0)
display(cpr_df)
country year countryyear iso3c iso3n cowcode hrmicode disap_mean disap_sd disap_lo ... rel_hi physint_mean physint_sd physint_lo physint_hi empower_mean empower_sd empower_lo empower_hi pacific_binary
0 Angola 2017 AGO2017 AGO 24.0 540.0 1 6.464161 0.898106 5.314585 ... NaN 4.401868 0.795705 3.383366 5.420371 2.735583 0.737381 1.791735 3.679431 0
1 Angola 2018 AGO2018 AGO 24.0 540.0 1 7.174069 0.491192 6.545342 ... NaN 5.122885 0.812793 4.082510 6.163259 4.580472 0.701104 3.683059 5.477886 0
2 Angola 2019 AGO2019 AGO 24.0 540.0 1 6.357960 0.626105 5.556545 ... NaN 4.486554 0.790036 3.475308 5.497800 4.226362 0.671121 3.367327 5.085398 0
3 Angola 2020 AGO2020 AGO 24.0 540.0 1 5.453531 0.792786 4.438765 ... NaN 3.438196 0.924271 2.255129 4.621263 2.709881 0.691141 1.825221 3.594541 0
4 Angola 2021 AGO2021 AGO 24.0 540.0 1 4.963902 0.588297 4.210882 ... NaN 3.121782 0.849364 2.034596 4.208967 2.996323 0.736295 2.053865 3.938781 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
225 Thailand 2023 THA2023 THA 764.0 800.0 47 8.162083 0.621875 7.366082 ... 9.428472 6.726162 0.901969 5.571642 7.880681 4.842543 0.708837 3.935232 5.749854 0
226 Singapore 2022 SGP2022 SGP 702.0 830.0 49 8.410766 0.316615 8.005499 ... 7.127632 7.231867 0.765176 6.252442 8.211292 4.089519 0.672613 3.228574 4.950464 0
227 Singapore 2023 SGP2023 SGP 702.0 830.0 49 8.403709 0.313198 8.002816 ... 7.073044 7.294268 0.743383 6.342738 8.245797 4.190406 0.712885 3.277913 5.102899 0
228 Sri Lanka 2022 LKA2022 LKA 144.0 780.0 50 7.047200 0.292876 6.672319 ... 6.533591 5.090563 0.740136 4.143188 6.037937 3.610210 0.680381 2.739323 4.481098 0
229 Sri Lanka 2023 LKA2023 LKA 144.0 780.0 50 7.356123 0.282373 6.994685 ... 6.520242 5.441030 0.729936 4.506712 6.375348 3.370074 0.657596 2.528351 4.211797 0

230 rows × 52 columns

3. Select the data

Dynamically select all columns that include 'mean' in their name, which represent the average scores for different human rights.

  • Disap: Right to freedom from disappearance
  • Exkill: Right to freedom from extrajudicial execution
  • Arrest: Right to freedom from arbitrary or political arrest and imprisonment
  • Tort: Right to freedom from torture and ill-treatment
  • Depex: Right to freedom from death penalty execution
  • Express: Right to opinion and expression
  • Polpart: Right to participate in government
  • Assem: Right to assembly and association
  • Rel: Right to freedom of religion and belief
  • Physint: Overall right to safety from the state, i.e. Physical Integrity Rights
  • Empower: Overall right to empowerment
In [26]:
rights_columns = [col for col in cpr_df.columns if 'mean' in col]
display(rights_columns)
['disap_mean',
 'exkill_mean',
 'arrest_mean',
 'tort_mean',
 'dpex_mean',
 'express_mean',
 'polpart_mean',
 'assem_mean',
 'rel_mean',
 'physint_mean',
 'empower_mean']

4. Calculation

Calculate the average values of the selected human rights columns, grouped by pacific_binary and year.

In [27]:
grouped_means = cpr_df.groupby(['pacific_binary', 'year'])[rights_columns].mean().reset_index()
display(grouped_means)
pacific_binary year disap_mean exkill_mean arrest_mean tort_mean dpex_mean express_mean polpart_mean assem_mean rel_mean physint_mean empower_mean
0 0 2017 6.733320 6.247352 5.245753 4.890763 9.230942 4.675750 4.603258 5.023186 NaN 5.368153 4.579556
1 0 2018 6.735316 6.199742 5.115282 4.893804 9.173853 4.625586 5.073367 5.323752 NaN 5.280598 4.812908
2 0 2019 6.911426 6.292821 4.920348 5.026125 9.195390 4.578641 5.143065 5.457412 NaN 5.460887 4.880391
3 0 2020 6.805790 6.411092 5.212349 4.964187 8.866773 4.335139 5.019546 4.740300 NaN 5.426631 4.448103
4 0 2021 6.749326 6.534168 5.391714 5.196561 8.971095 4.411066 5.088646 4.501338 5.103049 5.464674 4.375018
5 0 2022 6.827311 6.611590 5.205224 5.132033 8.844059 4.422194 5.056482 4.396123 6.558734 5.480520 4.397002
6 0 2023 6.892017 6.650289 5.085363 5.054111 8.794397 4.508944 4.993601 4.393300 6.528699 5.559597 4.387060
7 1 2017 8.292837 7.052309 5.469275 4.838824 10.000000 2.841465 4.365599 3.590295 NaN 6.566249 3.152234
8 1 2018 8.554965 8.021153 6.894804 6.795908 10.000000 5.856057 6.803963 7.379413 NaN 7.612841 6.698705
9 1 2019 8.621577 8.031246 7.357238 7.071760 10.000000 6.240824 7.021942 7.347842 NaN 7.649969 6.901580
10 1 2020 8.420540 8.169713 7.810239 7.158492 9.885901 6.118137 7.015026 6.891534 NaN 7.714486 6.544884
11 1 2021 8.492205 8.502044 8.006396 7.448673 10.000000 6.510010 7.129258 6.632172 8.946402 8.001933 6.669821
12 1 2022 8.402668 8.419019 7.961962 6.805907 10.000000 6.253374 6.465792 6.645225 9.271708 7.800919 6.443977
13 1 2023 8.377878 8.366530 7.955499 5.973340 10.000000 7.621193 6.639057 7.016024 9.297732 7.622438 7.247096

5. Combine the data

Reshape the data into a long format that combine all human rights into one column right.

Identify the human right with the lowest mean value for each combination of pacific_binary and year.

In [28]:
lowest_means = grouped_means.melt(id_vars=['pacific_binary', 'year'], var_name='right', value_name='right_mean')
lowest_means_per_year = lowest_means.loc[lowest_means.groupby(['pacific_binary', 'year'])['right_mean'].idxmin()]
lowest_means_per_year['right'] = lowest_means_per_year['right'].replace({
    'express_mean': 'express',
    'empower_mean': 'empower',
    'assem_mean': 'assem',
    'exkill_mean': 'exkill',
    'disap_mean': 'disap',
    'arrest_mean': 'arrest',
    'tort_mean': 'tort',
    'dpex_mean': 'dpex',
    'rel_mean': 'rel',
    'physint_mean': 'physint'
})
display(lowest_means)
display(lowest_means_per_year)
pacific_binary year right right_mean
0 0 2017 disap_mean 6.733320
1 0 2018 disap_mean 6.735316
2 0 2019 disap_mean 6.911426
3 0 2020 disap_mean 6.805790
4 0 2021 disap_mean 6.749326
... ... ... ... ...
149 1 2019 empower_mean 6.901580
150 1 2020 empower_mean 6.544884
151 1 2021 empower_mean 6.669821
152 1 2022 empower_mean 6.443977
153 1 2023 empower_mean 7.247096

154 rows × 4 columns

pacific_binary year right right_mean
140 0 2017 empower 4.579556
71 0 2018 express 4.625586
72 0 2019 express 4.578641
73 0 2020 express 4.335139
144 0 2021 empower 4.375018
103 0 2022 assem 4.396123
146 0 2023 empower 4.387060
77 1 2017 express 2.841465
78 1 2018 express 5.856057
79 1 2019 express 6.240824
80 1 2020 express 6.118137
81 1 2021 express 6.510010
82 1 2022 express 6.253374
55 1 2023 tort 5.973340

6. Split the data

Split the data for plotting into two subsets:

  • pacific_data: Contains Pacific countries.
  • continental_data: Contains non-Pacific (continental) countries.
In [29]:
pacific_data = lowest_means_per_year[lowest_means_per_year['pacific_binary'] == 1]
continental_data = lowest_means_per_year[lowest_means_per_year['pacific_binary'] == 0]
display(pacific_data)
display(continental_data)
pacific_binary year right right_mean
77 1 2017 express 2.841465
78 1 2018 express 5.856057
79 1 2019 express 6.240824
80 1 2020 express 6.118137
81 1 2021 express 6.510010
82 1 2022 express 6.253374
55 1 2023 tort 5.973340
pacific_binary year right right_mean
140 0 2017 empower 4.579556
71 0 2018 express 4.625586
72 0 2019 express 4.578641
73 0 2020 express 4.335139
144 0 2021 empower 4.375018
103 0 2022 assem 4.396123
146 0 2023 empower 4.387060

7. Visualize the data

Create two bar plost showing how the human rights with the lowest average scores evolved over this period for Pacific and Continental countries:

  • X-axis: year
  • Y-axis: right_mean (the lowest mean value per year for each group).
In [30]:
# Create a figure with two subplots (1 row, 2 columns)
fig, axes = plt.subplots(1, 2, figsize=(15, 6))

custom_palette = {
    'express': 'seagreen',
    'empower': 'limegreen',
    'assem': 'darkturquoise',
    'tort': 'greenyellow'
}

# Plot for HIY countries (pacific_binary = 1) in the first subplot
sns.barplot(x='year', y='right_mean', hue='right', data=pacific_data, ax=axes[0], palette=custom_palette)
axes[0].set_title('Right with Lowest Mean by Year for Pacific Countries', fontsize=14)
axes[0].set_xlabel('Year')
axes[0].set_ylabel('Right Mean')
axes[0].tick_params(axis='x', rotation=45)
axes[0].set_ylim(0, 7)

# Plot for LMY countries (pacific_binary = 0) in the second subplot
sns.barplot(x='year', y='right_mean', hue='right', data=continental_data, ax=axes[1], palette=custom_palette)
axes[1].set_title('Right with Lowest Mean by Year for Continental Countries', fontsize=14)
axes[1].set_xlabel('Year')
axes[1].set_ylabel('Right Mean')
axes[1].tick_params(axis='x', rotation=45)
axes[1].set_ylim(0, 7)

# Adjust layout to avoid overlap
plt.tight_layout()

# Show the combined plot
plt.show()
No description has been provided for this image

8. Observation

The graph highlights notable disparities between Pacific and Continental countries in terms of their lowest-scoring human rights over time.

Pacific countries: the lowest mean score in 2017 was rights of expression freedom (express) with a value slightly above 3, which indicates significant challenges in rights of opinion and expression. However, there is a significant improvement in right of expression from 2017 to 2018, and the average of expression right remains relative high from then on (around 6). Then, by 2023, the lowest-scoring right shifted to freedom from torture and ill-treatment (tort), which also has a relatively high score (around 7). This progression reflects significant improvement made by Pacific countries in addressing their weakest human rights issues.

Continental countries: continental countries show no significant improvements in their lowest right score, and the lowest score stabilizes around 4.5 over years. There is also a fluctuation bewteen rights with the lowest score over years. In 2017, empowerment (empower) was the lowest-scoring right, with a mean score around 4. Between 2018 to 2020, freedom of expression emerged as the lowest-scoring right, with scores stabilizing around 5, indicates significant challenges in rights of opinion and expression. By 2022, assembly rights (assem) became the lowest-scoring right, indicating challenges in guaranteeing the right to assembly and association. In 2023, empowerment once again became the lowest-scoring right, showing persistent barriers to individual empowerment.

Pacific countries show more significant progresses overtime, with their lowest-scoring rights improving from ~3 in 2017 to nearly 7 in 2023. In contrast, continental countries showed slower improvements, with their lowest-scoring rights stabilizes around 4.5 throughout the years. Continental countries also have greater fluctuations between rights with the lowest score, suggesting more rights that are at risks/with low scores.

Insight 2 :

In pacfic countries, rights of expression freedom is the most poorly protected human right, appearing as the lowest-scoring right for 5 years (between 2017 and 2022). In 2023, freedom from torture and ill-treatment became the lowest-scoring right. In continental countries, both the empowerment right and the right of expression freedom are the most denied human righs, each appears as the lowest right for three years. Assembly right is also poorly protected and appear as the lowest-scoring right in 2022.

When comparing the two regions, pacific countries made more significant improvements in protecting human rights compared to the continental countries. Pacific countries also have less fluctuations in their lowest-scoreing human rights compared to continental countries, suggesting fewer rights at risks or with low values. Thus, individuals in pacific countries generally have better protected human rights compared to continental countries.

Insight Question 3:

What is the status of empowerment rights in each Asia-Pacific regions between 2018 to 2023? Specifically, does women in regions with stronger empowerment rights have greater rights in education?

1. Import the dataset

people_at_risk_df is People At Risk Dataset
cpr_df is HRMI Civil and Political Rights (CPR) Dataset

In [3]:
cpr_df = pd.read_csv('csv files/cpr.csv')
people_at_risk_df = pd.read_csv('csv files/people_at_risk.csv')

2. Filter the data

Filter the data to include only the selected Asia-Pacific regions countries (listed in target_countries). Also extract relevant columns including:

  • Empower: Overall right to empowerment
  • educ_atrisk_prop10: Proportion of women/girls at risk of lacking right to education
In [11]:
target_countries = ['China', 'South Korea', 'HongKong', 'Taiwan', 'Vietnam', 'Malaysia', 'Thailand', 'Singapore']
filtered_cpr_df = cpr_df[(cpr_df['country'].isin(target_countries))]
filtered_cpr_df = filtered_cpr_df[['country', 'year', 'empower_mean']]
filtered_people_at_risk_df = people_at_risk_df[(people_at_risk_df['country'].isin(target_countries))]
filtered_people_at_risk_df = filtered_people_at_risk_df[['country', 'year', 'educ_atrisk_prop10']]
display(filtered_cpr_df)
display(filtered_people_at_risk_df)
country year empower_mean
104 South Korea 2017 7.091490
105 South Korea 2018 7.096694
106 South Korea 2019 6.599675
107 South Korea 2020 6.437470
108 South Korea 2021 7.256850
109 South Korea 2022 7.158103
110 South Korea 2023 5.785240
125 Vietnam 2017 1.207825
126 Vietnam 2018 2.108225
127 Vietnam 2019 2.430420
128 Vietnam 2020 2.496144
129 Vietnam 2021 2.357853
130 Vietnam 2022 2.212683
131 Vietnam 2023 2.257634
199 Malaysia 2019 5.672837
200 Malaysia 2020 3.773163
201 Malaysia 2021 4.088863
202 Malaysia 2022 4.383113
203 Malaysia 2023 4.775669
204 Taiwan 2019 7.072268
205 Taiwan 2020 6.988512
206 Taiwan 2021 6.993072
207 Taiwan 2022 7.069246
208 Taiwan 2023 7.247096
209 China 2020 1.613400
210 China 2021 1.516172
211 China 2022 1.627610
212 China 2023 1.913970
223 Thailand 2021 3.681815
224 Thailand 2022 4.080241
225 Thailand 2023 4.842543
226 Singapore 2022 4.089519
227 Singapore 2023 4.190406
country year educ_atrisk_prop10
89 South Korea 2018 0.000000
90 South Korea 2019 0.000000
91 South Korea 2020 0.000000
92 South Korea 2021 0.000000
93 South Korea 2022 0.000000
94 South Korea 2023 0.125000
107 Vietnam 2018 0.235294
108 Vietnam 2019 0.153846
109 Vietnam 2020 0.000000
110 Vietnam 2021 0.200000
111 Vietnam 2022 0.052632
112 Vietnam 2023 0.000000
163 Malaysia 2020 0.200000
164 Malaysia 2021 0.166667
165 Malaysia 2022 0.333333
166 Malaysia 2023 0.166667
167 Taiwan 2020 0.000000
168 Taiwan 2021 0.033333
169 Taiwan 2022 0.040000
170 Taiwan 2023 0.066667
171 China 2021 0.243243
172 China 2022 0.145833
173 China 2023 0.090909
181 Thailand 2022 0.076923
182 Thailand 2023 0.333333
183 Singapore 2023 0.000000

3. Merge the data

Merges the filtered datasets filtered_cpr_df and filtered_people_at_risk_df on country and year to create a new dataset merged_df.

In [12]:
merged_df = pd.merge(filtered_cpr_df, filtered_people_at_risk_df, on=['country', 'year'])
display(merged_df)
country year empower_mean educ_atrisk_prop10
0 South Korea 2018 7.096694 0.000000
1 South Korea 2019 6.599675 0.000000
2 South Korea 2020 6.437470 0.000000
3 South Korea 2021 7.256850 0.000000
4 South Korea 2022 7.158103 0.000000
5 South Korea 2023 5.785240 0.125000
6 Vietnam 2018 2.108225 0.235294
7 Vietnam 2019 2.430420 0.153846
8 Vietnam 2020 2.496144 0.000000
9 Vietnam 2021 2.357853 0.200000
10 Vietnam 2022 2.212683 0.052632
11 Vietnam 2023 2.257634 0.000000
12 Malaysia 2020 3.773163 0.200000
13 Malaysia 2021 4.088863 0.166667
14 Malaysia 2022 4.383113 0.333333
15 Malaysia 2023 4.775669 0.166667
16 Taiwan 2020 6.988512 0.000000
17 Taiwan 2021 6.993072 0.033333
18 Taiwan 2022 7.069246 0.040000
19 Taiwan 2023 7.247096 0.066667
20 China 2021 1.516172 0.243243
21 China 2022 1.627610 0.145833
22 China 2023 1.913970 0.090909
23 Thailand 2022 4.080241 0.076923
24 Thailand 2023 4.842543 0.333333
25 Singapore 2023 4.190406 0.000000

4. Visualize the data -- Bar chart of Empowerment Rights in Asia-Pacific Regions

Plots a bar chart comparing empower_mean scores for countries/regions across different years to visualize changes in empowerment rights over time for each country.

  • X-axis: country
  • Y-axis: Empowerment mean scores empower_mean.
In [13]:
plt.figure(figsize=(12, 6))
sns.barplot(data=merged_df, x='country', y='empower_mean', hue='year', palette='viridis')
plt.title('Empowerment Rights in Asia-Pacific Regions between 2018 and 2023')
plt.ylabel('Empowerment Mean Score')
plt.xlabel('Country/Region')
plt.xticks(rotation=45)
plt.legend(title='Year')
plt.show()
No description has been provided for this image

5. Observation

The bar chart compares the empowerment scores of Asia-Pacific regions between 2018 and 2023. Taiwan and South Korea rank the highest in empowerment scores among the regions, having average empowerment scores around 7 and 6.5 respectively. Malaysia follow, with an improving empowerment scores from 4 to 5 overtime. Vietnam and China exhibit significantly lower empowerment scores (around 2). Interestingly, while both Vietnam and China are communist countries, Taiwan and South Korean are capitalist regions. This difference in political and economic systems may be a potential factor impacting people's empowerment rights. The data is missing for Thailand and Singapore from 2018 to 2021, posing a difficulty in analyzing their long term trend. However, both countries tend to have similar scores as Malasysia (arounf 4.5), suggesting a similarity shared between capitalist countries in Southeast Asia. Due to the disparity in empowerment scores between capitalist countries/regions and communist countries, further explorations into the impact of governance systems on empowerment scores may provide meaningful insights.

6. Scatter Plot -- Empowerment vs. Education Rights at Risk in Asia-Pacific Regions

Plots a scatter plot showing the relationship between empowerment rights empower_mean and work rights at risk educ_atrisk_prop10. Points are distinguished by country and year to analyze the correlation between empowerment scores and perceived risks to work rights.

  • X-axis: Empowerment mean scores empower_mean.
  • Y-axis: Total respondents identifying work rights at risk educ_atrisk_prop10.
In [14]:
scatter_data = merged_df.groupby('country').agg({
    'empower_mean': 'mean',
    'educ_atrisk_prop10': 'mean'
}).reset_index()

# Create the scatter plot using the 'hvplot.scatter' function
scatter = scatter_data.hvplot.scatter(
    x='empower_mean', 
    y='educ_atrisk_prop10', 
    by='country', 
    marker='o', 
    c='empower_mean',  # Color points based on 'empower_mean'
    cmap='viridis',  # Apply 'viridis' colormap
    label='country',
    ylabel='Average Proportion of Women/Girls at Risks of Education', 
    xlabel='Empowerment Mean'
)

# Customize the plot options
final_plot = scatter.opts(
    title="Interactive Scatter Plot: Average Proportion of Women/Girls' at Risks of Education vs. \n Empowerment Mean between Asia-Pacific Countries",
    legend_position='right', 
    width=800, 
    height=600
)

# Display the final plot
final_plot
Out[14]:

7. Observation

From the scatter plot, there is a negative correlation between empowerment mean and the proportion of women/girls at risks of education. A higher empowerment generally associated with fewer risks at education for women/girls. Countries/regions with higher empowerment scores (e.g., Taiwan, South Korea, and singapore), show lower proportions of women/girls at education risks, with values below 0.1. This indicates that higher empowerment may be associated with stronger policies and societal structures that mitigate women/girls' risks in education. In contrast, China and Vietnam, with lower empowerment scores, exhibit higher proportions of women/girls at education risks (above 0.1). Interestingly, Malaysia and Singapore, having a moderate empowerment mean scores (~4-5), exhibit a high proportion of women/girls at education risks (~0.05-0.1). This discrepency from the general trend suggest the existence of other confounding factors influencing women/girls' risks at education (e.g., economic development, governance systems, cultural norms).

Insight 3 :

In Asia-pacific regions, capitalist countries/regions generally have higher empowerment rights than communist countries, therefore, political and economic systems may be a potential factor impacting people's empowerment rights. Moreover, a negative correlation is observed between empowerment score and women/girls' risks in education. A higher empowerment score is related to lower risks of women/girls in education. In countries with weaker empowerment rights, women or girls are more likely to experience systemic barriers such as gender inequality or lack of resources, exposing them to higher risks at education. Combining with previous observation that capitalist regions generally have higher empowerment rights, this observation may be shaped by socio-economic and political dynamics in different countries.