1 Overview

1.1 Setting the Scene

Since late December 2019, an outbreak of a novel coronavirus disease (COVID-19; previously known as 2019-nCoV) was reported in Wuhan, China, which had subsequently affected 210 countries worldwide. In general, COVID-19 is an acute resolved disease but it can also be deadly, with a 2% case fatality rate.

The COVID-19 vaccination in Indonesia is an ongoing mass immunisation in response to the COVID-19 pandemic in Indonesia. On 13 January 2021, the program commenced when President Joko Widodo was vaccinated at the presidential palace. In terms of total doses given, Indonesia ranks third in Asia and fifth in the world.

According to wikipedia, as of 5 February 2023 at 18:00 WIB (UTC+7), 204,266,655 people had received the first dose of the vaccine and 175,131,893 people had been fully vaccinated; 69,597,474 of them had been inoculated with the booster or the third dose, while 1,585,164 had received the fourth dose. Jakarta has the highest percentage of population fully vaccinated with 103.46%, followed by Bali and Special Region of Yogyakarta with 85.45% and 83.02% respectively.

Despite its compactness, the cumulative vaccination rate are not evenly distributed within DKI Jakarta. The question is where are the sub-districts with relatively higher number of vaccination rate and how they changed over time.

1.2 Objectives

Exploratory Spatial Data Analysis (ESDA) hold tremendous potential to address complex problems facing society. In this study, you are tasked to apply appropriate Local Indicators of Spatial Association (LISA) and Emerging Hot Spot Analysis (EHSA) to undercover the spatio-temporal trends of COVID-19 vaccination in DKI Jakarta.

1.3 Tasks

The specific tasks of this take-home exercise are as follows:

1.3.1 Choropleth Mapping and Analysis

Compute the monthly vaccination rate from July 2021 to June 2022 at sub-district (also known as kelurahan in Bahasa Indonesia) level,
Prepare the monthly vaccination rate maps by using appropriate tmap functions,
Describe the spatial patterns revealed by the choropleth maps (not more than 200 words).

1.3.2 Local Gi* Analysis

With reference to the vaccination rate maps prepared in ESDA:

Compute local Gi* values of the monthly vaccination rate,
Display the Gi* maps of the monthly vaccination rate. The maps should only display the significant (i.e. p-value < 0.05)
With reference to the analysis results, draw statistical conclusions (not more than 250 words).

1.3.3 Emerging Hot Spot Analysis(EHSA)

With reference to the local Gi* values of the vaccination rate maps prepared in the previous section:

Perform Mann-Kendall Test by using the spatio-temporal local Gi* values,
Select three sub-districts and describe the temporal trends revealed (not more than 250 words), and
Prepared a EHSA map of the Gi* values of vaccination rate. The maps should only display the significant (i.e. p-value < 0.05).
With reference to the EHSA map prepared, describe the spatial patterns revelaed. (not more than 250 words).

2 Getting Started

2.1 Data Acquisition

The following datasets would be used to study the spatial-temporal geographical distribution of vaccination rates in Jarkata, Indonesia, between June 2021 to May 2022.

Dataset Name	Source
Vaccination Data from June 2021 to May 2022 in Jarkata, Indonesia	Riwayat File Vaksinasi DKI Jakarta
DKI Jakarta Administration Boundary 2019	Indonesia Geospatial Portal

2.2 Installing and Loading Packages

Next, pacman assists us by helping us load R packages that we require, sf, tidyverse and funModeling.

pacman::p_load(readxl, sf, tidyverse, tmap, sfdep, gifski)

The following packages assists us to accomplish the following:

readxl assists us in importing .xlsx aspatial data without having to convert to .csv
sf helps to import, manage and process vector-based geospatial data in R
tidyverse which includes readr to import delimited text file, tidyr for tidying data and dplyr for wrangling data
tmap provides functions to allow us to plot high quality static or interactive maps using leaflet API
gifski helps us to handle the GIF animation for tmap

2.3 Context

In Indonesia, the subdivisions in Indonesia is denoted as follows, this will be important for our analysis as we will be interested in looking at spatio-temporal data of COVID-19 vaccinations at sub-district level which will be mapped to Level 4, Rural or Urban Villages (Desa or Kelurahan).

Level of Administration	Name (English) / Name (Bahasa Indonesia)
Level 1	Province / Provinsi
Level 2	Cities / Kota
Level 3	Districts / Kecamantan
Level 4	Rural or Urban Villages / Desa or Kelurahan

3 Importing and Preparing Geospatial Data

3.1 Geospatial Dataset

3.1.1 Importing Geospatial Dataset

Firstly, in the code chunk below, we will import the geospatial dataset for Jarkata,

In the code below, dsn specifies the filepath where the dataset is located and layer provides the filename of the dataset excluding the file extension.

We will also convert the dataset from WGS84 Geographic Coordinate System to EPSG::23837 (DGN95) which is the national Projected Coordinate System for Jarkata.

jkt = st_read(dsn = "Take-Home_Ex02/data/geospatial", layer = "BATAS_DESA_DESEMBER_2019_DUKCAPIL_DKI_JAKARTA") %>% st_transform(crs = 23837)

Reading layer `BATAS_DESA_DESEMBER_2019_DUKCAPIL_DKI_JAKARTA' from data source 
  `C:\renjie-teo\IS415-GAA\exercises\Take-Home_Ex02\data\geospatial' 
  using driver `ESRI Shapefile'
Simple feature collection with 269 features and 161 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 106.3831 ymin: -6.370815 xmax: 106.9728 ymax: -5.184322
Geodetic CRS:  WGS 84

3.1.2 Preparing Geospatial Dataset

Let us view the dataset columns:

head(jkt, 1)

Simple feature collection with 1 feature and 161 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -765269.9 ymin: 811640 xmax: -764544.3 ymax: 812335.7
Projected CRS: DGN95 / Indonesia TM-3 zone 50.1
  OBJECT_ID  KODE_DESA      DESA   KODE    PROVINSI      KAB_KOTA  KECAMATAN
1     25477 3173031006 KEAGUNGAN 317303 DKI JAKARTA JAKARTA BARAT TAMAN SARI
  DESA_KELUR JUMLAH_PEN JUMLAH_KK LUAS_WILAY KEPADATAN PERPINDAHA JUMLAH_MEN
1  KEAGUNGAN      21609      7255       0.36     60504        102         68
  PERUBAHAN WAJIB_KTP SILAM KRISTEN KHATOLIK HINDU BUDHA KONGHUCU KEPERCAYAA
1     20464     16027 15735    2042      927    15  2888        2          0
   PRIA WANITA BELUM_KAWI KAWIN CERAI_HIDU CERAI_MATI   U0   U5  U10  U15  U20
1 11049  10560      10193 10652        255        509 1572 1751 1703 1493 1542
   U25  U30  U35  U40  U45  U50  U55 U60 U65 U70 U75 TIDAK_BELU BELUM_TAMA
1 1665 1819 1932 1828 1600 1408 1146 836 587 312 415       3426       1964
  TAMAT_SD SLTP SLTA DIPLOMA_I DIPLOMA_II DIPLOMA_IV STRATA_II STRATA_III
1     2265 3660 8463        81        428       1244        74          4
  BELUM_TIDA APARATUR_P TENAGA_PEN WIRASWASTA PERTANIAN NELAYAN AGAMA_DAN
1       3927         81         70       8974         1       0         6
  PELAJAR_MA TENAGA_KES PENSIUNAN LAINNYA    GENERATED KODE_DES_1 BELUM_
1       4018         28        57    4447 30 Juni 2019 3173031006   3099
  MENGUR_ PELAJAR_ PENSIUNA_1 PEGAWAI_ TENTARA KEPOLISIAN PERDAG_ PETANI
1    4447     3254         80       48       4         10      31      0
  PETERN_ NELAYAN_1 INDUSTR_ KONSTR_ TRANSP_ KARYAW_ KARYAW1 KARYAW1_1
1       0         1        7       3       2    6735       9         0
  KARYAW1_12 BURUH BURUH_ BURUH1 BURUH1_1 PEMBANT_ TUKANG TUKANG_1 TUKANG_12
1         23   515      1      0        0        1      0        1         0
  TUKANG__13 TUKANG__14 TUKANG__15 TUKANG__16 TUKANG__17 PENATA PENATA_
1          1          0          1          7          1      0       0
  PENATA1_1 MEKANIK SENIMAN_ TABIB PARAJI_ PERANCA_ PENTER_ IMAM_M PENDETA
1         0      11        4     1       0        0       1      0       2
  PASTOR WARTAWAN USTADZ JURU_M PROMOT ANGGOTA_ ANGGOTA1 ANGGOTA1_1 PRESIDEN
1      0        7      6      0      0        0        0          0        0
  WAKIL_PRES ANGGOTA1_2 ANGGOTA1_3 DUTA_B GUBERNUR WAKIL_GUBE BUPATI WAKIL_BUPA
1          0          0          0      0        0          0      0          0
  WALIKOTA WAKIL_WALI ANGGOTA1_4 ANGGOTA1_5 DOSEN GURU PILOT PENGACARA_ NOTARIS
1        0          0          0          0     3   72     1          4       0
  ARSITEK AKUNTA_ KONSUL_ DOKTER BIDAN PERAWAT APOTEK_ PSIKIATER PENYIA_
1       1       1       1     16     3       7       0         0       0
  PENYIA1 PELAUT PENELITI SOPIR PIALAN PARANORMAL PEDAGA_ PERANG_ KEPALA_
1       0      0        0    65      0          0     379       0       0
  BIARAW_ WIRASWAST_ LAINNYA_12 LUAS_DESA KODE_DES_3 DESA_KEL_1 KODE_12
1       0       1370         94     25476 3173031006  KEAGUNGAN  317303
                        geometry
1 MULTIPOLYGON (((-764674.9 8...

The dataset above a variety of information, a breakdown by subdistrict level of the religion, demographic information, education, occupation etc.

In our analysis, we only require information related to the population count, subdistrict and district information and geometry information. Hence, we will only filter and retain information from index 0 to 9.

jkt <- jkt[0:9]
head(jkt, 1)

Simple feature collection with 1 feature and 9 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -765269.9 ymin: 811640 xmax: -764544.3 ymax: 812335.7
Projected CRS: DGN95 / Indonesia TM-3 zone 50.1
  OBJECT_ID  KODE_DESA      DESA   KODE    PROVINSI      KAB_KOTA  KECAMATAN
1     25477 3173031006 KEAGUNGAN 317303 DKI JAKARTA JAKARTA BARAT TAMAN SARI
  DESA_KELUR JUMLAH_PEN                       geometry
1  KEAGUNGAN      21609 MULTIPOLYGON (((-764674.9 8...

Next, we will recode information from Bahasa Indonesia to English for easier processing, using rename() of dplyr.

jkt <- jkt %>% rename("SubDistrictCode" = "KODE_DESA",
                      "SubDistrict" = "DESA",
                      "Code" = "KODE",
                       "Province" = "PROVINSI",
                      "City" = "KAB_KOTA",
                      "District" = "KECAMATAN",
                      "SubDistrictName" = "DESA_KELUR",
                      "TotalPop" = "JUMLAH_PEN")
jkt

Simple feature collection with 269 features and 9 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -814909.7 ymin: 787584.9 xmax: -747172.2 ymax: 919446.7
Projected CRS: DGN95 / Indonesia TM-3 zone 50.1
First 10 features:
   OBJECT_ID SubDistrictCode        SubDistrict   Code    Province
1      25477      3173031006          KEAGUNGAN 317303 DKI JAKARTA
2      25478      3173031007             GLODOK 317303 DKI JAKARTA
3      25397      3171031003      HARAPAN MULIA 317103 DKI JAKARTA
4      25400      3171031006       CEMPAKA BARU 317103 DKI JAKARTA
5      25378      3101011001     PULAU PANGGANG 310101 DKI JAKARTA
6      25379      3101011002       PULAU KELAPA 310101 DKI JAKARTA
7      25390      3171021001         PASAR BARU 317102 DKI JAKARTA
8      25382      3101021002       PULAU TIDUNG 310102 DKI JAKARTA
9      25391      3171021002       KARANG ANYAR 317102 DKI JAKARTA
10     25394      3171021005 MANGGA DUA SELATAN 317102 DKI JAKARTA
               City                  District    SubDistrictName TotalPop
1     JAKARTA BARAT                TAMAN SARI          KEAGUNGAN    21609
2     JAKARTA BARAT                TAMAN SARI             GLODOK     9069
3     JAKARTA PUSAT                 KEMAYORAN      HARAPAN MULIA    29085
4     JAKARTA PUSAT                 KEMAYORAN       CEMPAKA BARU    41913
5  KEPULAUAN SERIBU    KEPULAUAN SERIBU UTARA     PULAU PANGGANG     6947
6  KEPULAUAN SERIBU    KEPULAUAN SERIBU UTARA       PULAU KELAPA     7059
7     JAKARTA PUSAT               SAWAH BESAR         PASAR BARU    15793
8  KEPULAUAN SERIBU KEPULAUAN SERIBU SELATAN.       PULAU TIDUNG     5891
9     JAKARTA PUSAT               SAWAH BESAR       KARANG ANYAR    33383
10    JAKARTA PUSAT               SAWAH BESAR MANGGA DUA SELATAN    35906
                         geometry
1  MULTIPOLYGON (((-764674.9 8...
2  MULTIPOLYGON (((-764859.9 8...
3  MULTIPOLYGON (((-760021.3 8...
4  MULTIPOLYGON (((-759415 810...
5  MULTIPOLYGON (((-796698.3 8...
6  MULTIPOLYGON (((-795419 870...
7  MULTIPOLYGON (((-762362.7 8...
8  MULTIPOLYGON (((-801734.4 8...
9  MULTIPOLYGON (((-762950.3 8...
10 MULTIPOLYGON (((-762822.4 8...

Next, let us quickly plot the jkt geospatial data

tmap_mode("plot")
tm_shape(jkt) +
  tm_borders()

We can see there are many small dots to the top and left of the main Jarkata region. Those are outer islands. In this analysis, we will exclude the outer islands of Jarkata from our analysis.

Now, let us try to identify how to identify and remove the outer islands. Below, we will plot City.

tmap_mode("view")
tm_shape(jkt) +
  tm_fill("City", 
          alpha = 0.2) +
  tm_borders()

From our map above, we can tell that the outer islands are denoted by KEPULAUAN SERIBU which means Thousand Islands.

However, we also spot some missing values. We will fill them in first before removing the outer islands.

filter(jkt, is.na(jkt$City))

Simple feature collection with 2 features and 9 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -761893.4 ymin: 811733.2 xmax: -759659.1 ymax: 812745.6
Projected CRS: DGN95 / Indonesia TM-3 zone 50.1
  OBJECT_ID SubDistrictCode      SubDistrict   Code    Province City District
1     25645        31888888     DANAU SUNTER 318888 DKI JAKARTA <NA>     <NA>
2     25646        31888888 DANAU SUNTER DLL 318888 DKI JAKARTA <NA>     <NA>
  SubDistrictName TotalPop                       geometry
1            <NA>        0 MULTIPOLYGON (((-759659.1 8...
2            <NA>        0 MULTIPOLYGON (((-760863.5 8...

From the output above, we can gather that there are two Villages with NA values:

DANAU SUNTER (OBJECT_ID 25645)
DANAU SUNTER DLL (OBJECT_ID 25646)

By comparing the information from the Geospatial Information Agency of Indonesia (Badan Informasi Geospasial), we now know that these two villages are part of Pademangan Timur based on the screenshot from the website below:

Now, let find out what is the OBJECT_ID of the village PADEMANGAN TIMUR where we are supposed to merge the two other villages (DANAU SUNTER and DANAU SUNTER DLL) into.

jkt %>% filter((jkt$SubDistrictName) == "PADEMANGAN TIMUR")

Simple feature collection with 1 feature and 9 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -762223.7 ymin: 811885.3 xmax: -760240.1 ymax: 814736.4
Projected CRS: DGN95 / Indonesia TM-3 zone 50.1
  OBJECT_ID SubDistrictCode      SubDistrict   Code    Province          City
1     25453      3172051001 PADEMANGAN TIMUR 317205 DKI JAKARTA JAKARTA UTARA
    District  SubDistrictName TotalPop                       geometry
1 PADEMANGAN PADEMANGAN TIMUR    45083 MULTIPOLYGON (((-760244.6 8...

The OBJECT_ID is 25453.

Let’s plot a map of how the three states would look like before we merge them:

tmap_mode("plot")
tm_shape(filter(jkt, jkt$OBJECT_ID == 25453 | jkt$OBJECT_ID == 25645 | jkt$OBJECT_ID == 25646)) +
  tm_fill("SubDistrict") +
  tm_borders()

Now, let us merge the regions together using the code chunk below using the st_union() function of sf:

merged_25453 <-
st_union(filter(jkt,jkt$OBJECT_ID == 25453), filter(jkt,jkt$OBJECT_ID == 25645))[0:9]

merged_25453 <- st_union(merged_25453, filter(jkt, jkt$OBJECT_ID == 25646))[0:9]

list(merged_25453)

[[1]]
Simple feature collection with 1 feature and 9 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: -762223.7 ymin: 811733.2 xmax: -759659.1 ymax: 814736.4
Projected CRS: DGN95 / Indonesia TM-3 zone 50.1
  OBJECT_ID SubDistrictCode      SubDistrict   Code    Province          City
1     25453      3172051001 PADEMANGAN TIMUR 317205 DKI JAKARTA JAKARTA UTARA
    District  SubDistrictName TotalPop                       geometry
1 PADEMANGAN PADEMANGAN TIMUR    45083 POLYGON ((-760404.9 811968....

The attributes of PADEMANGAN TIMUR still matches the data from before.

tmap_mode("plot")
tm_shape(merged_25453) +
  tm_fill("SubDistrict") +
  tm_borders()

We have successfully merged the 3 geometries together.

Now, let us remove the NA records (filtering by default will remove NA values), KEPULAUAN SERIBU which removes the outer islands before we delete and reinsert the PADEMANGAN TIMUR stored in merged_25453 variable.

We check how many outer islands are there to be removed:

filter(jkt, jkt$City == "KEPULAUAN SERIBU")

Simple feature collection with 6 features and 9 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -814909.7 ymin: 824270.9 xmax: -761447 ymax: 919446.7
Projected CRS: DGN95 / Indonesia TM-3 zone 50.1
  OBJECT_ID SubDistrictCode       SubDistrict   Code    Province
1     25378      3101011001    PULAU PANGGANG 310101 DKI JAKARTA
2     25379      3101011002      PULAU KELAPA 310101 DKI JAKARTA
3     25382      3101021002      PULAU TIDUNG 310102 DKI JAKARTA
4     25383      3101021003        PULAU PARI 310102 DKI JAKARTA
5     25380      3101011003     PULAU HARAPAN 310101 DKI JAKARTA
6     25381      3101021001 PULAU UNTUNG JAWA 310102 DKI JAKARTA
              City                  District   SubDistrictName TotalPop
1 KEPULAUAN SERIBU    KEPULAUAN SERIBU UTARA    PULAU PANGGANG     6947
2 KEPULAUAN SERIBU    KEPULAUAN SERIBU UTARA      PULAU KELAPA     7059
3 KEPULAUAN SERIBU KEPULAUAN SERIBU SELATAN.      PULAU TIDUNG     5891
4 KEPULAUAN SERIBU KEPULAUAN SERIBU SELATAN.        PULAU PARI     3524
5 KEPULAUAN SERIBU    KEPULAUAN SERIBU UTARA     PULAU HARAPAN     2579
6 KEPULAUAN SERIBU KEPULAUAN SERIBU SELATAN. PULAU UNTUNG JAWA     2436
                        geometry
1 MULTIPOLYGON (((-796698.3 8...
2 MULTIPOLYGON (((-795419 870...
3 MULTIPOLYGON (((-801734.4 8...
4 MULTIPOLYGON (((-786095.1 8...
5 MULTIPOLYGON (((-795768.5 8...
6 MULTIPOLYGON (((-778934.5 8...

There are 6 outer islands. Together with the 2 villages with NA city district and removing PADEMANGAN TIMUR, we should see 260 features remaining.

jkt <- jkt %>% filter(jkt$City != "KEPULAUAN SERIBU" & jkt$OBJECT_ID != 25453)
jkt

Simple feature collection with 260 features and 9 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -779259.9 ymin: 787584.9 xmax: -747172.2 ymax: 819039
Projected CRS: DGN95 / Indonesia TM-3 zone 50.1
First 10 features:
   OBJECT_ID SubDistrictCode        SubDistrict   Code    Province
1      25477      3173031006          KEAGUNGAN 317303 DKI JAKARTA
2      25478      3173031007             GLODOK 317303 DKI JAKARTA
3      25397      3171031003      HARAPAN MULIA 317103 DKI JAKARTA
4      25400      3171031006       CEMPAKA BARU 317103 DKI JAKARTA
5      25390      3171021001         PASAR BARU 317102 DKI JAKARTA
6      25391      3171021002       KARANG ANYAR 317102 DKI JAKARTA
7      25394      3171021005 MANGGA DUA SELATAN 317102 DKI JAKARTA
8      25386      3171011003       PETOJO UTARA 317101 DKI JAKARTA
9      25403      3171041001              SENEN 317104 DKI JAKARTA
10     25408      3171041006             BUNGUR 317104 DKI JAKARTA
            City    District    SubDistrictName TotalPop
1  JAKARTA BARAT  TAMAN SARI          KEAGUNGAN    21609
2  JAKARTA BARAT  TAMAN SARI             GLODOK     9069
3  JAKARTA PUSAT   KEMAYORAN      HARAPAN MULIA    29085
4  JAKARTA PUSAT   KEMAYORAN       CEMPAKA BARU    41913
5  JAKARTA PUSAT SAWAH BESAR         PASAR BARU    15793
6  JAKARTA PUSAT SAWAH BESAR       KARANG ANYAR    33383
7  JAKARTA PUSAT SAWAH BESAR MANGGA DUA SELATAN    35906
8  JAKARTA PUSAT      GAMBIR       PETOJO UTARA    21828
9  JAKARTA PUSAT       SENEN              SENEN     8643
10 JAKARTA PUSAT       SENEN             BUNGUR    23001
                         geometry
1  MULTIPOLYGON (((-764674.9 8...
2  MULTIPOLYGON (((-764859.9 8...
3  MULTIPOLYGON (((-760021.3 8...
4  MULTIPOLYGON (((-759415 810...
5  MULTIPOLYGON (((-762362.7 8...
6  MULTIPOLYGON (((-762950.3 8...
7  MULTIPOLYGON (((-762822.4 8...
8  MULTIPOLYGON (((-764144 810...
9  MULTIPOLYGON (((-761646.2 8...
10 MULTIPOLYGON (((-761014.1 8...

Yes, there are only 260 features remaining. Now, let us add the updated feature back into the jkt dataframe using rbind() which allows us to merge two sf dataframe objects together.

jkt <- rbind(jkt, merged_25453)

Now, let us plot a map to see if the map has been properly fixed.

tmap_mode("plot")
tm_shape(jkt) +
  tm_fill("City") +
  tm_borders()

Now, we are ready to perform other tasks.

3.2 Aspatial Data

3.2.1 Importing Aspatial Datsets

Next, we will import the aspatial datasets

vac_202106 <- read_excel("Take-Home_Ex02/data/aspatial/20210630.xlsx")
vac_202107 <- read_excel("Take-Home_Ex02/data/aspatial/20210731.xlsx")
vac_202108 <- read_excel("Take-Home_Ex02/data/aspatial/20210831.xlsx")
vac_202109 <- read_excel("Take-Home_Ex02/data/aspatial/20210930.xlsx")
vac_202110 <- read_excel("Take-Home_Ex02/data/aspatial/20211031.xlsx")
vac_202111 <- read_excel("Take-Home_Ex02/data/aspatial/20211130.xlsx")
vac_202112 <- read_excel("Take-Home_Ex02/data/aspatial/20211231.xlsx")
vac_202201 <- read_excel("Take-Home_Ex02/data/aspatial/20220131.xlsx")
vac_202202 <- read_excel("Take-Home_Ex02/data/aspatial/20220227.xlsx")
vac_202203 <- read_excel("Take-Home_Ex02/data/aspatial/20220331.xlsx")
vac_202204 <- read_excel("Take-Home_Ex02/data/aspatial/20220430.xlsx")
vac_202205 <- read_excel("Take-Home_Ex02/data/aspatial/20220531.xlsx")

3.2.2 Preparing Aspatial Data

Let’s have a look at one of the asptial dataset files and it’s columns:

vac_202106

# A tibble: 268 × 21
   `KODE KELURAHAN` `WILAYAH KOTA`  KECAMATAN   KELURAHAN SASARAN `BELUM VAKSIN`
   <chr>            <chr>           <chr>       <chr>       <dbl>          <dbl>
 1 <NA>             <NA>            <NA>        TOTAL     7739060        5113425
 2 3172051003       JAKARTA UTARA   PADEMANGAN  ANCOL       20393          13394
 3 3173041007       JAKARTA BARAT   TAMBORA     ANGKE       25785          16743
 4 3175041005       JAKARTA TIMUR   KRAMAT JATI BALE KAM…   25158          19068
 5 3175031003       JAKARTA TIMUR   JATINEGARA  BALI MES…    8683           5816
 6 3175101006       JAKARTA TIMUR   CIPAYUNG    BAMBU AP…   22768          15575
 7 3174031002       JAKARTA SELATAN MAMPANG PR… BANGKA      18930          12655
 8 3175051002       JAKARTA TIMUR   PASAR REBO  BARU        20267          11481
 9 3175041004       JAKARTA TIMUR   KRAMAT JATI BATU AMP…   41389          30601
10 3171071002       JAKARTA PUSAT   TANAH ABANG BENDUNGA…   19008          11660
# ℹ 258 more rows
# ℹ 15 more variables: `JUMLAH\r\nDOSIS 1` <dbl>, `JUMLAH\r\nDOSIS 2` <dbl>,
#   `TOTAL VAKSIN\r\nDIBERIKAN` <dbl>, `LANSIA\r\nDOSIS 1` <dbl>,
#   `LANSIA\r\nDOSIS 2` <dbl>, `LANSIA TOTAL \r\nVAKSIN DIBERIKAN` <dbl>,
#   `PELAYAN PUBLIK\r\nDOSIS 1` <dbl>, `PELAYAN PUBLIK\r\nDOSIS 2` <dbl>,
#   `PELAYAN PUBLIK TOTAL\r\nVAKSIN DIBERIKAN` <dbl>,
#   `GOTONG ROYONG\r\nDOSIS 1` <dbl>, `GOTONG ROYONG\r\nDOSIS 2` <dbl>, …

The dataset contains many columns, telling us about the specific breakdown of vaccinatation rates by targeted population (SASARAN), yet to be vaccinated (BELUM VAKSIN), breakdowns of vaccination and doses taken by different user groups and total number of vaccinations.

Since we are only interested in visualising and analysing the monthly vaccination rate in Jarkata, we only require the targeted population (SASARAN) and yet to be vaccinated (BELUM VAKSIN) columns and derive the formula:

Note

Percentage of the cumulative vaccination rate = (targeted / total population)

Let us also check what City Areas (WILAYAH KOTA) the dataset contains:

unique(vac_202106$`WILAYAH KOTA`)

[1] NA                   "JAKARTA UTARA"      "JAKARTA BARAT"     
[4] "JAKARTA TIMUR"      "JAKARTA SELATAN"    "JAKARTA PUSAT"     
[7] "KAB.ADM.KEP.SERIBU"

Since we do not want the outer islands, we should remove values with City Area KAB.ADM.KEP.SERIBU.

Also, we can see there is an NA value, attributed to the first row which contains the Total count, which we want to remove.

Let’s check how many rows we should expect in our tidied dataframe per month:

vac_202106 %>% filter(vac_202106$`WILAYAH KOTA` != "KAB.ADM.KEP.SERIBU")

# A tibble: 261 × 21
   `KODE KELURAHAN` `WILAYAH KOTA`  KECAMATAN   KELURAHAN SASARAN `BELUM VAKSIN`
   <chr>            <chr>           <chr>       <chr>       <dbl>          <dbl>
 1 3172051003       JAKARTA UTARA   PADEMANGAN  ANCOL       20393          13394
 2 3173041007       JAKARTA BARAT   TAMBORA     ANGKE       25785          16743
 3 3175041005       JAKARTA TIMUR   KRAMAT JATI BALE KAM…   25158          19068
 4 3175031003       JAKARTA TIMUR   JATINEGARA  BALI MES…    8683           5816
 5 3175101006       JAKARTA TIMUR   CIPAYUNG    BAMBU AP…   22768          15575
 6 3174031002       JAKARTA SELATAN MAMPANG PR… BANGKA      18930          12655
 7 3175051002       JAKARTA TIMUR   PASAR REBO  BARU        20267          11481
 8 3175041004       JAKARTA TIMUR   KRAMAT JATI BATU AMP…   41389          30601
 9 3171071002       JAKARTA PUSAT   TANAH ABANG BENDUNGA…   19008          11660
10 3175031002       JAKARTA TIMUR   JATINEGARA  BIDARA C…   32331          23675
# ℹ 251 more rows
# ℹ 15 more variables: `JUMLAH\r\nDOSIS 1` <dbl>, `JUMLAH\r\nDOSIS 2` <dbl>,
#   `TOTAL VAKSIN\r\nDIBERIKAN` <dbl>, `LANSIA\r\nDOSIS 1` <dbl>,
#   `LANSIA\r\nDOSIS 2` <dbl>, `LANSIA TOTAL \r\nVAKSIN DIBERIKAN` <dbl>,
#   `PELAYAN PUBLIK\r\nDOSIS 1` <dbl>, `PELAYAN PUBLIK\r\nDOSIS 2` <dbl>,
#   `PELAYAN PUBLIK TOTAL\r\nVAKSIN DIBERIKAN` <dbl>,
#   `GOTONG ROYONG\r\nDOSIS 1` <dbl>, `GOTONG ROYONG\r\nDOSIS 2` <dbl>, …

Note

The code block above tells us we are expecting 261 rows per month, a total of 3132 rows for our combined vaccination dataset over 12 months

The creates a function proc_data() which processes the imported dataset to do the following:

Remove the “Total” row
Remove the outer islands records
Drop unnecessary columns not required for processing
Recode Bahasa Indonesia column names to English for easy processing
Add a period column which indicates the month
Calculate vaccination rate (VaccinationRate) utilising formula stated above
Merge the geometry from jkt into the new dataframe

proc_data <- function(df, date){
    df <- filter(df, df$KELURAHAN != "TOTAL")
    df <- filter(df, df$`WILAYAH KOTA` != "KAB.ADM.KEP.SERIBU")
    new_df <- left_join(jkt, df,
                          by = c("SubDistrictCode" = "KODE KELURAHAN"))
    
    # recode column names to english
    new_df <- new_df %>% rename(
                      "CityArea" = "WILAYAH KOTA",
                      "TargetPop" = "SASARAN",
                      "YetToBeVac" = "BELUM VAKSIN")
    
    new_df <- new_df[,c("OBJECT_ID",
                   "SubDistrictCode",
                   "SubDistrict",
                   "Code",
                   "Province",
                   "City",
                   "District",
                   "SubDistrictName",
                   "TotalPop",
                   "CityArea",
                   "TargetPop",
                   "YetToBeVac"
                   )]
    
    new_df$period = as.Date(date, "%Y-%m-%d") 
    
    new_df <- new_df %>%
      mutate(VaccinationRate = (new_df$TargetPop - new_df$YetToBeVac) / new_df$TargetPop)
    
    return (new_df)
}

The code chunk below assists us in the vaccination processing the raw vaccination data to create an sf object with geometry using the proc_data() function we have created earlier above.

proc2106 <- proc_data(vac_202106, "2021-06-01")
proc2107 <- proc_data(vac_202107, "2021-07-01")
proc2108 <- proc_data(vac_202108, "2021-08-01")
proc2109 <- proc_data(vac_202109, "2021-09-01")
proc2110 <- proc_data(vac_202110, "2021-10-01")
proc2111 <- proc_data(vac_202111, "2021-11-01")
proc2112 <- proc_data(vac_202112, "2021-12-01")
proc2201 <- proc_data(vac_202201, "2022-01-01")
proc2202 <- proc_data(vac_202202, "2022-02-01")
proc2203 <- proc_data(vac_202203, "2022-03-01")
proc2204 <- proc_data(vac_202204, "2022-04-01")
proc2205 <- proc_data(vac_202205, "2022-05-01")

Using rbind() we merge all sf objects of each month into a singular sf object.

combined_jkt_vac <- rbind(proc2106, proc2107, proc2108, proc2109, proc2110, proc2111, proc2112, proc2201, proc2202, proc2203, proc2204, proc2205)

Let us inspect the combined_jkt_vac sf object now.

glimpse(combined_jkt_vac)

Rows: 3,132
Columns: 15
$ OBJECT_ID       <dbl> 25477, 25478, 25397, 25400, 25390, 25391, 25394, 25386…
$ SubDistrictCode <chr> "3173031006", "3173031007", "3171031003", "3171031006"…
$ SubDistrict     <chr> "KEAGUNGAN", "GLODOK", "HARAPAN MULIA", "CEMPAKA BARU"…
$ Code            <dbl> 317303, 317303, 317103, 317103, 317102, 317102, 317102…
$ Province        <chr> "DKI JAKARTA", "DKI JAKARTA", "DKI JAKARTA", "DKI JAKA…
$ City            <chr> "JAKARTA BARAT", "JAKARTA BARAT", "JAKARTA PUSAT", "JA…
$ District        <chr> "TAMAN SARI", "TAMAN SARI", "KEMAYORAN", "KEMAYORAN", …
$ SubDistrictName <chr> "KEAGUNGAN", "GLODOK", "HARAPAN MULIA", "CEMPAKA BARU"…
$ TotalPop        <dbl> 21609, 9069, 29085, 41913, 15793, 33383, 35906, 21828,…
$ CityArea        <chr> "JAKARTA BARAT", "JAKARTA BARAT", "JAKARTA PUSAT", "JA…
$ TargetPop       <dbl> 15262, 7192, 19987, 27330, 11656, 23056, 24940, 16084,…
$ YetToBeVac      <dbl> 10407, 3622, 13778, 18618, 6162, 15233, 16405, 10609, …
$ period          <date> 2021-06-01, 2021-06-01, 2021-06-01, 2021-06-01, 2021-…
$ VaccinationRate <dbl> 0.3181103, 0.4963849, 0.3106519, 0.3187706, 0.4713452,…
$ geometry        <MULTIPOLYGON [m]> MULTIPOLYGON (((-764674.9 8..., MULTIPOLY…

Checking back with our previous calculations, we have 3132 rows here, hence, the data preparation has been done correctly!

4 Chloropleth Mapping and Analysis

We will now plot a choropleth map of vaccination rates with the help of tmap and tmap_animations (Reference: https://r.geocompx.org/adv-map.html)

Since the combined_jkt_vac sf object contains vaccination data from all time periods, we will use tm_facets() to split them by the column period.

vacrate_anim <-
  tm_shape(combined_jkt_vac) + tm_fill("VaccinationRate",
            palette = "Greens") +
    tm_borders(lwd = 0.1) +
  tm_facets(along = "period", free.coords = FALSE)

By calling the vacrate_anim variable stored above, we can display the static maps below. The number ontop of the map denotes the YYYYMM time period of the Vaccination Rates by Subdistricts. (202205 means Vaccination Rates in May 2022 by Subdistrict)

vacrate_anim

=============

=======

=======

======

=======

=======

======

=======

=======

======

=======

Utilising the tmap_animations() function, we can export a gif animation that we can insert into the quarto doc for better understanding of spatial point patterns between the different time periods.

tmap_animation(vacrate_anim, filename = "Take-Home_Ex02/vacrate.gif", delay = 100, width = 1280, height = 720, scale = 2)

Creating frames

=============

=======

=======

======

=======

=======

======

=======

=======

======

=======


Creating animation
Animation saved to C:\renjie-teo\IS415-GAA\exercises\Take-Home_Ex02\vacrate.gif

Here, we have generated an animation of the Vaccination rates from June 2021 to May 2022 using the tmap_animation to export it as a gif with the help of gifski.

The data is coded in the form of YYYYMM (Year and Month)

Legend values are in terms of proportion (0.2 means 20% vaccination rate)

Note

Vaccination Rate = Vaccinated Populaton / Target Population

From the animation above, we can see that there is a rather steady and rapid increase in vaccination rate for most states between the months of June 2021 to September 2021 where vaccinations increased from “0.2 to 0.3” range and “0.3 to 0.4” range to “0.7 to 0.8” range.

Thereafter, vaccination rates slowly increased to the “0.8 to 0.9” range around neighbouring regions who has previous achieved highest rates of vaccinations.

Majority of the subdistricts has reached “0.8 to 0.9” vaccination range by around Janurary to February 2022.

5 Creating a Time Series Cube

5.1 Gathering Requisite Data

We will create another function to come up with the data format required for the spacetime cube:

The creates a function proc_timeseries_raw_data() which processes the imported dataset to do the following:

Remove the “Total” row
Remove the outer islands records
Drop unnecessary columns not required for processing
Recode Bahasa Indonesia column names to English for easy processing
Add a period column which indicates the month
Calculate vaccination rate (VaccinationRate) utilising formula stated above

proc_timeseries_raw_data <- function(df, date){
    df <- filter(df, df$KELURAHAN != "TOTAL")
    df <- filter(df, df$`WILAYAH KOTA` != "KAB.ADM.KEP.SERIBU")
    
    # recode column names to english
    new_df <- df %>% rename(
                      "SubDistrict" = "KELURAHAN",
                      "SubDistrictCode" = "KODE KELURAHAN",
                      "CityArea" = "WILAYAH KOTA",
                      "TargetPop" = "SASARAN",
                      "YetToBeVac" = "BELUM VAKSIN")
    
    new_df <- new_df[,c(
                   "SubDistrictCode",
                   "SubDistrict",
                   "CityArea",
                   "TargetPop",
                   "YetToBeVac"
                   )]
    
    new_df$period = as.Date(date, "%Y-%m-%d") 
    
    new_df <- new_df %>%
      mutate(VaccinationRate = (new_df$TargetPop - new_df$YetToBeVac) / new_df$TargetPop)
    
    return (new_df)
}

The code chunk below uses the proc_timeseries_raw_data() function to process the original imported vaccination data into data usable for the time series cube. We will use rbind() to combine all the tables together

jkt_vac <-
rbind(proc_timeseries_raw_data(vac_202106, "2021-06-01"),
      proc_timeseries_raw_data(vac_202107, "2021-07-01"),
      proc_timeseries_raw_data(vac_202108, "2021-08-01"),
      proc_timeseries_raw_data(vac_202109, "2021-09-01"),
      proc_timeseries_raw_data(vac_202110, "2021-10-01"),
      proc_timeseries_raw_data(vac_202111, "2021-11-01"),
      proc_timeseries_raw_data(vac_202112, "2021-12-01"),
      proc_timeseries_raw_data(vac_202201, "2022-01-01"),
      proc_timeseries_raw_data(vac_202202, "2022-02-01"),
      proc_timeseries_raw_data(vac_202203, "2022-03-01"),
      proc_timeseries_raw_data(vac_202204, "2022-04-01"),
      proc_timeseries_raw_data(vac_202205, "2022-05-01")
      )

We will also reload the map and perform the required transformations that were done above. The only difference in jkt_geo is that we will just drop the two polygons which has no City, District or Subdistrict filled in instead of merging to PADEMANGAN TIMUR:

DANAU SUNTER (OBJECT_ID 25645)
DANAU SUNTER DLL (OBJECT_ID 25646)

This is done as there would be errors in computing the weights later on using the originally merged geometry.

jkt_geo <- st_read(dsn = "Take-Home_Ex02/data/geospatial", layer = "BATAS_DESA_DESEMBER_2019_DUKCAPIL_DKI_JAKARTA") %>% st_transform(crs = 23837)

Reading layer `BATAS_DESA_DESEMBER_2019_DUKCAPIL_DKI_JAKARTA' from data source 
  `C:\renjie-teo\IS415-GAA\exercises\Take-Home_Ex02\data\geospatial' 
  using driver `ESRI Shapefile'
Simple feature collection with 269 features and 161 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 106.3831 ymin: -6.370815 xmax: 106.9728 ymax: -5.184322
Geodetic CRS:  WGS 84

jkt_geo  <- jkt_geo[0:9] 
jkt_geo <- jkt_geo %>% rename("SubDistrictCode" = "KODE_DESA",
                      "SubDistrict" = "DESA",
                      "Code" = "KODE",
                       "Province" = "PROVINSI",
                      "City" = "KAB_KOTA",
                      "District" = "KECAMATAN",
                      "SubDistrictName" = "DESA_KELUR",
                      "TotalPop" = "JUMLAH_PEN") 
jkt_geo <- jkt_geo %>% 
  filter(jkt_geo$City != "KEPULAUAN SERIBU")

5.2 Creating Spatio-Temporal Time Series Cube

Using the code below, we will use spacetime() of sfdep package to create a spatio-temporal cube

jkt_st <- spacetime(.data = jkt_vac, .geometry = jkt_geo,
                    .loc_col = "SubDistrictCode",
                    .time_col = "period")

Next, we will check if the spacetime cube has been created correctly with the code below:

is_spacetime_cube(jkt_st)

[1] TRUE

The TRUE return means that the jkt_st space time cube has been created successfully.

6 Local Gi* Analysis

6.1 Computing Gi*

Next, we will compute the local Gi* statistics

6.1.1 Deriving Spatial Weights

Using the code chunk below, we will be able to identify neighbours and derive inverse distance weights, which will be necessary to compute the local GI* statistics.

jkt_nb <- jkt_st %>%
  activate("geometry") %>%
  mutate(nb = include_self(st_contiguity(geometry)),
         wt = st_inverse_distance(nb, geometry,
                                  scale = 1,
                                  alpha = 1),
         .before = 1) %>%
  set_nbs("nb") %>%
  set_wts("wt")

Now, let us check the variable jkt_nb

head(jkt_nb)

# A tibble: 6 × 9
  SubDistrictCode SubDistrict   CityArea      TargetPop YetToBeVac period    
  <chr>           <chr>         <chr>             <dbl>      <dbl> <date>    
1 3173031006      KEAGUNGAN     JAKARTA BARAT     15262      10407 2021-06-01
2 3173031007      GLODOK        JAKARTA BARAT      7192       3622 2021-06-01
3 3171031003      HARAPAN MULIA JAKARTA PUSAT     19987      13778 2021-06-01
4 3171031006      CEMPAKA BARU  JAKARTA PUSAT     27330      18618 2021-06-01
5 3171021001      PASAR BARU    JAKARTA PUSAT     11656       6162 2021-06-01
6 3171021002      KARANG ANYAR  JAKARTA PUSAT     23056      15233 2021-06-01
# ℹ 3 more variables: VaccinationRate <dbl>, nb <list>, wt <list>

We can see that the neighbour (nb) and weights (wt) has been calculated.

6.1.2 Computing Gi* Values

Utilising the code chunk below, we use local_gstar_perm() of sfdep package and group by period to manually calculate the local GI* statistic for each subdistrict. After which, we can use unnest() to unnest the gi_star column of the new dataframe.

gistars <- jkt_nb %>%
  group_by(period) %>%
  mutate(gi_star = local_gstar_perm(
    VaccinationRate, nb, wt)) %>%
  tidyr::unnest(gi_star)

With the statistic, we can merge our geometry to be able to plot and see the statistical trends of the local GI* statistic

gistar_map <- left_join(jkt_geo, gistars,
                          by = c("SubDistrictCode" = "SubDistrictCode"))

We create the tmap plot for gi_star statistic. Note we use p_sim to used the simulated values and used fixed style and breaks to plot the values that were P < 0.05 which are areas that are significant.

gistar_map <- gistar_map %>% mutate(`P-Value` = case_when(p_sim < 0.05 ~ '< 0.05',  p_sim >= 0.05 ~ 'Not-Significant'))

gistar_tmap <-
  tm_shape(gistar_map) + 
    tm_fill("P-Value") +
    tm_borders(lwd = 0.1) +
    tm_facets(along = "period", free.coords = FALSE)

Now, we plot the tmap plot of areas where the Gi* statistic is significant.

gistar_tmap

=============

=======

=======

======

=======

=======

======

=======

=======

======

=======

What it means is that for values in yellow, since the P Value is <0.05 which is significant, we know that the area is significantly either associated with higher or lower vaccination rate values than the surrounding areas.

The Central Jarkata region consistently remained as significantly different from its surrounding regions throughout the 12 months. From about October 2021 onwards until May 2022, we can see most of Jarkata Selatan’s subdistricts are significant. Part of Jarkata Timur’s subdistricts, particularly those in the middle and south of the city area are significant too.

Now, we will conduct a Emerging Hot Spot Analysis to conclusively tell if the area that is significant is a cold or hot spot.

7 Emerging Hot Spot Analysis (EHSA)

7.1 Mann-Kendall Test

We utilise the group_by and MannKendall() function to perform the Mann-Kendall test to identify if the Vaccination Rates are increasing or decreasing over time.

ehsa <- gistars %>%
  group_by(SubDistrict) %>%
  summarise(mk = list(
    unclass(
      Kendall::MannKendall(gi_star)))) %>%
  tidyr::unnest_wider(mk)

7.1.0.1 Selecting 3 Districts to Evaluate the Temporal Trends Revealed

Hypothesis:

H0: No monotonic trend in series

H1: A trend exists, it can be positive, negative, or non-null

KAPUK MUARA (3172011003)

plotdata <- gistars %>% 
  ungroup() %>% 
  filter(SubDistrictCode == "3172011003") |> 
  select(SubDistrict, period, gi_star)

ggplot(data = plotdata, 
       aes(x = period, 
           y = gi_star)) +
  geom_line() +
  theme_light()

Firstly, we can see theres a sharp upward and then gradual downward trend.

ehsa %>% filter(SubDistrict == "KAPUK MUARA")

# A tibble: 1 × 6
  SubDistrict    tau        sl     S     D  varS
  <chr>        <dbl>     <dbl> <dbl> <dbl> <dbl>
1 KAPUK MUARA -0.909 0.0000521   -60  66.0  213.

Next, since the p_value < 0.05, we can reject the null hypothesis that there is no monotonic trend. Hence, we can conclude that we can say that the vaccination rate for Kapuk Muara is significantly a osciliating hotspot trend.

HALIM PERDANA KUSUMAH (3175081004)

plotdata <- gistars %>% 
  ungroup() %>% 
  filter(SubDistrictCode == "3175081004") |> 
  select(SubDistrict, period, gi_star)

ggplot(data = plotdata, 
       aes(x = period, 
           y = gi_star)) +
  geom_line() +
  theme_light()

Firstly, we can see theres mostly a general upward trend most of the time. Meaning that the vaccination rates are increasing.

ehsa %>% filter(SubDistrict == "HALIM PERDANA KUSUMAH")

# A tibble: 1 × 6
  SubDistrict             tau        sl     S     D  varS
  <chr>                 <dbl>     <dbl> <dbl> <dbl> <dbl>
1 HALIM PERDANA KUSUMAH 0.939 0.0000287    62  66.0  213.

Next, since the p_value < 0.05, we can reject the null hypothesis that there is no monotonic trend. Hence, we can conclude that we can say that the vaccination rate for Halim Perdana Kusuma is significantly a osciliating hotspot trend.

KEBON MELATI (3171071005)

plotdata <- gistars %>% 
  ungroup() %>% 
  filter(SubDistrictCode == "3171071005") |> 
  select(SubDistrict, period, gi_star)

ggplot(data = plotdata, 
       aes(x = period, 
           y = gi_star)) +
  geom_line() +
  theme_light()

Firstly, we can see theres mostly a downward trend of vaccination most of the time.

ehsa %>% filter(SubDistrict == "KEBON MELATI")

# A tibble: 1 × 6
  SubDistrict     tau        sl     S     D  varS
  <chr>         <dbl>     <dbl> <dbl> <dbl> <dbl>
1 KEBON MELATI -0.909 0.0000521   -60  66.0  213.

Next, since the p-value is < 0.05, we reject the null hypothesis that there is no monotonic trend. Hence, we can conclude that we can say that the vaccination rate for Kebon Melati is significantly a osciliating hotspot trend.

7.1.1 Arranging to Show Significant Emerging Hot/Cold Spots

emerging <- ehsa %>%
  arrange(sl, abs(tau)) %>%
  slice(1:5)

7.1.2 Performing Emerging Hotspot Analysis

Utilising the emerging_hotspot_analysis() function of sfdep, it tasks a space time object jkt_st which we created before and the variable name which we are interested in VaccinationRate.

By default we leave k = 1 which is for time lag. nsim in this case is number of simulations to be performed, the more simulations the more stable the result.

ehsa <- emerging_hotspot_analysis(
  x = jkt_st, 
  .var = "VaccinationRate", 
  k = 1, 
  nsim = 99
)

7.1.3 Visualising the Distribution of EHSA Classes

Using the code chunk below, we use ggplot2 functions to see the distribution of EHSA classes in a bar chart.

ggplot(data = ehsa,
       aes(x = classification)) +
  geom_bar()

Here, we can see most subdistricts are an oscilating hotspot meaning that statistically, many of these hotspots are a statistically significant cold spot during a prior month and less than 90% of the months, it has been a statistically significant hot spot. We can verify this as in the previous chloropleth map plotted, the vaccination rates changed at different rates, hence, a subdistrict’s which is initially lower could become higher than other subdistricts surrounding it int he next month.

Secondly, many of the subdistricts were sporadic cold spots, which means that they have these subdistricts have a history of being on and off-again cold spots, less than 90% of the months it has been statistically cold spots and it has never been a significant hot spot.

By 90% of months, given that there are 12 months, 90% would mean requiring to be a significant hot or cold spot for at least 11 out of the 12 months.

7.1.4 Visualising EHSA

Now, we will join the EHSA values with the geometry of Jarkata subdistrict map to see the distribution.

jkt_ehsa <- left_join(jkt_geo, ehsa,
                          by = c("SubDistrictCode" = "location"))

Next, we will plot the map, we utilise mutate to filter non-significant values out as non-significant. Significant values with P-value < 0.05 will have its its hot or cold spot pattern indicated.

ehsa_sig <- jkt_ehsa
ehsa_sig <- ehsa_sig %>% mutate(classification = case_when(
    `p_value` >= 0.05 ~ "not-Significant", 
    TRUE ~ classification 
))
tmap_mode("plot")
tm_shape(jkt_ehsa) +
  tm_polygons() +
  tm_borders(alpha = 0.5) +
tm_shape(ehsa_sig) +
  tm_fill("classification") + 
  tm_borders(alpha = 0.4)

From the EHSA Map, we can see that:

There are three states where no pattern is detected. These means that they do not fall into any of the hot spot or cold spot categories.
There are no subdistrict that is persistently hot or cold as compared to its surrounding subdistricts. This means that no one subdistrict has had a consistently higher vaccination rate than its neighbours
Oscillating Hotspots, coldspots and sporadic coldspots subdistricts are interspersed between each other. These cause their patterns to change as one subdistrict affects the other.
There are subdistricts where it is not significant to tell the EHSA patterns, these regions are commonly located around neighbours that are all some variant of coldspot or hotspot, hence, their vaccination rates might be similar to that of its neighbours.

8 References

Special Thanks to:

Prof Kam Tin Seong for his slides and materials
https://r.geocompx.org/adv-map.html
https://sparkbyexamples.com/r-programming/replace-values-based-on-condition-in-r/