1: Geospatial Analytics for Public Good

Author

Magdalene Chan

Published

November 25, 2023

Modified

December 3, 2023

Objectives

As city-wide urban infrastructures become increasingly digital, datasets from technologies like GPS and RFID on vehicles offer opportunities to track movement patterns over space and time. For instance, smart cards and GPS devices on public buses collect routes and ridership data, containing valuable structure and patterns for understanding human movement and behavior within cities. Despite their potential, the practical use of these extensive location-aware datasets often remains limited to basic tracking and mapping within GIS applications due to the lack of comprehensive spatial and spatio-temporal analysis functions in conventional GIS tools.

Exploratory Spatial Data Analysis (ESDA) holds tremendous potential to address such complex problems. In this study, appropriate Local Indicators of Spatial Association (GLISA) and Emerging Hot Spot Analysis (EHSA) will be applied to undercover the spatial and spatio-temporal mobility patterns of public bus passengers in Singapore.

Tasks

The following tasks will be undertaken in this exercise:

Geovisualisation and Analysis

Compute the passenger trips generated by origin at a hexagon level of 250m for the following time periods:
1. Weekday morning peak – 6am to 9am (inclusive)
2. Weekday evening peak – 5pm to 8pm (inclusive)
3. Weekend/holiday morning peak – 11am to 2pm (inclusive)
4. Weekend/holiday evening peak – 4pm to 7pm (inclusive)
Display the geographical distribution of the passenger trips using appropriate geovisualisation methods.
Describe the spatial patterns revealed by the geovisualisation (not more than 200 words per visual).

Emerging Hot Spot Analysis

With reference to the passenger trips generated by origin at the hexagon level for the four time periods given above:

Perform Mann-Kendall Test by using the spatio-temporal local G_i* values,
Prepare EHSA maps of the G_i* values of the passenger trips by origin at the hexagon level. The maps should only display the significant (i.e. p-value < 0.05).
With reference to the EHSA maps and data visualisation prepared, describe the spatial patterns revealed (not more than 250 words per cluster).

Getting Started

The code chunk below uses p_load() of pacman package to check if the required packages have been installed on the computer. If they are, the packages will be launched. The following packages will be used:

sf package is used for importing, managing, and processing geospatial data.
sfdep package is used to create spatial weights matrix and LISA objects using the sf class to represent spatial data.
tmap package is used for thematic mapping.
plotly package is used to create interactive graphs and charts.
tidyverse package is used for aspatial data wrangling.
knitr package is used for dynamic report generation in R.

Code

pacman::p_load(sf, sfdep, tmap, plotly, tidyverse, knitr)

Import Data

The data sets used are:

Bus Stop Location (Last updated Jul 2023) from LTADataMall retrieved on 18 Nov 2023
Passenger Volume by Origin Destination Bus Stops for August 2023 from LTADataMall retrieved on 18 Nov 2023

Import Geospatial Data: Bus Stop Locations

The code chunk below uses the st_read() function of sf package to import BusStop shapefile into R as a simple feature data frame called BusStop. As BusStop uses svy21 projected coordinate system, the crs is set to 3414.

Code

BusStop <- st_read(dsn = "data/geospatial", 
                layer = "BusStop") %>%
  st_transform(crs=3414)

Reading layer `BusStop' from data source 
  `C:\magdalenecjw\ISSS624 Geospatial\Take_Home_Exercise\Ex1\data\geospatial' 
  using driver `ESRI Shapefile'
Simple feature collection with 5161 features and 3 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 3970.122 ymin: 26482.1 xmax: 48284.56 ymax: 52983.82
Projected CRS: SVY21

Code

# Examine the structure of the data frame
str(BusStop)

Classes 'sf' and 'data.frame':  5161 obs. of  4 variables:
 $ BUS_STOP_N: chr  "22069" "32071" "44331" "96081" ...
 $ BUS_ROOF_N: chr  "B06" "B23" "B01" "B05" ...
 $ LOC_DESC  : chr  "OPP CEVA LOGISTICS" "AFT TRACK 13" "BLK 239" "GRACE INDEPENDENT CH" ...
 $ geometry  :sfc_POINT of length 5161; first list element:  'XY' num  13576 32884
 - attr(*, "sf_column")= chr "geometry"
 - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA
  ..- attr(*, "names")= chr [1:3] "BUS_STOP_N" "BUS_ROOF_N" "LOC_DESC"

There are a total of 5161 features in the BusStop shapefile. Notably, BUS_STOP_N is listed as a character variable. As this variable will be used as the identifier to link to the aspatial data, it should be transformed to a factor so that R treats it as a grouping variable.

Code

# Apply as.factor() to the column
BusStop$BUS_STOP_N <- as.factor(BusStop$BUS_STOP_N)

# Re-examine the structure of the data frame
str(BusStop)

Classes 'sf' and 'data.frame':  5161 obs. of  4 variables:
 $ BUS_STOP_N: Factor w/ 5145 levels "01012","01013",..: 1008 1724 2118 4972 431 3787 1160 2941 1610 4983 ...
 $ BUS_ROOF_N: chr  "B06" "B23" "B01" "B05" ...
 $ LOC_DESC  : chr  "OPP CEVA LOGISTICS" "AFT TRACK 13" "BLK 239" "GRACE INDEPENDENT CH" ...
 $ geometry  :sfc_POINT of length 5161; first list element:  'XY' num  13576 32884
 - attr(*, "sf_column")= chr "geometry"
 - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA
  ..- attr(*, "names")= chr [1:3] "BUS_STOP_N" "BUS_ROOF_N" "LOC_DESC"

Based on the output above, BUS_STOP_N is now a factor of 5145 levels. However, there are a total of 5161 observations. As such, the duplicate records should be checked. The code chunk below shows a list of duplicated bus stop codes.

Code

duplicate <- BusStop %>%
  group_by(BUS_STOP_N) %>%
  filter(n() > 1) %>%
  arrange(BUS_STOP_N)

kable(duplicate)

BUS_STOP_N	BUS_ROOF_N	LOC_DESC	geometry
11009	B04	Ghim Moh Ter	POINT (23101.34 32594.17)
11009	B04-TMNL	GHIM MOH TER	POINT (23100.58 32604.36)
22501	B02	Blk 662A	POINT (13489.09 35536.4)
22501	B02	BLK 662A	POINT (13488.02 35537.88)
43709	B06	BLK 644	POINT (18963.42 36762.8)
43709	B06	BLK 644	POINT (18952.02 36751.83)
47201	UNK	NA	POINT (22616.75 47793.68)
47201	NIL	W’LANDS NTH STN	POINT (22632.92 47934)
51071	B21	MACRITCHIE RESERVOIR	POINT (28311.27 36036.92)
51071	B21	MACRITCHIE RESERVOIR	POINT (28282.54 36033.93)
52059	B03	OPP BLK 65	POINT (30770.3 34460.06)
52059	B09	BLK 219	POINT (30565.45 36133.15)
53041	B05	Upp Thomson Road	POINT (28105.8 37246.76)
53041	B07	Upp Thomson Road	POINT (27956.34 37379.29)
58031	UNK	OPP CANBERRA DR	POINT (27089.69 47570.9)
58031	UNK	OPP CANBERRA DR	POINT (27111.07 47517.77)
62251	B03	Bef Blk 471B	POINT (35500.54 39943.41)
62251	B03	BEF BLK 471B	POINT (35500.36 39943.34)
67421	B01	CHENG LIM STN EXIT B	POINT (34548.54 42052.15)
67421	NIL	CHENG LIM STN EXIT B	POINT (34741.77 42004.21)
68091	B01	AFT BAKER ST	POINT (32164.11 42695.98)
68091	B08	AFT BAKER ST	POINT (32038.84 43298.68)
68099	B02	BEF BAKER ST	POINT (32154.9 42742.82)
68099	B07	BEF BAKER ST	POINT (32004.05 43320.34)
77329	B01	BEF PASIR RIS ST 53	POINT (40765.35 39452.18)
77329	B03	Pasir Ris Central	POINT (40728.15 39438.15)
82221	B01	BLK 3A	POINT (35323.6 33257.05)
82221	B01	Blk 3A	POINT (35308.74 33335.17)
96319	NA	Yusen Logistics	POINT (42187.23 34995.78)
96319	NIL	YUSEN LOGISTICS	POINT (42187.23 34995.78)
97079	B14	OPP ST. JOHN’S CRES	POINT (44144.57 38980.25)
97079	B14	OPP ST. JOHN’S CRES	POINT (44055.75 38908.5)

The code chunk below uses the distinct() function to keep only the unique rows based on the BUS_STOP_N while preserving all other fields (.keep_all = TRUE). By default, distinct() keeps the first occurrence of each unique combination of values in the specified columns.

Code

BusStop <- BusStop %>%
  distinct(BUS_STOP_N, .keep_all = TRUE)

# Re-examine the structure of the data frame
str(BusStop)

Classes 'sf' and 'data.frame':  5145 obs. of  4 variables:
 $ BUS_STOP_N: Factor w/ 5145 levels "01012","01013",..: 1008 1724 2118 4972 431 3787 1160 2941 1610 4983 ...
 $ BUS_ROOF_N: chr  "B06" "B23" "B01" "B05" ...
 $ LOC_DESC  : chr  "OPP CEVA LOGISTICS" "AFT TRACK 13" "BLK 239" "GRACE INDEPENDENT CH" ...
 $ geometry  :sfc_POINT of length 5145; first list element:  'XY' num  13576 32884
 - attr(*, "sf_column")= chr "geometry"
 - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA
  ..- attr(*, "names")= chr [1:3] "BUS_STOP_N" "BUS_ROOF_N" "LOC_DESC"

There are now a total of 5145 observations, aligned with the number of factor levels.

Import Passenger Volume by Origin-Destination Bus Stops

The code chunk below uses the read_csv() function of readr package (imported with the tidyverse package) to import the csv files into R.

Code

odbus <- read_csv("data/aspatial/origin_destination_bus_202308.csv")

# Examine the structure of the data frame
str(odbus)

spc_tbl_ [5,709,512 × 7] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ YEAR_MONTH         : chr [1:5709512] "2023-08" "2023-08" "2023-08" "2023-08" ...
 $ DAY_TYPE           : chr [1:5709512] "WEEKDAY" "WEEKENDS/HOLIDAY" "WEEKENDS/HOLIDAY" "WEEKDAY" ...
 $ TIME_PER_HOUR      : num [1:5709512] 16 16 14 14 17 17 17 17 7 17 ...
 $ PT_TYPE            : chr [1:5709512] "BUS" "BUS" "BUS" "BUS" ...
 $ ORIGIN_PT_CODE     : chr [1:5709512] "04168" "04168" "80119" "80119" ...
 $ DESTINATION_PT_CODE: chr [1:5709512] "10051" "10051" "90079" "90079" ...
 $ TOTAL_TRIPS        : num [1:5709512] 7 2 3 10 5 4 3 22 3 3 ...
 - attr(*, "spec")=
  .. cols(
  ..   YEAR_MONTH = col_character(),
  ..   DAY_TYPE = col_character(),
  ..   TIME_PER_HOUR = col_double(),
  ..   PT_TYPE = col_character(),
  ..   ORIGIN_PT_CODE = col_character(),
  ..   DESTINATION_PT_CODE = col_character(),
  ..   TOTAL_TRIPS = col_double()
  .. )
 - attr(*, "problems")=<externalptr>

Based on the data frame structure seen above, ORIGIN_PT_CODE and DESTINATION_PT_CODE are listed as character variables. These variables are equivalent to BUS_STOP_N of BusStop sf data frame and should be transformed to factors so that R treats them as grouping variables.

Code

# Columns to convert to factors
columns_to_convert <- c("ORIGIN_PT_CODE", "DESTINATION_PT_CODE")

# Apply as.factor() to the adjusted columns
odbus[columns_to_convert] <- lapply(odbus[columns_to_convert], as.factor)

# Re-examine the structure of the data frame
str(odbus)

spc_tbl_ [5,709,512 × 7] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ YEAR_MONTH         : chr [1:5709512] "2023-08" "2023-08" "2023-08" "2023-08" ...
 $ DAY_TYPE           : chr [1:5709512] "WEEKDAY" "WEEKENDS/HOLIDAY" "WEEKENDS/HOLIDAY" "WEEKDAY" ...
 $ TIME_PER_HOUR      : num [1:5709512] 16 16 14 14 17 17 17 17 7 17 ...
 $ PT_TYPE            : chr [1:5709512] "BUS" "BUS" "BUS" "BUS" ...
 $ ORIGIN_PT_CODE     : Factor w/ 5067 levels "01012","01013",..: 104 104 4422 4422 2008 2008 832 832 779 355 ...
 $ DESTINATION_PT_CODE: Factor w/ 5071 levels "01012","01013",..: 239 239 4736 4736 691 691 807 807 234 107 ...
 $ TOTAL_TRIPS        : num [1:5709512] 7 2 3 10 5 4 3 22 3 3 ...
 - attr(*, "spec")=
  .. cols(
  ..   YEAR_MONTH = col_character(),
  ..   DAY_TYPE = col_character(),
  ..   TIME_PER_HOUR = col_double(),
  ..   PT_TYPE = col_character(),
  ..   ORIGIN_PT_CODE = col_character(),
  ..   DESTINATION_PT_CODE = col_character(),
  ..   TOTAL_TRIPS = col_double()
  .. )
 - attr(*, "problems")=<externalptr>

Based on the output above, ORIGIN_PT_CODE and DESTINATION_PT_CODE are now factors.

Extract Commuting Flow data

The code chunk below extracts commuting flows for the four target time periods. For completeness, a list of all possible Bus Stops x Day Time x Time combinations is created. Following which, the commuting flow for each Bus Stops x Day Time x Time combination within the odbus data frame is computed and missing combinations are imputed with a value of 0. Lastly the resultant data frame is pivoted wider to form a data frame where each row represents one bus stop code, similar to the BusStop sf data frame.

Code

# Create a list of possible bus stop code, day type and time per hour for checking
BusStopList <- expand.grid(ORIGIN_PT_CODE = unique(BusStop$BUS_STOP_N),
                          DAY_TYPE = c("WEEKDAY", "WEEKENDS/HOLIDAY"),
                          TIME_PER_HOUR = unique(odbus$TIME_PER_HOUR))

# Commute commuting flow by bus stop code, day type and time per hour
odbus_recoded <- odbus %>%
  group_by(ORIGIN_PT_CODE, DAY_TYPE, TIME_PER_HOUR) %>%
  summarise(TRIPS = sum(TOTAL_TRIPS)) 

# Identify missing Bus Stops and impute a 0 value
odbus_missing <- BusStopList %>%
  anti_join(odbus_recoded, by = c("ORIGIN_PT_CODE", "DAY_TYPE", "TIME_PER_HOUR")) %>%
  mutate(TRIPS = 0)

# Combine the above two data frames
odbus_long <- rbind(odbus_recoded, odbus_missing) %>%
    mutate(peak_period = case_when(
    DAY_TYPE == "WEEKDAY" & TIME_PER_HOUR >= 6 & TIME_PER_HOUR <=9 ~ 
      "weekday_morn",
    DAY_TYPE == "WEEKDAY" & TIME_PER_HOUR >= 17 & TIME_PER_HOUR <=20 ~ 
      "weekday_evening",
    DAY_TYPE == "WEEKENDS/HOLIDAY" & TIME_PER_HOUR >= 11 & TIME_PER_HOUR <=14 ~ 
      "weekend_morn",
    DAY_TYPE == "WEEKENDS/HOLIDAY" & TIME_PER_HOUR >= 16 & TIME_PER_HOUR <=19 ~ 
      "weekend_evening",
    TRUE ~ "non_peak")) 

# Pivot wider to form data frame with one bus stop code per row
odbus_wide_peak <- odbus_long %>%
  group_by(ORIGIN_PT_CODE, peak_period) %>%
  summarise(TOTAL_TRIPS = sum(TRIPS)) %>%
  pivot_wider(names_from = peak_period,
              values_from = TOTAL_TRIPS)

odbus_wide <- odbus_long %>%
  filter(!(peak_period == "non_peak")) %>%
  group_by(ORIGIN_PT_CODE, DAY_TYPE, TIME_PER_HOUR) %>%
  summarise(TOTAL_TRIPS = sum(TRIPS)) %>%
  mutate(Day_Time = paste(DAY_TYPE, TIME_PER_HOUR)) %>%
  ungroup %>%
  select(-c(DAY_TYPE, TIME_PER_HOUR)) %>%
  pivot_wider(names_from = Day_Time,
              values_from = TOTAL_TRIPS) %>%
  left_join(odbus_wide_peak)

Append Commuting Flow data with Bus Stop Locations

left_join() of dplyr is then used to join the geographical data and commuting flow data table using the Bus Stop Code as the common identifier. Left join is done to ensure that the geospatial properties (geometry column) of the BusStop sf data frame is retained and bus stops not found within the geospatial data set are dropped.

Code

odBusStop <- left_join(BusStop, odbus_wide, by = c("BUS_STOP_N" = "ORIGIN_PT_CODE"))

str(odBusStop)

Classes 'sf' and 'data.frame':  5145 obs. of  25 variables:
 $ BUS_STOP_N         : Factor w/ 5199 levels "01012","01013",..: 1008 1724 2118 4972 431 3787 1160 2941 1610 4983 ...
 $ BUS_ROOF_N         : chr  "B06" "B23" "B01" "B05" ...
 $ LOC_DESC           : chr  "OPP CEVA LOGISTICS" "AFT TRACK 13" "BLK 239" "GRACE INDEPENDENT CH" ...
 $ WEEKDAY 6          : num  0 0 497 67 46 ...
 $ WEEKDAY 7          : num  9 0 623 128 36 ...
 $ WEEKDAY 8          : num  1 0 631 62 60 ...
 $ WEEKDAY 9          : num  5 0 373 50 48 ...
 $ WEEKDAY 17         : num  19 0 654 254 47 ...
 $ WEEKDAY 18         : num  17 0 485 104 60 ...
 $ WEEKDAY 19         : num  6 0 382 33 58 70 149 284 841 221 ...
 $ WEEKDAY 20         : num  1 0 221 8 30 42 18 140 610 91 ...
 $ WEEKENDS/HOLIDAY 11: num  0 0 178 27 12 57 3 415 502 234 ...
 $ WEEKENDS/HOLIDAY 12: num  2 0 201 39 32 44 3 447 551 246 ...
 $ WEEKENDS/HOLIDAY 13: num  1 0 183 17 24 27 4 320 560 206 ...
 $ WEEKENDS/HOLIDAY 14: num  3 0 138 27 13 30 0 254 357 172 ...
 $ WEEKENDS/HOLIDAY 16: num  6 1 130 21 21 34 17 131 465 232 ...
 $ WEEKENDS/HOLIDAY 17: num  4 0 151 23 10 41 37 133 541 171 ...
 $ WEEKENDS/HOLIDAY 18: num  1 0 135 29 11 57 15 142 499 167 ...
 $ WEEKENDS/HOLIDAY 19: num  1 0 130 11 19 41 15 91 368 140 ...
 $ non_peak           : num  163 6 3813 935 412 ...
 $ weekday_evening    : num  43 0 1742 399 195 ...
 $ weekday_morn       : num  15 0 2124 307 190 ...
 $ weekend_evening    : num  12 1 546 84 61 ...
 $ weekend_morn       : num  6 0 700 110 81 ...
 $ geometry           :sfc_POINT of length 5145; first list element:  'XY' num  13576 32884
 - attr(*, "sf_column")= chr "geometry"
 - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
  ..- attr(*, "names")= chr [1:24] "BUS_STOP_N" "BUS_ROOF_N" "LOC_DESC" "WEEKDAY 6" ...

There are a total of 5145 observations, which matches the number of features in the BusStop sf data frame.

Geovisualisation and Analysis

Create Spatial Grids

Spatial grids are commonly used in spatial analysis to divide the study area into equal-sized, regular polygons that can tessellate the whole study area. After that, a variable is summarized within each polygon. For this exercise, a hexagon layer of 250m (where 250m is the perpendicular distance between the hexagon centre and its edges) will be created.

In the code chunk below, st_make_grid is used to create the hexagon grids using the svy21 projected coordinate system. In st_make_grid, cellsize argument refers to the cell size in the units that the crs of the spatial data is using. The cellsize can be defined as the distance between opposite edges. Since the data is set in SVY21 projected coordinate system, which uses metres as the unit, the value is set as c(500,500) to create a hexagon layer of 250m. The resulting data frame is then converted into a sf data frame and an index is added for each hexagon grid.

Code

hex_grid <- st_make_grid(odBusStop, cellsize = c(500, 500), 
                         crs = 3413, what = "polygons", square = FALSE) %>%
  st_sf() %>%
  mutate(index = row_number())

In the code chunk below, st_join and st_intersects() is used to return a list of bus stops lying within each hexagon grid. A left join is done in this process (using the argument left = TRUE) to ensure that the geospatial properties (geometry column) of the hex_grid sf data frame is retained. After filtering out hexagon grids that do not contain any bus stops, the number of bus stops and the total number of trips at each time period is then computed for each hexagon grid.

Code

hex_grid_count <- st_join(hex_grid, odBusStop, join = st_intersects, left = TRUE) %>%
  filter(!(is.na(BUS_STOP_N))) %>%
  group_by(index) %>%
  summarise(
    count = n(),
    bus_stop_codes = paste(BUS_STOP_N, collapse = ", "),
    bus_stop_names = paste(LOC_DESC, collapse = ", "),
    across(starts_with("week"), sum)
  ) %>%
  ungroup()

The hex_grid_count sf data frame can now be used for plotting the geographical distribution of peak hour bus stops and commuting flow using tmap.

Visualisation of Geographical Distribution of Commuting Flow

tmap package can be used to help visualise the geographical distribution of the commuting flow over the four target time periods. Such geovisualisations can help identify areas with high commuting flow, which can be further used for other operational needs such as traffic management, bus route planning and crowd control management.

Continuous Scale

For efficiency of plotting, only the columns that will be used for this section will be selected.

Code

hex_grid_count2 <- hex_grid_count %>%
  select(index, count, bus_stop_codes, bus_stop_names, geometry,
         weekday_evening, weekday_morn, weekend_evening, weekend_morn)

The following code chunks will review the geographical distribution of the weekdays and weekends peak hour commuting flows using a continuous scale fill.

Geographical Distribution of Weekday Peak Hour Commuting Flows
Geographical Distribution of Weekend/PH Peak Hour Commuting Flows

The geographical distribution of the weekdays peak hour commuting flows is plotted side by side using tm_facets() function. The free.scales argument is set to FALSE to show a consistent scale across both time periods.

Code

tmap_mode("view")

tm_shape(hex_grid_count2) +
  tm_fill(col = c("weekday_morn", "weekday_evening"),
          palette = "Blues",
          style = "cont", 
          popup.vars = c("Bus Stops" = "bus_stop_codes", 
                         "Weekday Morning" = "weekday_morn", 
                         "Weekday Evening" = "weekday_evening")) + 
  tm_facets(ncol = 1, free.scales = FALSE)

Insights

The visualisation above shows a heavily right-skewed distribution across all the weekday peak hour time periods. Most bus stops have less than 100k commuter trips at each of these time periods. Given the current geographical distribution, it is challenging to identify the areas with high commuting flow. However, notable hexagon grids that can be distinctively identified as having high commuting flow >300k are:

Weekday Mornings: 1251, 2411
Weekday Evenings: 1251, 2411, 3239, 4349, 4539

The code chunk below displays the hexagon grids with significantly high commuting flow >300k in greater detail.

Code

weekday_300k <- hex_grid_count2 %>%
  filter(weekday_morn >= 300000 | weekday_evening >= 300000) %>%
  select(index, bus_stop_names, weekday_morn, weekday_evening)

kable(weekday_300k)

index	bus_stop_names	weekday_morn	weekday_evening	geometry
1251	BLK 664C, Blk 662A, BLK 662D, BOON LAY MRT STN, BOON LAY INT, OPP BLK 662C, OPP BOON LAY MRT STN	396701	551636	POLYGON ((13720.12 35286.69…
2411	BLK 515, BEF W’LANDS STN EXIT 7, WOODLANDS REG INT, W’LANDS CIVIC CTR, BLK 894C, W’LANDS STN EXIT 4, W’LANDS STN EXIT 5, OPP W’LANDS CIVIC CTR	313192	474311	POLYGON ((22970.12 46112.01…
3239	BLK 322, BLK 422, AFT ANG MO KIO INT, AFT ANG MO KIO STN EXIT A, ANG MO KIO INT, ANG MO KIO STN	212340	325858	POLYGON ((29720.12 38750.79…
4349	BEDOK STN EXIT B, PANASONIC, OPP PANASONIC, BEDOK BUS INT, BEDOK STN EXIT A	187052	370328	POLYGON ((38720.12 33554.64…
4539	TAMPINES INT, TAMPINES INT, UOB TAMPINES CTR, OPP CENTURY SQ, OPP OUR TAMPINES HUB, OUR TAMPINES HUB, Aft Tampines Stn Exit E, Tampines Stn Exit D	202366	446207	POLYGON ((40220.12 37018.74…

The geographical distribution of the weekends / holidays peak hour commuting flows is plotted side by side using tm_facets() function. The free.scales argument is set to FALSE to show a consistent scale across both time periods.

Code

tmap_mode("view")

tm_shape(hex_grid_count2) +
  tm_fill(col = c("weekend_morn", "weekend_evening"),
          palette = "Purples",
          style = "cont", 
          popup.vars = c("Bus Stops" = "bus_stop_codes", 
                         "Weekend/PH Morning" = "weekend_morn", 
                         "Weekend/PH Evening" = "weekend_evening")) + 
  tm_facets(ncol = 1, free.scales = FALSE)

Insights

Similarly, the visualisation above shows a heavily right-skewed distribution across all the weekend peak hour time periods. However, the continuous scale suggests that weekend peak hour commuting flow volume is approximately one-fifth of weekday peak hour commuting flow volume. Most bus stops have less than 20k commuter trips at each of these time periods. Again, it is challenging to identify the areas with high commuting flow but notable hexagon grids that can be distinctively identified as having high commuting flow >80k are:

Weekend/PH Mornings: 1251, 2411, 4349, 4539
Weekend/PH Evenings: 1251, 2411, 3239, 3578, 4349, 4539

The code chunk below displays the hexagon grids with significantly more commuting flow >80k in greater detail.

Code

weekend_80k <- hex_grid_count2 %>%
  filter(weekend_morn >= 80000 | weekend_evening >= 80000) %>%
  select(index, bus_stop_names, weekend_morn, weekend_evening)

kable(weekend_80k)

index	bus_stop_names	weekend_morn	weekend_evening	geometry
1251	BLK 664C, Blk 662A, BLK 662D, BOON LAY MRT STN, BOON LAY INT, OPP BLK 662C, OPP BOON LAY MRT STN	108551	149047	POLYGON ((13720.12 35286.69…
2411	BLK 515, BEF W’LANDS STN EXIT 7, WOODLANDS REG INT, W’LANDS CIVIC CTR, BLK 894C, W’LANDS STN EXIT 4, W’LANDS STN EXIT 5, OPP W’LANDS CIVIC CTR	99396	123274	POLYGON ((22970.12 46112.01…
3239	BLK 322, BLK 422, AFT ANG MO KIO INT, AFT ANG MO KIO STN EXIT A, ANG MO KIO INT, ANG MO KIO STN	79535	94824	POLYGON ((29720.12 38750.79…
3578	BLK 421, SERANGOON INT, S’GOON STN EXIT A / BLK 413, S’GOON STN EXIT C / BLK 201, S’GOON STN EXIT E, S’GOON STN EXIT H, BLK 409, S’GOON STN EXIT D / BLK 416	57084	80507	POLYGON ((32470.12 36585.73…
4349	BEDOK STN EXIT B, PANASONIC, OPP PANASONIC, BEDOK BUS INT, BEDOK STN EXIT A	101960	118967	POLYGON ((38720.12 33554.64…
4539	TAMPINES INT, TAMPINES INT, UOB TAMPINES CTR, OPP CENTURY SQ, OPP OUR TAMPINES HUB, OUR TAMPINES HUB, Aft Tampines Stn Exit E, Tampines Stn Exit D	95639	124569	POLYGON ((40220.12 37018.74…

However, due to the heavily right-skewed distribution, not much can be glimpsed from the current geovisualisation using continuous scale fill. A data classification method will be used to replot the maps.

Jenks Natural Breaks

The Jenks optimization method, also called the Jenks natural breaks classification method, is a data clustering method designed to determine the best arrangement of values into different classes by seeking to reduce the variance within classes and maximize the variance between classes. For this exercise, the number of breaks is defined at n=6.

The following code chunks will review the geographical distribution of the weekdays and weekends peak hour commuting flows using a continuous scale fill.

Geographical Distribution of Weekday Peak Hour Commuting Flows
Geographical Distribution of Weekend/PH Peak Hour Commuting Flows

Code

tmap_mode("view")

tm_shape(hex_grid_count2) +
  tm_fill(col = c("weekday_morn", "weekday_evening"),
          palette = "Blues",
          n = 6, style = "jenks", 
          popup.vars = c("Bus Stops" = "bus_stop_codes", 
                         "Weekday Morning" = "weekday_morn", 
                         "Weekday Evening" = "weekday_evening")) + 
  tm_facets(ncol = 1, free.scales = FALSE)

Code

weekday_156k <- hex_grid_count2 %>%
  filter(weekday_morn >= 156000 | weekday_evening >= 156000) %>%
  select(index, bus_stop_names, weekday_morn, weekday_evening)

kable(weekday_156k)

index	bus_stop_names	weekday_morn	weekday_evening	geometry
1251	BLK 664C, Blk 662A, BLK 662D, BOON LAY MRT STN, BOON LAY INT, OPP BLK 662C, OPP BOON LAY MRT STN	396701	551636	POLYGON ((13720.12 35286.69…
1438	LAKESIDE STN, OPP BLK 203, BLK 517, OPP LAKESIDE STN	94194	172864	POLYGON ((15220.12 36152.72…
1815	CHOA CHU KANG INT, LOT 1 / CHOA CHU KANG STN, OPP NORTHVALE CONDO, OPP CHOA CHU KANG STN	156028	289879	POLYGON ((18220.12 40482.84…
1904	BLK 644, BLK 231, BLK 503, BLK 225, BT BATOK INT, BLK 240	137323	164006	POLYGON ((18970.12 36585.73…
2054	BLK 431, CLEMENTI STN, CLEMENTI STN	202467	184460	POLYGON ((20220.12 32688.62…
2062	BT PANJANG STN EXIT A/ LRT, OPP BT PANJANG STN, AFT BT PANJANG STN, BT PANJANG INT, THE LINEAR, BT PANJANG STN EXIT B, OPP BLK 180, BLK 180	95420	230072	POLYGON ((20220.12 39616.82…
2411	BLK 515, BEF W’LANDS STN EXIT 7, WOODLANDS REG INT, W’LANDS CIVIC CTR, BLK 894C, W’LANDS STN EXIT 4, W’LANDS STN EXIT 5, OPP W’LANDS CIVIC CTR	313192	474311	POLYGON ((22970.12 46112.01…
3204	BLK 84B, OPP TOA PAYOH STN, BLK 138B, BLK 79C, TOA PAYOH BUS INT, TOA PAYOH BUS INT, TOA PAYOH STN	166638	248170	POLYGON ((29470.12 34853.68…
3239	BLK 322, BLK 422, AFT ANG MO KIO INT, AFT ANG MO KIO STN EXIT A, ANG MO KIO INT, ANG MO KIO STN	212340	325858	POLYGON ((29720.12 38750.79…
3578	BLK 421, SERANGOON INT, S’GOON STN EXIT A / BLK 413, S’GOON STN EXIT C / BLK 201, S’GOON STN EXIT E, S’GOON STN EXIT H, BLK 409, S’GOON STN EXIT D / BLK 416	111428	299786	POLYGON ((32470.12 36585.73…
3859	NAUNG RESIDENCE, THE MIDTOWN, HOUGANG CTRL INT, BLK 302, HOUGANG CTRL INT, BLK 327	106496	183771	POLYGON ((34720.12 38750.79…
3924	BLK 248A, SENGKANG INT, COMPASSVALE INT, BLK 240, Sengkang Stn, BLK 259C, BLK 250A	77090	211180	POLYGON ((35220.12 41348.87…
3957	PUNGGOL STN / WATERWAY PT, PUNGGOL BUS INTERCHANGE, BEF BLK 264, OPP BLK 272C, BLK 272C	88871	179509	POLYGON ((35470.12 42647.91…
4018	BLK 102C, BEF PUNGGOL RD, AFT PUNGGOL RD, OPP BLK 103A	207583	146562	POLYGON ((35970.12 41781.88…
4349	BEDOK STN EXIT B, PANASONIC, OPP PANASONIC, BEDOK BUS INT, BEDOK STN EXIT A	187052	370328	POLYGON ((38720.12 33554.64…
4539	TAMPINES INT, TAMPINES INT, UOB TAMPINES CTR, OPP CENTURY SQ, OPP OUR TAMPINES HUB, OUR TAMPINES HUB, Aft Tampines Stn Exit E, Tampines Stn Exit D	202366	446207	POLYGON ((40220.12 37018.74…
4635	OPP PASIR RIS STN EXIT A, PASIR RIS INT, PASIR RIS INT, AFT PASIR RIS ST 53, AFT PASIR RIS DR 3, BLK 586, BEF PASIR RIS ST 53, AFT PASIR RIS STN EXIT A, BEF PASIR RIS STN	88685	220510	POLYGON ((40970.12 39183.81…

Insights

The geographical distribution of weekday peak hour commuting flow is much clearer in this geovisualisation using Jenks Natural Breaks. Areas with high commuting flow could be defined as the last two classes (commuting flow >156k) and the code chunk above displays these hexagon grids in greater detail. It can also be observed that all of the highest commuting flow areas are in the housing estates.

Code

tmap_mode("view")

tm_shape(hex_grid_count2) +
  tm_fill(col = c("weekend_morn", "weekend_evening"),
          palette = "Purples",
          n = 6, style = "jenks", 
          popup.vars = c("Bus Stops" = "bus_stop_codes", 
                         "Weekend/PH Morning" = "weekend_morn", 
                         "Weekend/PH Evening" = "weekend_evening")) + 
  tm_facets(ncol = 1, free.scales = FALSE)

The code chunk below displays the hexagon grids that fall into the last two classes (>31k) in greater detail.

Code

weekend_31k <- hex_grid_count2 %>%
  filter(weekend_morn >= 31000 | weekend_evening >= 31000) %>%
  select(index, bus_stop_names, weekend_morn, weekend_evening)

kable(weekend_31k)

index	bus_stop_names	weekend_morn	weekend_evening	geometry
1251	BLK 664C, Blk 662A, BLK 662D, BOON LAY MRT STN, BOON LAY INT, OPP BLK 662C, OPP BOON LAY MRT STN	108551	149047	POLYGON ((13720.12 35286.69…
1438	LAKESIDE STN, OPP BLK 203, BLK 517, OPP LAKESIDE STN	23964	37003	POLYGON ((15220.12 36152.72…
1778	J GATEWAY, JURONG EAST TEMP INT, OPP JEM	25234	37976	POLYGON ((17970.12 34853.68…
1815	CHOA CHU KANG INT, LOT 1 / CHOA CHU KANG STN, OPP NORTHVALE CONDO, OPP CHOA CHU KANG STN	62447	78360	POLYGON ((18220.12 40482.84…
1904	BLK 644, BLK 231, BLK 503, BLK 225, BT BATOK INT, BLK 240	39157	43066	POLYGON ((18970.12 36585.73…
2024	OPP REGENT PK, NAN HUA PR SCH, CLEMENTI INT, REGENT PK	37828	40308	POLYGON ((19970.12 33121.63…
2054	BLK 431, CLEMENTI STN, CLEMENTI STN	44441	52125	POLYGON ((20220.12 32688.62…
2062	BT PANJANG STN EXIT A/ LRT, OPP BT PANJANG STN, AFT BT PANJANG STN, BT PANJANG INT, THE LINEAR, BT PANJANG STN EXIT B, OPP BLK 180, BLK 180	48824	64095	POLYGON ((20220.12 39616.82…
2102	W’LANDS CHECKPT	42251	38354	POLYGON ((20470.12 46978.04…
2133	W’LANDS CHECKPT	28240	38286	POLYGON ((20720.12 47411.05…
2135	JOHOR BAHRU CHECKPT, JOHOR BAHRU CHECKPT	31773	38431	POLYGON ((20720.12 49143.1,…
2411	BLK 515, BEF W’LANDS STN EXIT 7, WOODLANDS REG INT, W’LANDS CIVIC CTR, BLK 894C, W’LANDS STN EXIT 4, W’LANDS STN EXIT 5, OPP W’LANDS CIVIC CTR	99396	123274	POLYGON ((22970.12 46112.01…
2854	HARBOURFRONT STN EXIT A, OPP VIVOCITY, HARBOURFRONT/VIVOCITY	24091	41675	POLYGON ((26720.12 27492.46…
2950	OPP TIONG BAHRU STN, TIONG BAHRU STN, CTRL GREEN CONDO	37980	35154	POLYGON ((27470.12 29657.53…
3014	OPP NGEE ANN CITY, BEF ORCHARD STN EXIT B, ORCHARD STN / LUCKY PLAZA, ORCHARD STN / TANGS PLAZA, OPP ORCHARD STN / ION	35047	54816	POLYGON ((27970.12 31389.58…
3166	OPP HONG LIM CPLX, CHINATOWN STN EXIT E, MAXWELL RD FC, NEW BRIDGE CTR, OPP SRI MARIAMMAN TP, CHINATOWN STN EXIT C	43599	45553	POLYGON ((29220.12 29224.51…
3204	BLK 84B, OPP TOA PAYOH STN, BLK 138B, BLK 79C, TOA PAYOH BUS INT, TOA PAYOH BUS INT, TOA PAYOH STN	71600	78369	POLYGON ((29470.12 34853.68…
3239	BLK 322, BLK 422, AFT ANG MO KIO INT, AFT ANG MO KIO STN EXIT A, ANG MO KIO INT, ANG MO KIO STN	79535	94824	POLYGON ((29720.12 38750.79…
3292	ST JOSEPH’S CH, AFT BUGIS STN EXIT C, OPP BUGIS STN EXIT C, BRAS BASAH CPLX, HOTEL GRAND PACIFIC, BEF BENCOOLEN STN EXIT A, AFT WATERLOO ST, BEF WATERLOO ST	29286	38048	POLYGON ((30220.12 30956.57…
3578	BLK 421, SERANGOON INT, S’GOON STN EXIT A / BLK 413, S’GOON STN EXIT C / BLK 201, S’GOON STN EXIT E, S’GOON STN EXIT H, BLK 409, S’GOON STN EXIT D / BLK 416	57084	80507	POLYGON ((32470.12 36585.73…
3822	SIMS VILLE, BLK 1015, PAYA LEBAR STN EXIT D, OPP BLK 1015, PAYA LEBAR STN EXIT C, PAYA LEBAR STN EXIT B	28803	34103	POLYGON ((34470.12 33121.63…
3859	NAUNG RESIDENCE, THE MIDTOWN, HOUGANG CTRL INT, BLK 302, HOUGANG CTRL INT, BLK 327	46128	50760	POLYGON ((34720.12 38750.79…
3924	BLK 248A, SENGKANG INT, COMPASSVALE INT, BLK 240, Sengkang Stn, BLK 259C, BLK 250A	44740	53323	POLYGON ((35220.12 41348.87…
3957	PUNGGOL STN / WATERWAY PT, PUNGGOL BUS INTERCHANGE, BEF BLK 264, OPP BLK 272C, BLK 272C	41764	49806	POLYGON ((35470.12 42647.91…
4006	OPP PARKWAY PARADE, PARKWAY PARADE, OPP 112 KATONG	31559	37707	POLYGON ((35970.12 31389.58…
4018	BLK 102C, BEF PUNGGOL RD, AFT PUNGGOL RD, OPP BLK 103A	39349	41052	POLYGON ((35970.12 41781.88…
4349	BEDOK STN EXIT B, PANASONIC, OPP PANASONIC, BEDOK BUS INT, BEDOK STN EXIT A	101960	118967	POLYGON ((38720.12 33554.64…
4539	TAMPINES INT, TAMPINES INT, UOB TAMPINES CTR, OPP CENTURY SQ, OPP OUR TAMPINES HUB, OUR TAMPINES HUB, Aft Tampines Stn Exit E, Tampines Stn Exit D	95639	124569	POLYGON ((40220.12 37018.74…
4635	OPP PASIR RIS STN EXIT A, PASIR RIS INT, PASIR RIS INT, AFT PASIR RIS ST 53, AFT PASIR RIS DR 3, BLK 586, BEF PASIR RIS ST 53, AFT PASIR RIS STN EXIT A, BEF PASIR RIS STN	40921	54305	POLYGON ((40970.12 39183.81…

Insights

For the weekend peak hour commuting flow, areas with high commuting flow could be defined as the last two classes (commuting flow >31k) and the code chunk above displays these hexagon grids in greater detail. Using Jenks Natural Breaks for data classification also reveals more high commuting flow areas that are not just in the major housing estates, but also in the town area and the area surrounding Woodlands Checkpoint.

Emerging Hot Spot Analysis

Given that each target time period spans across four whole hours, it would also be useful to identify time trends across the four hours among these high commuting flow areas. This could help guide better operational decisions such as staffing decisions. Emerging Hot Spot Analysis (EHSA) could be performed to uncover this as it aims to classify each location into one of seventeen categories based on ESRI’s emerging hot spot classification criteria.

To perform EHSA, time series cubes are needed.

Time Series Cube

Spatio-temporal data are typically presented in long formats, where a row identifies a unique location and time observation represented by a column dedicated to time and another to locations. However, such data are traditionally not linked to the geographies as they contain only an identifier of the location, but not the spatial representation that makes the data meaningful for geospatial analysis.

The spacetime class in sfdep package can be used to create lattice data that is a representation of spatio-temporal data for a set of regions containing the geometries over a number of different time-periods. There are two possible layouts: spatio-temporal full grid and sparse grids.

Given a number of spatial features n, and time periods m, a spatio-temporal full grid contains n×m rows. Each location has a recorded observation for each of the time periods in m. When there are missing observations for some locations or time periods and they are entirely omitted from the data set, that is a spatio-temporal sparse grid.

For this analysis, time series cubes will be created for each of the four target time periods respectively using the earlier created hex_grid_count.

Create Time Series Cube

In the code chunks below, relevant columns that correspond to the target time periods will be selected from hex_grid_count and joined with hex_grid to create the n number of spatial features. The corresponding sf data frame will then be pivoted longer to create a time series data containing the m time periods. Once the data frame has nxm rows, apply the spacetime() function to create the time series cube.

Code

# Select relevant columns from hex_grid_count
weekday_morn_hex <- hex_grid_count %>%
  select(index, count, bus_stop_codes, bus_stop_names, geometry,
         `WEEKDAY 6`, `WEEKDAY 7`, `WEEKDAY 8`, `WEEKDAY 9`)

# Join with hex_grid and pivot longer to create time series data
weekday_morn_full <- st_join(hex_grid, weekday_morn_hex, join = st_within) %>%
  select(-c(index.y)) %>%
  rename(index = index.x) %>%
  rename_with(~gsub("WEEKDAY ", "", .), matches("^WEEKDAY \\d+$")) %>%
  pivot_longer(cols = c(`6`, `7`, `8`, `9`), 
               names_to = "TIME_PER_HOUR", values_to = "TRIPS",
               names_transform = as.integer, values_transform = as.integer) %>%
  mutate(across(where(is.integer), ~ifelse(is.na(.), 0, .))) %>%
  select(index, TIME_PER_HOUR, TRIPS) %>%
  st_set_geometry(NULL)

# Apply spacetime() function
weekday_morn_cube <- spacetime(weekday_morn_full, hex_grid,
                               .loc_col = "index", .time_col = "TIME_PER_HOUR")

# Verify that resultant output is a spacetime cube
is_spacetime_cube(weekday_morn_cube)

[1] TRUE

The TRUE return confirms that weekday_morn_cube object is indeed an time-space cube.

Code

# Select relevant columns from hex_grid_count
weekday_evening_hex <- hex_grid_count %>%
  select(index, count, bus_stop_codes, bus_stop_names, geometry,
         `WEEKDAY 17`, `WEEKDAY 18`, `WEEKDAY 19`, `WEEKDAY 20`)

# Join with hex_grid and pivot longer to create time series data
weekday_evening_full <- st_join(hex_grid, weekday_evening_hex, join = st_within) %>%
  select(-c(index.y)) %>%
  rename(index = index.x) %>%
  rename_with(~gsub("WEEKDAY ", "", .), matches("^WEEKDAY \\d+$")) %>%
  pivot_longer(cols = c(`17`, `18`, `19`, `20`), 
               names_to = "TIME_PER_HOUR", values_to = "TRIPS",
               names_transform = as.integer, values_transform = as.integer) %>%
  mutate(across(where(is.integer), ~ifelse(is.na(.), 0, .))) %>%
  select(index, TIME_PER_HOUR, TRIPS) %>%
  st_set_geometry(NULL)

# Apply spacetime() function
weekday_evening_cube <- spacetime(weekday_evening_full, hex_grid,
                               .loc_col = "index", .time_col = "TIME_PER_HOUR")

# Verify that resultant output is a spacetime cube
is_spacetime_cube(weekday_evening_cube)

[1] TRUE

The TRUE return confirms that weekday_evening_cube object is indeed an time-space cube.

Code

# Select relevant columns from hex_grid_count
weekend_morn_hex <- hex_grid_count %>%
  select(index, count, bus_stop_codes, bus_stop_names, geometry,
         `WEEKENDS/HOLIDAY 11`, `WEEKENDS/HOLIDAY 12`, 
         `WEEKENDS/HOLIDAY 13`, `WEEKENDS/HOLIDAY 14`)

# Join with hex_grid and pivot longer to create time series data
weekend_morn_full <- st_join(hex_grid, weekend_morn_hex, join = st_within) %>%
  select(-c(index.y)) %>%
  rename(index = index.x) %>%
  rename_with(~gsub("WEEKENDS/HOLIDAY ", "", .), matches("^WEEKENDS/HOLIDAY \\d+$")) %>%
  pivot_longer(cols = c(`11`, `12`, `13`, `14`), 
               names_to = "TIME_PER_HOUR", values_to = "TRIPS",
               names_transform = as.integer, values_transform = as.integer) %>%
  mutate(across(where(is.integer), ~ifelse(is.na(.), 0, .))) %>%
  select(index, TIME_PER_HOUR, TRIPS) %>%
  st_set_geometry(NULL)

# Apply spacetime() function
weekend_morn_cube <- spacetime(weekend_morn_full, hex_grid,
                               .loc_col = "index", .time_col = "TIME_PER_HOUR")

# Verify that resultant output is a spacetime cube
is_spacetime_cube(weekend_morn_cube)

[1] TRUE

The TRUE return confirms that weekend_morn_cube object is indeed an time-space cube.

Code

# Select relevant columns from hex_grid_count
weekend_evening_hex <- hex_grid_count %>%
  select(index, count, bus_stop_codes, bus_stop_names, geometry,
         `WEEKENDS/HOLIDAY 16`, `WEEKENDS/HOLIDAY 17`, 
         `WEEKENDS/HOLIDAY 18`, `WEEKENDS/HOLIDAY 19`)

# Join with hex_grid and pivot longer to create time series data
weekend_evening_full <- st_join(hex_grid, weekend_evening_hex, join = st_within) %>%
  select(-c(index.y)) %>%
  rename(index = index.x) %>%
  rename_with(~gsub("WEEKENDS/HOLIDAY ", "", .), matches("^WEEKENDS/HOLIDAY \\d+$")) %>%
  pivot_longer(cols = c(`16`, `17`, `18`, `19`), 
               names_to = "TIME_PER_HOUR", values_to = "TRIPS",
               names_transform = as.integer, values_transform = as.integer) %>%
  mutate(across(where(is.integer), ~ifelse(is.na(.), 0, .))) %>%
  select(index, TIME_PER_HOUR, TRIPS) %>%
  st_set_geometry(NULL)

# Apply spacetime() function
weekend_evening_cube <- spacetime(weekend_evening_full, hex_grid,
                               .loc_col = "index", .time_col = "TIME_PER_HOUR")

# Verify that resultant output is a spacetime cube
is_spacetime_cube(weekend_evening_cube)

[1] TRUE

The TRUE return confirms that weekend_evening_cube object is indeed an time-space cube.

Computation of Local G_i* statistics

To compute the local G_i* statistics, spatial weights must first be derived. The code chunks below will be used to identify neighbors and to derive an inverse distance weights for each target time period. activate() is used to first activate the geometry context, followed by mutate() to create two new columns nb and wt. Then activate the data context again and copy over the nb and wt columns to each time-slice using set_nbs() and set_wts().

Code

weekday_morn_nb <- weekday_morn_cube %>%
  activate("geometry") %>%
  mutate(nb = include_self(st_contiguity(geometry)),
         wt = st_inverse_distance(nb, geometry,
                                  scale = 1,
                                  alpha = 1),
         .before = 1) %>%
  set_nbs("nb") %>%
  set_wts("wt")

Code

weekday_evening_nb <- weekday_evening_cube %>%
  activate("geometry") %>%
  mutate(nb = include_self(st_contiguity(geometry)),
         wt = st_inverse_distance(nb, geometry,
                                  scale = 1,
                                  alpha = 1),
         .before = 1) %>%
  set_nbs("nb") %>%
  set_wts("wt")

Code

weekend_morn_nb <- weekend_morn_cube %>%
  activate("geometry") %>%
  mutate(nb = include_self(st_contiguity(geometry)),
         wt = st_inverse_distance(nb, geometry,
                                  scale = 1,
                                  alpha = 1),
         .before = 1) %>%
  set_nbs("nb") %>%
  set_wts("wt")

Code

weekend_evening_nb <- weekend_evening_cube %>%
  activate("geometry") %>%
  mutate(nb = include_self(st_contiguity(geometry)),
         wt = st_inverse_distance(nb, geometry,
                                  scale = 1,
                                  alpha = 1),
         .before = 1) %>%
  set_nbs("nb") %>%
  set_wts("wt")

In the code chunks below, the new columns are used to manually calculate the local G_i* for each location. This is done by grouping by TIME_PER_HOUR and using local_gstar_perm() of sfdep package. After which, unnest() is used to unnest the gi_star column of the newly created output.

Code

gi_weekday_morn <- weekday_morn_nb %>%
  group_by(TIME_PER_HOUR) %>%
  mutate(gi_star = local_gstar_perm(TRIPS, nb, wt)) %>%
  tidyr::unnest(gi_star)

Code

gi_weekday_evening <- weekday_evening_nb %>%
  group_by(TIME_PER_HOUR) %>%
  mutate(gi_star = local_gstar_perm(TRIPS, nb, wt)) %>%
  tidyr::unnest(gi_star)

Code

gi_weekend_morn <- weekend_morn_nb %>%
  group_by(TIME_PER_HOUR) %>%
  mutate(gi_star = local_gstar_perm(TRIPS, nb, wt)) %>%
  tidyr::unnest(gi_star)

Code

gi_weekend_evening <- weekend_evening_nb %>%
  group_by(TIME_PER_HOUR) %>%
  mutate(gi_star = local_gstar_perm(TRIPS, nb, wt)) %>%
  tidyr::unnest(gi_star)

Mann-Kendall Test

With the above G_i* statistics, it is possible to evaluate each location for a trend using the Mann-Kendall test.

The code chunk below performs the Mann-Kendall Test for the following high commuting flow hexagon grids:

 [1] 1251 1438 1815 1904 2054 2062 2411 3204 3239 3578 3859 3924 3957 4018 4349
[16] 4539 4635

Code

plot_weekdaymorn <- gi_weekday_morn %>% 
  ungroup() %>% 
  filter(index %in% weekday_156k$index) |> 
  select(index, TIME_PER_HOUR, gi_star)

# Plot result using ggplot2 functions and ggplotly() of plotly package
p1 <- ggplot(data = plot_weekdaymorn, 
       aes(x = TIME_PER_HOUR, 
           y = gi_star,
           color = as.factor(index))) +
  geom_line() +
  labs(color = "Hexagon Grid") + 
  theme_light()

ggplotly(p1)

The Mann-Kendall test is done for each hexagon grid in the code chunk below.

Code

weekdaymorn_ehsa <- gi_weekday_morn %>%
  group_by(index) %>%
  summarise(mk = list(
    unclass(
      Kendall::MannKendall(gi_star)))) %>%
  tidyr::unnest_wider(mk)

# Save out the model
write_rds(weekdaymorn_ehsa, "data/rds/weekdaymorn_ehsa.rds")

# A tibble: 17 × 7
   index    tau     sl     S     D  varS                                geometry
   <int>  <dbl>  <dbl> <dbl> <dbl> <dbl>                           <POLYGON [m]>
 1  3859  1.00  0.0894     6  6.00  8.67 ((34720.12 38750.79, 34470.12 38895.13…
 2  1438 -0.667 0.308     -4  6.00  8.67 ((15220.12 36152.72, 14970.12 36297.06…
 3  2054  0.667 0.308      4  6.00  8.67 ((20220.12 32688.62, 19970.12 32832.95…
 4  2062 -0.667 0.308     -4  6.00  8.67 ((20220.12 39616.82, 19970.12 39761.16…
 5  2411 -0.667 0.308     -4  6.00  8.67 ((22970.12 46112.01, 22720.12 46256.35…
 6  3204  0.667 0.308      4  6.00  8.67 ((29470.12 34853.68, 29220.12 34998.02…
 7  3578  0.667 0.308      4  6.00  8.67 ((32470.12 36585.73, 32220.12 36730.07…
 8  3924 -0.667 0.308     -4  6.00  8.67 ((35220.12 41348.87, 34970.12 41493.21…
 9  3957 -0.667 0.308     -4  6.00  8.67 ((35470.12 42647.91, 35220.12 42792.25…
10  4018 -0.667 0.308     -4  6.00  8.67 ((35970.12 41781.88, 35720.12 41926.22…
11  4635 -0.667 0.308     -4  6.00  8.67 ((40970.12 39183.81, 40720.12 39328.14…
12  1815 -0.333 0.734     -2  6.00  8.67 ((18220.12 40482.84, 17970.12 40627.18…
13  1904 -0.333 0.734     -2  6.00  8.67 ((18970.12 36585.73, 18720.12 36730.07…
14  3239  0.333 0.734      2  6.00  8.67 ((29720.12 38750.79, 29470.12 38895.13…
15  4539  0.333 0.734      2  6.00  8.67 ((40220.12 37018.74, 39970.12 37163.08…
16  1251  0     1          0  6.00  8.67 ((13720.12 35286.69, 13470.12 35431.03…
17  4349  0     1          0  6.00  8.67 ((38720.12 33554.64, 38470.12 33698.98…

In the above result, sl is the p-value.

Insights

The chart and the table above suggests a slight upward trend across the 4 hours of the weekday morning peak period for hexagon grid 3859.

At an alpha value of 0.05 (CI 95%), p-value > alpha value, suggesting that this trend is insignificant.
At alpha value of 0.1 (CI 90%), p-value < alpha value, suggesting that this trend is significant.

For all the other hexagon grids, there is no significant upward or downward trend in commuting flow across the 4 hours of the weekday morning peak period.

The code chunk below performs the Mann-Kendall Test for the following high commuting flow hexagon grids:

 [1] 1251 1438 1815 1904 2054 2062 2411 3204 3239 3578 3859 3924 3957 4018 4349
[16] 4539 4635

Code

plot_weekdayevening <- gi_weekday_evening %>% 
  ungroup() %>% 
  filter(index %in% weekday_156k$index) |> 
  select(index, TIME_PER_HOUR, gi_star)

# Plot result using ggplot2 functions and ggplotly() of plotly package
p2 <- ggplot(data = plot_weekdayevening, 
       aes(x = TIME_PER_HOUR, 
           y = gi_star,
           color = as.factor(index))) +
  geom_line() +
  labs(color = "Hexagon Grid") + 
  theme_light()

ggplotly(p2)

The Mann-Kendall test is done for each hexagon grid in the code chunk below.

Code

weekdayevening_ehsa <- gi_weekday_evening %>%
  group_by(index) %>%
  summarise(mk = list(
    unclass(
      Kendall::MannKendall(gi_star)))) %>%
  tidyr::unnest_wider(mk)

# Save out the model
write_rds(weekdayevening_ehsa, "data/rds/weekdayevening_ehsa.rds")

# A tibble: 17 × 7
   index    tau     sl     S     D  varS                                geometry
   <int>  <dbl>  <dbl> <dbl> <dbl> <dbl>                           <POLYGON [m]>
 1  1904 -1.00  0.0894    -6  6.00  8.67 ((18970.12 36585.73, 18720.12 36730.07…
 2  4018 -1.00  0.0894    -6  6.00  8.67 ((35970.12 41781.88, 35720.12 41926.22…
 3  1251  0.667 0.308      4  6.00  8.67 ((13720.12 35286.69, 13470.12 35431.03…
 4  1438 -0.667 0.308     -4  6.00  8.67 ((15220.12 36152.72, 14970.12 36297.06…
 5  2062  0.667 0.308      4  6.00  8.67 ((20220.12 39616.82, 19970.12 39761.16…
 6  3204 -0.667 0.308     -4  6.00  8.67 ((29470.12 34853.68, 29220.12 34998.02…
 7  3239  0.667 0.308      4  6.00  8.67 ((29720.12 38750.79, 29470.12 38895.13…
 8  3578  0.667 0.308      4  6.00  8.67 ((32470.12 36585.73, 32220.12 36730.07…
 9  4349  0.667 0.308      4  6.00  8.67 ((38720.12 33554.64, 38470.12 33698.98…
10  4539  0.667 0.308      4  6.00  8.67 ((40220.12 37018.74, 39970.12 37163.08…
11  4635  0.667 0.308      4  6.00  8.67 ((40970.12 39183.81, 40720.12 39328.14…
12  2054 -0.333 0.734     -2  6.00  8.67 ((20220.12 32688.62, 19970.12 32832.95…
13  3859  0.333 0.734      2  6.00  8.67 ((34720.12 38750.79, 34470.12 38895.13…
14  3957  0.333 0.734      2  6.00  8.67 ((35470.12 42647.91, 35220.12 42792.25…
15  1815  0     1          0  6.00  8.67 ((18220.12 40482.84, 17970.12 40627.18…
16  2411  0     1          0  6.00  8.67 ((22970.12 46112.01, 22720.12 46256.35…
17  3924  0     1          0  6.00  8.67 ((35220.12 41348.87, 34970.12 41493.21…

In the above result, sl is the p-value.

Insights

The chart and the table above suggests a slight downward trend across the 4 hours of the weekday evening peak period for hexagon grids 1904 and 4018.

At an alpha value of 0.05 (CI 95%), p-value > alpha value, suggesting that this trend is insignificant.
At alpha value of 0.1 (CI 90%), p-value < alpha value, suggesting that this trend is significant.

For all the other hexagon grids, there is no significant upward or downward trend in commuting flow across the 4 hours of the weekday evening peak period.

The code chunk below performs the Mann-Kendall Test for the following high commuting flow hexagon grids:

 [1] 1251 1438 1778 1815 1904 2024 2054 2062 2102 2133 2135 2411 2854 2950 3014
[16] 3166 3204 3239 3292 3578 3822 3859 3924 3957 4006 4018 4349 4539 4635

Code

plot_weekendmorn <- gi_weekend_morn %>% 
  ungroup() %>% 
  filter(index %in% weekend_31k$index) |> 
  select(index, TIME_PER_HOUR, gi_star)

# Plot result using ggplot2 functions and ggplotly() of plotly package
p3 <- ggplot(data = plot_weekendmorn, 
       aes(x = TIME_PER_HOUR, 
           y = gi_star,
           color = as.factor(index))) +
  geom_line() +
  labs(color = "Hexagon Grid") + 
  theme_light()

ggplotly(p3)

The Mann-Kendall test is done for each hexagon grid in the code chunk below.

Code

weekendmorn_ehsa <- gi_weekend_morn %>%
  group_by(index) %>%
  summarise(mk = list(
    unclass(
      Kendall::MannKendall(gi_star)))) %>%
  tidyr::unnest_wider(mk)

# Save out the model
write_rds(weekendmorn_ehsa, "data/rds/weekendmorn_ehsa.rds")

# A tibble: 29 × 7
   index   tau     sl     S     D  varS                                 geometry
   <int> <dbl>  <dbl> <dbl> <dbl> <dbl>                            <POLYGON [m]>
 1  1438 -1.00 0.0894    -6  6.00  8.67 ((15220.12 36152.72, 14970.12 36297.06,…
 2  2102 -1.00 0.0894    -6  6.00  8.67 ((20470.12 46978.04, 20220.12 47122.37,…
 3  2133 -1.00 0.0894    -6  6.00  8.67 ((20720.12 47411.05, 20470.12 47555.39,…
 4  2950 -1.00 0.0894    -6  6.00  8.67 ((27470.12 29657.53, 27220.12 29801.86,…
 5  3859 -1.00 0.0894    -6  6.00  8.67 ((34720.12 38750.79, 34470.12 38895.13,…
 6  4018 -1.00 0.0894    -6  6.00  8.67 ((35970.12 41781.88, 35720.12 41926.22,…
 7  1251  1.00 0.0894     6  6.00  8.67 ((13720.12 35286.69, 13470.12 35431.03,…
 8  1778  1.00 0.0894     6  6.00  8.67 ((17970.12 34853.68, 17720.12 34998.02,…
 9  2411  1.00 0.0894     6  6.00  8.67 ((22970.12 46112.01, 22720.12 46256.35,…
10  2854  1.00 0.0894     6  6.00  8.67 ((26720.12 27492.46, 26470.12 27636.8, …
# … with 19 more rows

In the above result, sl is the p-value.

Insights

The chart and the table above suggests a slight downward trend across the 4 hours of the weekend morning peak period for hexagon grids 1438, 2102, 2133, 2950, 3859 and 4018.

At an alpha value of 0.05 (CI 95%), p-value > alpha value, suggesting that this trend is insignificant.
At alpha value of 0.1 (CI 90%), p-value < alpha value, suggesting that this trend is significant.

Conversely, there is a slight upward trend across the 4 hours of the weekend morning peak period for hexagon grids 1251, 1778, 2411, 2854, 3014, 3166, 3292, 3578, 3924, 3957, 4349, 4539 and 4635.

At an alpha value of 0.05 (CI 95%), p-value > alpha value, suggesting that this trend is insignificant.
At alpha value of 0.1 (CI 90%), p-value < alpha value, suggesting that this trend is significant.

For all the other hexagon grids, there is no significant upward or downward trend in commuting flow across the 4 hours of the weekend morning peak period.

The code chunk below performs the Mann-Kendall Test for the following high commuting flow hexagon grids:

 [1] 1251 1438 1778 1815 1904 2024 2054 2062 2102 2133 2135 2411 2854 2950 3014
[16] 3166 3204 3239 3292 3578 3822 3859 3924 3957 4006 4018 4349 4539 4635

Code

plot_weekendevening <- gi_weekend_evening %>% 
  ungroup() %>% 
  filter(index %in% weekend_31k$index) |> 
  select(index, TIME_PER_HOUR, gi_star)

# Plot result using ggplot2 functions and ggplotly() of plotly package
p4 <- ggplot(data = plot_weekendevening, 
       aes(x = TIME_PER_HOUR, 
           y = gi_star,
           color = as.factor(index))) +
  geom_line() +
  labs(color = "Hexagon Grid") + 
  theme_light()

ggplotly(p4)

The Mann-Kendall test is done for each hexagon grid in the code chunk below.

Code

weekendevening_ehsa <- gi_weekend_evening %>%
  group_by(index) %>%
  summarise(mk = list(
    unclass(
      Kendall::MannKendall(gi_star)))) %>%
  tidyr::unnest_wider(mk)

# Save out the model
write_rds(weekendevening_ehsa, "data/rds/weekendevening_ehsa.rds")

# A tibble: 29 × 7
   index   tau     sl     S     D  varS                                 geometry
   <int> <dbl>  <dbl> <dbl> <dbl> <dbl>                            <POLYGON [m]>
 1  2950 -1.00 0.0894    -6  6.00  8.67 ((27470.12 29657.53, 27220.12 29801.86,…
 2  3166 -1.00 0.0894    -6  6.00  8.67 ((29220.12 29224.51, 28970.12 29368.85,…
 3  3292 -1.00 0.0894    -6  6.00  8.67 ((30220.12 30956.57, 29970.12 31100.9, …
 4  3924 -1.00 0.0894    -6  6.00  8.67 ((35220.12 41348.87, 34970.12 41493.21,…
 5  4018 -1.00 0.0894    -6  6.00  8.67 ((35970.12 41781.88, 35720.12 41926.22,…
 6  1251  1.00 0.0894     6  6.00  8.67 ((13720.12 35286.69, 13470.12 35431.03,…
 7  1778  1.00 0.0894     6  6.00  8.67 ((17970.12 34853.68, 17720.12 34998.02,…
 8  2062  1.00 0.0894     6  6.00  8.67 ((20220.12 39616.82, 19970.12 39761.16,…
 9  2411  1.00 0.0894     6  6.00  8.67 ((22970.12 46112.01, 22720.12 46256.35,…
10  2854  1.00 0.0894     6  6.00  8.67 ((26720.12 27492.46, 26470.12 27636.8, …
# … with 19 more rows

In the above result, sl is the p-value.

Insights

The chart and the table above suggests a slight downward trend across the 4 hours of the weekend evening peak period for hexagon grids 2950, 3166, 3292, 3924 and 4018.

At an alpha value of 0.05 (CI 95%), p-value > alpha value, suggesting that this trend is insignificant.
At alpha value of 0.1 (CI 90%), p-value < alpha value, suggesting that this trend is significant.

Conversely, there is a slight upward trend across the 4 hours of the weekend morning peak period for hexagon grids 1251, 1778, 2062, 2411, 2854 and 4635.

At an alpha value of 0.05 (CI 95%), p-value > alpha value, suggesting that this trend is insignificant.
At alpha value of 0.1 (CI 90%), p-value < alpha value, suggesting that this trend is significant.

For all the other hexagon grids, there is no significant upward or downward trend in commuting flow across the 4 hours of the weekend morning peak period.

Perform Emerging Hot Spot Analysis (EHSA)

In the code chunks below, emerging_hotspot_analysis() of sfdep package is used to perform Emerging Hot Spot Analysis on a spacetime object e.g. weekday_morn_cube. The .var argument takes the name of a variable of interest (e.g. TRIPS), the k argument is used to specify the number of time lags which is set to 1 by default, and the nsim argument maps the number of simulations to be performed.

For visualisation, ggplot2 functions are used to reveal the distribution of EHSA classes as a bar chart. To visualise the geographic distribution of the EHSA classes, hex_grid sf data frame is joined to the EHSA data frames. Then, tmap functions will be used to plot categorical choropleth maps.

Code

# Creating a custom color palette for choropleth map
my_palette <- c("no pattern detected" = "grey85",
                "new hotspot" = "mistyrose1",
                "sporadic hotspot" = "lightpink",
                "consecutive hotspot" = "indianred",
                "persistent hotspot" = "darkred")

Code

wkd_morn_ehsa <- emerging_hotspot_analysis(
  x = weekday_evening_cube, 
  .var = "TRIPS", 
  k = 1, 
  nsim = 99
)

# Save out the model
write_rds(wkd_morn_ehsa, "data/rds/wkd_morn_ehsa.rds")

Code

# Visualise the distribution of classes
v1 <- ggplot(data = wkd_morn_ehsa,
             aes(x = classification)) +
  geom_bar() + 
  theme(axis.text = element_text(size = 6))

ggplotly(v1)

Insights

More than half of the hexagon grids had no pattern detected - likely due to the large number of 0s in the data frame. The next largest class of hexagon grids are “sporadic cold spots”, which refers to locations that are on-again then off-again cold spots.

Choropleth Map for Weekday Morning Hotspots

The following choropleth map is charted at an alpha value of 0.1 (CI 90%). - Based on the Mann-Kendall Test performed earlier, there are no areas with a significant trend observed at alpha value of 0.05 (CI 95%).

Code

wkd_morn_ehsa_grid <- hex_grid %>%
  left_join(wkd_morn_ehsa,
            by = c("index" = "location"))

# Plot the choropleth map for significant values
morn_ehsa_sig <- wkd_morn_ehsa_grid  %>%
  # Remove coldspot areas since most of these areas have low bus stop density
  filter(p_value < 0.1)

tmap_mode("view")

tm_shape(morn_ehsa_sig) +
  tm_fill("classification", palette = my_palette) + 
  tm_borders(alpha = 0.4)

Insights

Based on the map, there are clusters of hot spots in several areas of Singapore during weekday morning peak hours. Among the grids identified earlier as high commuting flow regions, most are also hot spot areas. However, there are also clusters which were previously not identified as a high commuting flow region but should also be reviewed when doing further analysis.

Code

wkd_evening_ehsa <- emerging_hotspot_analysis(
  x = weekday_evening_cube, 
  .var = "TRIPS", 
  k = 1, 
  nsim = 99
)

# Save out the model
write_rds(wkd_evening_ehsa, "data/rds/wkd_evening_ehsa.rds")

Code

# Visualise the distribution of classes
v2 <- ggplot(data = wkd_evening_ehsa,
             aes(x = classification)) +
  geom_bar() + 
  theme(axis.text = element_text(size = 6))

ggplotly(v2)

Insights

More than half of the hexagon grids had no pattern detected - likely due to the large number of 0s in the data frame. The next largest class of hexagon grids are “sporadic cold spots”. Among the “hot spots” classes, more areas are “sporadic hot spots”.

Choropleth Map for Weekday Evening

Code

wkd_evening_ehsa_grid <- hex_grid %>%
  left_join(wkd_evening_ehsa,
            by = c("index" = "location"))

# Plot the choropleth map for significant values
ehsa_sig <- wkd_evening_ehsa_grid  %>%
  # alpha value set at 0.1 based on Mann-Kendall Test performed earlier
  filter(p_value < 0.1) 

tm_shape(ehsa_sig) +
  tm_fill("classification", palette = my_palette) + 
  tm_borders(alpha = 0.4)

Insights

Based on the map, there are clusters of hot spots in several areas of Singapore during weekday evening peak hours. Among the grids identified earlier as high commuting flow regions, most are also hot spot areas. However, there are also clusters which were previously not identified as a high commuting flow region but should also be reviewed when doing further analysis.

Code

wke_morn_ehsa <- emerging_hotspot_analysis(
  x = weekend_morn_cube, 
  .var = "TRIPS", 
  k = 1, 
  nsim = 99
)

# Save out the model
write_rds(wke_morn_ehsa, "data/rds/wke_morn_ehsa.rds")

Code

# Visualise the distribution of classes
v3 <- ggplot(data = wke_morn_ehsa,
             aes(x = classification)) +
  geom_bar() + 
  theme(axis.text = element_text(size = 6))

ggplotly(v3)

Insights

Choropleth Map for Weekend/PH Morning

Code

wke_morn_ehsa_grid <- hex_grid %>%
  left_join(wke_morn_ehsa,
            by = c("index" = "location"))

# Plot the choropleth map for significant values
wkemorn_ehsa_sig <- wke_morn_ehsa_grid  %>%
  # alpha value set at 0.1 based on Mann-Kendall Test performed earlier
  filter(p_value < 0.1) 

tm_shape(wkemorn_ehsa_sig) +
  tm_fill("classification", palette = my_palette) + 
  tm_borders(alpha = 0.4)

Insights

Based on the map, there are clusters of hot spots in several areas of Singapore during weekend morning peak hours. However, not all grids identified earlier as high commuting flow regions are also hot spot areas. The biggest cluster of hot spot area is in the Downtown region, though some larger housing estates such as Tampines, Boon Lay, Punggol/Sengkang are also hot spot areas. This could be due to the timings used for the weekend morning peak hour definition being around lunch time and considerably later than the weekday morning peak hours.

Code

wke_evening_ehsa <- emerging_hotspot_analysis(
  x = weekend_evening_cube, 
  .var = "TRIPS", 
  k = 1, 
  nsim = 99
)

# Save out the model
write_rds(wke_evening_ehsa, "data/rds/wke_evening_ehsa.rds")

Code

# Visualise the distribution of classes
v4 <- ggplot(data = wke_evening_ehsa,
             aes(x = classification)) +
  geom_bar() + 
  theme(axis.text = element_text(size = 6))

ggplotly(v4)

Insights

Choropleth Map for Weekend/PH Evening

Code

wke_evening_ehsa_grid <- hex_grid %>%
  left_join(wke_evening_ehsa,
            by = c("index" = "location"))

# Plot the choropleth map for significant values
evening_ehsa_sig <- wke_evening_ehsa_grid  %>%
  # alpha value set at 0.1 based on Mann-Kendall Test performed earlier
  filter(p_value < 0.1) 

tm_shape(evening_ehsa_sig) +
  tm_fill("classification", palette = my_palette) + 
  tm_borders(alpha = 0.4)

Insights

Based on the map, there are clusters of hot spots in several areas of Singapore during weekend/PH evening peak hours. However, not all grids identified earlier as high commuting flow regions are also hot spot areas. Most of the clusters are found in the housing estates and around Woodlands Checkpoint. The Downtown area, while still a high commuting flow region, is no longer a hot spot area by evening time.

Conclusion

The geographical distribution of weekday and weekend peak hour commuting flow is heavily right-skewed, and can hence, be better identified in geovisualisation using Jenks Natural Breaks data classification. On weekdays, all of the highest commuting flow areas are in the housing estates, while on weekends / PH, high commuting flow areas that are not just in the major housing estates, but also in the town area and the area surrounding Woodlands Checkpoint.

There are clusters of hot spots in several areas of Singapore during all four peak hour time periods. Among the grids identified earlier as high commuting flow regions for weekday mornings and evenings, most are also hot spot areas. However, there are also clusters which were previously not identified as a high commuting flow region but should also be reviewed when doing further analysis.

For weekends/PH, not all grids identified as high commuting flow regions are also hot spot areas. During the weekend / PH morning peak, the biggest cluster of hot spot area is in the Downtown region, though some larger housing estates such as Tampines, Boon Lay, Punggol/Sengkang are also hot spot areas. This could be due to the timings used for the weekend morning peak hour definition being around lunch time and considerably later than the weekday morning peak hours. Conversely, during the weekend / PH evening peak, most clusters are found in the housing estates and around Woodlands Checkpoint. The Downtown area, while still a high commuting flow region, is no longer a hot spot area by evening time.

Objectives

Tasks

Geovisualisation and Analysis

Emerging Hot Spot Analysis

Getting Started

Import Data

Import Geospatial Data: Bus Stop Locations

Import Passenger Volume by Origin-Destination Bus Stops

Extract Commuting Flow data

Append Commuting Flow data with Bus Stop Locations

Geovisualisation and Analysis

Create Spatial Grids

Visualisation of Geographical Distribution of Commuting Flow

Continuous Scale

Jenks Natural Breaks

Emerging Hot Spot Analysis

Time Series Cube

Create Time Series Cube

Computation of Local Gi* statistics

Mann-Kendall Test

Perform Emerging Hot Spot Analysis (EHSA)

Choropleth Map for Weekday Morning Hotspots

Choropleth Map for Weekday Evening

Choropleth Map for Weekend/PH Morning

Choropleth Map for Weekend/PH Evening

Conclusion

Computation of Local G_i* statistics