Skip to main content

Analysis of the risk of theft from vehicle crime in Kyoto, Japan using environmental indicators of streetscapes


With the advent of spatial analysis, the importance of analyzing crime patterns based on location has become more apparent. Previous studies have advanced our understanding of the factors associated with crime concentration in street networks. However, it has recently become possible to assess the factors associated with crime at even finer spatial scales of streetscapes, such as the existence of greenery or walls, owing to the availability of streetscape image data and progress in machine learning-based image analysis. Such place-scale environments can be both crime-producing and crime-preventing, depending on the composition of the streetscape environment. In this study, we attempted to assess the risk of crime occurrence through place-scale indicators using streetscape images and their interaction terms through binomial logistic regression modeling of the place-scale crime risk of theft from vehicles in the central part of Kyoto City, Japan. The results suggest that the effects of specific streetscape components on the risk of crime occurrence are certainly dependent on other components. For example, the association of the crime occurrence risk with the occupancy rate of vegetation in a streetscape image is positive when there are few buildings and walls, and vice versa. The findings of this study show the importance of considering the complex composition of visible streetscape components in assessing the place-scale risk of crime occurrence.


Understanding crime patterns is important for designing safe living environments and implementing effective crime prevention activities. With the advent of new data analysis techniques, perspectives on crime patterns have shifted from the macro- or meso-level (e.g., cities, neighborhoods) to the micro-level (e.g., street segments, intersections, microgrids, proximity to specific facilities) (Weisburd et al., 2009; Eck & Guerette, 2012, Groff et al., 2010). A well-known outcome at the micro-level scale is the discovery of crime hotspots. Law enforcement agencies can apply this knowledge to incorporate hotspot patrols into practical crime prevention efforts (Ariel et al., 2020; Braga et al., 2019). Although several studies have used points of interests (POI) and street network centralities as micro-scale indicators, finding theoretically relevant data at this spatial scale is still challenging. Note that a POI is a specific entity or facility, such as a convenience store or parking lot, with a well-defined location denoted primarily by geographical coordinates.

Several environmental criminological theories, such as defensible space (Newman, 1972), crime prevention through environmental design (CPTED; Jeffery, 1971), and routine activity theory (Cohen & Felson, 1979), suggest that crime risks are determined by the geographic visible elements of places. Sherman et al., (1989) defined place as “a fixed physical environment that can be seen completely and simultaneously, at least on its surface, by one’s naked eyes.” Considering these points, large-scale photographic images composed of visible features in streetscapes, such as Google Street View (GSV), are expected to be a useful data source for environmental criminology research at the spatial place scale (hereafter called place-scale) (Vandeviver, 2014).

Using GSV, Langton & Steenbeek (2017) evaluated surveillability scores, such as the number of windows and the front door visibility from the street; accessibility, such as a burglar alarm; and the existence of amenities, such as plant pots and benches in public spaces, and found that buildings with higher surveillability scores were at a lower risk of being victims of residential burglary. He et al., (2017) also used GSV to audit physical incivility indicators (litter, graffiti, etc.), defensible space indicators (walls, fences, hedges, etc.), and territorial functioning indicators (trees, shrubs, gardens, etc.) to quantitatively analyze the association between the incidence of violent crime and the built environment. They analyzed violent crime and socio-economic indicators using Poisson regression with eigenvector-based spatial filtering and audited the built environment of the sites selected from the result of regression analysis using GSV. Consequently, they showed that sites where crime is over-estimated tend to have low physical incivility indicators but high defensible space and territorial functioning indicators, and vice versa.

However, such an approach to systematic social observation using GSV images involves the time-consuming and human-dependent task of extracting built-environment indicators from GSV images using the naked eye. To overcome these problems, this study applied machine learning to automatically obtain streetscape indicators that might help explain crime patterns from the collected GSV images. Computer vision technologies with deep learning at their core have developed dramatically during the last decade. This has made it possible to mechanically extract the features of streetscapes from a large number of streetscape images such as GSV. We also have easy access to pre-trained semantic segmentation models, making these automated processes relatively easy to implement. Several studies have used automated approaches to detect major streetscape components in GSV images (Amiruzzaman et al., 2021; Deng et al., 2022; Hipp et al., 2021; Khorshidi et al., 2021; Zhou et al., 2021). Classes detected by semantic segmentation are often denoted as “segments,” but in this paper, we denote them as “components” to distinguish them from “street segments.” Hipp et al. (2021) extracted streetscape components from GSV images and compared the rates of components in an image with the occurrence of various crimes (aggravated assault, robbery, burglary, motor vehicle theft, and larceny). Zhou et al. (2021) found that the locations where drug activity occurs are likely to have a higher percentage of traffic signs, roads, and building components in a streetscape image, suggesting that the composition of streetscape components in street-view images can be a useful indicator of whether a location is well-managed. Hence, the automated approach to extracting streetscape components from GSV through deep learning offers new possibilities for understanding the risk of place-scale crime occurrence.

Note, however, that whether a streetscape component promotes or deters crime is not easily generalized and might be context dependent. For example, consider the effect of the presence of vegetation on crime risks. Vegetation may create opportunities for crime because it may be used by offenders to hide themselves or stolen goods (Michael et al., 2001). According to CPTED, removing trees is recommended in cases where the trees block the lines of sight and may attenuate natural surveillance (Crowe, 2000). On the other hand, some studies have shown that plants significantly reduce the risk of crime in urban centers and residential areas (Du & Law, 2016; Kuo & Sullivan, 2001).

Well-managed vegetation in cities can be thought of as improving territoriality and enhancing natural surveillance by increasing opportunities for people to pay attention and stay. Thus, the effect of vegetation on crime can be both positive and negative, depending on the surrounding conditions (Donovan & Prestemon, 2012; Troy et al., 2016). Similarly, it has been suggested that walls and fences be removed or made permeable if they interfere with natural surveillance, while they may also serve as physical barriers to reinforce territoriality, contributing to crime risk reduction (Crowe, 2000; Newman, 1972). This indicates that the same streetscape components may produce different affordances depending on other components as well as crime type and modus operandi. Such complex effects can be captured as the interaction effects of streetscape components on crime risks in statistical modeling.

Previous studies using automated techniques to obtain streetscape indicators via GSV and machine learning have not investigated such complex relationships among streetscape indicators, i.e., their interaction effects. Therefore, we highlight the interaction effects of streetscape components in the modeling of the risk of crime occurrence in this study. Specifically, we examined the risk of theft from vehicle (TFV) occurrence in the city of Kyoto, Japan using streetscape indices detected by a pre-trained machine learning (semantic segmentation) model to explore how the interactions of streetscape components contribute to the effective modeling of the TFV risk on streets. In the statistical modeling, we employed commonly used micro-scale indicators, network centrality measures, and proximity measures to POIs as control variables to assess how streetscape components improve such a spatial analysis of crime risk on streets.


Crime occurrence data

The study area is located in the central district of Kyoto City, Japan. The city was founded in 794 and maintains the old Japanese townscape with narrow streets in a grid pattern. Currently, it is one of the 20 largest cities in Japan, with a population of over one million. The target area for this analysis was the three central wards of Kyoto (Fig. 1). According to the 2015 census in Japan, the population, areal size, and population density of the target area are 277,122 persons, 21.22 km2, and 13,059 persons per km2, respectively.

Fig. 1
figure 1

Street network and density of theft from vehicles in the study area. a Street network and road length of each road component. b Kernel density estimation of the theft from vehicles

The analysis used crime occurrence data of all TFV (n = 500) recorded by the Kyoto Prefectural Police Department between January 1, 2015, and December 31, 2018, in the target area. As the analysis unit, we created points at 10-m intervals on the street network data of the area, which are based on ArcGIS Geo Suite Road Network 2021 from ESRI Japan Inc. We generated the interval points on the streets using the QChainage plugin in QGIS. Further, we snapped the TFV occurrence to the nearest 10-m interval points on the street network. The street points nearest to the location where the TFV occurred were considered to be the locations where the TFV occurred. The crime occurrence locations were registered by indicating the locations on a zoomable digital map of the Kyoto Prefectural Police GIS. Although there is a certain margin of error for the indication, the size is quite small compared to the 10 m analysis unit. Furthermore, as all registrations were verified at headquarters, we considered these TFV location data sufficiently accurate for our analysis.

We focused on the TFVs that occurred on the street to assess the association between TFV risk and streetscape characteristics. Therefore, a building polygon layer by the Geospatial Information Authority of Japan was overlaid on the target area to filter out the TFVs occurring inside buildings, such as multistory parking lots (n = 333). In addition, we excluded TFVs that occurred over 20 m away from the points on the street because, even if the assigned point is outside a building, an occurrence far away from the street, such as in a large parking lot, is not considered to have occurred on the street. Ultimately, 278 TFV incidents satisfied these conditions (Fig. 1). The mean and median of the distances between all crime locations and the point where the GSV was taken were 8.23 and 7.73 m, respectively.

Streetscape indicators

We used the percentage of each component in a GSV image as a streetscape index. Using the Street View Static API provided by Google, we obtained two street-view images of 640 × 640 pixels and classified each image pixel via the PSPNet semantic segmentation model (Zhao, 2019; Zhao et al., 2017). The acquisition points of the GSV images were assigned to the nearest points on the street, within 10 m. The percentage of each component in an image was defined as the ratio of the pixels of each component class to the total pixels of the image (Fig. 2). We used the PSPNet semantic segmentation model trained on the Cityscapes dataset (Cordts et al., 2016), consisting of 19 classes (road, sidewalk, building, wall, fence, pole, traffic light, traffic sign, vegetation, terrain, sky, person, rider, car, truck, bus, train, motorcycle, and bicycle).

Fig. 2
figure 2

Example of semantic segmentation result and percentage of each component (Source: Authors’ photos). a Forward of road. b Backward of road. c Average of (a) and (b)

The streetscape characteristics often discussed in environmental criminology include security features, windows, graffiti, and potted plants. Ideally, to automatically detect these features, a new image segmentation model that has learned those components should be built. However, building such a model in a short time is difficult owing to the huge computation and annotation costs involved. In this study, we therefore used an accessible model trained on 19 streetscape indicators from the Cityscapes dataset. We used these streetscape indicators in our modeling and exploratory interpretation of the results. The Cityscapes dataset is composed of streetscape images captured with a camera mounted behind the windshield of a car. We inspected some of the side-view GSV image pairs and found that few images were too close to buildings or walls, resulting in streetscapes not being captured. Therefore, we used GSV images of the car (Cordts et al., 2016) in the forward and backward street directions. In this study, we adopted PSPNet with an input size of 713 × 713 pixels and ResNet101 as the feature extractor, trained on 2,975 training data points from the Cityscapes dataset, and attained a relatively high accuracy of 79.63% mIoU on a 500-test dataset (Zhao, 2019). Table 1 lists the streetscape indices obtained from the street points used in this study.

Table 1 Summary statistics of extracted components from Google Street View images

Street network indicators

Street network centrality has been used as a micro-scale indicator of crime occurrence (Yamamura et al., 2019). Using the Urban Network Analysis Toolbox (City Form Lab, 2016), we computed the network centralities of streets to characterize street morphological features. In this study, three indices were used as control variables: betweenness centrality, straightness centrality, and closeness centrality. As shown in Additional file 1: Fig. S1, the street network in this study was prepared by considering intersections as nodes and road segments as edges. Subsequently, nodes were added at the centroid of each street segment; the network has two types of nodes, intersections and street segment centroids. To calculate each centrality, we counted only the nodes of the street segment centroid (hereafter referred to as centroid node), following earlier studies (Kim & Hipp, 2020; Yamamura et al., 2019):

$$Betweenness{\left[i\right]}^{r}=\sum\limits_{j,k\in G-\{i\},d\left[j,k\right]\le r}\frac{{n}_{jk}\left[i\right]}{{n}_{jk}},$$
$$Closeness{\left[i\right]}^{r}=\frac{1}{\sum\limits_{j\in G-\{i\},d\left[i,j\right]\le r}\left(d\left[i,j\right]\right)},$$
$$Straightness{\left[i\right]}^{r}=\sum\limits_{j\in G-\{i\},d\left[i,j\right]\le r}\frac{\updelta \left[i,j\right]}{d\left[i,j\right]},$$

where \(i\), \(j,\) and \(k\) are indices of centroid nodes, \(r\) is the preset maximum network distance from centroid node \(i\), \(G\) is the graph, \({n}_{jk}\left[i\right]\) is the number of shortest paths between \(j\) and \(k\) that pass through \(i\), \({n}_{jk}\) is the number of shortest paths between \(j\) and \(k\), \(d\left[i,j\right]\) is the shortest road distance between \(i\) and \(j\), \(\delta \left[i,j\right]\) is the Euclidean distance between \(i\) and \(j\). The “graph” is the entire street network, and the “paths” are the routes from one centroid node to the other on the graph. In this study, \(r\) was calculated for two patterns: 100 m (1-min walking distance) and 400 m (5-min walking distance). In addition, each indicator was calculated, including the street network outside the target area, to suppress the edge effect.

These indices were standardized using the following definitions:

$$Betweenness{\left[i\right]}_{norm}^{r}=\frac{Betweenness{\left[i\right]}^{r}}{Reach{\left[i\right]}^{r}\times \left(Reach{\left[i\right]}^{r}-1\right)},$$
$$Closeness{\left[i\right]}_{norm}^{r}=Closeness{\left[i\right]}^{r} \times \mathrm{R}each{\left[i\right]}^{r},$$


$$Reach{\left[i\right]}^{r}=\sum\limits_{j\in G-\{i\},d\left[i,j\right]\le r}1.$$

Betweenness is the number of target streets traversed when traveling the shortest distance between the centroids. It is assumed to represent the popularity of a street for walking around and, therefore, the number of potential people moving from one place to another. Potential offenders may expect that streets with high betweenness provide a high probability of encountering appropriate targets for criminal activities (Kelsay & Haberman, 2020). Closeness is the reciprocal of the sum of the shortest distance between the target centroid to other centroids. This centrality is interpreted as the ease of moving to other streets representing the number of escape routes for criminal offenders (Mahfoud et al., 2020; Yamamura et al., 2019). Straightness is the ratio of the road distance to the Euclidean distance from the target centroid to the other centroids. Straightness is related to the linearity of the street, with sinuous streets favoring crime and straight streets hindering crime owing to the possibility of visual supervision by capable guardians (Davies & Johnson, 2015).

In addition, we used the street length and width as classic attributes of a street for our control variables analysis. Street length is defined as a continuous variable (meter); street width is defined as a categorical variable having three levels (3–5.5 m, 5.5–13 m, and 13 m or more; an approximate road width of 5.5 m corresponds to two lanes, and 13 m corresponds to four lanes) following ArcGIS Geo Suite Road Network 2021 made by ESRI Japan Inc. See figure S2, Additional file 1 for the computed results of each centrality in our study area.

Surrounding facility indicators

Ohyama & Amemiya (2018) used a negative binomial regression model for risk terrain modeling with the number of TFV incidents as the dependent variable and the proximities of parking lots, convenience stores, department stores/supermarkets, family/fast food restaurants, coffee shop chain stores, parks, and total street length as major independent variables. They confirmed the significant association of the risk of TFV occurrence with proximity to the surrounding facilities. Following this, we used similar proximity measures of the following facilities as a control variable in this study: transportation-related facilities (stations, bus stops, parking lots, gas stations), food and beverage service facilities (family/fast food restaurants), shopping facilities (convenience stores, supermarkets), educational facilities (nursery schools/kindergartens, schools (elementary, junior high, and high schools), universities/colleges/vocational schools), downtown facilities (snack bars/pubs/clubs, pachinko parlors), and other facilities (parks/green spaces). A dichotomized variable was made about whether these facilities were included within a 100-m radius of a street point (no: 0, yes: 1) (See Additional file 1: Table S1). Information on each facility was collected from NAVITIME, a Japanese navigation service from NAVITIME JAPAN Co., Ltd.

Statistical analysis

The assigned street points of the crime occurrence location were considered positive examples (n = 261), and negative examples were randomly sampled from other street points (twice as large as the positive examples, n = 522). We sampled the negative examples to be separated at least 50 m from other points to avoid overlapping of the same locations.

We then fitted a binomial logistic regression model to the combined dataset, including positive and negative examples (n = 783), to predict the probability of TFV occurrence on a street point. We applied the stepwise method based on the Akaike Information Criterion (AIC) to select variables. We considered two models and compared their performances: one without streetscape indicators (Model A) and one with streetscape indicators (Model B). In Model B, the interaction terms between streetscape indices were considered using the complexity of crime risk regulated by multiple aspects of street environments. Each continuous variable was standardized. Note that each variable with interaction satisfies the principle of marginality (Venables & Ripley, 1997), which includes the main effect term of each variable constituting the interaction term.

In the logistic regression modeling, the effect of the explanatory variable on the target variable changes by one unit, depending on the value of the explanatory variable. Furthermore, in models that include interaction terms, the relationship between the explanatory variable and a specific target variable cannot be intuitively interpreted using only the estimated coefficients. Therefore, the average marginal effects (AME) were computed to interpret the estimated effects of each streetscape component on the crime occurrence risk. AME is the average effect on the predicted value (marginal effect) of a unit change in an explanatory variable for every sample (Leeper, 2017). In this study, we used the margins package of R (version 0.3.26) to calculate the AME.


Tables 2 and 3 present the results of each binomial logistic regression model. Table 4 presents the confusion matrices of each model. The Area Under the Curve (AUC) of Models A and B was 0.66 and 0.70, and their AIC was 964.1 and 943.7, respectively. Thus, we showed that the model that considered streetscape indicators attained a better performance. Note that the maximum and mean values of the Variance Inflation Factor (VIF) were 3.86 and 1.74 for Model A and 3.78 and 1.77 for Model B, indicating that the effect of multicollinearity was reasonably low for each model.

Table 2 Results of binomial logistic regression without streetscape indicators (Model A)
Table 3 Results of binomial logistic regression with streetscape indicators and their interaction terms (Model B)
Table 4 Confusion matrices and evaluation metrics for the estimated model

Focusing on the main effect variables that were statistically significant (p < 0.05) of Model B, the results indicated that the risk of TFV tended to increase on streets near convenience stores [0.64; 95% CI (0.28, 1.00)] and gas stations [1.14; 95% CI (0.35, 1.97)]. In particular, convenience stores have often been interpreted as crime generators or attractors (Brantingham & Brantingham, 1995; Ohyama & Amemiya, 2018). This is because convenience stores are open till late night, frequented by several people, and probably have a high concentration of vehicles parked in the vicinity. Proximity to gas stations was also associated with an elevated TFV risk in this study. This is similar to previous studies that reported proximity to gas stations as a risk factor for crimes such as robbery (Bernasco and Block, 2011; Barnum et al., 2017). Street length [− 0.30; 95% CI (− 0.59, − 0.01)] and straightness (100 m) [− 0.19, (− 0.37, 0.00)] were negatively associated with TFV risk.

Most of the statistically significant (p < 0.05) variables are interaction terms with streetscape indicators, including (building) [hereinafter the standardized value of the percentage of the component in a streetscape image is presented in square brackets, such as “(component name)”], (wall), (fence), (sidewalk), (vegetation), and (road).

For example, (vegetation) had a statistically significant interaction effect with (building), (wall), and (sidewalk) [regression coefficients of (vegetation): − 0.32; 95% CI (− 0.65, − 0.01), (vegetation) × (building): − 0.29; 95% CI (− 0.48, − 0.11), (vegetation) × (wall): − 0.34; 95% CI (− 0.62, − 0.12), (vegetation) × (sidewalk): − 0.14; 95% CI (− 0.30, − 0.01)]. Focusing on the interaction term between streetscape indicators, we attempted to interpret (vegetation) × (building) and (vegetation) × (wall). The AME of (vegetation) was negative over the entire range, and as expected, the TFV risk decreased with increasing (vegetation) (Fig. 3a).

Fig. 3
figure 3

Estimated effects of vegetation on the occurrence risk of theft from vehicle. a Average marginal effect and estimated probability of the occurrence of theft from vehicle for vegetation. b When there is no wall (− 0.37). c When the percentage of wall in a streetscape image equals its third quantile value (0.85)

The effect of (vegetation) on the TFV risk value depended on the values of (building) and (wall). For example, when there are no walls [(wall)  − 0.37], the AME of (vegetation) is positive when there are few (building) but negative when there are many (building) (Fig. 3b). However, when there are many walls [(wall)  0.85], the AME of (vegetation) becomes more negative as (building) increases. This indicates that in a streetscape with a relatively large number of walls and buildings, the amount of vegetation tends to be associated with lower TFV risk. For example, the sample of streetscapes with few (wall) and (building) has a low predicted TFV risk of 0.29 in the case of few (vegetation) (Fig. 4a left), and a high value of 0.83 in the case of much [vegetation] (Fig. 4a right). In a streetscape with more (wall) and (building), the TFV risk becomes as high as 0.56 in the sample with few (vegetation) (Fig. 4b left) and as low as 0.01 in the sample with much (vegetation) (Fig. 4b right).

Fig. 4
figure 4

Sample streetscape images for demonstrating the relationship between the amount of vegetation and the risk of theft from vehicle (TFV) (Source: Authors’ photos). a Streetscape with no walls and few buildings. b Streetscape with many walls and buildings. c Conceptual diagram of the relationship between streetscape compositions and the TFV risk

As another example, (building) also had statistically significant interaction terms with (vegetation), (road), and (sidewalk) [regression coefficients of (building): 0.03; 95% CI (− 0.27,0.33), (building) × (vegetation): − 0.29; 95% CI (− 0.48, − 0.11), (building] × [road): 0.24; 95% CI (0.08, 0.41), (building) × (sidewalk): − 0.30; 95% CI (− 0.52, − 0.09)]. The confidence interval for AME contains zero for the entire range of (building), which makes it unclear whether (building) is positively or negatively related to the TFV risk (Fig. 5a). However, the AME of (building) for the case with few (sidewalk) [(sidewalk)  − 0.74] was found to be significantly positive only for the case with more (road) (Fig. 5b). When there are many (sidewalk) [(sidewalk)  0.49], the range of (road) for which (building) has a positive AME becomes narrower, and that confidence interval will now include zero. This result suggests that in a streetscape with many roads and few sidewalks, a higher occupancy of building components increases the TFV risk. Figure 6 shows samples of streetscape images with few sidewalks and many roads. The TFV risk is as low as 0.14 for the sample with few sidewalks and buildings, whereas the risk value is as high as 0.56 for the sample with few sidewalks and many buildings.

Fig. 5
figure 5

Estimated effects of building on the occurrence risk of theft from vehicle. a Average marginal effect and estimated probability of the occurrence risk of theft from vehicle for building. b When the percentage of sidewalk in a streetscape image equals its first quantile value (− 0.74). c When that of sidewalk equals its third quantile value (0.49)

Fig. 6
figure 6

Sample streetscape images with few sidewalks and many roads. (Source: Authors’ photos). The figure on the left shows a streetscape with a low building component percentage, that on the right shows a streetscape with a high building component percentage


This study considered the use of interaction terms to clarify the complex relationships between geographical environments regulating TFV risk at the place-scale. First, we compared the model with and without streetscape indicators. The results confirmed that the model with the streetscape indicators and their interaction terms provides a better performance: lower AIC, higher F-measure, and higher AUC. This indicates that streetscape indicators contribute to investigating the risk of crime at the place-scale. As expected, there appeared significant interaction terms related to streetscape indicators, indicating that landscape indicators become more meaningful in a combination of ways. Among such streetscape indicators, the percentages of vegetation and buildings are included in the significant interaction terms in our statistical modeling result, thus we focus on discussing the interactions of streetscape indicators.

The estimated negative relationship between (vegetation) and the risk of crime occurrence is consistent with previous studies (Kuo & Sullivan, 2001; Wolfe & Mennis, 2012). However, our study also clarifies that the effect of (vegetation) on the TFV risk depends on (wall) and (building) in a streetscape image. According to CPTED, vegetation and walls can sometimes interfere with natural surveillance (Crowe, 2000; Reynald, 2009), but how about buildings? We calculated Pearson's correlation coefficients among streetscape indicators and found relatively strong negative correlations between (building) and (road) (r = − 0.65) and between (building) and (sky) (r = − 0.71) (See Additional file 1: Table S3). This result indicates that buildings may have a blocking effect on visibility and that higher percentages of walls and buildings in images are related to lower visibility of the streetscape in our study area. Although we would expect to have more “eyes on the street” (Jacobs, 1961) enhancing natural surveillance, streetscapes with many walls and buildings may result in limited visibility, attenuating natural surveillance. In such an environment, the presence of vegetation may create a sense of managed space, enhancing the territoriality of the neighborhood and making offenders aware of the presence of capable guardianship, resulting in reduced crime risk (Fig. 4b and c). However, in an environment with few buildings and walls, excessive vegetation may limit visibility, reducing natural surveillance and increasing crime risk (Fig. 4a and c). Thus, the interaction terms allow us to integrate the opposing arguments about whether vegetation increases or decreases the risk of crime. Note that we have offered only one possible interpretation of the building components in streetscape images. Buildings having many windows may engender natural surveillance by residents (Newman, 1972). Therefore, our interpretations should be verified by future studies assessing the relationship between visibility and the components of streetscape images.

(Building) alone does not appear to be associated with the risk of crime occurrence. However, our results show a more complex association through the interaction terms of TFV risk and (building), (sidewalk), and (road). Street environments with fewer sidewalks have a higher risk of crime than those with more building. A street environment with few sidewalks is considered a street not designed for large numbers of pedestrians. On the other hand, in environments with many sidewalks, one no longer finds a relationship between building and crime risk. If, as in the previous discussion, (building) is related to visibility, then natural surveillance may decrease with more buildings in environments where many pedestrians do not appear (Fig. 6).

This study shows for the first time the importance of using interactions between streetscape indicators for comprehending the place-scale risk of crime occurrence. Nagata et al., (2020) successfully analyzed streetscape walkability in detail using streetscape indicators obtained by semantic segmentation of GSV and their interaction terms for walkability. Similarly, in environmental criminology, the interaction terms between streetscape indicators may provide clues to understanding the visible landscape compositions associated with the risk of crime occurrence at the place-scale.

This study has several limitations. First, GSV images are streetscapes, as seen from the roadway, because GSV images are acquired from cameras mounted on cars. However, because TFV offenders approach their targets by walking, the GSV images should be strictly analyzed in terms of the streetscape, as seen from the sidewalk. Second, GSV does not consider the difference in streetscape by year or season because the year and month when GSV was taken are not uniform. Further studies are needed on the effects of these limitations on the use of GSV in the analysis results. In addition, the streetscape index defined in this study is only a limited feature of occupancy rate by semantic segmentation. In actual streetscapes, it may be necessary to handle additional information, such as the location of components (e.g., buildings and vegetation being adjacent to each other), depth, and brightness. Furthermore, we used 19 general classes of streetscape indicators defined by the Cityscapes dataset in this study, which is still insufficient to conduct a detailed analysis following the theories of environmental criminology.

Future studies should consider more theoretically motivated components of streetscapes associated with the risk of crime occurrence, such as security cameras, windows of buildings, and graffiti. In the future, it will be necessary to consider how to handle such information and assess the risk value using more realistic street environmental indicators.

Prior work includes methods to obtain feature vectors from the middle layer of a convolutional neural network trained on an image classification task (Kang & Kang, 2017; Zhang et al., 2020) and object detection to identify the location and number of objects (Dakin et al., 2020). We can select or combine other such methods according to the study purpose. In addition, instance segmentation and panoptic segmentation, which are extensions of semantic segmentation, would allow us to acquire more features. New technologies, such as monocular depth estimation techniques, are also being developed. It is anticipated that various features of streetscapes will be obtained using automated approaches and used in crime research in the future.


This study assessed the risk of TFV crime using place-scale geographic environmental indicators in a Japanese city. The indicators, composed of streetscape indicators obtained from streetscape images with deep learning, clarified the relationships between place-scale environments and the risk of crime occurrence on the street. Some of the results were consistent with previous studies and may be explained by the classic theory of environmental criminology. Through the interaction terms of the environmental indicators, we comprehended the complex and interdependent relationships between environmental factors and the crime occurrence risk at the place-scale; for example, we observed that the association of vegetation with the risk of TFV can be both positive and negative depending on the number of buildings and walls. Further research is needed to unravel the complexity of the contribution of streetscape composition to crime risk in different geographical and criminological contexts.

Availability of data and materials

The data used in this study are restricted from public disclosure.



Average marginal effect


Crime prevention through environmental design


Google Street View


Theft from vehicle


Akaike information criterion


Area under the curve


Point of interest


Variance inflation factor


Download references


We thank the Kyoto Prefectural Police Department for providing the dataset and important advice on the analysis in this study.


No external funding was used to support this work.

Author information

Authors and Affiliations



HMA designed the study, analyzed the data, discussed the results, and wrote the manuscript. TN advised on the design of the study and interpretation of the analysis results, and contributed to the revision of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Hiroki M. Adachi.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1

: Fig. S1. Illustration of street network. Fig. S2. Distributions of street network centralities in the study area. Fig. S3. Pearson correlation coefficients between all streetscape indicators. Table S1. Relationship between sample points on the street and proximity to surrounding facilities.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Adachi, H.M., Nakaya, T. Analysis of the risk of theft from vehicle crime in Kyoto, Japan using environmental indicators of streetscapes. Crime Sci 11, 13 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: