Skip to main content

Table 5 Topic stability comparison

From: Unsupervised identification of crime problems from police free-text data

Topic

Topic keywords

Median  % overlap between closest topic

1

Make, enter, remove, unseen, approach, insecure, direction

64.3

2

Police, foot, male, witness, disturbed, occupant, hears

42.9

3

Door, front, makes, enters, insecure, bed, approaches

42.9

4

Door, force, open, bodily, bodily_force, wooden, garage

71.4

5

Lock, handle, euro, entry, snap, profile, attack

85.7

6

Area, premises, attacked, residential, dwelling

71.4

7

Attacked, premises, dwelling, quiet, detached, offender, sac

42.9

8

Address, home, key, leaves, returns, whilst, aggrieved

57.1

9

Vehicle, keys, house, car, locked, secure, driveway

71.4

10

Entry, gain, make, direction, unseen, approach, times

71.4

11

House, terraced, front, attacked, back, mid, terrace

85.7

12

Damage, entry, causing, gained, implement, jemmy, frame

57.1

13

Made, removed, entry, times, direction, means, approached

57.1

14

Room, living, kitchen, living_room, bedroom, left, front

85.7

15

Detached, semi, semi_detached, side, house, attacked, residential

71.4

16

Floor, flat, ground, escape, good, good_escape, premises

57.1

17

Door, locus, front, times, dates, leave, date

42.9

18

Window, person, open, kitchen, rear, climb, bedroom

71.4

19

Search, untidy, items, egress, tidy, rooms, removing

85.7

20

Rear, window, smash, glass, alarm, large, reach

57.1

21

Rear, garden, patio, doors, gate, access, side

71.4

  1. Using the working model as a reference, 9 runs of LDA were performed and topic stability was assessed by counting shared topic keywords between the most similar topic. This was calculated as a percentage and the median was calculated over the 9 runs to give a representation of how stable a topic was given how often the words of the topic occurred together