CHICAGO CRIMES ANALYSIS

Predicting crimes in Chicago and creating an alert system.We’re using the Chicago Crime Dataset from 2012-2017 for this analysis. We’re predicting crime using FB Prophet. There is also an crime alert system that would alert nearby crimes based on given user Location.
(Team Project at UC Irvine)

For my other articles, visit my BLOG ¶

Contact me on Twitter or LinkedIn ¶

%load_ext watermark

%watermark -v -u -n -t -z -a 'Samira Kumar' -p numpy,pandas,scipy,matplotlib,sklearn,fbprophet,bokeh,geopy,geoviews,holoviews

Samira Kumar 
last updated: Fri Dec 14 2018 13:19:47 PST

CPython 2.7.15
IPython 5.8.0

numpy 1.15.3
pandas 0.23.4
scipy 1.1.0
matplotlib 2.2.3
sklearn 0.19.2
fbprophet 0.3
bokeh 1.0.2
geopy 1.17.0
geoviews 1.5.1
holoviews 1.10.4

INDEX:

1: VISUALISATIONS
2: CRIME ALERT
3: FB PROPHET TO PREDICT CRIMES IN CHICAGO

VISUALISATIONS¶

Dataset: https://www.kaggle.com/umeshnarayanappa/exploring-chicago-crimes-2012-2016/data ¶

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import geopandas as gpd
import holoviews as hv
import geoviews as gv
import geoviews.tile_sources as gts
from bokeh.io import output_file, save, show

#Dropped NA values and other outliers (some location were outside chicago) and cleaned the dataset
df=pd.DataFrame(pd.read_csv('cleaned_file.csv'))
df.head()

Visualisation of crimes in each police beats and wards¶

Shapefiles can be downloaded from here: https://data.cityofchicago.org/browse?tags=shapefiles

beats_df=df.groupby(['Beat']).size().reset_index(name='Total_Crimes')
beats_df=beats_df.rename(columns={'Beat':'beat_num'})
beats_df.head()

%%opts Polygons (cmap='YlOrRd')

hv.extension('bokeh')
geometries = gpd.read_file('geo_export_3b3b25c2-a600-40c3-a663-2f7ad8dc2b9c.shp')

geometries['beat_num']=geometries['beat_num'].apply(int)
gdf = gpd.GeoDataFrame(pd.merge(beats_df, geometries))

plot_opts = dict(tools=['hover'], width=750, height=700, color_index='Total_Crimes',
                 colorbar=True, toolbar='above', xaxis=None, yaxis=None)
plot=gts.CartoLight *gv.Polygons(gdf, vdims=['beat_num', 'Total_Crimes'], label='Chicago Crime Police Beat Map').opts(plot=plot_opts,style=dict(alpha=0.7))
# gv.renderer('bokeh').save(plot, 'beat_crimes')
plot

%%opts Polygons (cmap='YlOrRd')

ward_df=df.groupby(['Ward']).size().reset_index(name='Total_Crimes')
ward_df=ward_df.rename(columns={'Ward':'ward'})

hv.extension('bokeh')
wards_shape = gpd.read_file('Boundaries - Wards (2015-)/geo_export_7fe30167-754d-4ed5-947f-c515456d9762.shp')

wards_shape['ward']=wards_shape['ward'].apply(int)
gdf = gpd.GeoDataFrame(pd.merge(ward_df, wards_shape))

plot_opts = dict(tools=['hover'], width=750, height=700, color_index='Total_Crimes',
                 colorbar=True, toolbar='above', xaxis=None, yaxis=None)
plot=gts.CartoLight *gv.Polygons(gdf, vdims=['ward', 'Total_Crimes'], label='Chicago Crime Ward Map').opts(plot=plot_opts,style=dict(alpha=0.7))
# gv.renderer('bokeh').save(plot, 'ward_crimes')
plot

CRIME ALERT¶

Creating the cluster¶

In order to create a crime alert, we're clustering a sample of 500000 crimes into 200 clusters. The cluster size can be any number. Higher the cluster number, better the alert system.

from sklearn.cluster import KMeans
from sklearn import metrics
from sklearn.metrics import pairwise_distances
data=df.sample(500000).copy()
ml = KMeans(n_clusters=200, init='k-means++')
ml.fit(data[['Longitude', 'Latitude']])

KMeans(algorithm='auto', copy_x=True, init='k-means++', max_iter=300,
    n_clusters=200, n_init=10, n_jobs=1, precompute_distances='auto',
    random_state=None, tol=0.0001, verbose=0)

cluster = ml.cluster_centers_
cluster[:10]

array([[-87.58564446,  41.76842148],
       [-87.70954061,  41.88284319],
       [-87.66222958,  41.76711367],
       [-87.66692053,  41.9410608 ],
       [-87.78625868,  41.93248014],
       [-87.62523552,  41.85573192],
       [-87.62398212,  41.72210565],
       [-87.72503318,  41.80416549],
       [-87.76813019,  41.8947028 ],
       [-87.68737204,  42.01373099]])

Total crimes for each cluster¶

X = data[['Longitude','Latitude']].values
predictions = ml.fit_predict(X)
kclustered = pd.concat([data.reset_index(), 
                       pd.DataFrame({'Cluster':predictions})], 
                      axis=1)
kclustered.drop('index', axis=1, inplace=True)
centers = ml.cluster_centers_
kcenters=pd.DataFrame(centers)
kcenters=kcenters.rename(columns={0:'Longitude',1:'Latitude'})
kcenters['Total Crimes']=kclustered.groupby('Cluster')['ID'].count().reset_index()['ID']
kcenters

Using geocoders, we find the address for each cluster centers. This is just for plotting the cluster centers on Folium¶

from geopy.geocoders import Nominatim
geolocator=Nominatim(timeout=3)

address=[]
for index,row in kcenters.iterrows():
    rev_location=geolocator.reverse(np.array([row.Latitude, row.Longitude]))
    address.append(rev_location.address)
kcenters['Address']=address
kcenters.head()

Plotting the cluster centers on Folium¶

import folium

m = folium.Map(location=[41.8781,-87.64], zoom_start=11)

for i in range(0,len(kcenters)):
   folium.Circle(
      location=[kcenters.iloc[i]['Latitude'], kcenters.iloc[i]['Longitude']],
       popup = (
        "<b>Location:</b> {loc}</br></br>"
        "<b>Crimes: </b> {crime}<br>"
    ).format(loc=str(kcenters.iloc[i]['Address']), crime=str(kcenters.iloc[i]['Total Crimes'])),
      radius=kcenters.iloc[i]['Total Crimes']/15,
      color='red',
      fill=True,
      fill_color='red',
      fill_opacity=0.5
   ).add_to(m)
folium.TileLayer('cartodbpositron').add_to(m)
m.save('clustered_200.html')
m

data['cluster'] = ml.predict(data[['Longitude','Latitude']])
data[['ID','Latitude','Longitude','Block','cluster']].sample(10)

We're predicting the same data to find cluster of each crime. We can plot the cluster and centers in a voronoi plot as below¶

from scipy.spatial import Voronoi

def voronoi_polygons_2d(vor, radius=None):
    """
    Reconstruct infinite voronoi regions in a 2D diagram to finite
    regions.

    Input_args:
    vor : Voronoi
        Input diagram
    radius : float, optional
        Distance to 'points at infinity'.

    :returns:
    regions : list of tuples
        Indices of vertices in each revised Voronoi regions.
    vertices : list of tuples
        Coordinates for revised Voronoi vertices. Same as coordinates
        of input vertices, with 'points at infinity' appended to the
        end.

    """
    if vor.points.shape[1] != 2:
        raise ValueError("Requires 2D input")

    new_regions = []
    new_vertices = vor.vertices.tolist()

    center = vor.points.mean(axis=0)
    if radius is None:
        radius = vor.points.ptp().max()*2

    # Construct a map containing all ridges for a given point
    all_ridges = {}
    for (p1, p2), (v1, v2) in zip(vor.ridge_points, vor.ridge_vertices):
        all_ridges.setdefault(p1, []).append((p2, v1, v2))
        all_ridges.setdefault(p2, []).append((p1, v1, v2))

    # Reconstruct infinite regions
    for p1, region in enumerate(vor.point_region):
        vertices = vor.regions[region]

        if all([v >= 0 for v in vertices]):
            # finite region
            new_regions.append(vertices)
            continue

        # reconstruct a non-finite region
        ridges = all_ridges[p1]
        new_region = [v for v in vertices if v >= 0]

        for p2, v1, v2 in ridges:
            if v2 < 0:
                v1, v2 = v2, v1
            if v1 >= 0:
                # finite ridge: already in the region
                continue

            # Compute the missing endpoint of an infinite ridge

            t = vor.points[p2] - vor.points[p1] # tangent
            t /= np.linalg.norm(t)
            n = np.array([-t[1], t[0]])  # normal

            midpoint = vor.points[[p1, p2]].mean(axis=0)
            direction = np.sign(np.dot(midpoint - center, n)) * n
            far_point = vor.vertices[v2] + direction * radius

            new_region.append(len(new_vertices))
            new_vertices.append(far_point.tolist())

        # sort region counterclockwise
        vs = np.asarray([new_vertices[v] for v in new_region])
        c = vs.mean(axis=0)
        angles = np.arctan2(vs[:,1] - c[1], vs[:,0] - c[0])
        new_region = np.array(new_region)[np.argsort(angles)]

        # finish
        new_regions.append(new_region.tolist())

    return new_regions, np.asarray(new_vertices)

# make up data points
points = cluster

# compute Voronoi tesselation
vor = Voronoi(points)

# compute regions
regions, vertices = voronoi_polygons_2d(vor)

# prepare figure
plt.style.use('seaborn-white')
fig = plt.figure()
fig.set_size_inches(20,20)

#geomap
# centroids
plt.plot(points[:,0], points[:,1], 'wo',markersize=10)

# colorize
for region in regions:
    polygon = vertices[region]
    plt.fill(*zip(*polygon), alpha=0.4)
    
plt.scatter(data['Longitude'],data['Latitude'],c='red')
plt.xlim(vor.min_bound[0] - 0.1, vor.max_bound[0] + 0.1)
plt.ylim(vor.min_bound[1] - 0.1, vor.max_bound[1] + 0.1)
plt.show()

global_df = data.groupby(['cluster', 'Block']).size().reset_index()
global_df.columns = ['cluster', 'Block', 'count']
global_df.head()

For each cluster, we're finding the block which has the highest crimes. So we'd get 200 blocks which have high crimes in their cluster.¶

topcrimes_df=global_df.sort_values('count',ascending=False)
topcrimes_df.groupby(['cluster'])['count'].max().reset_index()
#Sorting the cluster and removing duplicates would keep only one cluster for each block
topcrimes_df=topcrimes_df.sort_values('count', ascending=False).drop_duplicates(['cluster'])
topcrimes_df.to_csv('topcrimes_df.csv')
topcrimes_df.head(20)

Inspired from this post: https://github.com/modqhx/geolocation_ml_Analysis/blob/master/.ipynb_checkpoints/Recommendation_Spatial_ML-checkpoint.ipynb ¶

Creating an alert of nearby high crime block. So this works on concept for each new given longitude and latitude, the cluster is predicted and the highest crime block is given as output. A google maps location is also given as HTML link.¶

from IPython.core.display import display, HTML
import requests, json

def get_crime_url(location):
#     text = requests.utils.quote(location)
    url = "http://maps.google.com/maps?q={},{}".format(location[1],location[0])
    return url


def crime_alert_closest(lon, lat):
    cluster = ml.predict(np.array([lon, lat]).reshape(1, -1))[0]
    crime_block = str(topcrimes_df[topcrimes_df['cluster']==cluster].iloc[0]['Block'])
    count = topcrimes_df[topcrimes_df['Block']==crime_block].iloc[0]['count']
#     location=np.array([lon,lat])
    location=df[df['Block']==crime_block][['Longitude','Latitude']].mean().values
    url = get_crime_url(location)
    if url:
        crime_html = '<a href="{}">{}</a>'.format(url, crime_block)
    else:
        crime_html = crime_block
    msg = "The most violent block closest to your location is {} and the total crimes in that block is {}".format(crime_html,count)
    return display(HTML(msg))

crime_alert_closest( -87.6, 41.8)

crime_alert_closest(-88.627, 39.7)

crime_alert_closest(-87.62923881,  41.88393292)

Creating an alert of a nearby crime. If the crime and user location are in same cluster, then alert will be provided. Geocoder used to find the address of crime and send it to user as HTML¶

from pygeocoder import Geocoder
from geopy.geocoders import Nominatim


def get_crime_url(location):
    url = "http://maps.google.com/maps?q={},{}".format(location.item(0),location.item(1))
    return url


def crime_alert(crime_lon, crime_lat, person_lon, person_lat):
    msg=[]
    cluster_crime = ml.predict(np.array([crime_lon,crime_lat]).reshape(1, -1))[0]
    cluster_person = ml.predict(np.array([person_lon,person_lat]).reshape(1, -1))[0]
    crime_block = str(topcrimes_df[topcrimes_df['cluster']==cluster_crime].iloc[0]['Block'])
#     location=data[data['Block']==crime_block][['Longitude','Latitude']].mean().values

    geolocator = Nominatim()
    location_add=np.array([crime_lat, crime_lon])
    rev_location = geolocator.reverse(location_add)
    address=(rev_location.address)
    
    url = get_crime_url(location_add)
    if url:
        crime_html = '<a href="{}">{}</a>'.format(url, address)
    else:
        crime_html = address
        
    if cluster_crime==cluster_person:
        msg = "There is a crime near your location, at {}".format(crime_html)
    else: 
        msg='No crimes around you now'
    return display(HTML(msg))

crime_alert(-87.627877,  41.931080,-85.62923881,  41.88393292)

crime_alert(-87.75,  41.88393292,-87.739,  41.892)

crime_alert(-87.789,  41.97,-87.79,  41.975)

FB PROPHET TO PREDICT CRIMES IN CHICAGO¶

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

clean_df=pd.DataFrame(pd.read_csv('cleaned_file.csv'))
df_date_group=clean_df.groupby('dates').size().reset_index(name='Freq')
df_date_group['dates']=pd.to_datetime(df_date_group['dates'])

#Dropping dates from 2017 in order to have empty data for 2017-2018
df_date_group=df_date_group.drop(df_date_group.index[1827:1845])
df_date_group.tail()

Building the model and visualizing the predicted crimes for 2017-2018¶

from fbprophet import Prophet
crime_model = Prophet(interval_width=0.95)
crime_data = df_date_group.rename(columns={'dates': 'ds', 'Freq': 'y'})
crime_model.fit(crime_data)

crime_forecast = crime_model.make_future_dataframe(periods=365, freq='D')
crime_forecast = crime_model.predict(crime_forecast)
plt.figure(figsize=(20, 6))
crime_model.plot(crime_forecast, xlabel = 'Date', ylabel = 'Crimes')
plt.title('Crimes');

INFO:fbprophet.forecaster:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.
/anaconda2/lib/python2.7/site-packages/pystan/misc.py:399: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  elif np.issubdtype(np.asarray(v).dtype, float):

<Figure size 1440x432 with 0 Axes>

Plotting the trend¶

crime_model.plot_components(crime_forecast);

Actual vs Predicted Crimes 2017-2018¶

Actual crimes in 2017 - 267817¶

crime_forecast[crime_forecast['ds']>='2017-01-01']['yhat'].sum()

262607.9291577991

Predicted Crimes in 2017 - 262608¶

Plotting the above graph in Bokeh to make it interactive¶

from bokeh.plotting import figure, show, output_notebook
from bokeh.models import HoverTool,ColumnDataSource,RangeTool,LegendItem,Legend
from bokeh.layouts import column
from bokeh.io import output_file,save
from bokeh.layouts import row,gridplot
from numpy import histogram, linspace
from scipy.stats.kde import gaussian_kde

output_notebook()

source_prophet = ColumnDataSource(data=dict(date=crime_forecast.ds,y=crime_forecast.yhat))
source_original = ColumnDataSource(data=dict(date=df_date_group.dates.dt.date,y=df_date_group.Freq))

prophet_forecast = figure(title='Total Crimes over the years (Predicted value in Red)',width=950, height=450, 
            tools='save,wheel_zoom,pan,reset,box_zoom',x_axis_type='datetime',sizing_mode="scale_width",x_range=(crime_forecast.ds.min(), crime_forecast.ds.max()))


prophet_forecast.scatter('date','y',source=source_original,line_width=2,
                         color='black',fill_alpha=0.5,size=2,legend='Actual Crimes')

prophet_forecast.line(crime_forecast.iloc[0:1827].ds,crime_forecast.iloc[0:1827].yhat,
                      line_width=2,color='blue',legend='Actual Value Trend')
prophet_forecast.line(crime_forecast.iloc[-365:].ds,crime_forecast.iloc[-365:].yhat,
                      line_width=2,color='red',line_alpha=0.3,legend='Predicted Value')
prophet_forecast.line(crime_forecast.ds,crime_forecast.yhat_lower,
                      line_width=2,color='lightblue',line_alpha=0.3,legend='95% Confidence Interval')
prophet_forecast.line(crime_forecast.ds,crime_forecast.yhat_upper,
                      line_width=2,color='lightblue',line_alpha=0.3)

prophet_forecast.xaxis.axis_label = "Year"
prophet_forecast.yaxis.axis_label = "Total Crimes"
prophet_forecast.xgrid.grid_line_color = None
prophet_forecast.ygrid.grid_line_color = None

prophet_train = figure(title="Date Selection",
                plot_height=100, plot_width=950, y_range=prophet_forecast.y_range,
                x_axis_type="datetime", y_axis_type=None,
                tools="", toolbar_location=None, background_fill_color="#efefef")


range_tool_prophet = RangeTool(x_range=prophet_forecast.x_range)
range_tool_prophet.overlay.fill_color = "navy"
range_tool_prophet.overlay.fill_alpha = 0.2

prophet_train.scatter('date', 'y', source=source_original,size=1)
prophet_train.line('date', 'y', source=source_prophet)


prophet_train.ygrid.grid_line_color = None
prophet_train.add_tools(range_tool_prophet)
prophet_train.toolbar.active_multi = range_tool_prophet

show(column(prophet_forecast,prophet_train))
# output_file("FBProphet_Output.html", title="FBProphet Output")
# save((prophet_forecast,prophet_train))

Predicting the crimes from Jan-01-2017- Dec-02-2018¶

from fbprophet import Prophet
crime_model = Prophet(interval_width=0.95)
crime_data = df_date_group.rename(columns={'dates': 'ds', 'Freq': 'y'})
crime_model.fit(crime_data)

crime_forecast = crime_model.make_future_dataframe(periods=730, freq='D')
crime_forecast = crime_model.predict(crime_forecast)
plt.figure(figsize=(20, 6))
crime_model.plot(crime_forecast, xlabel = 'Date', ylabel = 'Crimes')
plt.title('Crimes');

INFO:fbprophet.forecaster:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.

<Figure size 1440x432 with 0 Axes>

crime_forecast[(crime_forecast['ds']>='2017-01-01')&(crime_forecast['ds']<='2018-12-02')]['yhat'].sum()

505532.35361045913

	Unnamed: 0	ID	Case Number	Date	Block	IUCR	Primary Type	Description	Location Description	Arrest	...	Updated On	Latitude	Longitude	Location	time_crime	time_hour	minutes	dates	day_of_week
0	3	10508693	HZ250496	2016-05-03 23:40:00	013XX S SAWYER AVE	0486	BATTERY	DOMESTIC BATTERY SIMPLE	APARTMENT	True	...	05/10/2016 03:56:50 PM	41.864073	-87.706819	(41.864073157, -87.706818608)	0 days 11:40:00.000000000	23	40	2016-05-03	1
1	89	10508695	HZ250409	2016-05-03 21:40:00	061XX S DREXEL AVE	0486	BATTERY	DOMESTIC BATTERY SIMPLE	RESIDENCE	False	...	05/10/2016 03:56:50 PM	41.782922	-87.604363	(41.782921527, -87.60436317)	0 days 09:40:00.000000000	21	40	2016-05-03	1
2	197	10508697	HZ250503	2016-05-03 23:31:00	053XX W CHICAGO AVE	0470	PUBLIC PEACE VIOLATION	RECKLESS CONDUCT	STREET	False	...	05/10/2016 03:56:50 PM	41.894908	-87.758372	(41.894908283, -87.758371958)	0 days 11:31:00.000000000	23	31	2016-05-03	1
3	673	10508698	HZ250424	2016-05-03 22:10:00	049XX W FULTON ST	0460	BATTERY	SIMPLE	SIDEWALK	False	...	05/10/2016 03:56:50 PM	41.885687	-87.749516	(41.885686845, -87.749515983)	0 days 10:10:00.000000000	22	10	2016-05-03	1
4	911	10508699	HZ250455	2016-05-03 22:00:00	003XX N LOTUS AVE	0820	THEFT	$500 AND UNDER	RESIDENCE	False	...	05/10/2016 03:56:50 PM	41.886297	-87.761751	(41.886297242, -87.761750709)	0 days 10:00:00.000000000	22	0	2016-05-03	1

	beat_num	Total_Crimes
0	111	8089
1	112	7467
2	113	4813
3	114	4021
4	121	3716

	Longitude	Latitude	Total Crimes
0	-87.562543	41.757362	5633
1	-87.723997	41.874345	3391
2	-87.647856	41.737336	3153
3	-87.626566	41.898832	5280
4	-87.661300	41.989576	2874
5	-87.689190	41.788970	2284
6	-87.787579	41.936611	1495
7	-87.604033	41.814649	3050
8	-87.719462	41.922614	2309
9	-87.644212	41.691239	1845
10	-87.660230	41.866823	1954
11	-87.613065	41.771176	2440
12	-87.767568	41.778903	971
13	-87.901201	41.976360	2377
14	-87.651969	41.946508	5646
15	-87.671957	41.764680	1935
16	-87.768894	41.896212	2872
17	-87.705489	41.966385	1889
18	-87.536134	41.688851	581
19	-87.663652	41.781110	3070
20	-87.697404	41.849916	2782
21	-87.772455	41.976608	1316
22	-87.626050	41.736157	1810
23	-87.638456	41.812819	1794
24	-87.619734	41.705476	2568
25	-87.626894	41.849017	3692
26	-87.760932	41.924089	2801
27	-87.668072	41.925246	1987
28	-87.571464	41.723318	1974
29	-87.697984	41.742965	1241
...	...	...	...
170	-87.792075	41.980644	901
171	-87.759794	41.968912	1760
172	-87.604044	41.704924	1596
173	-87.709427	41.843848	2593
174	-87.779456	41.924424	2054
175	-87.608273	41.760469	2724
176	-87.683026	41.870676	1741
177	-87.639767	41.866448	1681
178	-87.631570	41.793930	1741
179	-87.669610	41.809749	2468
180	-87.644542	41.936753	3003
181	-87.698248	41.868886	3148
182	-87.729729	41.913196	2945
183	-87.735547	41.879182	4266
184	-87.763710	41.795112	922
185	-87.651172	41.706862	2078
186	-87.629293	41.888985	5121
187	-87.780305	41.996226	696
188	-87.658627	41.684744	2071
189	-87.684192	41.996241	2208
190	-87.612480	41.748742	2111
191	-87.679131	41.911361	3091
192	-87.706255	41.757211	1331
193	-87.603158	41.779609	3281
194	-87.648339	41.766753	3625
195	-87.686446	42.013764	2195
196	-87.811133	41.980889	739
197	-87.727662	41.853131	2391
198	-87.622806	41.676020	2847
199	-87.649648	41.891030	1628

	Longitude	Latitude	Total Crimes	Address
0	-87.562543	41.757362	5633	2532-2546, East 76th Street, South Shore HIsto...
1	-87.723997	41.874345	3391	3922, West Congress Parkway, West Garfield Par...
2	-87.647856	41.737336	3153	8614, South Sangamon Street, Chester Highlands...
3	-87.626566	41.898832	5280	America-Fore Building, 844, North Rush Street,...
4	-87.661300	41.989576	2874	5917, North Magnolia Avenue, Edgewater Glen, E...

	ID	Latitude	Longitude	Block	cluster
1059816	10130901	41.916053	-87.719181	019XX N LAWNDALE AVE	8
1181829	10343488	41.883165	-87.769861	001XX N MENARD AVE	56
790069	9587080	41.754593	-87.741529	076XX S CICERO AVE	49
432678	8993045	41.747393	-87.585439	081XX S STONY ISLAND AVE	41
1241362	10446682	41.894166	-87.621850	002XX E ERIE ST	101
1336621	10634007	41.839811	-87.617142	030XX S DR MARTIN LUTHER KING JR DR	62
585300	9238321	41.961703	-87.698484	044XX N CALIFORNIA AVE	17
571046	9215466	41.750802	-87.599109	079XX S DOBSON AVE	72
754525	9522175	41.912324	-87.749758	049XX W ST PAUL AVE	147
1321870	10609016	41.744094	-87.594707	012XX E 83RD ST	72

CHICAGO CRIMES ANALYSIS

For my other articles, visit my BLOG ¶

Contact me on Twitter or LinkedIn ¶

INDEX:

VISUALISATIONS¶

Dataset: https://www.kaggle.com/umeshnarayanappa/exploring-chicago-crimes-2012-2016/data ¶

Visualisation of crimes in each police beats and wards¶

CRIME ALERT¶

Creating the cluster¶

Total crimes for each cluster¶

Using geocoders, we find the address for each cluster centers. This is just for plotting the cluster centers on Folium¶

Plotting the cluster centers on Folium¶

We're predicting the same data to find cluster of each crime. We can plot the cluster and centers in a voronoi plot as below¶

For each cluster, we're finding the block which has the highest crimes. So we'd get 200 blocks which have high crimes in their cluster.¶

Inspired from this post: https://github.com/modqhx/geolocation_ml_Analysis/blob/master/.ipynb_checkpoints/Recommendation_Spatial_ML-checkpoint.ipynb ¶

Creating an alert of nearby high crime block. So this works on concept for each new given longitude and latitude, the cluster is predicted and the highest crime block is given as output. A google maps location is also given as HTML link.¶

Creating an alert of a nearby crime. If the crime and user location are in same cluster, then alert will be provided. Geocoder used to find the address of crime and send it to user as HTML¶

FB PROPHET TO PREDICT CRIMES IN CHICAGO¶

Building the model and visualizing the predicted crimes for 2017-2018¶

Plotting the trend¶

Actual vs Predicted Crimes 2017-2018¶

Actual crimes in 2017 - 267817¶

Predicted Crimes in 2017 - 262608¶

Plotting the above graph in Bokeh to make it interactive¶

Predicting the crimes from Jan-01-2017- Dec-02-2018¶

Actual crimes in 2018 till 2-Dec - 239221¶

Acutal crimes between Jan-1 2017 till Dec-2-2018 - 507038¶

Predicted crimes between Jan-1 2017 till Dec-2-2018 - 505532¶

Feedbacks are appreciated :)¶

	Block	count
0	008XX E 75TH ST	1
1	022XX E 75TH ST	22
2	022XX E 76TH ST	2
3	022XX E 77TH ST	5
4	022XX E 78TH ST	3

	cluster	Block	count
5958	39	001XX N STATE ST	1025
2001	13	0000X W TERMINAL ST	937
477	3	008XX N MICHIGAN AVE	850
7734	49	076XX S CICERO AVE	765
19549	124	064XX S DR MARTIN LUTHER KING JR DR	491
25265	161	083XX S STEWART AVE	436
26232	168	051XX W MADISON ST	411
23166	147	046XX W NORTH AVE	400
27664	177	011XX S CANAL ST	382
16071	103	040XX W LAKE ST	363
2071	14	009XX W BELMONT AVE	329
20082	127	012XX S WABASH AVE	327
28405	183	042XX W MADISON ST	322
22727	144	011XX W WILSON AVE	310
13230	84	038XX W ROOSEVELT RD	298
28850	186	0000X W HUBBARD ST	295
3223	22	001XX W 87TH ST	287
13783	88	066XX S HALSTED ST	281
14195	91	071XX S JEFFERY BLVD	275
18156	116	075XX S STONY ISLAND AVE	267

	dates	Freq
1822	2016-12-27	623
1823	2016-12-28	686
1824	2016-12-29	614
1825	2016-12-30	704
1826	2016-12-31	672

CHICAGO CRIMES ANALYSIS

For my other articles, visit my BLOG¶

Contact me on Twitter or LinkedIn¶

INDEX:

VISUALISATIONS¶

Dataset: https://www.kaggle.com/umeshnarayanappa/exploring-chicago-crimes-2012-2016/data¶

Visualisation of crimes in each police beats and wards¶

CRIME ALERT¶

Creating the cluster¶

Total crimes for each cluster¶

Using geocoders, we find the address for each cluster centers. This is just for plotting the cluster centers on Folium¶

Plotting the cluster centers on Folium¶

We're predicting the same data to find cluster of each crime. We can plot the cluster and centers in a voronoi plot as below¶

For each cluster, we're finding the block which has the highest crimes. So we'd get 200 blocks which have high crimes in their cluster.¶

Inspired from this post: https://github.com/modqhx/geolocation_ml_Analysis/blob/master/.ipynb_checkpoints/Recommendation_Spatial_ML-checkpoint.ipynb¶

Creating an alert of nearby high crime block. So this works on concept for each new given longitude and latitude, the cluster is predicted and the highest crime block is given as output. A google maps location is also given as HTML link.¶

Creating an alert of a nearby crime. If the crime and user location are in same cluster, then alert will be provided. Geocoder used to find the address of crime and send it to user as HTML¶

FB PROPHET TO PREDICT CRIMES IN CHICAGO¶

Building the model and visualizing the predicted crimes for 2017-2018¶

Plotting the trend¶

Actual vs Predicted Crimes 2017-2018¶

Actual crimes in 2017 - 267817¶

Predicted Crimes in 2017 - 262608¶

Plotting the above graph in Bokeh to make it interactive¶

Predicting the crimes from Jan-01-2017- Dec-02-2018¶

Actual crimes in 2018 till 2-Dec - 239221¶

Acutal crimes between Jan-1 2017 till Dec-2-2018 - 507038¶

Predicted crimes between Jan-1 2017 till Dec-2-2018 - 505532¶

Feedbacks are appreciated :)¶

For my other articles, visit my BLOG ¶

Contact me on Twitter or LinkedIn ¶

Dataset: https://www.kaggle.com/umeshnarayanappa/exploring-chicago-crimes-2012-2016/data ¶

Inspired from this post: https://github.com/modqhx/geolocation_ml_Analysis/blob/master/.ipynb_checkpoints/Recommendation_Spatial_ML-checkpoint.ipynb ¶