By Erik Neemann on Mar 21, 2022
Introducing Open Source Places
This article will take 5 minutes to read
There’s an exciting new data layer that has been recently added to the State Geographic Information Datasource (SGID): Open Source Places. This is a brand-new offering that is created from open source data and is intended to represent places of interest in the state of Utah. These may include businesses, restaurants, places of worship, airports, parks, schools, event centers, apartment complexes, hotels, car dealerships, etc., etc., …almost anything that you can find in OpenStreetMap (OSM). There are over 23,000 features in the data, and the best part of all? You can make updates and contribute directly to this layer through OpenStreetMap! (more on that later)
UGRC’s creation of the Utah Open Source Places dataset does not imply endorsement by, or affiliation with, the OpenStreetMap Foundation
What kind of points of interest are in the data?
All kinds! There are businesses, restaurants, churches, schools, banks, hotels, viewpoints, attractions, and more. Take a look at this chart showing the counts of the 25 most popular types of locations.
How can it be used?
We think there will be numerous potential use cases for such a large and varied dataset, for state and local agencies, citizens, and even researchers. The data could be used to perform different analyses on certain business types or even specific businesses. How many Beans and Brews locations are within a half-mile of Starbucks? The businesses and landmarks could be used as a reference for 911 call-takers, first responders, or emergency managers. This can be particularly helpful because we don’t always know the address of our emergencies, but we usually know a point of interest or place name. The data could even be used to quality control other datasets by looking for features that we may be missing. Are we missing any schools, fire stations, liquor stores, or cemeteries from our authoritative datasets?
How can I participate?
It’s pretty easy, really. All you need to do is create a free OSM account, login, start editing or creating places (iD is the default editor), and save your updates. Then, the next time we run our python script to update this layer, your changes will be included and published in the SGID. In addition to the default, web-based editor on openstreetmap.org (iD), there are several desktop and mobile (Android and iOS) editors that can be used, including, JOSM, Vespucci, and GoMap!!. A larger list of editors is maintained on the OSM Wiki. If creating an OSM account and using an editor are too cumbersome, you can also submit businesses anonymously with a simple form entry and pin-location at OnOSM. Your submissions through OnOSM will be reviewed and added by an OSM user.
You can get involved in the OSM community by checking out upcoming events on the OSM Calendar. The calendar will list events across the US, including those right here in Utah. The OSM Slack community is also open to anyone, with a specific
#local-utah channel for users and events in the state.
How does it differ from other SGID data?
- It’s not authoritative - due to the nature of crowdsourced data, we can’t guarantee that it’s perfect or accurate
- It has a different license - instead of our usual CC BY 4.0 license, it carries the ODbL 1.0 license
- Any user can contribute to it - the data can be quickly improved and grown by OSM users
OpenStreetMap® is open data, licensed under the Open Data Commons Open Database License (ODbL) by the OpenStreetMap Foundation (OSMF).
How is it similar to other SGID data?
- It may include locations that are represented in other, authoritative, SGID datasets
- Examples: schools, parks, liquor stores, libraries, fire stations, etc.
- It’s available in a variety of ways and data formats
- ArcGIS Online services
- Downloads in your favorite data format (shapefile, file geodatabase, GeoJSON, KML, CSV)
What else should I know?
- The Open Source Places data isn’t perfect, so you may find some typos, missing locations, locations that no longer exist, etc.
- OSM data tends to have a European ‘flavour’ (see what I did there?). Be mindful that many categories or attributes may be spelled in a British/European English manner. Business or location names shouldn’t be affected, but attribute examples you might encounter include:
- centre (instead of center, ex: sports_centre, arts_centre, garden_centre)
- theatre (instead of theater)
- caravan_site (instead of rv_park or trailer_park)
- archaeological (instead of archeological)
- chemist (instead of pharmacy)
- tyres (instead of tires)
- There are two types of addresses provided in the data:
- The address contained in the original OSM data (
osm_addr). These were input by OSM contributors, but only a small percentage of the points come with an address (~24%).
- The address of the nearest UGRC address point (
ugrc_addr). These are not official addresses of the businesses or points of interest, but provide a very close address for a greater percentage of points (~59%). They are the nearest address (within 25m) in the UGRC Address Points dataset. The
addr_distfield provides the distance (in meters) from the actual point to the nearby address point.
- The address contained in the original OSM data (
- The attributes can be used to filter the data down to your desired types of points
categoryfield represents a general classification for a location. Examples include: restaurant, fast_food, viewpoint, fire_station, etc.
- The other fields provide additional details about the location, where available
cuisine- type of food available, may have multiple, separated by a ‘;’ (american, italian, indian, burger, pizza, sandwich, ice_cream, etc.)
amenity- often similar to the category field (restaurant, post_office, place_of_worship, social_facility)
shop- type of shop (supermarket, toys, hardware, convenience, etc.)
- Some attributes might be duplicated among these fields
- Additional information is provided in the other attributes, where available (
- Several locations simply have a category of ‘building’ - these are prime for user-contributed improvements in OpenStreetMap! Stayed tuned for a future project to improve the
categoryand attributes of these features…
What are the details on how this data is created?
The Open Source Places layer is created by a Python script that pulls statewide OSM data from a nightly archive provided by Geofabrik. The archive data contains nearly 20 shapefiles, some that are relevant and some that aren’t. So, the Open Source Places layer is built by filtering the data in those shapefiles down to a single point dataset with specific categories and attributes. Additional de-duplication and spatial filtering are done along the way, to ensure the data is as clean as possible. Then, additional fields are created and assigned from UGRC’s SGID data (county, city, zip, nearby address, etc.) via point-in-polygon analysis. Finally, additional attributes (OSM address, open hours, cuisine, etc.) are pulled from the Overpass API and joined to the filtered data using the ‘osm_id’ field as the join key. Additional information can be found on the Open Source Places data page.
If you really want to get into the nitty gritty details, come to my 2022 UGIC Conference presentation or keep your eyes peeled 👀 for a future blog post.
Let us know what you think and reach out to Erik Neemann from UGRC at ENeemann@utah.gov or 385-258-4299 or on Twitter (@maputah).