Data Policy
Introduction
Figure 1: Heatmap of people movement within Melbourne Botanical Gardens
Data Sourcing
The mobile location data that propella.ai sources originates from mobile devices that are using location services within the applications (“apps”) that are running on the device, where the user has “opted-in” for that location data to be shared with the app owner. Note that none of the mobile location data is sourced from telecommunications providers like Telstra and Optus.
- First Party Location SDK (software development kit): the Provider has a direct relationship with publishers who have installed the Provider’s SDK, which collects background location information.
- Second-Party Location SDK: The Provider has partnered with other location-collecting SDK providers in order to increase the available SDK-sourced footprint.
- Owned and Operated Apps: The Provider is also an app publisher and collects location data from location-enabled users of their six social media apps.
- Bid Stream Data: The Provider participates in mobile advertising exchanges, allowing them to collect location data in the process of displaying ads in over 100,000 mobile apps.
Data Privacy
Our Provider (a US-based company) complies with the California Consumer Privacy Act (CCPA), which became effective on January 1, 2020. The CCPA requires businesses to update the information they provide consumers about their collection practices and provide consumers choices about the sale and use of Personal Information. The Provider has committed to maintaining compliance under the CPRA, which will come into effect in Jan 2023 as a replacement of CCPA.
Data Anonymisation
The mobile location data that is collected is anonymised (“de-identified”) by the Provider before it is sourced by propella.ai. No sensitive data such as names, home addresses, social security information or credit card information is ever collected. Email addresses, mobile phone numbers and other identifying information tied to mobile device IDs is also not collected.
Figure 2: Unique identifiers for the mobile devices, termed "ifas"
- Common Evening Location (CEL) - an approximate home location for the device. This assessment is made using factors such a frequency of visitation, days of week and times of day for visitation, length of stays, and others. This location is not recorded down to a street address level. The CEL for the device is recorded as the mesh block (a statistical area that is defined by the ABS, typically 2 - 4 residential blocks) not as a location point. The mesh block is the most granular statistical area that the ABS uses to publish recorded populations. The mesh block is also the level at which Roy Morgan's Helix Personas are geo-projected - we use the mesh block mapping for devices to assign a Helix Persona consumer profile to the device to support psychographic analysis.
- Common Day Location (CDL) - a probabilistic assessment of where the device owner works, using similar factors to the CEL.
Location Data Quality
Mobile location data sourced from consumer GPS devices can be incredibly valuable in generating insights on people movement, but only after quality challenges with the data are addressed.
Figure 3: The Provider's "Path to Quality" process illustrated
Identifying Likely Workers within an Office Building
propella.ai has developed a proprietary system for geo-fencing a commercial office building and classifying the mobile devices identified within the geo-fenced area (the building envelope) as either (a) workers, or (b) visitors to the building.
The first step is to identify the building outline (called the " polygon") for the subject office building that is being analysed - refer to Figure 4 for an example. This is the two-dimensional area that will be analysed, to identify all mobile devices detected within the area over the date and time range specified, using a geo-spatial process called intersection. Note that we can only geo-fence an area, the vertical plane cannot be controlled.
Figure 4: Example of a building polygon
Our platform processes billions of mobile event points from millions of devices across Australia to identify all mobile devices that were detected within the geo-fenced boundary, and collects the total number of mobile location data events for each device over the specified date range. This mobile location data set (visualised in Figure 5) includes the device ID (IFA), the latitude and longitude of the event, and the date and time of the event.
Figure 5: Visualisation of a sample of mobile events for the geo-fenced office building
Once the cohort of mobile devices has been identified (through intersection with the building polygon), the next step is to classify these devices as being associated with either workers or visitors (or in some cases of mixed use developments, residents). We do this probabilistically, using an artificial intelligence algorithm that uses a number of attributes of location visitation (dimensions) to determine the likelihood that the device owner is a worker (or alternatively, a visitor). These dimensions include frequency of visitation, days of week, time of day, length of visitation, among other factors.
So our worker classification algorithm determines which devices identified within the geo-fenced area are most likely to be associated with workers (as opposed to just visitors), taking into account temporal factors (how often the device was observed in the area, and what times of day and days of week), and locational accuracy (including the horizontal error associated with GPS signals from devices in built-up zones like office buildings). These devices are seen so often in the building polygon area that there is a high probability that they are associated with workers.
Figure 6: Visualisation of mobile events within an office building for workers (blue dots) and visitors (magenta dots)
Note that our worker algorithm relies on a sufficient sample size of devices and mobile location data events to be able to accurately predict the worker cohort. Our platform also relies on a reasonable sample size to be able to generate meaningful insights about those workers using the worker cohort data set. We typically find that any office buildings smaller than 10,000 sqm of net lettable area (NLA) may not yield a sufficient sample of devices for us to conduct a reliable worker analysis exercise. This is also dependent on the date range used for the analysis - the longer the time period, the greater the sample size and more likelihood of reliable worker insights.
Human Movement Data (HMD)
At propellla.ai, we utilise Near’s Human Movement Data (HMD) to provide estimates for Return to Office in buildings in Australia's largest cities. The Near HMD represents a non-uniform sample of the total population. As a result, changes in the data volume captured within a specific building can stem from either variations in the HMD sample size or fluctuations in the actual number of people present in an office building.
To address this challenge, propella.ai has implemented advanced normalisation techniques that effectively account for fluctuations in the HMD sample using external data sources. While the exact nature of this normalisation process remains proprietary, we are confident in the accuracy of the Worker Activity estimates at the city aggregate level. These estimates provide reliable insight into overall “Return to Office” trends.
Our Worker Activity data was validated while the Property Council of Australia published their monthly occupancy survey results. During this time propella.ai’s Return to Office (Worker Activity) estimates were consistently aligned with this validation source. Regrettably, PCA occupancy data is no longer available, we suspect due to this survey being labour intensive, the cost of maintaining etc. Consequently, our Worker Activity trends are heavily relied upon in the industry by many commercial property clients as a source of truth.