For each participant, two sets of data were collected: environmental conditions and participants physiological responses. The examples of both data sets, collected through sensors, are given here.

Example data: environmental conditions

Data collected through Smart Cities Board sensor were sound level and the amount of dust in the environment. An example of data for sound level signal and for dust signal collected during a participants' walk is as follows:

Fig Signal for sound
Fig Signal for dust

Data collected through HOBO U12 Logger sensor were temperature, humidity, and illuminance in the environment. An example of each of which collected during a participant's walk is shown as follows:

Fig Signal for temperature
Fig Signal for humidity
Fig Signal for illuminance

Nonetheless, participants’ field-of-view (Isovist) was computed based on the GPS position noted during their walk. Isovist indicates the filed-of-view of a participant that corresponds to a narrow and wider view (space) experienced by a participant during the walk. An example of an Isovist measure is illustrated through an animation as follows:

Example data: Physiological sensor data

Data collected through the wearable device (Empatica E4) were participants’ physiological responses. It was observed that some of the participants' physiological response data were corrupt and erroneous. Hence, participants’ physiological response data/signal were categorized into two sets: sets of error-free physiological response signals and erroneous physiological response signals. The examples of four types of error-free physiological response signals and four types of erroneous physiological response signals are shown here:

Considered physiological signals Discarded physiological signals
Fig Type 1
Fig Type 1: step function like signal
Fig Type 2
Fig Type 2: step function with major sensor loss
Fig Type 3
Fig Type 3: major sensor loss
Fig Type 4
Fig Type 4: insignificant observations

Depending upon the selection error-free physiological response signals, a total 20 participants data (environmental conditions and physiological responses) were propagated to data preprocessing steps and further data mining steps. In other words, data of 10 participants having erroneous physiological response signals were discarded.


Data preprocessing involved two steps: smoothing and filtering of physiological response signals and quantification of the environmental conditions signals and physiological response signals.

Signal smoothing

Physiological data is known for its sensitivity towards artifacts. Thus, a signal smoothing based on stationary wavelet transform (SWT) and filtering based on information such as an abrupt change in the signal, sensor loss in the signal, and participants waiting time at the start of the walk were performed.

Signal smoothing process Signal filtering process
Fig Stationary Wavelet Transform (SWT)
Fig Filtering of SWT-based smoothed signal

Signal quantification

Signals from each sensor were of length A to B, where A and B are UNIX timestamp varied from participant to participants. Each signal was marked start and end of the participants walk. From physiological response data of each participants, only the signal fragment marked start and end is used for data analysis.

Fig Original signal length with start and end marking

Signals quantification was the major step in the data preprocessing. A conceptual diagram is shown here to illustrate the signals qualification process. A participant walked from start to end point in a city neighborhood. And, for every \(t\) seconds participants' walk, its continuous signals were quantified.

Fig Signal's quantification concept

Environmental sensor values \(e_i\) for a \(t\) second window was the average of the signal portion and participants physiological response \(r_i\) for a \(t\) second window was computed using continuous decomposition analysis method described by a tool called Ledalab.

Fig Time window marking as per the quantification concept

Information Fusion

Data in this study are collected from various sensors. Each sensor operated at differing frequencies. Therefore, one-step was to align all sensors signals to a 1 Hz frequency (shown by mark "\(\mathrm{A}\)" in Fig).

Singles from environmental conditions and physiological responses underwent to their respective quantification process. Here, it needs to mention that physiological responses were quantified first and the time window size and UNIX timestamp information were passed to environmental condition signals for their quantification. After quantification of both environmental condition signals and physiological response signals, they were paired as event-response as shown by mark "\(\mathrm{B}\)" in Fig.

Once data of each participant were prepared by pairing event and response, they were further merged together to obtained a compiled data set as shown by mark "\(\mathrm{C}\)" in Fig. The compiled data set thus contained environmental condition (sound, dust, temperature, humidity, illuminance, and Isovist) as the input and participant physiological response as the output.

Fig Information fusion and knowledge discovery framework

Class label: Quantified physiological response (nSCR) for a time window was within a range of 0 to 14 for an original signal and 0 - 12 for a smooth signal. Hence, to formulate the dataset into a classification problem, nSCR were labeled as 0 when the nSCR value was 0 and 1 when the nSCR value was greater than 0. The distribution and range of nSCR values are illustrated in Fig below.

nCR Histogram nSCR Box plot
Fig Physiological response histogram
Fig Physiological response range

Complied dataset: As per the information fusion stage "\(\mathrm{C}\)" a comprehensive complied dataset was obtained:

Type Attributes/Features Definition
Input Participant ID Participant identifier used for combined datasets.
Timestamp Used to match samples at an instance in time.
Geographic location associated with each sample, i.e., the location where a time window begins.
Relative humidity
Calculated mean of noise, dust, environment temperature, relative humidity, and illuminance for a time window along the walk.
Isovist area
Isovist perimeter
Isovist compactness
Isovist occlusivity
Isovist characteristics (Benedikt, 1979) calculated from the field of view for each participant at the corresponding Timestamp, Longitude and Latitude.
Output Binary/Multiclass/phasic driver Quantified response of each participants’ individual perception (phasic nSCR) (Section 3.2.3).

Correlation and Geo-reference analysis framework: To compute the correlation between environmental conditions experienced across all participants and the physiological arousal across all participants, participants event-referenced mean were computed.

Similarly, to investigate average physiological responses across all participants associated with a geographical location, geo-referenced mean were computed.

The framework for fusing information to compute event-referenced and geo-referenced means was designed as per the following:

Fig Correlation and Geo-referencing Framework


Research paper:

Ojha VK, Griego D, Kuliga S, Bielik M, Buš P, Schaeben C, Treyer L, Standfest M, Schneider S, König R, Donath D, Schmitt G (2018) Machine learning approaches to understand the influence of urban environments on human’s physiological response, Information Sciences, Elsevier (pdf).


Download raw and processed data directly. The data are in ZIP folder. Following are the subprojects of ESUM:

ESUM SNF Project data

Bus Stop study data (Commuting Experience data)

Virtual Reality Experiment data


Download the codes from the GitHub. The codes are mainly for data preprocessing and visualization. The codes are mostly written in python. Following are the subprojects of ESUM:

ESUM SNF project data processing codes

ESUM+ commuting experience (Bus Stop study) data processing codes

360 Virtual Reality data processing codes

Block Geometry data processing codes

ESUM: Data Visualization Tool


ESUM complete picture (pdf)
Machine learning and pattern analysis for ESUM (pdf)
MAPS: Visual representation of results (pdf)
Talk delivered in ITA, ETH Zurich, Switzerland (pdf)
Falling Wall competition Paris, France (presentation won 3rd place): (pdf)
Information Visualization: People's commuting experience in cities (pdf)


Silvennoinen, H. (2018) Non-Spatial and Spatial Statistics for Analysing Human's Perception of the Built Environment, ETH Zurich, Switzerland (pdf, code)

Stolbovoy, V. (2018) Convolutional Neural Network and Visual Feature Extraction for Evaluation of the Urban Environment, ETH Zurich, Switzerland (pdf, code)

Schaeben, C. (2017) People's Perception of Urban and Architectural Features, ETH Zurich, Switzerland (pdf)

This website has been archived and is no longer maintained.