Uncertainty in Applications
- System goal
- Correctness
- Efficiency
- Scalability
- Difficulties
- Uncertainty
- Imprecision
- Noise
- Missing values
- Examples
- Sampling accuracy
- Values monitored is changing, and those stored in database may be obsolete
- Uncertainty in satellite images
- Measurement errors
- GPS, indoor positioning, etc.
- Uncertainty in repeated measurements
- Text extraction
- Data integration
- Uncertain graphs
- Criminal databases
- Sampling accuracy
Problems in Managing Uncertain Data
- Modeling data uncertainty
- Capturing & representation
- Simple v.s. complicated models
- Probabilistic queries
- More difficult & expensive
- Data quality & cleaning
- Measurement of the quality of data & quality of queries
- Which & how to clean
Uncertainty Models
- Types
- Attribute uncertainty (value uncertainty): I know it exists, but not sure about the value
- Discrete attribute
- Binary attribute
- Continuous attribute
- Discrete attribute
- Tuple uncertainty (existential uncertainty): I know its value, but not sure if it exists at all
- Attribute uncertainty (value uncertainty): I know it exists, but not sure about the value
- Sources of uncertainty
- Data staleness
- Location-based tracking systems
- Location estimation: last reported location + uncertainty model
- Location-based tracking systems
- Measurement errors
- Value estimation: measured value + error estimation
- Error modeled as e.g. Gaussian distribution
- Dead reckoning: value bounded within
[v-d, v+d]
- Trade-off between data uncertainty & update frequency
- Value estimation: measured value + error estimation
- Repeated measurements
- Value estimation: distribution of a large number of of measurements
- Data integration
- Data staleness
Attribute Uncertainty
- $$f_i(x)$$: continuous, uniform, Gaussian, histogram, discrete, etc.
Uncertainty pdf/pmf
Depends on the application. If no information known, assume e.g. uniform pdf, or derive via time-series analysis from past data.
Tuple Uncertainty
X-Tuple Model
Integrating data from different sources, the values may vary for the same target.