Data Discovery & Classification

Aurva automatically scans connected data sources and classifies sensitive fields, columns, and files using built-in detectors for 30+ sensitive data types.

How It Works

Connect — add a data source via the Connector Setup flow
Scan — Aurva's scanner reads metadata and samples data (no full data copy)
Classify — detectors identify sensitive labels (PII, PHI, PCI, Aadhaar, etc.)
Surface — findings appear on the Data Assets page with confidence scores

Sensitive Data Types

Category	Examples
Personal	Name, email, phone, date of birth, gender
Financial	PAN, account number, IFSC, credit card (PCI)
Health	Medical record number, diagnosis code (PHI/HIPAA)
Identity	Aadhaar, passport number, driving licence
Credentials	API keys, passwords, tokens

Scan Modes

Mode	Description
Discovery	Scans schema and metadata only — fast, no data sampling
Classification	Samples data rows to identify sensitive labels — more thorough
Full	Discovery + Classification in one pass

Triggering a Scan

From the Data Assets page, select an asset and click Actions → Scan Now for an immediate scan, or Actions → Update Auto Scan to configure a recurring schedule (daily or weekly).

Enable auto-scan for all production and regulated datasets. Set the scan window to off-peak hours to avoid performance impact.

Reviewing Results

After a scan completes, open the asset's Data Tab to review:

Plain Sensitive Data — columns/fields with clear sensitive values
Obfuscated Sensitive Data — masked or tokenized fields for confirmation
Overall Schema — full database/schema/table tree with sensitivity indicators