You can't fix what you haven't measured
Most CRM cleanup efforts fail because they start with cleaning instead of auditing. Someone runs a deduplication tool, merges a few hundred records, declares the CRM "clean," and goes back to their day job. Three months later, the same problems are back, because nobody identified the root causes or established a baseline to measure against.
A data audit is Phase 2 of the CRM data hygiene framework, after defining your governance standards and before cleaning anything. It gives you a quantified baseline, a prioritized list of problems, and the ammunition to build a business case for investment.
Here's the checklist, organized by CRM object.
Account data audit
Completeness
- What percentage of accounts have a populated industry field?
- What percentage have employee count filled in?
- What percentage have a valid website URL?
- What percentage have billing address (city, state, country at minimum)?
- What percentage have an account owner assigned?
- What percentage have annual revenue populated (or estimated)?
Target: 90%+ completion on mandatory fields. If you're below 70% on any critical field, that field is unreliable for segmentation, territory design, or reporting.
Accuracy
- Pull a random sample of 50 accounts. Manually verify industry, employee count, and website against LinkedIn or the company's site. What percentage is accurate?
- How many accounts have company names that are variations of the same company? (IBM vs. I.B.M. vs. International Business Machines)
- How many accounts show an employee count of 0 or 1 for companies that clearly have more?
Target: 85%+ accuracy on spot-checked records. Below 70% means your enrichment data is stale or was never validated.
Duplication
- Run your CRM's built-in duplicate detection on accounts. How many potential duplicates are flagged?
- Check for fuzzy duplicates that native tools miss: slightly different company names, different domains for the same company, parent/subsidiary relationships
- What percentage of total accounts are potential duplicates?
Target: Below 5% duplication rate. Above 10% means your duplicate prevention controls are failing.
Staleness
- How many accounts have zero activity (no tasks, emails, meetings, or opportunities) in the last 12 months?
- How many accounts have no associated contacts at all?
- How many accounts have an owner who no longer works at your company?
Target: Stale accounts should be less than 20% of your total. If a third of your CRM is accounts nobody has touched in a year, you're inflating your TAM and cluttering reports.
Contact data audit
Completeness
- What percentage of contacts have a valid email address?
- What percentage have a phone number?
- What percentage have a job title?
- What percentage have an associated account? (Orphaned contacts with no account are nearly useless)
- What percentage have a lead source populated?
Target: 95%+ for email (it's the primary communication channel), 80%+ for title and account association.
Validity
- Run an email verification tool against your contact database. What percentage of emails are valid, invalid, or risky?
- How many contacts have generic emails (info@, hello@, sales@) as their primary email?
- How many contacts show a title that indicates they've likely moved (e.g., "Former," "Ex-," or a title that doesn't match their current LinkedIn)?
Target: Below 5% invalid email rate. Above 10% means your deliverability is at risk. Run verification before any large email campaign.
Duplication
- Run duplicate detection on contacts. How many share the same email address?
- How many share the same name and company but have different records?
- Are duplicates concentrated in specific segments or time periods? (Bulk imports often create duplicate clusters)
Target: Below 3% duplication rate on contacts. Contact duplicates are especially damaging because they split activity history and create confusion for reps.
Engagement freshness
- How many contacts have no activity (inbound or outbound) in 12+ months?
- How many contacts have bounced emails in the last 90 days?
- How many contacts are associated with closed-lost opportunities and have no other engagement?
Target: Archive contacts with no activity in 18+ months and bounced emails. Keep them for compliance but remove from active lists and reporting.
Opportunity data audit
Completeness
- What percentage of opportunities have a populated close date?
- What percentage have an amount/value?
- What percentage have a stage assigned?
- What percentage have an associated contact (not just an account)?
- What percentage have a lead source or campaign source populated?
Target: 95%+ for close date, amount, and stage. These are the minimum fields needed for pipeline reporting and forecasting. Below 90% means your forecast is unreliable.
Accuracy
- How many open opportunities have a close date in the past? (These should have been won, lost, or pushed. A past close date means nobody updated the record.)
- How many open opportunities have been in the same stage for 90+ days? (Potentially stale deals inflating pipeline)
- How many opportunities have an amount of $0 or a suspiciously round number that suggests a placeholder?
Target: Zero opportunities with past close dates. If you have more than 10% of pipeline with past close dates, your pipeline metrics are meaningless.
Pipeline health
- What percentage of pipeline value comes from opportunities with no activity in 30+ days?
- Are there opportunities with no associated contacts? (Who is the rep talking to?)
- How many opportunities were created by reps who have since left the company and haven't been reassigned?
Running the audit
Step 1: Pull the data
Export or run reports for each object against the checklist items above. Most CRMs can generate completion reports natively. For accuracy and validity checks, you'll need to sample and verify manually or use a data quality tool.
Step 2: Score each category
For each object (Accounts, Contacts, Opportunities), calculate:
| Metric | Score |
|---|---|
| Field completion rate | ___ % |
| Accuracy (spot-check) | ___ % |
| Duplication rate | ___ % |
| Staleness / decay rate | ___ % |
| Overall data health | (average) |
Step 3: Prioritize by impact
Not all data problems are equal. Prioritize based on revenue impact:
Fix first: Opportunity data issues. These directly affect your pipeline and forecast. Past close dates, missing amounts, and stale deals are immediate wins.
Fix second: Contact data issues. These affect routing accuracy, marketing deliverability, and sales outreach. Invalid emails and duplicates have the broadest impact.
Fix third: Account data issues. These affect segmentation, territory design, and reporting. Important but less urgent than pipeline and contact accuracy.
Step 4: Set targets and schedule
Based on your baseline, set 90-day improvement targets for each metric. Then schedule the Phase 3 cleanup focused on the highest-priority issues.
The quarterly re-audit
The initial audit gives you the baseline. Quarterly re-audits show the trend. Build a simple dashboard that tracks your core metrics over time:
- Is field completion improving or declining?
- Is the duplication rate growing (despite cleanup) — indicating a prevention problem?
- Is the stale record ratio shrinking — indicating better maintenance cadence?
- Is email validity improving — indicating enrichment is working?
Trends matter more than absolute numbers. A team that moves field completion from 65% to 82% over two quarters has fundamentally changed their data quality, even if they're not yet at the 90% target.
Build this audit into your quarterly RevOps review cadence. Data quality should be a standing agenda item, not a special project.
The bottom line
A data audit takes one to two days the first time and a few hours each quarter after that. The ROI is immediate: you'll discover pipeline that doesn't exist, contacts that can't be reached, and accounts that are duplicated or miscategorized. Every one of those findings represents revenue at risk or efficiency lost.
Don't skip Phase 2. The companies that jump straight to cleaning without auditing first end up repeating the cleanup every six months because they never identified the root causes. Measure first, clean second, maintain always. And once you know your baseline, you'll have the ammunition to quantify the cost of dirty data and build the business case for investment.