Healthcare Data Mining for Fraud Detection: Identify and Reduce False Positives to Build Stronger Models

Posted by IntegrityM | | Consulting


One of the key goals of healthcare data mining is to reduce false positives.  In health care fraud data mining, a false positive occurs when your model identifies a provider that is not engaging in fraudulent activity and that has legitimate reasons for having seemingly aberrant billing.

Pursuing false positives can be expensive and time consuming.  Even if a bad lead is discovered in the first stages of the investigative process, valuable time and resources and have already been lost.  Good models will reduce false positives, but even the very best of models will not eliminate them.  Savvy healthcare data analysts can prevent resources from being drained on dead end cases by designing models that cut down on false positives on the front end of an analysis and then by identifying any bad leads that made it through the model on the back end.

Looking for Data Analysts or Fraud Investigators?

Discuss Your Needs with an Industry Expert

Identify False Positives in Healthcare Fraud Detection Data Mining

One way to identify false positives is to make allowances for healthcare providers that are likely to have a legitimate reason to exhibit the pattern(s) identified by your model.  If your model does not account for these exceptions, these providers will be false positives.

For example, a model that has seen some success in identifying healthcare fraud is one that calculates the distance beneficiaries travel to see providers.  This model flags providers that have a certain percentage of beneficiaries traveling over a certain distance and potentially billing for services not rendered.

However, there are several factors that may explain why beneficiaries are travelling long distances to see a provider. A short list would include factors like:

  • Provider specialty
  • Beneficiary diagnosis
  • Geography

Acknowledging these factors on the front end will allow you to fine-tune your model to select fewer providers with legitimate reasons to be outliers.

Healthcare Data Mining False Positive Examples

Example #1: Beneficiaries are more likely to travel long distances to see an oncologist than a general practitioner.  You can adjust your model to exclude oncologists, or to increase the thresholds for oncologists.  You could also adjust your model to calculate the percentage of beneficiaries with a non-cancer diagnosis that travel a certain distance.  In this way, experience and fine-tuning can lead to some extremely strong models.

However, we cannot account for every potential false positive, so no model will be perfect.  The costs of pursuing a false positive are high, so we cannot assume that our models don’t produce them.  The healthcare data analyst’s job is not done:  looking for patterns among your outliers may allow you to identify some that don’t fit.  These are your potential false positives.

Example #2: Returning to the example of the distance traveled model, you may have identified a provider that, based on specialty codes, appeared to be a general practitioner.  A closer look at procedure codes may reveal that the provider actually specializes in internal medicine and additional searches could show that he or she heads one of the largest practices in the area.  At this point, what once was a strong lead now appears to be readily explainable.

The human factor is still necessary in data mining for healthcare fraud, but strong models are necessary to narrow results down to the best possible leads.

Increase the Efficiency of Your Healthcare Fraud Detection Data Mining

Time is money, and we’ve all felt the sense of urgency that comes from wanting to post results for a project as quickly as possible. In the case of healthcare fraud detection data mining, you want to find as many fraudulent leads as possible in the shortest amount of time. That said, sound planning and preparedness beat “spray and pray” tactics every time.

Understanding how to identify false positives, and making allowances for them in your model, is the first step in improving your process. Though it may feel like it is taking a lot of time to get everything set up just right, this will actually save you time in the long run by drastically reducing the number of false positives you’ll have to weed out on the back end.

At IntegrityM we offer a range of Government Program Integrity and compliance services including Medicaid fraud investigation and data analysis. Contact us today to learn how we can help your organization

Copyright © 2019 Integrity Management Services, Inc. All Rights Reserved.

GLȲD(Σ)TM and the GLȲD(Σ)TM Logo are registered trademarks of Integrity Management Services, Inc. in the United States and other countries.

Privacy Policy | Sitemap