Efficiently Evaluating “Big Data” for Medicare Fraud Detection

With over 2 billion Medicare claims available for analysis since 2006, the term “Big Data” has no better application than in the health care industry. The opportunity for meaningful analyses resulting from big data is limitless. However, finding the best method for combing through Medicare claims data in an efficient manner can be tricky since there are thousands of analysts accessing the same data simultaneously as you. In order to get meaningful results, fast for thousands of users, most industries rely on sophisticated database management systems (DBMS) such as Oracle, Teradata, and Microsoft SQL Server. While these DBMS have inherent optimization logic under the hood to process queries efficiently, there are several tactics an analyst should employ for every research endeavor.

Best “Big Data” Evaluation Techniques for Medicare Fraud Detection

Get to Know the Data. No matter how sophisticated the programmer, without business intelligence no meaningful output can be generated in the Medicare fraud detection industry. Take the time to research the Medicare claims tables and views you are proposing to utilize. An understanding of the data can assist in your query development tactics and reduce overall time to complete an analysis. A simple way to do so is to focus on a single Medicare provider or beneficiary at a time for a short period of time (a year or less). A great deal of business intelligence regarding table layout and population can be obtained with a very small sampling of the data you are hypothesizing using in your analyses.

Targeted Analytics. To develop a good Medicare fraud analysis study on the national scale, it is critical to develop a specific research question or a direct Medicare vulnerability to evaluate. Big data like Medicare claims is too big for your analysis to be non-specific or multilayered. Focus on answering succinct research questions and let the answers to each question drive your next analysis.

Learn About Our Data Analysis Services

Discuss Your Needs with an Industry Expert

Start at 35,000 feet and work your way down. A high level initial approach to identify suspect billing behavior is an efficient method to employ in DBMS. Performing summary functions in database like identifying high utilization providers (number of procedures performed, dollars billed, dollars paid, etc.) at the aggregate level to identify providers, beneficiaries, or other metrics of interest has been a best practice for Medicare claims data analysis. Once you have a select group of metrics such as high volume providers, perform a deeper dive at the claim level to evaluate specific questions. Since DBMS are stronger at performing summary analysis, you’ll want to do as much as possible “in database” before utilizing other statistical programming software such as R, SAS, or STATA to continue your analysis as these applications work best with no more than a few million records at a time.

Formalize Your SQL. To take advantage of the optimizations available inherently within DBMS, there are several formalities you can rely upon to ensure the DBMS will operate smoothly:

Apply static query filters exclusively using the WHERE statement.
- Avoid static filtering exclusively when joining tables inside an ON condition.
For large “in-list” filters on non-indexed fields, create lookup tables in database that can be joined with Medicare claims to limit your results.
- Large in-list values (upwards of 50 unique values) require full table scans and can be time consuming.
- Instead, create a database table (permanent or temporary) that holds all of the unique values you wish to filter on and join with Medicare claims data.
Always apply a primary index (can be more than one field) that best represents the uniqueness of each record in a user defined database table to improve query performance.
- DBMS can search for items much faster if you filter on the primary index of an object (table or view).
- Table skewness occurs when a primary index is improperly assigned which negatively affects query performance.

Contact IntegrityM’s team of efficient data evaluation experts

IntegrityM can provide the support needed to enhance your Medicare claims and fraud data analysis. We have experience in evaluating “Big Data,” and providing meaningful results. For additional information on how we can assist your organization with Medicare fraud detection or claims data analysis, contact us online or call (703) 535-1400.