The opioid crisis in the United States is fuelled by a malevolent release of prescription drugs from clinical settings. There are two types of bad actors in this situation: first, there’s the external bad actor. This is an entity who exists outside the clinical setting and is funneling these drugs into the community by purchasing or stealing them from the clinics and/or clinicians. The other type is the internal bad actor. This is an entity who works inside the clinical setting, and uses their access credentials to maliciously disburse these drugs into the greater population. One common form of this disbursement is the misdirection of opioid prescriptions or treatments within clinics. Unfortunately, this increasingly common fraudulent behavior is hard to detect, and is causing an increase in the overdose deaths in the United States every year.
A common fraud scheme is for a medical professional working in a hospital to falsely assign opioid drugs to a patient’s chart, but never deliver those drugs to the patient, instead pocketing the opioids for use or disbursement. Using their clinic’s internal record system, it’s possible for an employee with the proper medical credentials to change patient records to account for missing opioids.
Fortunately, HIPAA’s 45 C.F.R. § 164.312(b) requires all health care professionals to maintain an audit log for any changes to patients’ medical records. One effect of this requirement is that patients can request a list of all the health care professionals who’ve ever changed their medical records. Another effect is that most medical facilities (in our experience) have some logging system streaming data in real time about access and changes to medical records. These logs include the ID of the medical professional, their position (lab tech, nurse, pharmacist, medical coder, etc.), a timestamp, the patient ID, and the log deltas.
Traditionally, companies have employed standard anomaly detection techniques and temporal modeling to attempt to classify various types of fraudulent behavior from these logs. These techniques do work with varying degrees of accuracy, but as in the case of all fraud, criminal techniques are constantly evolving. Expero employs a bleeding edge architecture for performing internal bad actor anomaly detection which can be used in concert with these traditional techniques. The secret sauce is as follows.
The information known as priori in this analysis is:
We extract the information on each of these patients and clinicians by selecting data for each clinician from a graph database. Because the patient-clinician relationships within one medical facility can be devilishly intertwined, storing this information in a graph database is ideal. When we perform the extraction, we simply issue the query, “return all the patients within X hops of each clinician.” In this article, we examine only the case where X equals 1, though as you can imagine, when you increase the X distance you can uncover extremely complex and subtle interactions in the medical facility.
The type of analysis we perform in these cases is a derivative of the field of sequential recommender systems. In fact, one may call it a perversion of the field of sequential recommender systems, because it has little to do with recommendations. The analysis starts in the same way, we use a deep learning library to instantiate a high dimensional latent vector embedding layer for each of the patients, and each of the medical professionals.
We train our model against the cross interactions between medical professionals and patients. One (naive) way of building up the cross interaction information is by counting; every time medical professional X interacts with patient record Y, add one to the X/Y intersection point. This is directly correlated to the application of implicit ratings matrices of the recommender system world.
Once we’ve trained against the interaction matrix, we can perform several interesting analysis types. First, we can extract the latent vectors describing the medical professionals and use cosine similarity (or similar techniques) to sort them against their colleagues. This works in an identical way for patients. Using cosine similarity we can find the most similar patients/clinicians, and find the least similar patients/clinicians.
More importantly, we can fill in the rest of the interaction matrix.
Filling in the interaction matrix is equivalent to predicting the number of times each clinician will interact with each patient’s medical record. This is extremely powerful, because once we have this information, we can cross check it with observed data as it streams in in real time. When our comparison algorithm finds that a clinician’s interactions with a patient's record is higher than some threshold (which is dependent on the types of the patients. Remember that side information?), we use our UI to notify an auditor or regulator that this clinician is behaving anomalously, maybe even fraudulently. The regulator then can make the decision whether to investigate further.
Questions about this blog post? Send us an email - email@example.com