Skip main navigation
We use cookies to give you a better experience, if that’s ok you can close this message and carry on browsing. For more info read our cookies policy.
We use cookies to give you a better experience. Carry on browsing if you're happy with this, or read our cookies policy for more information.

Automated data processing for a large pharmaceutical company

In the next two steps, you’ll look at 2 case studies that explore how companies from very different industries use big data. When you finish reading both examples, we’ll ask you some questions about each and how they compare.

Background

A global pharmaceutical company headquartered in the US, has billions of dollars in revenues and offices in more than 10 countries. It distributes diabetic, psychiatric, and cancer-related products in more than 100 countries.

Challenge

This major pharmaceutical company was struggling to perform timely analysis on big data streaming from its customers across the world. One of the major goals of the analysis was to identify the most talked about issues and topics. Annually, the company receives approximately 1.5 million inquiries, complaints, and comments from a variety of sources, including call centers, conferences, online chat dialogs, and focus groups. These data are gathered from primary care providers (PCP), researchers, scientists and pharmacists; and the topics range across all of the company’s product domains.

Using manual annotation, a dedicated team at the company spent hundreds of hours sifting through complex, “dirty”, natural language data which contain multiple issues and references per record. These data were collected in six different languages and were isolated in a variety of non-interacting departments at the global level. Additionally, due to resource restrictions, the company was only processing a fraction of the total data; presenting it with a skewed perspective on issues, developments, and patterns relating to its products and locations.

The company approached Megaputer Intelligence (henceforth Megaputer) with a series of requirements ranging from accurate categorisation of data from disparate sources to near real-time analysis and report generation.

Solution

Utilising its proprietary text and data analytics software, PolyAnalyst, and the knowledge of its domain-specific analysts, Megaputer developed an integrated system to fulfill the needs of the pharmaceutical company.

Data streamlining. The data the company provides Megaputer is input into the solution and cleansed in order to be suitable for analysis. In addition to checking the spelling and grammar of the text data, many of the structured fields in the datasets are frequently missing or incorrect. The solution handles this by mining the accompanying free-text to extract and populate missing data.

Methodological optimisation. Data from different sources is integrated to conform to a consistent set of labeling paradigms, jointly developed by analysts at the pharmaceutical company and Megaputer. This process identifies inconsistencies and redundancies in the data, from which Megaputer creates suggestions for the improvement of the company’s data collection methodology.

Sensitive data anonymisation. An additional domain-specific step in data cleansing is anonymisation. Due to compliance standards such as HIPPA, the automatic Entity Anonymisation features of PolyAnalyst, are employed to guarantee the security of medical information passed through Megaputer’s systems.

Automatic categorisation. For each treatment area covered by the pharmaceutical company, Megaputer’s solution includes a detailed pattern and semantics-based classification. Megaputer works hand-in-hand with various teams at the pharmaceutical company to further customise the topics and subtopics captured in the solution’s taxonomies. Each of the treatment areas is explored in all of the languages currently supported by PolyAnalyst: English, Spanish, German, French, Italian, and Japanese, with plans to extend to Chinese, Portuguese and more.

Flexible report generation. The solution automatically pulls data from the company’s servers each week and generates a series of reports. These reports are created automatically at weekly, monthly, and yearly intervals. All reports generated by Megaputer’s solution show aggregate trends, and the scope can be reduced, altered, or expanded to include different customer bases, locations, languages, and more. Additionally, all levels of reports support drill-down functionality, from which the end user can examine the original data records in context.

Results

The pharmaceutical company’s previous methods of data processing were time inefficient and did not permit the company to accurately visualise the patterns in the data. Megaputer’s automated solution is aimed at maximising the efficiency in the company’s data analytics. Specifically, the solution cleans, regularises, anonymises, and categorises the company’s data in such a way that its management can monitor developments in near real time.

In particular, the solution provides the pharmaceutical company with the following advantages:

Scalability. Although already constructed to handle big data, the modularity of the solution permits for data analysis at all levels. Additional languages, locations, and topics can be seamlessly integrated into the existing solution without compromising the clarity of the existing analysis.

Interactivity & intuitiveness. Web-based dashboards allow streaming updates and presentations to decision-makers without the need for additional dedicated software. These dashboards have interactive filters that permit the user to explore the differences between subgroups, and contain infographics which let the user directly view the underlying data. The categorisation derived from the raw data flows from a non-exclusive hierarchical structure and has two main functions. First, it permits for flexibility in dealing with natural language data, and second, it ensures the classification is both logical and intuitive. For example, common side effects can be inspected by cause, prevention, or other related subtopics.

Granularity & accuracy. Megaputer’s solution is data-driven. The automated system contains drill-down capability at every step of the analysis process while still being able to present the user with global, regional, or product-based trends. In the previous solution, the analysts’ biases influenced the accuracy of the analysis, whereas Megaputer’s solution allows the data to speak for itself. By applying automated analysis, the results are uniform and reproducible.

Quick Turnaround. The system greatly reduces the time and effort needed to perform analysis and generate reports. After implementation of the solution, the time required to produce a report decreased from one week to half-a-day. Thus, the users can view the reports from complex analysis by the next business day.

Benefits

Megaputer’s solution system helps this major pharmaceutical company successfully address multiple challenges. Below are the key benefits of the implementation:

Discovering of local and global trends. The solution gives management at the company, a clear and accurate picture of near-real time local and global trends affecting its products and services. The management can react to and manage any emerging issues quickly and proactively, thus containing the damage and protecting and enhancing the company’s brand image.

Product insights. With the solution, the pharmaceutical company has the ability to quickly identify common side effects which result in additional care. This empowers the company with the necessary data to improve the product and customer experience.

Increased customer satisfaction. Using the analysis of questions, complaints, and suggestions of customers, the company can now customise its business practices in response to product evaluation and the specific needs of client groups. Additionally, the company has been able to provide customised training to customer service representatives so that they are now better equipped to navigate common question or complaint areas.

Market understanding. The company uses Megaputer’s solution to research and learn from product launches and trends in different regions. This helps the management come up with a better strategy to launch products in new markets and geographical areas.

Better return on investment. The accuracy of PolyAnalyst’s analysis helps the pharmaceutical company to zero in on the root causes of problems, thus helping management focus attention on a few crucial areas rather than many areas. By reducing resources wasted on inaccurate results, Megaputer’s solution frees up substantial portions of market and research budgets. Moreover, data analysis automation saves time and money by decreasing the time spent on each project.


Now that you’ve read the case study, think about the usefulness of automated data processing for a large pharmaceutical company. What were the differences in requirements, data collection, processing and analysis for the pharmaceutical industry? Could the data analytics applied in this case study, also be used to tailor the needs of your business? Share your thoughts in the comments area below.

Share this article:

This article is from the free online course:

Digital Leadership: Creating Value Through Technology

University of Reading

Contact FutureLearn for Support