One of the largest US payment processing companies improved their data quality, built a reliable big data reservoir, and uncovered millions of dollars owed to them in unpaid balances.
Business objective & project requirements
Due to its strategy of growth by acquisition, the financial services company was composed of several different payment networks (acquired companies), each with their own technical stack and sales processes. The processor aimed to reach a stage where all of its payment networks could access the same data and use standard practices, sales strategies, and technology.
- Move from data lake to reservoir: Transition away from data being stored, maintained, accessed, and utilized by individual payment networks, to data being collected, accessible, and usable in a single place.
- Be able to instantly work with data, including performing analytics and detecting real time patterns.
- Detect payment fraud: Although fraud is also handled by banks and payment technology companies like Visa and Mastercard, the company wanted to expand their product offer with services such as fraud detection.
Project phases
- Proof of concept and early work in big data and reference data: Ataccama gave meaning to Hadoop by defining reference values such as payment and error codes, saving the company $1 million at the start of cooperation.
- Standardization: Data was not consistently organized across various platforms, so Ataccama worked to standardize all company data.
- Enrichment: Ataccama helped the company enrich their data with information that was not originally available, such as addresses, postal codes, or transaction types.
- Distribution and business user empowerment: Ataccama allowed all company analysts to access their standardized and enriched data in Hadoop. We also enabled them to build small databases on demand, which proved especially beneficial to their marketing department.
Solution & benefits
Ataccama migrated data from former individual companies to a single location with Hadoop, and implemented a data quality solution so that data could be better analyzed, understood, and utilized.
Hadoop Implementation: In the first stage, Ataccama helped the company migrate their data from individual companies to a Hadoop cluster.
Data Profiling and Cleansing: Migrating data to a single location on Hadoop did not increase understanding of that data by itself. Ataccama made the company’s data more useful by implementing a comprehensive data quality solution, using profiling to allow the organization to better understand the contents of its data lake, and cleansing to improve and standardize their data.
Stats
- 250 GB of data is compressed each day, equal to approximately 5000 transactions per second.
- The company’s Hadoop cluster was originally 12 nodes, and increased to 32 nodes during our cooperation.
- Company systems now include data dating back to 2012, which is roughly 1 terabyte of data every 4 days.