Icon close

Unlocking Real-Time Insights: How a Financial Institution Adopted Cloud Spanner

Author

Date

Introduction

Digital transformation is rapidly changing the financial services industry, and banks need to have a robust and efficient data management system in place to stay competitive. However, as data volumes continue to grow, managing and processing this data in real-time can be challenging. In this blog post, we will explore how a leading bank was able to modernize its data management system, leveraging Google Cloud technologies, and overcome the challenges it faced with its previous data storage and search system. We will also discuss how Kasna, a Google Cloud Premium partner, helped the bank achieve its goals.

Background

The bank’s previous data storage and search system relied on Apache Cassandra and Apache Solr, but it had several limitations. While the system was able to store and search the bank’s data, it faced challenges when it came to displaying fresh data in real-time due to the daily snapshot jobs used to load denormalized data into Cassandra. Although the denormalized data contained the necessary information, the system’s scalability was limited. Additionally, updating every related transaction stored in Cassandra when a large company rebranded (i.e., the rebranding of Caltex to Ampol) was time-consuming and inefficient, highlighting the system’s limitations in managing real-time data and supporting complex data models.

 

Recognizing the need for a more scalable, efficient, and flexible data management solution, the bank turned to Kasna, a Google Cloud Premium partner with expertise in cloud solutions and data management, to design and implement a new solution that could address these challenges and meet the bank’s evolving needs.

Challenges with Previous Data Storage and Search System

The bank faced several challenges with its previous data storage and search system. One major issue was the system’s limited scalability due to the daily snapshot jobs used to load denormalized data into Cassandra. Additionally, the system’s ability to support complex data models and provide advanced search capabilities was inadequate, making it difficult for the bank to extract valuable insights from its data. The process of updating every related transaction stored in Cassandra was also inefficient and time-consuming, leading to delays in providing timely and accurate information to customers.

 

To overcome these challenges, Kasna designed and implemented a new solution that enabled the bank to manage real-time data more efficiently, support complex data models, and provide advanced search capabilities. The new solution provided the scalability needed to handle the bank’s evolving needs, enabling the institution to keep up with the demands of an ever-changing industry.

Solution Design

The new system is designed to address the challenges of the previous data storage and search system, while also providing several benefits. The diagram below illustrates the components and data flow of the new event-driven system:

Spanner: We use Google Cloud Spanner as our transactional database to store the normalized data. By having our data normalized in Spanner, we can easily manage complex relationships between data and have real-time data accessed and updated more efficiently. Additionally, we can leverage Spanner’s change streams to stream real-time updates on the database to other storage systems, making it easier to integrate our database with other systems in our architecture.

 

Dataflow: We use Apache Beam on Google Cloud Dataflow to denormalize the data, perform any necessary transformations, and bulk index the data into Elasticsearch. Here’s a high-level overview of the event-driven data processing flow:

  • First, we use SpannerIO to read the change streams directly from Spanner in real-time. Any updates or inserts to the data are captured in near real-time as events.
  • Next, we use a custom ParDo function to denormalize the data by joining it with other relevant data, such as reference data.
  • After denormalizing the data, we use another custom ParDo function to transform the data into a format suitable for indexing into Elasticsearch.
  • Finally, we use the ElasticsearchIO connector to bulk index the transformed data into Elasticsearch.

 

Elasticsearch: We use Elasticsearch to store and search the denormalized data. The data is indexed in a denormalized form to enable fast and efficient search queries.

 

With this event-driven design, we are able to handle large updates to the data, such as renaming a large amount of merchants in past transactions, by leveraging the bulk indexing capabilities of Elasticsearch. This approach allows us to update many records efficiently, without incurring the overhead of individual updates. Additionally, the use of change streams allows us to stream data to other storage systems like BigQuery, enabling real-time analytics and reporting on the transaction data. 

Benefits of the New System

Real-time data insights

The combination of Spanner change streams and Dataflow has enabled the bank to stream real-time data updates to Elasticsearch in an event-driven manner. This allows for real-time insights and search capabilities on the data, providing near-instant information to the bank as well as users. This real-time data processing also enables faster and more accurate decision-making.

Scalability

Both Cloud Spanner and Dataflow are designed to scale horizontally, which allows the bank to handle any increase in data volume or throughput. This scalability ensures that the system remains performant even as the organization grows and the amount of data it needs to manage increases.

Cost savings

With the adoption of Cloud Spanner, the bank has been able to reduce its operational costs. By leveraging the capabilities of the platform, the bank no longer needs to manage its own database infrastructure, which reduces both operational costs and the need for a dedicated team to manage the infrastructure.

Reliability

Cloud Spanner provides high availability and strong consistency guarantees, ensuring that the bank’s data is always available and up to date. Additionally, Dataflow is designed to automatically recover from failures and handle backpressure, further increasing the reliability of the system.

Security

Google Cloud Platform offers robust security features, including encryption at rest and in transit, IAM, and VPC Service Controls. These features ensure that the bank’s data is protected from unauthorized access.

Simplified mangement

Cloud Spanner simplifies management by providing a single system that can handle a wider range of data types and use cases, reducing management complexity. This is particularly beneficial when dealing with relational data or other use cases that may not be well-suited for Cassandra. With Spanner, the bank can handle all of its data needs with a single system, making it easier to manage and reducing the potential for errors.

Flexibility

With denormalized data stored in Elasticsearch, the bank can easily perform complex query and aggregations on the data, enabling a wide range of use cases. Additionally, with the ability to stream data changes to other systems, such as BigQuery, the bank can perform even more complex analysis and gain additional insights into its data. This flexibility allows the bank to derive more value from its data and make better-informed decisions.

Comparison of the Old and New System

The table below provides a comparison of the old system (Cassandra + Solr) and the new system (Spanner + Dataflow + Elasticsearch):

Old System (Cassandra + Solr) New System (Spanner + Dataflow + Elasticsearch)
Data model
Requires denormalization
Supports complex data models with normalization in Spanner and denormalization in Elasticsearch
Real-time updates
Not natively supported and challenging due to denormalized data and limited data pre-aggregation
Cloud Spanner provides real-time data updates through change streams, which are then streamed to Dataflow for processing and denormalization The denormalized data is then indexed and searchable in Elasticsearch
Search capabilities
Limited and difficult to scale
Advanced search capabilities including real-time search powered by Elasticsearch
Scalability
Challenging to scale as the volume of data grows
Can handle the growing volume of data with horizontal scaling in both Dataflow and Spanner
Cost Savings
N/A
Reduced operational costs due to leveraging cloud native service
Reliability
Limited availability and consistency guarantees
High availability and strong consistency guarantees
Security
Limited security features
Strong security features including encryption at rest and transit, IAM, and VPC Service Controls
Management Complexity
Management complexity increases for relational data and other use cases that may not be well-suited for Cassandra
Reduced management complexity and a wider range of data types and use cases supported with a single system

By migrating to the new system, the bank has been able to improve its data management capabilities while also realizing significant benefits in terms of scalability, cost savings, reliability, security, and simplified management.

Conclusion

In conclusion, the modernized data management system designed and implemented by Kasna has successfully transformed the bank’s data management capabilities, providing numerous benefits such as improved data management, real-time data analytics, scalability, cost savings, reliability, and security. By leveraging Cloud Spanner and Dataflow the bank can now store and analyze large amounts of data in real-time, downstream the data into other data storages such as Elasticsearch, and extract valuable insights.

 

The improved data management and analytics capabilities directly impact the bank’s digital experience. With real-time analytics and search capabilities, customers can access their account balances and transactions in real-time, advisers can monitor and manage their clients’ portfolios, and users can receive personalized recommendations based on their financial data. These real-time capabilities allow for faster, more informed decision-making, enhancing the overall customer experience and increasing customer satisfaction.

 

In summary, Kasna’s solution not only enables the bank to manage their data more efficiently but also offers a better digital experience to their customers. This improved experience can have a significant positive impact on the bank’s business, leading to increased retention and growth. With the adoption of this modernized system, the bank is well-positioned to stay ahead of the competition and meet the ever-changing needs of its customers.

Stay up to date in the community!

We love talking with the community. Subscribe to our community emails to hear about the latest brown bag webinars, events we are hosting, guides and explainers.

Share