100% found this document useful (1 vote)
130 views45 pages

Implementasi Big Data Di Fintech - Compressed

1. LinkAja implemented big data technologies to address scalability issues and enable data-driven decisions. 2. They moved all data infrastructure to the cloud for scalability and built a data lake for unified data access. 3. The big data group was formed to build an AI/ML platform, implement data governance, and empower business users with self-service analytics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
130 views45 pages

Implementasi Big Data Di Fintech - Compressed

1. LinkAja implemented big data technologies to address scalability issues and enable data-driven decisions. 2. They moved all data infrastructure to the cloud for scalability and built a data lake for unified data access. 3. The big data group was formed to build an AI/ML platform, implement data governance, and empower business users with self-service analytics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

Big Data Implementation at Fintech

Muhammad Saipul Rohman


Data Engineer
A glimpse
about Linkaja
LinkAja 2020 3
… enabled by the support we receives from key stakeholders and SOEs

LinkAja 2020 4
LinkAja 2020 5
LinkAja 2020 6
Big Data at
Linkaja
Oct 2019, Before the fun begins..

Vendor-driven Information retrieval No extended data


Scalability issues
development is challenging products
Where were we?

1. Monolithic architecture
Scalability issues
2. Architecture & network modification was a challenge

3. Hard to scale servers/engines

4. Infrastructure was not so easy to govern

5. No clear logging & monitoring system

6. Scattered databases
Now, Talking about the team

Vendor-driven
development
1. No internal resource, too much dependency to vendors

2. Communication is a challenge

3. No clear data strategy


Talking about classic problem

Information retrieval
is challenging 1. Not so consistent data

2. Metrics definitions are scattered all over the organization

3. Difficult to extend information influence through visualization

4. Democratizing data.. Are you kidding me?


Limited room for innovation

No extended data
products

1. Can not build AI/ML products

2. Can not do UI/UX & product experimentation


Moving All Data Infrastructure to Cloud

Scalability issues
Compute engine & networking

Storage and datawarehouse

Logging, monitoring, & IAM


Why we exist

The needs of Data unification, layering, & Talent pool demand to build big data
democratization capabilities, such as data engineering,
analytics & data governance, and AI/ML

Critical business activities require big data Requirement to advocate high quality and
technology, i.e. business decision, product secure data
development, & crucial operational works

What are our missions

1 2 3 4
Building a single source of Instil the organisation as a Empowering business with Building Artificial
truth of company data lake, whole with fit-for-purpose accessible and secure data Intelligence/Machine Learning
that is scalable, reliable, and data governance principles platform to extract the benefit products that produce high
with high availability from data driven culture business value
What we do and how we do it

Getting the basics perfect

We focus our access management on our


cloud assets

Our cloud assets are great, but we are the


ones optimizing them we try to do less of these… instead, we do these…

instant, in-the-moment requests structured, planned work


We are the experts in data, from storing it,
ad-hoc data retrieval data enablement and evangelism
accessing it, and everything in between. So
partner with us when using data working on requests with solution building and educating on
short-term thinking the most efficient and sustainable
outcome
Big Data Group

Big Data
Group

Data Data Platform AI/ML


Engineering & Core BI Platform

Data Solution Engineering Core BI & Reporting AI/ML Engineering


Data pipeline development, cross data Proactive data enablement by anticipating Bringing best-in-class of self-service AI/ML
sources integration, and data product business needs of self-service data access platform to support business in developing
enablement to support business decision & AI/ML model
operational optimization Product Analytics
Partnering with Product Owners and Product
AI/ML Scientist
Data Infrastructure Engineering Managers to empower the business in Proactively empowering AI/ML activities in
Build scalable & high performing data making product development data-driven providing business and product insights
infrastructure that enhances the efficiency decisions through simulation, intelligent pattern,
and productivity of the data environment and unstructured data insight

Data Governance
Raise awareness, build frameworks and enforce
activities of data quality and data security in all
data domains across the organisation
How Big Data
started ?
Evolution Big Data
5V of Big Data
Type of Data Source
Common roles in Data Teams
Big Data Architecture

• Lambda Architecture
• Kappa Architecture
Lambda Architecture
Example Lambda Architecture
Kappa Architecture

Streaming/ Streaming/
Data Analytical Analytics &
Real-time Real-time
Sources Ingestion Processing Data Store Reporting
Example Kappa Architecture
Data Ingestion

Self service
Sources Datalake Datamart
playground

BI Reporting AI/ML
Democratization is nonsense without clear governance framework

We need to ensure the data is in high quality We need to ensure the data is secure

1. Plant airflow sensors in ingestion process


1. Looker dashboard for all individual &
service accounts access

2. Looker dashboard for ingestion processes 2. Platform security assessment


AI is useless if you have
scrappy foundation

Tertiary AI/ML

Reporting,
Secondary
Dashboard, &
Analytics

Primary
Data engineering
Use Case Big Data at
LinkAja
Tech Stacks and Production Components
Data Engineering Technology Ecosystems
Data Sources Persistence Layer
Data Pipelines

Databases

Streaming Infrastructure Google Big


Query

Excel
Connect Connect

Google
Files Batching Infrastructure Cloud Storage

API

Monitoring
Airflow
..... GCP Console
Dashboard
LinkAja 2020 39
Core Business Intelligence

Self-service platform
Data evangelist Visualisation Data enablement • Business data dictionary
• Operational data dashboard
• Query optimization how-to guide
Data linkage Business Business-
and data Intelligence focused
access dashboards enablement Government projects
consultant
• Kartu Prakerja disbursement
Data-driven • Pegadian Emas integration with LinkAja
business Core reporting
advocate
Financial regulators partner

• Bank Indonesia reporting


• Data pipeline documentation

LinkAja 2020 22
Product Analytics and Experimentation

Analytics dashboards Product cataloguing Data-focused product


development

Customer segmentation Product Requirement Data points enablement


dashboards Document with data in mind during product development

Product categorization based


Product usage dashboards Event tracking analysis
on data

Performance dashboards

LinkAja 2020 41
Data Governance

Data quality centre of excellence

Data quality and security • Prioritisation of data of high importance and value
• Data quality tools and processes improvements
• Data ingestion flow metrics (work in progress!)

Adoption of data governance tools


Data governance policy Best in class • Maximum utilization of Google Cloud Platform built-
implementation tools in tools
• Bespoke data catalog tool (work in progress!)

Simplification through clear framework


Awareness- Business-focussed Data governance
building requirements framework • Optimising the number of user access groups
• Bespoke data catalog tool (work in progress!)

LinkAja 2020 42
• Building AI/ML Platform
AI/ML Engineering • Making AI/ML platform adoption easier
• Automation of AI/ML production

LinkAja 2020 43
AI/ML Scientist

Starting AI Guild as eKYC improvement NLP Project


internal community • Image quality metrics • Gender prediction
• ID card detection • Syariah recommendation

LinkAja 2020 44

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy