Fuzzy Computing Applications For Anti Mo
Fuzzy Computing Applications For Anti Mo
Abstract—Fuzzy computing (FC) has made a great impact in to represent vague concepts [3]. Most importantly, Zadeh
capturing human domain knowledge and modeling non-linear offered a complete theory of fuzzy sets and fuzzy logic in 1965
mapping of input-output space. In this paper, we describe the [4], which enabled us to represent and manipulate ill-defined
design and implementation of FC systems for detection of money
laundering behaviors in financial transactions and monitoring of concepts. In addition, Zadeh defined fuzzy logic’s four facets
distributed storage system load. Our objective is to demonstrate [5], which provided us a language with syntax and semantics
the power of FC for real-world applications which are char- for computation. In particular, fuzzy logic allows us to use
acterized by imprecise, uncertain data, and incomplete domain linguistic variables to model dynamic systems by a set of
knowledge. For both applications, we designed fuzzy rules based fuzzy rules. Each rule consists of a set of linguistic variables.
on experts’ domain knowledge, depending on money laundering
scenarios in transactions or the “health” of a distributed storage These variables take fuzzy values, which are characterized by
system. In addition, we developped a generic fuzzy inference fuzzy membership functions. In addition, there is a reasoning
engine and contributed to the open source community. mechanism, fuzzy inference engine, which operates on the
fuzzy rules based on the generalized modus-ponens [6]. A
comprehensive review of fuzzy logic and fuzzy computing can
I. I NTRODUCTION be found in [7].
A. Motivation
There is a wide variety of industrial and financial problems
which require the analysis of uncertain and incomplete infor- C. Paper Structure
mation. To make matters worse, data used for the analysis are
often imprecise. These problems present a great opportunity In the next section we will focus on FC and their applica-
for the application of fuzzy computing technologies. tions in monitoring and detection. After a brief discussion of
In a distributed storage system, it is imperative to understand the problem of monitoring and detection in Section II, we
the usage patterns such as memory and CPU, so as to better will illustrate two applications of FC techniques. The first
load balance and avoid bottleneck. This can be accomplished application, described in Section III, consists in the adaptation
by using a tool which measures the state of the system of fuzzy rules in monitoring of distributed storage systems.
and reports the overall “healthiness” of the system. Such The second application, illustrated in Section IV, covers the
a tool ought to have a high level of sophistication which use of fuzzy logic inference to detect anomalies in money
incorporates real-time monitoring and decision-making. In a laundering. In the last section V we summarize the advantages
financial institution, we can still find labor-intensive tasks of using FC and discuss some potential extensions of these
such as review of suspicious account activities based on alerts technologies.
triggered by a rule-based system with hard-coded cutoffs. Due
to the complexity of these tasks, artificial intelligence (AI) and
in particular, fuzzy computing has been called upon in support II. F UZZY C OMPUTING A PPLICATIONS FOR M ONITORING
of monitoring of a distributed storage system and detection AND D ETECTION
of money laundering transactions. This paper focuses on the
use of fuzzy computing on these two aspects: monitoring and In this paper, we studies two fuzzy computing applications
detection. For starters, we will give a brief overview of fuzzy in the areas of monitoring and detection. They are fuzzy
computing in the next section. load monitoring for a distributed storage systems and fuzzy
anti-money laundering for a financial institution. In general,
monitoring is the first step to understand any complex real-
B. Fuzzy Computing world system. If certain undesired cases or scenarios were
The work of Post, Kleene and Lukasiewicz were among the discovered during the monitoring process, the next logic step
first treatment of imprecision and vagueness in multiple-valued would be to detect the recurring patterns, and subsequently to
logic systems as oppose to the classical Boolean logic [1], [2]. try to isolate the issues. Finally, a control strategy could be
In 1937, Max Black proposed the use of a consistency profile formalized to manage the problems.
2
Table I
F UZZY RULES FOR LOAD MONITORING
Memory\CPU LOW AVG HIGH
LOW HVG HG HB
AVG HG HG HB
HIGH HB HB HB
• We increased the size of the window and switched from of "deliciousness", so that it can develop an inference engine
a rectangular window to a gaussian window because the based on these linguistic terms later on.
rectangular window was too sensible to high frequency We began the fuzzy AML work by assuming a hypothetical
noise. financial corporation, XCorp, whose business involves servic-
• We tweaked some of our membership functions over time, ing clients for sending and receiving money via the Internet. A
by looking at our false positive/negative, every time by significant body of knowledge in Money Laundering (ML) was
asking the oncall engineer what were the thresholds that amassed via knowledge engineering. In practice, it was done
made him decide which data was the root cause of the through systematic identification of suspicious transactions by
problem regarding the GFS masters. customer service representatives, labor-intensive case studies
by analysts, and finally, generalization of ML indicators by
both aforementioned parties. The followings showed the main
F. Conclusions on Fuzzy Load Monitoring
data sources:
We successfuly designed, implemented and deployed a • From existing fraud models, which isolate fraudsters who
fuzzy inference engine for monitoring the masters of a dis- might be ML risk as well
tributed file system. Future work involves measuring the nodes • From customer service representatives who view accounts
(chunkservers [8]) of the file system, and using health of the and are in contact with customers
master and the nodes to automatically determine available • From analysts who conduct web searches looking for
capacity and availability of the GFS cluster. questionable websites accepting/offering XCorp pay-
ments
IV. F UZZY D ETECTION FOR A NTI M ONEY L AUNDERING • From customers or third parties who become aware
of suspicious activity, such as when customers report
A. Problem Description
account takeover
Anti-money laundering (AML) was implemented in the U.S. A case study was shown as follows to highlight a real-world
by the Bank Secrecy Act of 1970. AML refers to the legal example:
controls that require financial institutions and other regulated
• Suspect, using a UK postal address developed a pattern of
entities to prevent or report money laundering activities [11].
creating XCorp accounts, receiving funds, sending funds
According to Wikepedia [12], AML is a term “mainly used in
to several French accounts, funds were then withdrawn
the financial and legal industries to describe the legal controls
to bank accounts. Receiving and sending accounts were
that require financial institutions and other regulated entities
closed after transactions were made. 25 accounts were
to prevent or report money laundering activities.” In US laws,
identified, with a total transaction amount involved of
money laundering includes all financial transaction generating
$32K. Sending and recipient accounts shared IPs and
an asset or a value as the result of an illegal act. For example,
machine fingerprints.
tax evasion and false accounting. All the financial institutions
are required to identify transactions of a suspicious nature and From the case study, a number of ML indicators was gener-
report to the financial intelligence unit in their country [13]. alized:
One popular approach [14], [15] is to apply scenarios and • Multiple accounts controlled by one party
risk factors to transactions to detect potentially suspicious • Lots of account activities within a short time window,
activity. Transactional events that meet the rule parameters such as opening - sending - closing and opening -
become alerts. Finally, alerts are subject to additional work- receiving - withdrawing - closing
flow processes, such as suppression, risk scoring and routing. • No viable business reasons
The idea was to come up with a score from amount received Figure 2. Absolute change percentage vs match score
and match score. For instance: Table II
• AML1 score is very high, if amount received is big and AML SCORE FOR EACH OF THE 9 FUZZY RULES
match score is high Amount Rcvd\Match score S M B
• AML1 score is very low, if amount received is small and L M H VH
M L M H
match score is low
S VL L M
Therefore, the AML1 score is a real number bounded by
[0,1],which represents the possibility of the account is a ML
violation. were constructed for the inference, as shown in table II. For
Note that the match score represented the degree of match instance:
for $ received and $ withdrawn in a time window. In essence, • If amt received is big and match score is high, then AML1
the score was a function of three parameters: amount received, is very high
amount withdrawn and time window. For instance • If amt received is medium and match score is medium,
• Match score is high, if $ withdrawn is within [80%, then AML1 is medium
120%] of $ received • If amt received is small and match score is low, then
• Match score is high, if $ withdrawn is within [80%, AML1 is very low
120%] of $ received
• Match score is high, if $ withdrawn immediately after $ 2) Fuzzy membership functions: As described in the last
received section, there were three fuzzy sets: amt received, match
• Match score is zero, if $ withdrawn after 3 days $ score and AML1, where amt received and match score were
received antecedent, and AML1 was consequence. Their membership
• Match score is moderate, if $ withdrawn within 1-3 days functions were defined as shown in figures 3, 4 and 5,
after $ received respectively.
In essence, match score is a real number bounded by [0,1].
The higher the score, the closer the match of the two dollar
amount
� a−b � within 1-3 days. Let’s define a difference measure as
b , and label it as “absolute % change,” where a and b are
� �
the $ withdrawn and received in a time window, respectively.
Three levels of absolute % change scores were calculated as
there were three different time windows: 1-, 2- and 3-day. E. Experimental Results
In addition, a “match factor” was defined and it took integer We tested the fuzzy AML1 scoring on some sampled ac-
values in {1,2,3}, representing three levels of time windows. counts and their transactions. Of the 710 accounts got scored,
• Match.score = 0.9(1-abs.%.change/0.2)(match.factor-1), the median AML1 was 0.55. 73 accounts whose AML1 score
if abs.%.change <= 0.2 > 0.8 were routed to a AML queue for human reviewing. The
• Match.score = 0, if abs.%.change > 0.2 queue would be worked in descending order of AML1 scores.
Essentially, the score got 10% and 19% discount, if $ with-
drawn in 2 and 3 days, respectively. Refer to figure 2 for
details.
V. F INAL R EMARKS
Figure 4. Math score membership functions Fuzzy computing (FC) is having an impact on many indus-
trial and financial operations, from monitoring and predictive
modeling to diagnostics and control [16]. It provides us
with alternative approaches to traditional knowledge-driven
reasoning systems and it overcomes their main flaws in
the rigidness of the rule structure. We have demonstrated
two successful real-world deployments of FC applications in
monitoring and detection. In particular, we described how to
monitor the healthiness of a distributed file system and how to
detect the suspicious transactions in money laundering. Both
systems leverage the tolerance for imprecision, uncertainty and
incompletness, which is the hallmark of the problems to be
solved. In addition, we developed a generic fuzzy inference
Figure 5. AML1 membership functions engine and contributed to the open source community[17]. In
the future, we expect the combination of fuzzy computing with
advances in probabilistic reasoning, voice recognition, text
For demonstration, two suspicious accounts were described as processing and computer vision, etc., will further improve and
follow. expand our problem-solving capability for a large spectrum of
For row 1 in table III, its $ received > $10K (Big) and industrial and financial problems.
final match score = 1 (Large), so its AML1 score = 1 (Very
Large). In this example, its initial MatchScore =1 since its R EFERENCES
absolute % change = 0 ($ received = $ withdrawn). In addition, [1] N. Rescher, Many-valued Logic, McGraw-Hill, New York, NY, 1969.
MatchFactor = 1 since all the withdrawals logged in a day [2] J. Lukasiewicz, Elementy Logiki Matematycznej Elements of
after $ received. Therefore there is no discount for the initial Mathematical Logic, Warsaw, Poland: Panstowowe Wydawinctow
Naukowe,1929.
MatchScore. As a result, the final MatchScore is the same as [3] M. Black, Vaguenes: an Exercise in Logical Analysis, Phil.Sci. vol. 4.,
the initial one. pp-427-455, 1937.
For row 2 in the table III, its $ received > 10K (Big) [4] L.A. Zadeh, Fuzzy sets, Information and Control, vol. 8, pp.338-353,
1965.
and final match score = 0.83 (Large), so its AML1 score [5] L.A. Zadeh, Foreword, in Handbook of Fuzzy Computation, E.H.
= 0.92 (Large). In this example, its absolute percentage of Ruspini, P.P. Bonissone, and W. Pedycz, Eds., Bristol, UK: Institute
change change = 1.5% ($ received > $ withdrawn), hence of Physics, 1998.
[6] Y-M. Pok and J-X. Xu, Why is Fuzzy Control Robust, in Proc. Third
its initial MatchScore = 0.92. However, all the withdrawals IEEE Intl. Conf. on Fuzzy Systems (FUZZ-IEEE’94), pp. 1018-1022,
logged within 2 days after $ received, hence its MatchFactor Orlando, FL, 1994.
= 2. Thus it implied that there was a 10% discount to the [7] E.H. Ruspini, P.P. Bonissone, and W. Pedycz, Handbook of Fuzzy
Computation, Bristol, UK: Institute of Physics, 1998.
initial MatchScore. As a result, the final MatchScore = 0.83 [8] Ghemawat, S., Gobioff, H., And Leung, S.-T. The Google file system,
(= 0.92 × 0.9). In Proc. of the 19th ACM SOSP (Dec. 2003), pp. 29-43.
[9] Wikipedia, Network monitoring,
http://en.wikipedia.org/wiki/Network_monitoring.
[10] Google, Protocol Buffers - Googleś data interchange format,
F. Conclusions on Fuzzy Detection http://code.google.com/p/protobuf/ .
We have presented an approach that uses fuzzy computing [11] Paul Allan Schott Reference Guide to Anti-Money Laundering and
Combating the Financing of Terrorism, World Bank, 2006.
to detect money laundering (ML) patterns in complex financial [12] Wikipedia, Anti Money Laundering, http://en.wikipedia.org/wiki/Anti-
transactions. We showed the process of knowledge engineering money_laundering
for intelligence gathering and understanding of patterns of ML. [13] Jackie Harvey, (2005) An evaluation of money laundering policies,
Journal of Money Laundering Control, Vol. 8 Iss: 4, pp.339 - 345.
[14] SAS Anti Money Laundering , http://www.sas.com/industry/financial-
Table III services/banking/anti-money-laundering
E XPERIMENTAL RESULTS FOR AML [15] Kingdon, J., AI fights money laundering, Intelligent Systems, IEEE,
Vol. 19, Issue 3, May-Jun 2004, pp.87 - 89.
Act $ recv #wtxn $ wtx match %change MScore1 MScore2 AML1Score
[16] Bonissone et al. Hybrid Soft Computing Systems: Industrial and
1 $35,306 2 $35,306 1 0.0% 1.00 1.00 1.00 Commercial Applications, Proceedings of the IEEE, 1999.
8 $25,713 3 $25,326 2 1.5% 0.92 0.83 0.92 [17] GFuzzy, http://code.google.com/p/gfuzzy/