IN THE PROPERTY and casualty (P&C)
insurance business, nothing's more
important, or challenging, than knowing
your customers and the items they want
to insure. P&C insurers figure the pricing
of their policies based on the calculated
risk of the property being insured -- a type of car, for
instance -- and the owner. Normal insurance practices try
to identify various groups and their associated risk, such
as the infamous male drivers under age 25 who drive
sports cars: high risk.
But since insurance companies are also subject to
competitive pressures, it's not as simple, once the risk is
calculated, as "spreading the risk." If a company
overcharges low-risk customers in an effort to balance
higher-risk ones, it may lose those low- risk -- and highly
profitable -- customers to competitors who more accurately
price their policies. And if a company undercharges
high-risk customers, it may end up attracting more
high-risk customers away from competitors, at once
damaging its bottom line while improving the profitability
of the competition!
Fortunately, insurers collect reams of client data that can
help identify various risk groups, not so much to avoid
certain groups, but to fairly and profitably price policies for
various types of risk. Unfortunately, identifying risk
groups has been largely a speculative process: develop a
hypothesis, then check to see if the data supports it.
Deep Computing's data mining applications have changed
all that.
FARMERS INSURANCE GROUP decided to give data mining a
try and called in IBM. The insurance company had plenty
of data to mine -- 35 million records from over 2.4 million
policies spread over 7 different databases, approximately
30 gigabytes in all. And that was only from its auto
business in one state.
The IBM team set about determining the best approach to
mining this data. By combining its own expertise with the
domain knowledge of Farmers' actuaries and market
analysts, the team was able to focus the mining attempts
on a specific set of policy types in a certain region (known
as books of business in industry parlance). They
developed the necessary algorithms to search through the
data and confirm known sets of "rules" (such as "male
drivers under age 25 who drive a sports car have a claim
frequency of 25% and an average claim amount of
US$3200"). As part of the process, they also hoped to
discover previously unidentified rules, and hence, risk
groups.
One particular challenge the group faced was in designing
algorithms that allowed for simultaneous modeling of both
claim frequency and average claim amounts. To mine for
either statistic separately, and then combine the results in
figuring the cost to insure, would lead to erroneous
conclusions since each initial result might be based on
different sub-populations of customers -- the equivalent of
polling the hungriest group of people in a room, then the
thirstiest, and trying to base a lunch order for the
combined group based on that information. Only by mining
for risk groups while modeling both features
simultaneously could accurate and verifiable results be
guaranteed.
The Underwriting Profitability Analysis (UPA) solution
developed by the team, which was run at IBM Research in
Yorktown and also on Farmers' RS/6000 machine at the
company's headquarters, provided some incredible
"nuggets" of knowledge: the company had far more market
segments and sub-segments than previously known,
including a few that were counterintuitive. Farmers found
that covering a certain type of "high-risk" sports car was
not always so risky -- in fact, it could be quite profitable,
provided the owner had at least one other vehicle.
In all, some 43 individual pieces of essential business
information were found, with one of those pieces alone
capable of generating over $2 million in a combination of
higher revenues and lower claims. By extending data
mining to other "books of business," Farmers and other
e-business insurers will be able to turn the data already
being collected on clients into real competitive advantage.
FUTURE APPLICATIONS: It is difficult to predict specifically
what "nuggets" will be discovered by data mining , except
that they will almost always be pieces of insight and
knowledge previously unobtainable. Current efforts
include helping customers develop advanced -- and more
focused -- marketing, intelligent customer relationship
management, and analysis of Web usage data to make
Web surfing more enjoyable and useful. Data mining of
network access patterns will also likely reveal patterns of
illicit use, enabling early "hacker detection."
Since the fully networked world of pervasive computing
devices will produce a wealth of data several magnitudes
of order greater than is captured today, one thing is
certain: Deep Computing's data mining will be an essential
part of any successful business strategy.