A demonstration of using Indian Buffet Process for feature finding. Following A Nonparametric Bayesian Method for Inferring Features From Similarity Judgments. Navarro and Griffiths. NeurIPS 2006.
{-# LANGUAGE ScopedTypeVariables #-}module AdditiveClusteringDemo whereimport LazyPPL
import LazyPPL.Distributions
import LazyPPL.Distributions.IBP
import LazyPPL.Distributions.Memoizationimport Data.List
import Control.Monad
import Data.Monoidimport Data.Array
import Data.Maybe
import Graphics.MatplotlibAdditive clustering is a method for inferring features over a set of of objects using similarity data. The nature of the features is unknown, and the number of possible features is unknown, and inferred using the Indian Buffet Process. Each feature has an associated “saliency” weight indicating how much it contributes to the similarity coefficient. We also infer the saliency weights.
data Country = China | Cuba | Germany | Indonesia | Iraq | Italy | Jamaica | Japan | Libya | Nigeria | Philippines | Russia | Spain | US | Vietnam | Zimbabwe deriving (Eq, Ord, Show, Enum, Bounded, Ix)countries :: [Country]
countries = [China .. Zimbabwe]similarityData :: Array (Country,Country) DoublesimilarityData = array ((minBound,minBound),(maxBound,maxBound)) [((i,j),if i > j then (countriesDataset !! (fromEnum i) !! (fromEnum j)) else if i == j then 1 else (countriesDataset !! (fromEnum j) !! (fromEnum i))) | i <- countries , j <- countries]
where
countriesDataset :: [[Double]] =
[[ ],
[0.11],
[0.06, 0.04],
[0.43, 0.06, 0.03],
[0.06, 0.32, 0.04, 0.14],
[0.02, 0.09, 0.70, 0.02, 0.03],
[0.02, 0.59, 0.02, 0.14, 0.04, 0.10],
[0.69, 0.01, 0.26, 0.35, 0.03, 0.06, 0.03],
[0.03, 0.32, 0.01, 0.04, 0.70, 0.04, 0.11, 0.01],
[0.01, 0.12, 0.01, 0.04, 0.20, 0.03, 0.31, 0.01, 0.45],
[0.42, 0.12, 0.01, 0.87, 0.09, 0.02, 0.17, 0.31, 0.05, 0.04],
[0.51, 0.35, 0.55, 0.01, 0.13, 0.22, 0.02, 0.17, 0.05, 0.02, 0.03],
[0.02, 0.37, 0.58, 0.03, 0.04, 0.90, 0.20, 0.04, 0.04, 0.03, 0.04, 0.15],
[0.30, 0.11, 0.42, 0.03, 0.06, 0.20, 0.12, 0.46, 0.02, 0.04, 0.01, 0.43, 0.20],
[0.60, 0.12, 0.03, 0.55, 0.12, 0.01, 0.05, 0.45, 0.10, 0.03, 0.57, 0.08, 0.02, 0.12],
[0.01, 0.08, 0.01, 0.11, 0.15, 0.02, 0.29, 0.01, 0.31, 0.83, 0.08, 0.01, 0.02, 0.01, 0.03]]Data set taken from Representing Stimulus Similarity, PhD thesis, D. J. Navarro. Department of Psychology University of Adelaide. December, 2002
The Indian Buffet Process provides abstract types Restaurant and
Dish,
and the following interface:
newRestaurant :: Double -> Prob Restaurant
opens a new restaurant;
newCustomer :: Restaurant -> Prob [Dish]
assigns a finite set of dishes to a new customer.
Our additive clustering models has three parameters
(alpha, lambda1 and lambda2) and
produces a mapping from countries to lists of features. We use an Indian
Buffet Process abstraction, so we imagine that each country arrives as a
customer at a buffet, and takes some dishes, which are the features. We
can then assess which countries took similar dishes to see which
countries are similar.
additiveClustering :: Double -> Double -> Double
-> Array (Country,Country) Double -> Meas (Array Country [Dish])(Recall that in Haskell, the type (Array a b)
comprises a-indexed arrays of
things of type b.)
additiveClustering alpha lambda1 lambda2 similarityData = do
(restaurant :: Restaurant) <- sample $ newRestaurant alphaEach dish is randomly assigned a weight, which is how important it is to the similarities between countries.
(weights :: Dish -> Double) <- sample $ memoize (\d -> gamma lambda1 lambda2)Each country is assigned a list of dishes, by regarding it as a new customer in the restaurant.
(features :: Array Country [Dish])
<- listArray (minBound,maxBound) <$>
(replicateM (fromEnum (maxBound :: Country) + 1) $
sample $ newCustomer restaurant)We define a function that calculates the similarity of two countries given the dishes that they share.
let similarity :: Country -> Country -> Double
similarity i j = sum [weights a | a <- features!j, a `elem` features!i]We score based on how closely this similarity matches the experimental data.
mapM_ (\(i, j) -> score $ normalPdf (similarityData!(i,j)) 0.1 (similarity i j)) [ (i, j) | i <- countries, j <- countries, j < i ]We return the feature assignment.
return featuresWe sample from this using Metropolis Hastings simulation, and plot some MAP features. The columns represent the found features, with a coloured box if the country has that feature.
test = do
samples <- mhirreducible 0.2 0.01 $ additiveClustering 2 6 0.3 similarityData
return $ plotFeatureMatrix $ maxap $ take 100000 samplesThis is a highly multi-modal distribution. For the three runs, we typically have different numbers of features, and they are different. But in each case they admit a human interpretation, for example geographic, history, development status.
maxap samples =
let maxw = (maximum $ map snd samples) in
fromJust $ Data.List.lookup maxw $ map (\(z,w) -> (w,z)) samples plotFeatureMatrix :: (Ix x,Enum x,Bounded x,Eq y) => Array x [y] -> Matplotlib
plotFeatureMatrix features = do
let ys = nub $ concat $ elems features
let matrix = [[ if elem y (features ! x) then fromJust (elemIndex y ys) else 0 | y <- ys] | x <- [minBound..maxBound] ]
matShow matrix @@ [o2 "cmap" $ raw "twilight"] % yticks (map fromEnum countries) % ytickLabels (map (raw . show) countries) combinePlots filename fs = do
putStrLn $ "Generating " ++ filename ++ "..."
file filename $ foldl (\p i -> p % setSubplot i % fs !! i) (subplots @@ [o2 "ncols" (length fs), o2 "sharey" True, o2 "gridspec_kw" $ lit "{'width_ratios': [1, 1, 1]}"]) [0..(length fs -1)]
putStrLn "Done."main = do { ps <- replicateM 3 test ; combinePlots "images/additiveclustering-matrix.svg" ps }