Tanimoto Coefficient

Digg!
Posted by marco
Wed, 02 Jul 2008 16:40:00 GMT

A small Ruby snippet to calculate the Tanimoto coefficient (aka extended Jaccard coefficient) of two real vectors.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

#!/usr/bin/ruby

class Array
   def sum
      inject( 0 ) { |sum,x| sum+x }
   end
   def sum_square
      inject( 0 ) { |sum,x| sum+x*x }
   end
   def *(other) # dot_product
      ret = []
      return nil if !other.is_a? Array || size != other.size
      self.each_with_index {|x, i| ret << x * other[i]}
      ret.sum
   end
end

def tanimoto(a, b)
   dot = (a*b)
   den = a.sum_square + b.sum_square - dot
   dot.to_f/den.to_f
end

# puts tanimoto([1,2,2],[3,3,1])


Pearson Correlation Coefficient

Digg!
Posted by marco
Tue, 01 Jul 2008 12:26:00 GMT

A small Ruby snippet to calculate the Pearson product-moment correlation coefficient:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36

#!/usr/bin/ruby

class Array 
   def sum
      inject( 0 ) { |sum,x| sum+x }
   end
   def sum_square
      inject( 0 ) { |sum,x| sum+x*x }
   end
   def mean
      sum / size
   end
   def *(other)
      ret = []
      return nil if !other.is_a? Array || size != other.size
      self.each_with_index {|x, i| ret << x * other[i]}
      ret
   end
end


x = Array.new(100) {|i| rand(10) }
y = Array.new(100) {|i| rand(20) }


sumx = x.sum
sumy  = y.sum
num = (x.size * (x*y).sum) - (sumx * sumy)
den = Math.sqrt((x.size * x.sum_square) - sumx * sumx) * Math.sqrt((y.size * y.sum_square) - sumy * sumy)
r = num /den

puts "X= [#{x.join(', ')}]"
puts "Y= [#{y.join(', ')}]"
puts "r= #{r}"

Explore Data

Digg!
Posted by marco
Mon, 12 Feb 2007 16:58:00 GMT

A brilliant way to find unusual correlation, Swivel.