I found this code on GitHub and i try to understand the behavior of functions.I tried to compare this code with the formulas (from p. 6 of this paper):
I cannot find where this formulas are implemented into the code.Can anyone help me to explain the similarities between the formulas and the code?
class FCM() :
def __init__(self, n_clusters=17, max_iter=100, m=2, error=1e-6):
super().__init__()
self.u, self.centers = None, None
self.n_clusters = n_clusters
self.max_iter = max_iter
self.m = m
self.error = error
def fit(self, X):
N = X.shape[0]
C = self.n_clusters
centers = []
u = np.random.dirichlet(np.ones(C), size=N)
iteration = 0
while iteration < self.max_iter:
u2 = u.copy()
centers = self.next_centers(X, u)
u = self.next_u(X, centers)
iteration += 1
# Stopping rule
if norm(u - u2) < self.error:
break
self.u = u
self.centers = centers
return centers
def next_centers(self, X, u):
um = u ** self.m
return (X.T @ um / np.sum(um, axis=0)).transpose() #Vi
def next_u(self, X, centers):
return self._predict(X, centers)
def _predict(self, X, centers):
power = float(2 / (self.m - 1))
temp = cdist(X, centers) ** power
denominator_ = temp.reshape((X.shape[0], 1, -1)).repeat(temp.shape[-1], axis=1)
denominator_ = temp[:, :, np.newaxis] / denominator_
return 1 / denominator_.sum(2)
def predict(self, X):
if len(X.shape) == 1:
X = np.expand_dims(X, axis=0)
u = self._predict(X, self.centers)
return np.argmax(u, axis=-1)
img2 = ret.reshape(x * y, z)
algorithm = FCM()
cluster_centers = algorithm.fit(img2)
output = algorithm.predict(img2)
img = cluster_centers[output].astype(np.int16).reshape(x, y, 3)

I believe the most relevant parts are the following:
fitdescribes the general steps of the algorithm:norm(u - u2) < self.error)._predict:powercdistfrom scipy.spatial.distance. A single call to this function is used to compute the distances between all combinations of a data point xj and a cluster center vi. Each of the results is raised to thepowerand the result is stored in atemparray._predictjuggle around the entries of thetemparray in such a way that, without loops, it computes the divisions between each "distance" and the appropriate sum of "distances" (actually, "powerof distance(s)", but it is easier to ignore the exponent at first). For this, the code uses a few tricks available in numpy, such asreshape,repeat, and indexing.next_centers:umcontaining them-th power of each uij (for later use in both the numerator and denominator)np.sum(um, axis=0)to create an array whosei-th entry is the sum which is used in the denominator when computing thei-th clusteri-th cluster, the numerator of (5) will be computed in thei-th column ofX.T @ um(which becomes a row once we.transpose()the resulting matrix)