Best unofficial Apache Server developers community |
|
Hi, I ran a clustering test on crawled pages (more than 25K docs ; personal data set). I've done a clusterdump :
The output after running cluster dumper is shown 25 elements "VL-xxxxx {}" :
How to interpret this output? In short : I am looking for document ids which belong to a particular cluster. What is the meaning of :
Does 0:0.017 means "0" is the document id which belongs to this cluster? I already have read on mahout wiki-pages what CL, n, c and r means. But can someone please explain them to me better or points to a resource where it is explained a bit more in detail? Sorry, if i am asking some stupid questions, but i am a newbie wih apache mahout and using it as part of my course assignment for clustering.
posted via StackOverflow
|
|
 
|
I think you need to read the source code -- download from http://mahout.apache.org. |