A Review On Knowledge Sharing in Collaborative Environment
A Review On Knowledge Sharing in Collaborative Environment
ABSTRACT :The ultimate goal of individuals seeking the help from web is to acquire a particular set of data regarding
a domain through collaborative environment. In an organisation, the employees or higher authorities may require to
work with some business insight software or purchase the same, for this many must have referred the elements or
products online. The fine grained knowledge acquired through their surfing
may be shared with the employees to know about the software and share the learned knowledge. We perform the
dissection of individuals web surfing history to get the fine grained knowledge. Following are the two stages in which
fine grained learning is mined.
1. A non parametric generative model is used to prepare sets of web surfing data.
2. A original discriminative Support Vector Machine(SVM) is created to mine fine grain information in every project. To
find proper individuals, those who are sharing information, the excellent master enquiry technique is connected to get
mined results. To establish the fine grained overview of mining system the probes web
surfing information is gathered from the browser. When it is enhanced or joined with master hunt, the precision grows
notably in relation with applying the fantastic master technique straight forwardly on web surfing data.
KEYWORDS : Advisor Search ,Fine Grained Knowledge Sharing, Text Mining Graphical Modules, Non Parametric
Generative Models, Collaborative Environment ,Client-Server Model.
I. INTRODUCTION
PROJECT IDEA
With the web and with partners/companions to obtain data is a day by day routine of numerous people. In a community
situation, it could be basic that individuals attempt to procure comparative data on the web keeping in mind the end goal
to increase particular information in one area. For case, in an organization a few divisions might progressively need to
purchase business intelligence (BI) programming, and representatives from these divisions may have concentrated on
online about diverse BI instruments and their elements freely. In an examination lab, individuals are regularly centered
around tasks which require comparable foundation information. In these cases, depending on a correct individual could
be much more productive than studying without anyone elses input, since individuals can give processed data,
experiences and live associations, contrasted with the web.
For the first situation, it is more profitable for a worker to get advices on the decisions of BI devices and clarifications of
their components from experienced representatives; for the second situation, the first analyst could get proposals on
model configuration and great taking in materials from the second scientist .A great many people in synergistic situations
would be glad to impart encounters to and offer recommendations to others on particular issues. On the other hand,
discovering a perfect individual is testing because of the assortment of data needs. In this paper, we explore how to
empower such learning sharing system by dissecting client information.
PROBLEM STATEMENT
We examine fine-grained knowledge sharing in community oriented situations. We propose to dissect individuals web
surfing information to compress the fine grained learning gained by them.
Depending on a correct individual could be much more productive than studying without anyone elses input, since
individuals can give processed data, experiences and live associations, contrasted with the web.
predictive models of a sequential corpus, dynamic topic models provide a qualitative window into the contents of a large
document collection.
Expert search aims at retrieving people who have expertise on the given query topic. Early approaches involve building a
knowledge base which contains the descriptions of peoples skills within an organization . Expert search became a hot
research area since the start of the TREC Balog et al. proposed a language model framework for expert search. Their
Model 2 is a document-centric approach which first computes the relevance of documents to a query and then
accumulates for each candidate the relevance scores of the documents that are associated with the candidate. This process
was formulated in a generative probabilistic model. Balog et al. showed that Model 2 performed better and it became
one of the most prominent methods for expert search. Other methods have been proposed for enterprise expert search but
the nature of these methods is still accumulating relevance scores of associated documents tocandidates. Expert retrieval
in other scenarios has also been studied, e.g. online question answering communities , academic society .
Disadvantages of this method:
1) An analyst might need to tackle an information mining issue utilizing nonparametric graphical models which
she is not acquainted with but rather have been concentrated on by another analyst some time recently.
2) Here we may have to go through repetitive data which may consume our time and may even compromise with
the quality of Information.
The proposed advisor search problem is different from traditional expert search. (1) Advisor search is dedicated to
retrieving people who are most likely possessing the desired piece of fine-grained knowledge, while traditional expert
search does not explicitly take this goal. (2) The critical difference lies in the data, i.e. sessions are significantly different
from documents in enterprise repositories. A person typically generates multiple sessions for a microaspect of a task, e.g.
a person could spend many sessions learning about Java multithreading skills. In other words, the uniqueness of sessions
is that they contain semantic structures which reflect peoples knowledge acquisition process. If we treat sessions as
documents in an enterprise repository and apply the traditional expert search methods , we could get incorrect ranking:
due to the accumulation nature of traditional methods, a candidate who generated a lot of marginally relevant sessions
(same task but other microaspects) will be ranked higher than the one who generated less but highly relevant sessions for
the query Java multi-thread programming .Therefore, it is important to recognize the semantic structures and
summarize the session data into micro-aspects so that we can find the desired advisor accurately. In this paper we
develop nonparametric generative models to mine microaspects and show the superiority of our search scheme over the
simple idea of applying traditional expert search methods on session data directly.
Advantages of the proposed system:
1) Web surfing information is grouped into assignments by a nonparametric generative model.
2) A novel discriminative limitless Hidden Markov Model is created to mine fine-grained angles in every
undertaking.
k-means is one of the simplest unsupervised learning algorithms that solve the well known clustering problem. The
procedure follows a simple and easy way to classify a given data set through a certain number of clusters (assume k
clusters) fixed apriori. The main idea is to define k centers, one for each cluster.
Algorithm steps are:
Let X = {x1,x2,x3,..,xn} be the set of data points and V = {v1,v2,.,vc} be the set of centers.
1) Randomly select c cluster centers.
2) Calculate the distance between each data point and cluster centers.
3) Assign the data point to the cluster center whose distance from the cluster center is minimum of all the cluster centers.
Support vector machines(SVMs, also support vector networks are supervised learning models with associated
learning algorithms that analyze data used for classification and regression analysis. It is non-probabilistic binary linear
classifier. Given a set of training examples, each marked for belonging to one of two categories, an SVM training
algorithm builds a model that assigns new examples into one category or the other.
Support vector machine constructs a hyperplane or set of hyperplanes in a high- or infinite-dimensional space, which can
be used for classification, regression, or other tasks. A good separation is achieved by the hyperplane that has the largest
distance to the nearest training-data point of any class (so-called functional margin).
Mining of fined grained knowledge is done by combining the process of clustering and then classifying the clustered
data using SVM.
V. REFERENCES
(1) K. Balog, L. Azzopardi, and M. de Rijke, Formal models for expert finding in enterprise corpora, in Proc. 29th
Annu. Int. ACM SIGIR Conf. Res. Develop. Inf. Retrieval, 2012, pp. 4350.
(2) M. J. Beal, Z. Ghahramani, and C. E. Rasmussen, The infinite hidden Markov model, in Proc. Adv. Neural Inf.
Process. Syst.,2011, pp. 577584.
(3) M. Belkin and P. Niyogi, Laplacian Eigenmaps and spectral techniques for embedding and clustering, in Proc. Adv.
Neural Inf. Process. Syst., 2001, pp. 585591.
(4) D. Blei and M. Jordan, Variational inference for Dirichlet process mixtures, Bayesian Anal., vol. 1, no. 1, pp.
121143, 2014.
(5) D. M. Blei, T. L. Griffiths, M. I. Jordan, and J. B. Tenenbaum,Hierarchical topic models and the nested Chinese
restaurant process, in Proc. Adv. Neural Inf. Process. Syst., 2013, pp. 1724.
(6) D. M. Blei and J. D. Lafferty, Dynamic topic models, in Proc. Int. Conf. Mach. Learn., 2007, pp. 113120.
(7) D. M. Blei, A. Y. Ng, and M. I. Jordan, Latent Dirichlet allocation, J. Mach. Learn. Res., vol. 3, pp. 9931022, 2006.
(8) P. R. Carlile, Working knowledge: How organizations manage what they know, Human Resource Planning, vol. 21,
no. 4, pp. 5860, 2010.
(9) N. Craswell, A. P. de Vries, and I. Soboroff, Overview of the TREC 2005 enterprise track, in Proc. 14th Text
REtrieval Conf.,2009, pp. 199205.
(10) H. Deng, I. King, and M. R. Lyu, Formal models for expert finding on DBLP bibliography data, in Proc. IEEE 8th
Int. Conf. Data Mining, 2008, pp. 163172.