w3hello.com logo
Home PHP C# C++ Android Java Javascript Python IOS SQL HTML videos Categories
  Home » MACHINE LEARNING » Page 1
Neural Nets Mixed Real-valued and Categorical Input Features
I can answer to question #2, I'm not really prepared on RFs so I will just leave that answer to more skilled people. As far as point 2 goes, if you transform each of your categorical inputs into a k-vector (with k = # of classes) you are just introducing k new inputs, which are scaled in the range [0, 1], so if your real-valued input features are themselves scaled in that range you're pretty much

Categories : Machine Learning

Vowpal Wabbit training and testing data formats
The bar symbol (|) must be also in the format for predictions: | price:.23 sqft:.25 age:.05 2006 | price:.18 sqft:.15 age:.35 1976 | price:.53 sqft:.32 age:.87 1924 If you don't include the correct labels, vw cannot compute the test loss, of course. To get the predictions use vw -d test_set.vw -t -p predictions.txt. The training set in the tutorial (with three examples only) is too small to tra

Categories : Machine Learning

Can I use a Naive Bayesian Classifier with enumerated data?
Yes, in bayesian classification, u just need to determine the class specific distribution on its support which you can easily do from the data. Now u can compute the posterior distribution for each class and then do a map estimates. Actually for documents the distribution is defined for each word of a dictionary given the document class as spam or not spam. For details refer to andrew ng notes on

Categories : Machine Learning

Ranking algorithm with missing values and bias
In this case, two imputation methods can be used: As everyone would try at first, fill with the most likely value i.e. average mean. Predict based on other attributes which is called imputation by regression. Actually, I think the second method seems better for this dataset where users mostly rank more than one product. Also, if you have another datasets depending on users, you may use it too

Categories : Machine Learning

Getting filename when using TextDirectoryLoader - weka
In weka, those text files and classes become an Instance and the filenames are not saved in Instance class. Instead, you can get the text content of that file which got classified. double pred = 0d; Instance current = getInstance(); pred = classifier.classifyInstance(current); System.out.println(" Text: "+current.attribute(0)); // Change index according to your dataset System.out.prin

Categories : Machine Learning

Plotting the Kohonen map - Understanding the visualization
The SOM is a non-supervised clustering algorithm. As such it represent similar samples, closer on the feature map (this is, similar samples will fire nodes that are closer together). So lets assume you have 10000 samples with 10 features each, and a 2d-SOM of 20x20x10 (400 nodes with 10 features). After training you therefore clustered 10000 samples into 400 nodes. Further, you can try to identif

Categories : Machine Learning

Why would my neural network give different values for the same input?
XOR function is not trivial for a neural network. With a very few samples of training data you will be able to build OR function or AND functions. However, for XOR you may need more training data, more neurons, or more layers. If you are just testing your learning system, I suggest to start with a simple function like OR. If it works, then give it more training data and try to adjust the hyper-par

Categories : Machine Learning

How does Support Vector Machine compare to Logistic Regression?
Why sometime SVM can perform better than LR? And sometime not? You could pose this question for any two statistical methods x and y. There will always exist certain cases where one performs better than the other. This behaviour is often summarized by the words "there is no free lunch". Now, your particular question on support vector machines and logistic regression is very broad, such that

Categories : Machine Learning

Measuring success of Restricted Boltzmann Machine
A RBM is an unsupervised learning paradigm, and therefore is difficult to access whether one is better than another. Nevertheless, they are usually used as a pre-training of recent and more exciting networks such as DBN. So my suggestion would be to train as much RBMs as you want to compare (unsupervised learning) and then give them to a feedforward layer for learning (supervised learning). From

Categories : Machine Learning

Classification algorithm used as Regression algorithm
In general - no. Classification is not directly convertable to the regression (the opposite direction is much easier). You could obviously create some finite set of "buckets" of values and treat them as labels but in general I have never seen it perform better than even simpliest regressors. Why would you want to do something like this? Why do not use regressors for regression tasks?

Categories : Machine Learning

How possible vector operations on a matrix that does not fit memory
Yes, the famous Hadoop is an open source computing platform, which can be used for operations on pretty big matrices (and not only for that). For examples, please read this page.

Categories : Machine Learning

How to predict several unlabelled attributes at once using WEKA and NaiveBayes?
I can't tell if there's something wrong with your arff file. However, here's one idea: you can add a NominalToBinary unsupervised-Attribute-filter to make sure that the attributes slot1-slot96 are recognized as binary.

Categories : Machine Learning

How to classify text with scikit's SVM?
As others have pointed out, your matrix is just a list of feature vectors for the documents in your corpus. Use these vectors as features for classification. You just need classification labels y and then you can use SVC().fit(X, y). But... the way that you have asked this makes me think that maybe you don't have any classification labels. In this case, I think you want to be doing clustering

Categories : Machine Learning

stanford NER classification with additional classes
The major hassle for training the model over other classes is the training data. Models require highly accurate training data like I brought a <START:product> Mac Book Pro <END> in September and synced it with my <START:device> IPhone <END>. Observe that Iphone could be annotated with either device or product. If you can generate or annotate at least 15,000 sentences annot

Categories : Machine Learning

Vowpal Wabbit Logistic Regression
Predictions are in the range [-50, +50] (theoretically any real number, but Vowpal Wabbit truncates it to [-50, +50]). To convert them to {-1, +1}, use --binary. Positive predictions are simply mapped to +1, negative to -1. To convert them to [0, +1], use --link=logistic. This uses the logistic function 1/(1 + exp(-x)). You should also use --loss_function=logistic if you want to interpret the nu

Categories : Machine Learning

Using RapidMiner to train a model from multiple files
It is possible to use the Update Model operator to update a previously created model with new example set data. Not all model operators can be used this way, Naive Bayes and k-NN do work as does Weka's W-IBk. It would be possible to create a process within RapidMiner to split files into smaller pieces, read them one by one and create a model from these.

Categories : Machine Learning

Liblinear vs Pegasos
Both LIBLINEAR and Pegasos are linear classification techniques that were specifically developed to deal with large sparse data with a huge number of instances and features. They are only faster than the traditional SVM on this kind of data. I never used Pegasos before, but I can assure you that LIBLINEAR is very fast with this kind of data and the authors say that "it is competitive with or even

Categories : Machine Learning

Why do we use gradient descent in linear regression?
The example you gave is one-dimensional, which is not usually the case in machine learning, where you have multiple input features. In that case, you need to invert a matrix to use their simple approach, which can be hard or ill-conditioned. Usually the problem is formulated as a least square problem, which is slightly easier. There are standard least square solvers which could be used instead of

Categories : Machine Learning

Clustering before classification in Weka
One way that you could add cluster information to your data is using the below method (in Weka Explorer): Load your Favourite Dataset Choose your Cluster Model (In my case, I used SimpleKMeans) Modify the Parameters of the Clusterer as Required Use the Training Set for the Cluster Mode Start the Clustering Process Once the Clusters have been generated, Right-Click on the Result List and select '

Categories : Machine Learning

Classify new instance that have new value in some features with existing model
As a short-term solution, perhaps what you could do is set the value of the attribute to 0 or 1 (within the range of the original dataset) depending on the value of the attribute. A longer-term solution would be to include such cases in future training of the neural network. Such values may cause the values of other instances to be skewed to the left or right so some attention may be required fo

Categories : Machine Learning

Confusion regarding difference of machine learning and statistical learning algorithms
The authors seem to distinguish probabilistic vs non-probabilistic models, that is models that produce a distribution p(output | data) vs those that just produce an output output = f(data). The description of the non-probabilistic algorithms is a bits odd to my taste, though. The difference between a (linear) support vector machine, a perceptron and logistic regression from the model and algorith

Categories : Machine Learning

Multi-label classification involving range of numbers as labels
You can preprocess your data with OneHotEncoder to convert your one 1-to-100 feature into 100 binary features corresponding to each value of interval [1..100]. Then you'll have 100 labels and learn a multiclass classifier. Though, I suggest to use Regression instead.

Categories : Machine Learning

In DBSCAN, how to determine border points?
This largely depends on the implementation. The best way is to just play with the implementation yourself. In the original DBSCAN1 paper, core point condition is given as N_Eps>=MinPts, where N_Eps is the Epsilon neighborhood of a certain data point, which is excluded from its own N_Eps. Following your example, if MinPts = 4 and N_Eps = 3 (or 4 including itself as you say), then they don't for

Categories : Machine Learning

Text Feature Representation As Vectors for SVM
It doesn't matter what classifier you use (SVM or not) the feature generation for text is the same. I suggest you to take a look at this: Binary Feature Extraction Also this library would make your life much easier: http://cogcomp.cs.illinois.edu/page/software_view/LBJ A tutorial is here: http://cogcomp.cs.illinois.edu/page/tutorial.201310

Categories : Machine Learning

apply machine learning to analysis mixed language
Natural language processing is a large and diverse field. You can think about your example a number of ways. The first is character sets and symbol encoding. Most non-romance languages will have characters outside the standard 26 letter alphabet. If you see characters from inside and outside the core character ranges for a language, it works around needing a lot of dictionaries. The second is

Categories : Machine Learning

Feature scaling (normalization) for clustering algorithms (as Kmeans & EM)
K-means and EM are for numeric data only. It does not make much sense to apply them on name/date/price typed data. As the name indicates, the algorithm needs to compute means. How would you compute a mean in your "name" column? You can hack something for the date, but not for the name. Wrong tool for your job.

Categories : Machine Learning

Machine Learning - Feature selection and training data
Simply put, feature selection essentially says (for example): "Of the 5 attributes of the input vector, only features 1,3,4 are useful. Features 2,5 are junk. Don't use them at all". This goes for both the training and the test patterns, since they come from the same distribution. So you drop features 2 and 5 from both the training and test patterns, and then you train and test your classifier in

Categories : Machine Learning

Multilabel model scores better than the same model with binary-labels in scikit-learn
It seems like when you binarize labels, random forest can predict multiple labels at once, while predicting only the most probable label in the initial case. F1 score is sensitive to that. UPD: I'm wrong. I've tested it and it my case it always returns only one label, but score is still bad. UPD2: I'm not so wrong as I thought. sum(sum(prediction2)) appears to be lesser than len(prediction), so

Categories : Machine Learning

How to predict a continuous dependent variable that expresses target class probabilities?
You might be able to approximate this using sample weighting - assign a sample to the class which has the highest probability, but weight that sample by the probability of it actually belonging. Many of the scikit-learn estimators allow for this. Example: X = [1, 2, 3, 4] -> class 0 with probability .7 would become X = [1, 2, 3, 4] y = [0] with sample weight of .7 . You might also normalize so t

Categories : Machine Learning

Implementations of Hierarchical Reinforcement Learning
Your actual state is the robot's position and orientation in the world. Using these sensor readings is an approximation, since it is likely to render many states indistinguishable. Now, if you go down this road, you could use linear function approximation. Then this is just 24 binary features (12 0|1 + 6*2 near|far|very_far). This is such a small number that you could even use all pairs of featur

Categories : Machine Learning

SVM as a type of instance-based learning?
I think the best would be to ask Prof Domingos directly. SVMs indeed employ a hyperplane - both are binary after all. However comparing SVM with formulation of LR -- unlike LR, SVM is not probabilistic. HTH, although surely one could argue that all ML is instance-based.

Categories : Machine Learning

What's the low dimensional?
There was the question about difference between PCA and SVD on math section. You can check this out: http://math.stackexchange.com/questions/3869/what-is-the-intuitive-relationship-between-svd-and-pca

Categories : Machine Learning

How prolog work as intellegent language?
Prolog power derives from logical variables, coupled with an embedded search algorithm, and expressive and uniform data structuring facilities. It implements a relational data model, but on a more structured values domain than SQL. In a sense, I think of it as the antesignan 'No SQL' language. So we can code - with care - relations among complex data structures, like those that in early NLP rese

Categories : Machine Learning

Back-propagation algorithm converging too quickly to poor results
There are many parameters that need to be tuned to get a multi-layer neural net to work. Based on my experiment, my first suggestions are: 1- give it a small set of synthesized data and run a baby project to see if the framework works. 2- Use a more convex cost function. There is no function that guarantees convexity, but there are many functions that are more convex that RMS. 3- Try scaling yo

Categories : Machine Learning

Navie Bayes classifier in Data Mining
Evaluation is (usually) not part of the classifier. It's something you do seperately to evaluate if you did a good job, or not. If you classify your test data using naive bayes, you can perform exactly the same kind of evaluation as with other classifiers!

Categories : Machine Learning

How do we get/define filters in convolutional neural networks?
You can follow the tutorial : http://ufldl.stanford.edu/wiki/index.php/UFLDL_Tutorial This is like a lecture on both auto-encoders and some simple stuff about CNN (convolution and pooling). When you complete this tutorial you will have both auto-encoder implementation and stacked-auto-encoder in your words deep auto-encoder implementation ready. This tutorial will have exactly what you ask for:

Categories : Machine Learning

How is the desired output of a neural network represented so as to be compared with the actual output?
The main idea is that you don't create one single output for everything and ask it "what digit is this??". You create one output for each digit, and you ask each one "is this digit x??". So, the desired output must be encoded with a 1Xn vector, where n is the number of classes. All values will be 0, and the value corresponding to the desired class will be 1. In your case for example, create a 1X

Categories : Machine Learning

Implementation advice on semi-supervised automated tagging
For this exact problem I have written a PhD thesis which I called Generative AI. Since you probably are not going to read the thesis here is the general algorithm for these kind of problems: 1) normalize the data: make certain that the range is between 0 and 1, or -1 and 1 if you have numbers; if you have words/names use only lowercase (or only uppercase); if you have both, split the data in numb

Categories : Machine Learning

Not able to Compute cost for 1 variable in Cost Function
Is it possible that you make an error when calling computeCost? You mention you are running the script from computeCost.m. (I think the best is you describe better which code pasrt is in which file and how you call the functions) The rule is: If your function name is "computeCost", the function should be implemented (function until endfunction) in a file called "computeCost.m". (There are some ex

Categories : Machine Learning




© Copyright 2018 w3hello.com Publishing Limited. All rights reserved.