w3hello.com logo
Home PHP C# C++ Android Java Javascript Python IOS SQL HTML videos Categories
Extract HOG from a single pixel using VLFEAT
Computer Vision System Toolbox for MATLAB has a function detectMSERFeatures for detecting MSER regions, and a function extractHOGFeatures for computing the HOG descriptors at given locations.

Categories : Matlab

kmeans prediction using R
kmeans works on numerical matrices only. As @thelatemail pointed out, column 5 of iris isn't numeric. you could use use cl_predict from clue instead of predict.kmeans

Categories : R

Scipy: Kmeans and vq how to use them?
Everything depended on the way the whitening function is built... def whiten(obs) std_dev = std(obs, axis=0) return obs / std_dev that does not make any sense since it is by default computed on the first column, and in my case was always 0 since there is no data at that time. Fixed making my own whitening function, that returns the numpy array whitened over the std of the whole data set and not only on the first axis! def whiten(obs) std_dev = std(obs) return obs / std_dev Way better results obtained from a quantization error of 1000+ to 0.003 with the same iterations!

Categories : Python

kmeans with big data
Do you really need 5000 clusters? k-means performance scales with the number of clusters, so you're hurting yourself quite a bit with such a high number of clusters there. If you can stand to reduce the number of clusters, that will help a lot. Are you sure you need all 512 dimensions? If you can trim out or combine some of those dimensions that could also help. Have you tried running PCA on your data? Maybe you could try running k-means on just the top 10 components or something like that. Does it have to be k-means? You could try other algorithms like hierarchical clustering or self-organizing maps and see if those don't perform faster. I'd recommend taking a sample of your data (maybe N=100K) and speed test a few clustering algorithms on that. Revolution R is definitely supposed to be

Categories : R

Kmeans with an huge array
Agreed with guys above. The real question is that do you really have such a laptop/desktop with such huge memory. If the answer is Yes, you can simply write C programs to do the work. Otherwise, you may have to figure out a distributed solution, such as Mahout on Hadoop. Another option is that you may want to take sample out of all data somehow, and do clustering on the sample, if it is acceptable to your requirement.

Categories : C++

Clustering Baseline Comparison, KMeans
I would first check the UCI repository for data sets: http://archive.ics.uci.edu/ml/datasets.html?format=&task=clu&att=&area=&numAtt=&numIns=&type=&sort=nameUp&view=table I believe there are some in there with the labels. There are text clustering data sets that are frequently used in papers as baselines, such as 20newsgroups: http://qwone.com/~jason/20Newsgroups/ Another great method (one that my thesis chair always advocated) is to construct your own small example data set. The best way to go about this is to start small, try something with only two or three variables that you can represent graphically, and then label the clusters yourself. The added benefit of a small, homebrew data set is that you know the answers and it is great for debugging.

Categories : Machine Learning

kmeans clustering with limited memory
If you can't decrease the sustained memory usage of your operations, you should look for this answer for advice on increasing your memory allotment inside apps or change to another provider. For $20/ month this is a simple request of a rackspace server, although by definition it's closer to the metal and requires more setup.

Categories : Python

R data.table and kmeans clustering
the centers element is a matrix (it will contain as many columns as columns in the x argument to kmeans. If you want to find the clusters considering xcord and ycord in the same clustering episode you will need to pass a matrix to kmeans. You will then have to coerce back to data.table afterwards. this will keep the names sensibly. # eg. fx <- x[,data.table(kmeans(cbind(xcord,ycord),centers=2)$centers),by=id] fx # id xcord ycord # 1: a 2.666667 3.333333 # 2: a 8.500000 9.500000 # 3: b 7.500000 20.000000 # 4: b 1.000000 2.500000

Categories : R

Why am I not getting points around clusers in this kmeans implementation?
You don't see multiple points because your data are discrete, categorical observations. K-means is really only suitable for grouping continuous observations. Your data can only appear on three points on the plot you've shown and three points don't make a nice "cloud" of data. This suggests to me that k-means is probably not appropriate for your specific problem. Incidentally, when I run the code above, I get the plot below, which is different from the one you've shown us. Perhaps this is more like what you are expecting? The green green data point belongs to (is "around") the upper-right cluster centre indicated by a black asterisk.

Categories : R

How to take key and value in a CSV file for Kmeans clustering in Mahout
K-Means operates on vector spaces. It absolutely needs to able to compute means. But what is the mean value of {Pepsi, Coke, Pepsi, Limca}? Sorry, you are trying to use a hammer, but you don't have a nail! If you want to group data by their drink, this is not a clustering task. Maybe try a Hadoop based SQL system. Because apparently you want to perform a classic SQL operation: GROUP BY Drinks Oh, and your question is off-topic for stackoverflow. You are using Hadoop, but you are not posing a programming question!

Categories : Machine Learning

Get ordered kmeans cluster labels
K-means is a randomized algorithm. It is actually correct when the labels are not consistent across runs, or ordered in "ascending" order. But you can of course remap the labels as you like, you know... You seem to be using 1-dimensional data. Then k-means is actually not the best choice for you. In contrast to 2- and higher-dimensional data, 1-dimensional data can efficiently be sorted. If your data is 1-dimensional, use an algorithm that exploits this for efficiency. There are much better algorithms for 1-dimensional data than for multivariate data.

Categories : R

How To Fight Randomness Caused By KMeans Clustering
There are many possible ways of making clustering repeatable: The most basic method of dealing with k-means randomness is simply running it multiple times and selecting the best one (the one that minimizes the inner cluster distances/maximizes the between clusters distance). One can use some fixed initialization for your data instead of randomization. There are many heuristics for starting the k-means. Or at least minimize the variance by using algorithms like k-means++. Use modification of k-means which guarantees global minimum of regularized function, ie. convex k-means Use different clustering method, which is deterministic, ie. Data Nets

Categories : Machine Learning

adding labels to 2D scatter plot (kmeans clustering)
This can be done simply using ggplot. I will use the mtcars data since I don't have access to the dataset you are currently using. The idea should be pretty clear anyway. library(ggplot2) pca <- prcomp((mtcars),cor=F) dat.loadings <-pca$x[,1:2] cl <- kmeans(dat.loadings, centers=3) pca1 <-pca$x[,1] pca2 <-pca$x[,2] #plot(pca1, pca2,xlab="PCA-1",ylab="PCA-2",col=cl$cluster) mydf<-data.frame(ID=names(pca1),PCA1=pca1, PCA2=pca2, Cluster=factor(cl$cluster)) ggplot(mydf, aes(x=PCA1, y=PCA2, label=ID, color=Cluster)) + geom_point() + geom_text(size = 4, colour = "black", vjust = -1) This gives you a names output per data point.

Categories : R

Cluster data with output centers of Kmeans function
It seems that you're interested in performing some type of cluster assignment using the results of running K-Means on an initial data set, right? You could just assign the new observation to the closest mean. Unfortunately with K-Means you don't know anything about the shapes or size of each cluster. For example, consider a scenario where a new vector is equidistant (or roughly equidistant) from two means. What do yo do in this scenario? Do you make a hard assignment to one of the clusters? In this situation its probably better to actually look at the original data that comprises each of the clusters, and do some type of K-Nearest Neighbor assignment (http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm). For example, it may turn out that while the new vector is roughly equidistan

Categories : Opencv

opencv kmeans doesn't classify data in some classes
It turned out that some times some clusters have less than K number of members in them so in the next level the function returns an error. Though I still haven't figured out why sometimes a cluster is empty.

Categories : Opencv

How can i distribute processing of minibatch kmeans (scikit-learn)?
I don't think this is possible. You could implement something with OpenMP inside the minibatch processing. I'm not aware of any parallel minibatch k-means procedures. Parallizing stochastic gradient descent procedures is somewhat hairy. Btw, the n_jobs parameter in KMeans only distributes the different random initializations afaik.

Categories : Python

Kmeans matlab "Empty cluster created at iteration 1" error
It is simply telling you that during the assign-recompute iterations, a cluster became empty (lost all assigned points). This is usually caused by an inadequate cluster initialization, or that the data has less inherent clusters than you specified. Try changing the initialization method using the start option. Kmeans provides four possible techniques to initialize clusters: sample: sample K points randomly from the data as initial clusters uniform: select K points uniformly across the range of the data cluster: perform preliminary clustering on a small subset manual: manually specify initial clusters Also you can try the different values of emptyaction option, which tells MATLAB what to do when a cluster becomes empty. Ultimately, I think you need to reduce the number of clusters, i.

Categories : Matlab

How to fix kmeans error in r : 'more cluster centers than distinct data points'
Fix for this is to use : cells = c(read.csv("c:\data-files\kmeans\cells.csv", header = FALSE)) rnames = c(read.csv("c:\data-files\kmeans\rnames.csv", header = FALSE)) cnames = c(read.csv("c:\data-files\kmeans\cnames.csv", header = FALSE)) instead of cells = c(read.csv("c:\data-files\kmeans\cells.csv", header = TRUE)) rnames = c(read.csv("c:\data-files\kmeans\rnames.csv", header = TRUE)) cnames = c(read.csv("c:\data-files\kmeans\cnames.csv", header = TRUE))

Categories : R

Show pictures which represent kmeans cluster center (Scikit learn)
If you cluster SIFT descriptors, your cluster means will look like sift descriptors, not like images. I believe you were thinking of EigenFaces, but that has little to do with k-means.

Categories : Python

How to Apply a Kmeans clusering Algorithm on image data base using matlab?
First, you could use cells to ease coding : Cluster = cell(3,1); Cluster{1} = [9 2 3]; Cluster{2} = []; Cluster{3} = [4 8]; If you're using centroids, you'll have to make sure your images are the same size, or to extract features for all them.

Categories : Image

Getting empty cluster result with vectordump command in mahout kmeans algorithm
Just one line is not enough. I did some hack Just create one dummy file in Folder Input_files with dummy words in it. It shall run. Please let me know if you get better solution

Categories : Linux

opencv flann module: knn-search for hierarchical kmeans tree giving weird result
knnSearch is looking for the k-nearest neighbours in the index (it does not give the cluster-ID!). You build your index using cluster_data, and then you try to match cluster_data against itself. In this situation, it is not surprising that the closest neighbour to each descriptor is itself... EDIT: If you want to get the centers, have a look at this (from the source of the FLANN library): /** * Chooses the initial centers using the algorithm proposed in the KMeans++ paper: * Arthur, David; Vassilvitskii, Sergei - k-means++: The Advantages of Careful Seeding */ template <typename Distance> class KMeansppCenterChooser : public CenterChooser<Distance> { ...

Categories : Opencv

C MSB to LSB explanation
To answer your second question, u means to treat the hex constant as an unsigned (if there is need to expand it to a longer width), and l means to treat it as a long. I'm working on your first question.

Categories : C

C code explanation
Have you tried Googling for "paste your shellcode here"? The first second (now that this question is first LOL) result returned is Corelan Team's Exploit writing tutorial part 9: Introduction to Win32 shellcoding where it's all explained: In a nutshell, it's merely a small utility C application to test shellcode that will be used later on in following parts of the tutorial for this same purpose. The rest is explained in the tutorial.

Categories : C

Explanation of IplImage* img
img is the name of the variable, might as well be blahblahblah; IplImage is the type of the variable, it's just a struct that contains the image data itself plus some info (size, color depth, etc.) on the image; typedef struct _IplImage { int nSize; int ID; int nChannels; int alphaChannel; int depth; char colorModel[4]; char channelSeq[4]; int dataOrder; int origin; int align; int width; int height; struct _IplROI* roi; struct _IplImage* maskROI; void* imageId; struct _IplTileInfo* tileInfo; int imageSize; char* imageData; int

Categories : Opencv

Basic C / C++ explanation
The answer here is that the exact timing varies. When I run this code on my machine, it comes up with 1000 for the first loop sometimes, and 1000 for the second loop at other times. It's just "luck" when the timer ticks over. If you have a more accurate timer, it may show differences based on how long it takes to read the timer, or some such. $ ./a.out k = 1000000 k = 1000000 Simple: 0 Double: 1000 $ ./a.out k = 1000000 k = 1000000 Simple: 1000 Double: 0 $ ./a.out k = 1000000 k = 1000000 Simple: 1000 Double: 0 $ ./a.out k = 1000000 k = 1000000 Simple: 1000 Double: 0 It is easy to see that BOTH loops are optimized out: main: .LFB1474: .cfi_startproc pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset 6, -16 pushq %rbx .cfi_def_cfa_offset 24 .cfi_offset 3, -24 subq $8, %rsp .cfi_def_cf

Categories : C++

SEO friendly url explanation?
Yes, that will work as expected. Specifically, mod_rewrite's behavior with regards to existing query strings can be surprising. As an example, let's take the following rule, which converts the URL you supplied: index.php?news=1 to /news/1 page name will be accessible through $_GET['news'].

Categories : PHP

Explanation of this C Program
+ + a is parsed as the unary + operator applied twice, so the value remains unchanged. +(+a) is what the compiler saw, which is just 0 in this case

Categories : C

SOLR df and qf explanation
df is the default field and will only take effect if the qf is not defined. I guess you are not using dismax parser and using the default settings in solrconfig.xml qf then won't take effect anyways and the df field which is text would not return values. df=description searches on the field and hence returns values. Try passing defType=edismax as parameter.

Categories : Search

Wordnet SQL Explanation
To properly understand the meaning of the various terms in Wordnet, you should read the extensive documentation. For synonyms, you'll primarily need the synsets table. The actual database tables in the project you've downloaded are described on the project's schema page.

Categories : Mysql

C# Code explanation
basically, that is logically equivalent to (but terser than): bool dir; int tmp = br.ReadInt32(); if(tmp < 0) { dir = true; } else { dir = false; } It: does the call to ReadInt32() (which will result in an int) tests whether the result of that is < 0 (which will result in either true or false) and assigns that result (true or false) to dir To basically, it will return true if and only if the call to ReadInt32() gives a negative number.

Categories : C#

Enum Example Explanation
It makes more sense if you space it out a bit more: enum CoffeSize { BIG(8), HUGE(10), OVERWHELMING(20) { public String getLidCode() { return "B"; } }; // rest of enum code here } Only OVERWHELMING is overriding getLidCode(). You would access it via this approach (using your code): System.out.println(p.size.getLidCode()); The reason for this: p.size is type CoffeSize, of which it has the method getLidCode(). It will print the value of the overridden method, as expected.

Categories : Java

Explanation of ldd output
The most important part of that output is linux-vdso. VDSO stands for Virtual Dynamic Shared Object (http://en.wikipedia.org/wiki/VDSO) it's an way to export kernel space routines to userspace. The main reason is to reduce the system call overhead. Typically when a system call happens it requires some expensive operations like switching mode from user to kernel, copying data from userspace to kernelspace etc. To reduce these sorts of overhead VDSO is used, just by reading that vdso memory space result could be extracted i.e it's possible to gettimeofday() without doing a real system call! Note, not all system calls have VDSO support, only system calls like getcpu(), gettimeofday(), time() etc. which is an extremely fast way to get these thing done. Also the memory address linux-vdso.so.1

Categories : Linux

JavaScript Regex Explanation
I would perform this kind of validation like so: var d = document.createElement('div'); d.innerHTML = 'whatever </p>'; if (d.getElementsByTagName('*').length) { alert("You have typed some HTML"); }

Categories : Javascript

FaceRec tutorial explanation
subspaceProject() will basically give you a dimensionality reduction. projection = (images[0] - mean) * evs Subtracting the mean ensures that the images approximate a subspace. Presumably evs is the truncated right singular vectors. and for subspaceReconstruct() reconstruction = projection * transpose(evs) + mean The reconstruction is just the reverse of the projection, except since evs is truncated, it can not be perfect. See PCA

Categories : Opencv

cvResize Explanation, OpenCV
OpenCV documentation is blurred and if you are going through the source code, I assume you are pretty much angry at the source code, just like me :) Well it takes a while, lots of experimentation and read on-line. Slowly you fill the gaps Assuming you are using C interface of OpenCV You probably should go throughthis book for learning about cvResize You are not going to learn all the options for cvResize unless you understand what interpolation, bilinear, cubic means. These are not hard stuff but you must know at little theory and then test the code by writing programs yourself.

Categories : C++

Geocoder Class Explanation
First you need to get a LocationManager instance and the name of the provider you want to use to do a location look-up. For example: String provider = LocationManager.NETWORK_PROVIDER; LocationManager locationManager = (LocationManager)getSystemService(Context.LOCATION_SERVICE); Next, you call requestLocationUpdates(...) like so: locationManager.requestLocationUpdates(provider, 1000, 0, this); This can be invoked on the main thread since (I think) the system will do the location lookup using the specified provider on a background thread. When a location is found, the Android system will invoke the onLocationChanged(...) callback, which you need to override. Since you're trying to do an address lookup, you would put in: Geocoder geocoder = new Geocoder(this); geocoder.getFromLocati

Categories : Android

Need explanation on Exception code
If a subclass method declares it can throw a checked exception that the parent doesn't, it breaks the Liskov substitution principle, which is one of the corner stones of object oriented programming. Consider this bit of code, with Child.msg declared to throw a checked exception: void doMsg(Parent p) { p.msg(); } The program semantics break if you pass in a child object because the checked exception is now neither being caught nor thrown: the exception is no longer "checked." Since unchecked exceptions can be thrown everywhere, declaring to throw one serves no other purpose than documentation. Therefore it can be allowed safely.

Categories : Java

Python: CrashingPython explanation
It makes a pointer to memory that's likely to be unwritable, and writes to it. The numerical value of a is very small, and very low memory addresses are typically not writable, causing a crash when you try to write to them. Should the initial write succeed, it keeps trying successive addresses until it finds one that isn't writable. Not all memory addresses are writable, so it's bound to crash eventually. (Why it doesn't simply start at address zero I don't know - that's a bit odd. Perhaps ctypes explicitly protects against that?)

Categories : Python

javascript, some function explanation?
var someName; This is the variable name declaration. if (!someName) someName= {}; When the variable is null or otherwise empty, create a new empty object in it. someName.UI = function() { var player = 1; } Create the member UI in the object holding a function which will create a variable with a value of 1. someName.UI(); This would be the call to this function.

Categories : Javascript



© Copyright 2017 w3hello.com Publishing Limited. All rights reserved.