w3hello.com logo
Home PHP C# C++ Android Java Javascript Python IOS SQL HTML videos Categories
oracle sql navigator data-mining text-mining
You can use REGEXP_SUBSTR: SQL> SELECT txt, regex, regexp_substr(txt, regex, 1, 1, 'c', '3') result 2 FROM (SELECT q'¤@A'123'HEY'345''@B'K¤' txt, 3 q'¤^([^']*|('[^']*')*?)*?'([^']*[@#][^']*)'¤' regex 4 FROM dual); TXT REGEX RESULT -------------------- ---------------------------------------- --------- @A'123'HEY'345''@B'K ^([^']*|('[^']*')*?)*?'([^']*[@#][^']*)' @B

Categories : C#

How to use a regular expression inside TermDocumentMatrix for text mining?
I'm not sure that you can put regex in the dictionary function as it only accepts a character vector or a term-document matrix. The work-around I'd suggest is using regex to subset the terms in the term-document matrix, then do word counts: # What I would do instead tdm <- TermDocumentMatrix(crude, control = list(removePunctuation = TRUE)) # subset the tdm according to the criteria # this is where you can use regex crit <- grep("cru", tdm$dimnames$Terms) # have a look to see what you got inspect(tdm[crit]) A term-document matrix (2 terms, 20 documents) Non-/sparse entries: 10/30 Sparsity : 75% Maximal term length: 7 Weighting : term frequency (tf) Docs Terms 127 144 191 194 211 236 237 242 246 248 273 349 352 353 368

Categories : Regex

How to convert a termDocumentMatrix which I have got from text mining in R into excel or CSV file?
I assume you have a list of strings elements separated by a comma, with different number of elements. Names <- c("aaron, matt, patrick", "jiah, ron, melissa, john, patrick") ## get max number of elements mm <- mm <- max(unlist(lapply(strsplit(Names,','),length))) ## set all rows the same length lapply(strsplit(Names,','),function(x) {length(x) <- mm;x}) ## create a data frame with the data welle formatted res <- do.call(rbind,lapply(strsplit(Names,','),function(x) {length(x) <- mm;x})) ## save the file write.csv(res,'output.csv') I think also you can use rbind.fill from plyr package, but you have to coerce each row to a data.frame( certain cost).

Categories : R

Pairing qualitative user data with text-mining results
... as ben mentioned: vec <- as.character(x[,"place of comments"]) Corpus(VectorSource(vec)) perhaps some customer id as meta data would be nice... hth

Categories : R

Data Mining: grouping based on two text values (IDs) and one numeric (ratio)
Sounds like a classic matrix factorization task to me. With a weighted matrix, instead of a binary one. So some fast algorithms may not be applicable, because they support binary matrixes only. Don't ask for source on Stackoverflow: asking for off-site resources (tools, libraries, ...) is off-topic.

Categories : Python

Topic mining algorithm in c/c++
If you wish to count the occurrences of each word in an array then you can do no better than O(n) (i.e. one pass over the array). However, if you try to store the word counts in a two dimensional array then you must also do a lookup each time to see if the word is already there, and this can quickly become O(n^2). The trick is to use a hash table to do your lookup. As you step through your word list you increment the right entry in the hash table. Each lookup should be O(1), so it ought to be efficient as long as there are sufficiently many words to offset the complexity of the hashing algorithm and memory usage (i.e. don't bother if you're dealing with less than 10 words, say). Then, when you're done, you just iterate over the entries in the hash table to find the maximum. In fact, I wo

Categories : C++

Web usage mining with rapidminer
This was asked in the rapidforum: yes, the RapidMiner 4.6 Community Edition together with its text mining plugin are suitable for web usage mining. The RapidMiner 4.6 operator LogFileSource allows to directly import web server log files. RapidMiner supports aggregations of web usage statistics, automated web page visitor session extraction, search robot filtering, mash-ups with web services to map ip addresses to countries, cities, and map coordinates, automated clustering of visits and/or click paths, frequent path item set mining and association rule generation, 2D and 3D visualization of web usage statistics, click path sequence analysis, personalized product recommendations for cross-selling, and many other things.

Categories : Misc

Data Mining from HTML
Typically whenever you're retrieving data from a database in order to display information in the UI, it's best to avoid copy and paste "inheritance". Instead you might want to look into template based data binding. What specific approach to use is dependent on the technology you're using. In the above case it looks like it would make send to bind your dropdown to a data source

Categories : PHP

Data mining library for hadoop
Why not use Spark? It's a very efficient open source cluster computing system, both fast to run and fast to write. For distributed data mining, Spark is a very good tool. Hope helpes!

Categories : Hadoop

Sequential Pattern - Data Mining
In general, and keep in mind, this is inherently opinion based, data mining refers to the process of taking data that is in a relatively unusable format and converting it into a format that is more usable. For instance, if I have a huge .txt dump of unstructured text and I then extract relevant portions (according to some formal definition of relevant) and place it into a .bson store or something similar, that would be data mining, regardless of exactly how I do the extraction. However, since your data is already in a SQL database, I wouldn't consider this data mining. I would consider it SQL development, though again, this is largely opinion-based. A SQL database is already a highly useful way of storing data, so accessing that data isn't introducing a level of functionality that wasn'

Categories : SQL

Use data mining in SQL Server 2008 R2
First you should get SQL Server Data Tools which runs in Visual Studio. You will need Analysis Services installed; if you don't have it just run the SQL Server installer again and look for the option to install it. After that you can take a look at this post I wrote a few months ago: http://www.sqlservercentral.com/Forums/Topic480010-147-1.aspx I wrote it specifically targeting the Neural Network models, but it contains details on several background steps you will need to do. Finally - since you're using an evaluation version, you may want to just go for SQL Server 2012 (that's what I use, so I know it works).

Categories : Sql Server

stratum-mining-proxy error - Can't decode message
I am a little curious, I don't know as a fact but I was under the impression that the mining proxy was for BTC not LTC. But anyways I believe I got a similar message when I first installed it as well. To fix, or rather to actually get it running I had to use the Git installation method instead of installing manually. Installation on Linux using Git This is advanced option for experienced users, but give you the easiest way for updating the proxy. 1.git clone git://github.com/slush0/stratum-mining-proxy.git 2.cd stratum-mining-proxy 3.sudo apt-get install python-dev # Development package of Python are necessary 4.sudo python distribute_setup.py # This will upgrade setuptools package 5.sudo python setup.py develop # This will install required dependencies (namely Twisted and Stratum

Categories : Python

Scala - replace xml element with specific text
For real? scala> val x = <foo>Hi</foo> x: scala.xml.Elem = <foo>Hi</foo> scala> x match { case <foo>{what}</foo> => <foo>{System.nanoTime}</foo> } res1: scala.xml.Elem = <foo>213370280150006</foo> update with linked example: import scala.xml._ import System.{ nanoTime => now } object Test extends App { val InputXml : Node = <root> <subnode> <version>1</version> </subnode> <contents> <version>1</version> </contents> </root> def substitution = now // whatever you like def updateVersion(node: Node): Node = node match { case <root>{ ch @ _* }</root> => <root>{ ch.map(updateVersion )}</root> case

Categories : Xml

can gcc do loop optimizations (strip-mining/blocking) on unknown iteration count?
Definitely read the GCC 4.5.0 Optimize Options docs. (Search for -floop-strip-mine, about 1/3 of the way down the page) Also, make sure GCC's getting the --with-ppl and --with-cloog options (as noted in the docs about using Graphite in -floop-strip-mine). Without those, GCC probably won't even try to perform strip mining on your code. Based on the behavior description and pseudocode examples in the docs, which show some pseudocode loops with finite strip lengths and iteration counts, I'd say that GCC probably does not do strip mining on unknown iteration counts. From the docs: Pseudocode original loop: DO I = 1, N A(I) = A(I) + C ENDDO Pseudocode strip-mined loop: DO II = 1, N, 51 DO I = II, min (II + 50, N) A(I) = A(I) + C ENDDO ENDDO

Categories : C

updating line in large text file using scala
If you want to use text files, consider a fixed length/record size for each line/record. This way you can use a RandomAccessFile to seek to the exact position of each line by number: You just seek to line * LineSize, and then update it. It will not really help, if you have to insert a new line. Other limitations are: The file size will grow (because of the fixed record length), and there will always be one record which is too big. As for the initial conversion: Get the maximum line length of the current file, then add 10% for example. Now you have to convert the file once: Read a line from the text file, and convert it into a fixed-size record. You could use a special character like | to separate the fields. If possible, use somthing like ;, so you get a .csv file I suggest padding t

Categories : Scala

Create web services for Data Mining using Business Intelligence Development Studio (BIDS)
I assume here that you know how to write a web service. You should use the adomd.net to fetch your cube data. Refer: ADOMD.NET Client Programming Example: Displaying a grid using ADOMD.NET and MDX Code: AdomdConnection conn = new AdomdConnection(strConn); conn.Open(); AdomdCommand cmd = new AdomdCommand(MDX_QUERY, conn); CellSet cst = cmd.ExecuteCellSet();

Categories : C#

Will Twitter's rate limits allow me to do the data mining necessary to construct a complete social network graph of about 600K users?
I'll answer these questions in reverse order, starting with David Marx first: Well, I do have access to a pretty robust computer research center with a ton of storage capacity, so that should not be an issue. I don't know if the software can handle it, however. Chances are that I will have to scale down the project, which is OK. The idea for me is to start out with a bigger idea, figure out how big it can be, and then pare down accordingly. Following up on Anony-Mousse's question now: Part of my problem is that I am not sure I am interpreting the Twitter rate limits correctly. I'm not sure if it's 15 requests per 15 minutes, or 30 requests per 15 minutes. And I think 1 request will get 5000 followers/friends, so you could presumably collect 75,000 friends or followers every 15 minut

Categories : Twitter

Scala: How to transform a POJO like object into a SQL insert statement using Scala reflection
In your case, you can use the following codes: val o = new MyDataObj val attributes = o.getClass.getDeclaredMethods.filter { _.getReturnType != Void.TYPE }.map { method => (method.getName, method.getReturnType, method.invoke(o)) } Here I use getDeclaredMethods to get the public methods in the MyDataObj. You need to notice that getDeclaredMethods can not get methods in its parent class. For MyDataObj, getDeclaredMethods will return the following methods: public double MyDataObj.c() public boolean MyDataObj.b() public java.lang.String MyDataObj.d() public int MyDataObj.a() public void MyDataObj.c_$eq(double) public void MyDataObj.d_$eq(java.lang.String) public void MyDataObj.b_$eq(boolean) public void MyDataObj.a_$eq(int) So I add a filter to filter out irrelevant methods.

Categories : SQL

"scala.runtime in compiler mirror not found" but working when started with -Xbootclasspath/p:scala-library.jar
The easy way to configure the settings with familiar keystrokes: import scala.tools.nsc.Global import scala.tools.nsc.Settings def main(args: Array[String]) { val s = new Settings s processArgumentString "-usejavacp" val g = new Global(s) val r = new g.Run } That works for your scenario. Even easier: java -Dscala.usejavacp=true -jar ./scall.jar Bonus info, I happened to come across the enabling commit message: Went ahead and implemented classpaths as described in email to scala-internals on the theory that at this point I must know what I'm doing. ** PUBLIC SERVICE ANNOUNCEMENT ** If your code of whatever kind stopped working with this commit (most likely the error is something like "object scala not found") you can get it working again

Categories : Scala

"error: can't find main class scala.tools.nsc.MainGenericRunner" when running scala in windows
Those weird variables are called parameter extensions. they allow you to interpret a variable as a path to a file/directory and directly resolve things from that path. For example, if %1 is a path to a file dir123456file.txt, %~f1 is the fully qualified path to file.txt, %~p1 is the path to the containing directory dir123456, %~s1 is the path in short name format dir123~1file.txt, and many others... Also, %0 is always set to the path of the currently running script. So: %~fs0 is the fully qualified path, in short name format, to the current script, %%~dpsi is the manual expansion of the FOR variable %%i to a drive letter (d option) followed by the path the containing folder (p option), in short format (s option). Now, this weird looking block of code is a workaround for KB83343

Categories : Windows

Any specific rules for converting MySQL data to Prolog rules for exploratory mining?
You could maybe use nth1/1 and the "univ" operator, doing something like this: fieldnames(t, [id,this,that]). get_field(Field, Tuple, Value) :- Tuple =.. [Table|Fields], fieldnames(Table, Names), nth1(Idx, Names, Field), nth1(Idx, Fields, Value). You'd need to create fieldnames/2 records for each table structure, and you'd have to pass the table structure along to this query. It wouldn't be terrifically efficient, but it would work. ?- get_field(this, t(testId, testThis, testThat), Value) Value = testThis You could then build your accessors on top of this pretty easily: findThisById(X, This) :- get_field(this, X, This). Edit: Boris points out rightly that arg/3 will do this with even less work: get_field(Field, Tuple, Value) :- functor(Tuple, Table, _), f

Categories : Mysql

scala Duration: "This class is not meant as a general purpose representation of time, it is optimized for the needs of scala.concurrent."
Time can be represented in various ways depending on your needs. I personally have used: Long — a lot of tools take it directly Updated: java.time.* thanks to @Vladimir Matveev The package is designed by the author of Joda Time (Stephen Colebourne). He says it is designed better. Joda Time java.util.Date Separate hierarchy of classes: trait Time case class ExactTime(timeMs:Long) extends Time case object Now extends Time case object ASAP extends Time case class RelativeTime(origin:Time, deltaMs:Long) extends Time Ordered time representation: case class History[T](events:List[T]) Model time. Once I had a global object Timer with var currentTime:Long: object Timer { private var currentTimeValue:Long def currentTimeMs = currentTimeValue def currentTimeMs_=(newTime:Long) { ...

Categories : Scala

Why can I start the Scala compiler with "java -cp scala-library.jar;. Hello World"?
Short version: you're not using the java compiler, you're using the java runtime. Long version: there's a big difference between javac and java. javac is the java compiler, which takes in java source code and outputs jvm bytecode. java is the java runtime, which takes in jvm bytecode and runs it. But one of the great things about the jvm is that you can generate bytecode for it any which way. Scala generates jvm bytecode without any java source code.

Categories : Java

Why use template engine in playframework 2 (scala) if we may stay with pure scala
Actually you should ask this question to the dev team, however consider few points: Actually you don't need to use the Play's templating engine at all, you can easily return any string with Ok() method, so according to your link you can just do something like Ok(theDate("John Doe").toString()) Play uses approach which is very typical for other MVC web-frameworks, where views are HTML based files, because... it's web dedicated framework. I can't see nothing wrong about this, sometimes I'm working with other languages/frameworks and can see that only difference in views between them is just a language-specific syntax, that's the goal! Don't also forget, that Play is bilingual system, someone could ask 'why don't use some Java lib for processing the views?' The built-in Scala XML literals a

Categories : Scala

scala for each loop got error when convert java to scala
override def saveOrUpdateAll(entities: Collection[T]){ import scala.collection.JavaConverters._ val session: Session = getSession() for (entity <- entities.asScala) { session.saveOrUpdate(entity) } } There is no for each loop in scala. You should wrap your collection using JavaConverters and use for-comprehension here. JavaConverters wraps Collection using Wrappers.JCollectionWrapper without memory overhead.

Categories : Java

Scala import statement at top and inside scala class
The difference is: In Option 1 the import is viable for the complete scope. i.e any class/trait/function in com.somePackage can be used anywhere inside/outside the MyClass But in case of Option 2 it can only be used inside the MyClass and not outside it because scope of import is limited to inside MyClass only.

Categories : Scala

Scala Macros: Making a Map out of fields of a class in Scala
Note that this can be done much more elegantly without the toString / c.parse business: import scala.language.experimental.macros abstract class Model { def toMap[T]: Map[String, Any] = macro Macros.toMap_impl[T] } object Macros { import scala.reflect.macros.Context def toMap_impl[T: c.WeakTypeTag](c: Context) = { import c.universe._ val mapApply = Select(reify(Map).tree, newTermName("apply")) val pairs = weakTypeOf[T].declarations.collect { case m: MethodSymbol if m.isCaseAccessor => val name = c.literal(m.name.decoded) val value = c.Expr(Select(c.resetAllAttrs(c.prefix.tree), m.name)) reify(name.splice -> value.splice).tree } c.Expr[Map[String, Any]](Apply(mapApply, pairs.toList)) } } Note also that you need the c.r

Categories : Scala

Mining pdf Data with python through clipboard - Python Scripting the OS
I have settled on using pyPdf. It has a simple method that just extracts the text from the pdf. I have written simple functions to find the relevant information I need in this text. Splitting the text into list for easy data identification. Have also written a loop to to pick up the relevant files using glob search and feeding it into the parser. import pyPdf pdf = pyPdf.PdfFileReader(open(filename, "rb")) data = '' for page in pdf.pages: data += page.extractText() data2 = data.split(' ')

Categories : Python

How to use Scala Virtualized in a Maven project with Scala-2.10?
the groupId of scala-library, scala-compiler is hard-coded into the plugin. The informations (version, ...) of scala-compiler are computed from the scala-library dependencies. You can open a ticket and ask to support other groupId (may be configurable one or not hard-coded). You can fork, made the change, and submit a patch / pull request. UPDATE : scala-mavent-plugin 3.1.6 include the patch from evantill (thanks) So you can override the default scalaOrganization

Categories : Scala

Convert Java Object (that is really a Scala Map) to Scala Map
You are probably trying to cast across class loaders. You can't do this--each class loader maintains its own hierarchy (for those classes not passed off to a common parent loader). Try calling getClassLoader on both your returned map and on a freshly-created one. Incidentally, Map$Map2 is just an implementation detail--a subclass of Map for dealing with two-element maps. It casts just fine normally: scala> val m = Map(1->"one", 2 -> "two"): Object m: Object = Map(1 -> one, 2 -> two) scala> m.getClass res0: Class[_ <: Object] = class scala.collection.immutable.Map$Map2 scala> m.asInstanceOf[scala.collection.immutable.Map[Int,String]] res1: scala.collection.immutable.Map[Int,String] = Map(1 -> one, 2 -> two)

Categories : Scala

Scala: Use Java Constructor with Subclasses in Scala
Why are you using classOf[FXTaskStackElement].asInstanceOf[Class[_]] instead of just classOf[FXTaskStackElement]? Since your second argument is a Class[_], there is no suitable SUBCLASS.

Categories : Java

I have to compile hello.scala which has a import command "org.scala._"
Include the package name when using the scala command and run it against the class file instead of the source code scala org.scala.Hello Like Java, Scala Naming conventions indicate that classes start with an initial uppercase letter for classes, e.g. Hello

Categories : Java

Scala Yield behaves differently than Scala Map
var characterMap = (for (_ <- 0 until sizeX*sizeY) yield Nil).toArray[List[ActorRef]] Is translated into: var characterMap = (0 until sizeX*sizeY).map(_ => Nil).toArray[List[ActorRef]] and this should work. Nit-picking: They are not equivalent, they are the same. for is just sugar syntax. By the way, you might want to consider: Array.fill[List[ActorRef]](sizeX*sizeY)(Nil)

Categories : Arrays

Scala: Use multiple constructors from Java in Scala
Now, with the help from a workmate, i have solved that problem. Instead of classOf[Button] i have to use classOf[Button].asInstanceOf[Class[_]] With this it works fine.

Categories : Java

class java.lang.RuntimeException/Scala class file does not contain Scala annotation
Looks like the library is configured to compile against Scala 2.9.1. Major versions of Scala are not binary compatible. I put the necessary SBT changes here: https://github.com/mpartel/prestashop-scala-client/commit/e9a1df40bfe35518aaebac899e438b9b6fa6d728

Categories : Scala

Scala: How to pattern match scala.Long and java.lang.Long
It should work straight away: object LongTest { def test(value: Any): Boolean = value match { case l: Long => true case _ => false } def run() { println(test(1L)) println(test(new java.lang.Long(1L))) } } LongTest.run() // true and true It wasn't obvious to me that you want to match classes instead of instance. I'm not sure I understand what you actually want. Like this? object LongTest { def test(clazz: Class[_]): Boolean = clazz == classOf[Long] || clazz == classOf[java.lang.Long] def run() { println(test(1L.getClass)) println(test(new java.lang.Long(1L).getClass)) } } LongTest.run() // true and true Or as a pattern match: def test(clazz: Class[_]): Boolean = clazz match { case q if q == classOf[Long] || q == classOf[j

Categories : Scala

Adding "new line"s to text in text supporting objects (i.e Buttons, Rich Text boxs) both in and out of code
" " is the escaped character for a line break. So code like: label.Text = "This is a button"; Should put the word button on a new line. Edit: If you want to do it using the properties window in designer, click on the arrow on the far right of the text property field and it will open a small box. If you type multiple lines on that as you would normally (ie actually pressing enter, not using ) then the component will treat them as new lines and put the new lines in for you.

Categories : C#

Ignoring leading whitespace in the "lazy-text" quasiquoter in Haskell Text.Shakespeare.Text
It doesn't appear to, however it isn't hard to add the feature yourself. lt is just a QuasiQuoter, which is the data type: QuasiQuoter { quoteExp :: String -> Q Exp , quotePat :: String -> Q Pat , quoteType :: String -> Q Type , quoteDec :: String -> Q [Dec] } They take a String, and return the appropriate template haskell type (depending on the context it is used in. It is a simple matter to transform a string so it works as you described with a regex: stripWhiteSpaceBeforeBackslash :: String -> String stripWhiteSpaceBeforeBackslash str = subRegex (mkRegex "^[[:space:]]*\\") str "" Also, a function that transforms a QuasiQuoter with a string transform function is simple: transformQuasiQuoter :: (String -> String) -> QuasiQuoter -> QuasiQuoter transfo

Categories : Haskell

Give a block of text, some keyshorts, and the target text, how to get the minimal key press to convert the text?
It's probably NP-hard. This paper gives a hardness proof for an editor where insertions, deletions, and substring moves all have constant cost. %0 Book Section %D 2002 %@ 978-3-540-43862-5 %B Combinatorial Pattern Matching %V 2373 %S Lecture Notes in Computer Science %E Apostolico, Alberto %E Takeda, Masayuki %R 10.1007/3-540-45452-7_9 %T Edit Distance with Move Operations %U http://dx.doi.org/10.1007/3-540-45452-7_9 %I Springer Berlin Heidelberg %8 2002-01-01 %A Shapira, Dana %A Storer, JamesA. %P 85-98 %G English

Categories : Algorithm

Change encoding of text file (shell archive or script for antique kernel text to ASCII text, with CRLF, LF line terminators)
Unix uses a single character for line termination. If you want to convert your file with CRLF to single character termination, you can do the following: sed -e 's/<CTRL-V><CTRL_M>//' filename where <CTRL-V> is the Control key pressed with V (do not include < and > characters in the command.

Categories : Bash



© Copyright 2017 w3hello.com Publishing Limited. All rights reserved.