w3hello.com logo
Home PHP C# C++ Android Java Javascript Python IOS SQL HTML videos Categories
Nokogiri+Paperclip = XML parse with img from url
This is a code snippet from my parser... I guess where you went wrong is using open(url) instead of parse(url). picture = Picture.new( realty_id: realty.id, position: position, hashcode: realty.hashcode ) # picture.image = URI.parse(url), edit: added open() as this worked for Savroff picture.image = open(URI.parse(url)) picture.save! Additionally it would be a good idea to check if the image really exists picture_array.each do |url| # checks if the Link works res = Net::HTTP.get_response(URI.parse(url)) # if so, it will add the Picture Link to the verified Array if res.code.to_i >= 200 && res.code.to_i < 400 #good codes will be betweem 200 - 399 verified_array << url end end

Categories : Ruby On Rails

How do I parse HTML using Nokogiri?
Since you need to find the cells with respect to the city link, you should find their common ancestor - in this case their tr. Using xpath, you can locate a specific cell by its text: # This is the table that contains all of the city data data_table = extrair.at_css('.table_padrao') # This is the specific row that contains the specified city row = data_table.xpath('//tr[td/a[@class="linkpadrao" and text()="Serra Talhada"]]') # This is the data in the specific row data = row.css(".color_line").map{|e| e.text } #=> ["9", "2,973", "0,016", "2,939", "3,000", "0,572", "2,401", "0,024", "2,378", "2,426"]

Categories : Ruby

How do I use Nokogiri to parse row by row using CSS selectors?
I don't know what you're expecting this to do: rows [1..10].each do |row| puts "this isn't working :(" end but I'm pretty sure it won't do what you're expecting it to do. That's actually interpreted as this: rows[1..10].each { ... } and since the rows (which is a Nokogiri::XML::NodeSet) only has one entry, trying to extract a subset starting at 1 gives you an empty NodeSet; that means that you're effectively just saying this: some_empty_node_set.each { ... } and that does nothing useful. However, if you look at the first entry in rows, you'll find the href you're looking for: rows[0]['href'] # "http://servico-informatica.vivanuncios.com/..." You could also look at rows.attr('href') or rows.first['href'] depending on taste and what fits your needs.

Categories : HTML

How do I use Nokogiri to parse an XML file?
Here I will try to explain you all the questions/confusions you are having: require 'nokogiri' doc = Nokogiri::XML.parse <<-XML <Collection version="2.0" id="74j5hc4je3b9"> <Name>A Funfair in Bangkok</Name> <PermaLink>Funfair in Bangkok</PermaLink> <PermaLinkIsName>True</PermaLinkIsName> <Description>A small funfair near On Nut in Bangkok.</Description> <Date>2009-08-03T00:00:00</Date> <IsHidden>False</IsHidden> <Items> <Item filename="AGC_1998.jpg"> <Title>Funfair in Bangkok</Title> <Caption>A small funfair near On Nut in Bangkok.</Caption> <Authors>Anthony Bouch</Authors> <Copyright>Copyright © Anthony Bouch

Categories : Ruby

Rails - Upload and parse multiple CSV files
Well, as far as I see. This behavior shouldn't actually be happening. I would suggest another way of doing this code, since it's looking a bit polluted. def import (1..x).each do |i| #with x being the max number of files uploaded at the same time RatingSet.multi_uploader(params["file_#{i}".to_sym]) end redirect_to "/multi_uploader", :flash => { :notice => "Successfully Uploaded." } end Either way, this shouldn't solve your problem. I can't tell why this is happening to be honest. UPDATE: Have you considered using: <%= f.file_field :file, :multiple => true %> instead of mutiple file_tags?

Categories : Ruby On Rails

How do I parse XML with Nokogiri css selectors, using loops?
Nokogiri's NodeSet and Node support very similar APIs, with the key semantic difference that NodeSet's methods tend to operate on all the contained nodes in turn. For example, while a single node's children gets that node's children, a NodeSet's children gets all contained nodes' children (ordered as they occur in the document). So, to print all the titles and authors of all your items, you could do this: require 'nokogiri' doc = Nokogiri::XML(File.open("sample.xml")) coll = doc.css("Collection") coll.css("Items").children.each do |item| title = item.css("Title")[0] authors = item.css("Authors")[0] puts title.content if title puts authors.content if authors end You can get at any level of the tree in this way. Another example -- depth-first search printing every node in the t

Categories : Ruby

Parse namespaced xml with ruby nokogiri
You should use the Nokogiri namespace syntax: http://nokogiri.org/tutorials/searching_a_xml_html_document.html#namespaces. First, make sure you have namespaces you can use: ns = { 'xmlns' => 'http://schemas.dmtf.org/ovf/environment/1', 'oe' => 'http://schemas.dmtf.org/ovf/environment/1' } (I'm defining both even though they are the same in this example). You might also look into using the namespaces already available in doc.collect_namespaces. Then you can just do: value = properties.at('./xmlns:Property[@oe:key="mykey"]/@oe:value', ns).content Note that I am using ./ here because, for this specific search, Nokogiri interprets the XPath as CSS without it. You may wish to use .//.

Categories : Ruby

how to parse XML file remotely from FTP with nokogiri gem, without downloading
If you simply want to avoid using a temporary local file, it is possible to to fetch the file contents direct as a String, and process in memory, by supplying nil as the local file name: files.each do |file| xml_string = ftp.getbinaryfile( file, nil ) doc = Nokogiri::XML( xml_string ) # some operations with doc end This still does an FTP fetch of the contents, and XML parsing happens at the client. It is not really possible to avoid fetching the data in some form or other, and if FTP is the only protocol you have available, then that means copying data over the network using an FTP get. However, it is possible, but far more complicated, to add capabilities to your FTP (or other net-based) server, and return the data in some other form. That could include Nokogiri parsing done

Categories : Ruby

How to parse inner_html inside for loop using XPath with nokogiri
You could fix it instead of using name = row_html.xpath("/td[1]").text,use name = Nokogiri::HTML(row_html).xpath("/td[1]").text. Although there is a good technique of doing so if you share the full HTML you have with you. Nokogiri::HTML(row_html) will give you the instance of the class Nokogiri::HTML::Document. Now #xpath,#css and #search all the methods are the instance method of Nokogiri::HTML::Document class. Considering that if your inner_html produces the HTML table you provided,then you can think of as below. I did test the code,and hope it would give you the result: require "nokogiri" doc = Nokogiri::HTML(<<-eohl) <table> <tbody> <tr> <th>First Name</th> <th>Last Name</th> </tr>

Categories : Ruby

use nokogiri 1.5.9 with rails
According to the changelog: 1.5.0 beta1 / 2010/05/22 Ruby 1.8.6 is deprecated. Nokogiri will install, but official support is ended. So, you'll probably need to use nokogirl version 1.4.7.

Categories : Ruby On Rails

How do I merge two XML files into one using Nokogiri?
Based on your samples and the desired output it appears you just want to replace the mat value in XML2 with the mat value from XML1. require 'nokogiri' xml1 = Nokogiri::XML('<?xml version="1.0"?> <formX xmlns="sdu:x"> <identify> <mat>8</mat> </identify> </formX>') xml2 = Nokogiri::XML('<?xml version="1.0"?> <formX xmlns="sdu:x"> <identify> <mat>9999</mat> <name>John Smith</name> </identify> </formX>') xml2.at('mat').content = xml1.at('mat').content puts xml2.to_xml Which outputs: <?xml version="1.0"?> <formX xmlns="sdu:x"> <identify> <mat>8</mat> <name>John Smith</name> </identify> </formX> This isn't

Categories : Ruby

Problems getting data from XML using Nokogiri and Rails
Your XML element name contains a colon, but it is not in a namespace (otherwise the prefix and uri would show up in the dump of the node). Using element names with colons without using namespaces is valid, but can cause problems (like this case) so generally should be avoided. Your best solution, if possible, would be to either rename the elements in your xml to avoid the : character, or to properly use namespaces in your documents. If you can’t do that, then you’ll need to be able to select such element names using XPath. A colon in the element name part of an XPath node test is always taken to indicate a namespace. This means you can’t directly specify a name with a colon that isn’t in a namespace. A way around this is to select all nodes and use an XPath function in a predicate

Categories : Ruby On Rails

Parsing many files at once with Nokogiri/Rspec
as @theTinMan pointed out, Nokogiri only handles one file at time. If you want to parse all files in a folder, you will have to read the folder (again, as pointed by @theTinMan) and spawn a process or thread for each. Of course you need to understand how fork works or what is a thread first. Example using processes Ok, lets use a process, since ruby doesn't have real threads: files = Dir.glob("files/**") files.each do |file| # Here the program become two: # One executes the block, other continues the loop fork do puts File.open(file).read end end # We need to wait for all processes to get to this point # Before continue, because if the main program dies before # its children, they are killed immediately. Process.waitall puts "All done. closing." and the output: $ l

Categories : Ruby

NokoGiri Error trying to use MiniTest-Rails-capybara
You need to upgrade libxml2 and libxslt. If you are using Brew you can do the following: $ brew install libxml2 libxslt $ brew link libxml2 libxslt See the Nokogiri documentation for more information.

Categories : Ruby

Ruby Rails and Webscraping (Nokogiri) and Nitrous.io
Add nokogiri to your gem file, run bundle install and start using it. There is no need for require After bundle install, you can straight away use it in controller actions. e.g def index p = Nokogori::HTML(open('http://domain.com')) #=> p now have the contents of the page end

Categories : Ruby On Rails

Unable to install nokogiri using rvm, receiving "nokogiri requires Ruby version >= 1.9.2"
Just try gem install nokogiri as you have RVM installed. See here Installing Nokogiri for other things to install with it. Like below : # nokogiri requirements sudo apt-get install libxslt-dev libxml2-dev gem install nokogiri See my answer here for the part ERROR: While executing gem ... (Errno::EACCES) ` Permission denied

Categories : Ruby

How to move large number of files files on disk to HDFS sequence files
Seems like a few lines of code with Camel. i.e. from("file:/..").to("hdfs:..") plus some init and project setup. Not sure how much easier (less lines of code) you can do it using any method. If the HDFS options in Camel is enough for configuration and flexibility, then I guess this approach is the best. Should take you just a matter of hours (or even minutes) to have some test cases up and running.

Categories : Apache

Pig: Splitting large large file into multiple smaller files
If the split is not related to the data why even use Pig or MapReduce at all? As an alternative you could just use the standard split program to split your data, if I didn't misunderstand. For example: cat part-* | split -d -l 1000 - result-

Categories : Hadoop

Nokogiri get xpath from Nokogiri::XML::Element
rc is not an element-it's an array of matching elements: results = rc.map do |node| Nokogiri::CSS.xpath_for node.css_path end p results Or, if you know there is only one matching element: xpath = Nokogiri::CSS.xpath_for rc[0].css_path Note that xpath_for returns an array, so you will need to extract the first element of the array: xpath.first

Categories : Ruby On Rails

Can ANTLR4 java parser handle very large files or can it stream files
You could do it like this: InputStream is = new FileInputStream(inputFile);//input file is the path to your input file ANTLRInputStream input = new ANTLRInputStream(is); GeneratedLexer lex = new GeneratedLexer(input); lex.setTokenFactory(new CommonTokenFactory(true)); TokenStream tokens = new UnbufferedTokenStream<CommonToken>(lex); GeneratedParser parser = new GeneratedParser(tokens); parser.setBuildParseTree(false);//!! parser.top_level_rule(); And if the file is quite big, forget about listener or visitor - I would be creating object directly in the grammar. Just put them all in some structure (i.e. HashMap, Vector...) and retrieve as needed. This way creating the parse tree (and this is what really takes a lot of memory) is avoided.

Categories : Java

Large zip files served appear on client as 0 byte files
Downloading 350 MB in 60 seconds (what is your time limit) turns into 5,8 MB per second. Sounds challenging to me. Just change your time limit: set_time_limit(60*30); //also max_execution_time in php.ini

Categories : PHP

Parse.com: How large is the local cache?
It sounds like you are looking to be able to use your Parse data offline. I'm not certain of the size of the Parse cache or the details of how it works. I have some experience using Parse and ran into some issues with this very topic. I think it's important to note that in Parse's documentation they describe that they "cache the result of a query on disk." This is very different from caching all of your PFObjects on disk. Essentially the cache allows you to re-run a query that has already been run and get the same results as the first time. For example, you can't cache the results of a query that returns all of a certain object type and then later run another query that gets a subset of those objects from the cache. You have to run the exact same query to get the cached result. This can b

Categories : IOS

How to efficiently parse large bz2 xml file in C
You don't do bzip2 decompression in your program, just read uncompressed xml from stdin and parse it with libxml2 (or equvalent). Then just call your program like this, and enjoy the beuty of unix pipes: bzip2 -d < planet.osm.bzip2 | yourtool

Categories : C

How do I parse some of the data from a large xml file?
If the file is really large, use ElementTree or SAX. If the file is not that large (i.e. fits into memory), minidom might be easier to work with. Each line seems to be a simple string of comma-separated numbers, so you can sipmly do line.split(',').

Categories : Python

How to parse large SOAP response
If this in Java you could try using dom4j. This has a nice way of reading the xml using the xpathExpression. Additionally dom4j provides an event based model for processing XML documents. Using this event based model allows us to prune the XML tree when parts of the document have been successfully processed avoiding having to keep the entire document in memory. If you need to process a very large XML file that is generated externally by some database process and looks something like the following (where N is a very large number). <ROWSET> <ROW id="1"> ... </ROW> <ROW id="2"> ... </ROW> ... <ROW id="N"> ... </ROW> </ROWSET> So to process each <ROW> individually you can do the foll

Categories : Java

Nokogiri and Mechanize help (clicking links found by Nokogiri via Mechanize)
If your goal is simply to make it to the next page and then scrape some info off of it, then all you really care about are: Page content (For scraping your data) The URL to the next page you need to visit The way you get to the page content could be done by using Mechanize OR something else, like OpenURI (which is part of Ruby standard lib). As a side note, Mechanize uses Nokogiri behind the scenes; when you start to dig into elements on the parsed page you will see they come back as Nokogiri related objects. Anyways, if this were my project I'd probably go the route of using OpenURI to get at the page's content and then Nokogiri to search it. I like the idea of using a Ruby standard library instead of requiring an additional dependency. Here is an example using OpenURI: require 'n

Categories : Ruby On Rails

Parse large image from SOAP/XML in android
I have searched lot of on internet for your alternate solution. But it seems that memory constraints is always in middle of the way for phone. Really 1500 * 1000 px image is too huge for android mobile to eat. I was also faced the same problem in one of my projects. Only way is you can parse that string on server side, store it temporarily and pass the image path in response. This is the best approach i am feeling right now. I am still searching for solution. If found, i will let you know. Your above approach will slow down the mobile and also kill the useful apps services running in background. :(

Categories : Android

Using task queue to parse large xml data
When a line matches "</product>", it appends to self.masterlist and increments self.count (eventually to 50). But if the next line is not "</product>", the count will still be 50 and add another task to the queue. Instead, use the length of self.masterList because it is reset after being added to the queue: if len(self.masterList) >= 50: logging.debug('Starting queue of items up to %s with 50 items', len(self.masterList)) taskqueue.add(url='/adddata', params={'data': self.masterList}) self.masterList = [] and remove all references to self.count.

Categories : Python

How can I parse a large delimited text file in node
You should take a look at sax. It is developed by the isaacs! I haven't tested this code, but I would start by writing something along these lines. var Promise = Promise || require('es6-promise').Promise , thr = require('through2') , createReadStream = require('fs').createReadStream , createUnzip = require('zlib').createUnzip , createParser = require('sax').createStream ; function processXml (filename) { return new Promise(function(resolve, reject){ var unzip = createUnzip() , xmlParser = createParser() ; xmlParser.on('opentag', function(node){ // do stuff with the node }) xmlParser.on('attribute', function(node){ // do more stuff with attr }) // instead of rejecting, you may handle the error instead. xmlParser.on('error', reject)

Categories : Node Js

make DotLess parse CSS files instead of LESS files
Never tried it but I imagine you could change the type piece in your HTTP handlers. Instead of path=".LESS" change it to path=".CSS" Not sure if that would work or break something but it would be interesting to see if it would.

Categories : CSS

gdx parse layout files to object files
There currently isn't any way to serialize your layout information with libgdx. You will likely need to write your own wrappers and somehow store this layout information, as I don't know of anyone who has done this. If you are using an old version of the library, which still has the tablelayout files, I'm not sure how easy it will be considering the amount of changes which has been made to libgdx since that time. The alternative is to re-write you files using the Java api, which you will likely need to do going forward. I am not sure how you are still using the tablelayout ui files. They have been deprecated for over a year now and are not included with any of the latest releases of libgdx. If you want a more detailed explanation on what happened to the layout files you can look at this p

Categories : Java

Perl export/parse data from large block of text
Your data are similar to XML. If you fix the format (i.e. by changing <first name> to <first_name>, you can use a proper XML parser to do the hard work. For example, this is how to get the expected output in XML::XSH2, a wrapper around XML::LibXML: open data.xml ; echo xsh:join(',', //id) ; for //first_name echo :s (.) " " following-sibling::last_name[1] ", " following-sibling::height[1] ;

Categories : Regex

Parse a large string literal in JS with regex into object array
Description This regex will parse the text into a roman numeral and body. The body can then be split on the new line . ^s+([CDMLXVI]{1,12})(?: | |$).*?(?:^.*?)(^.*?)(?=^s+([MLXVI]{1,12})(?: | |$)|) Capture Groups Group 0 get the entire matching section gets the roman numeral gets the body of the section, not including the roman numeral Javascript Code Example: Sample text pulled from your link VII Lo! in the orient when the gracious light Lifts up his burning head, each under eye Doth homage to his new-appearing sight, VIII Music to hear, why hear'st thou music sadly? Sweets with sweets war not, joy delights in joy: Why lov'st thou that which thou receiv'st not gladly, Or else receiv'st with pleasure thine annoy? IX Is it for fear to wet a wid

Categories : Javascript

Alternatives to using PHP to fetch and parse large number of remote XMLs?
Keep in mind the bottleneck may very well not be PHP (most definitely not IMO). Are you running your Curl calls serially (one at a time)? There are tools available to make multiple curl calls at once through multiple processes (pcntl_fork and curl_multi_exec). Keep in mind how many calls you can make at one time to this service though. If your application requires some communication or shared memory between these calls, then you may want to look into another language (GoLang is super equipped for these types of tasks)

Categories : PHP

node.js is there any proper way to parse JSON with large numbers? (long, bigint, int64)
Not with built-in JSON.parse. You'll need to parse it manually and treat values as string (if you want to do arithmetics with them there is bignumber.js) You can use Douglas Crockford JSON.js library as a base for your parser. EDIT: I created a package for you :) var JSONbig = require('json-bigint'); var json = '{ "value" : 9223372036854775807, "v2": 123 }'; console.log('Input:', json); console.log(''); console.log('node.js bult-in JSON:') var r = JSON.parse(json); console.log('JSON.parse(input).value : ', r.value.toString()); console.log('JSON.stringify(JSON.parse(input)):', JSON.stringify(r)); console.log(' big number JSON:'); var r1 = JSONbig.parse(json); console.log('JSON.parse(input).value : ', r1.value.toString()); console.log('JSON.stringify(JSON.parse(input)):', JSONbig.stri

Categories : Node Js

Rails & Heroku: Running a script through rails runner using local files
To run a local script on Heroku: irbify.rb script.rb | heroku run rails console --app=my_app irbify.rb is a silly tiny script I wrote to convert a script to a single eval statement. Regarding passing data: you can serialize it in some form and put it inside script. Hope it helps someone. UPDATE: this does not work well anything beyond trivial datasets.

Categories : Ruby On Rails

Git with large files
You really, really, really do not want large binary files checked into your Git repository. Each update you add will cumulatively add to the overall size of your repository, meaning that down the road your Git repo will take longer and longer to clone and use up more and more disk space, because Git stores the entire history of the branch locally, which means when someone checks out the branch, they don't just have to download the latest version of the database; they'll also have to download every previous version. If you need to provide large binary files, upload them to some server separately, and then check in a text file with a URL where the developer can download the large binary file. FTP is actually one of the better options, since it's specifically designed for transferring binar

Categories : GIT

Any way to speed up d3.csv on large files?
No. D3 doesn't provide any functionality to handle compressed files. You could use a third-party library such as JSZip, but then you won't be able to use d3.csv directly.

Categories : Javascript

Download large files with PHP
You could configure Apache to set the proper headers in your .htaccess file. Then, you could link directly to the file instead of the PHP page. This will also reduce server load. Of course, if the PHP script performs functions other than just setting headers (such as authentication) then this is not an option. You will have to pass the file through PHP in chunks as @N.B. mentions in his comment.

Categories : PHP

RoR - Large file uploads in rails
You might consider using MiniProfiler to get a better sense of where the time is being spent. Large file uploading needs to be handled in the background. Any controllers or database access should simply mark that the file was uploaded, and then queue a background processing job to move it around, and any other operations that may need to happen. http://mattgrande.com/2009/08/11/delayedjob/ That article has the gist of it, every implementation is going to be different.

Categories : Javascript



© Copyright 2017 w3hello.com Publishing Limited. All rights reserved.