w3hello.com logo
Home PHP C# C++ Android Java Javascript Python IOS SQL HTML videos Categories
  Home » SOLR » Page 1
Making query to indexed SOLR documents - using highlighting
You can decide which fields gets returned by supplying the fl parameter: &fl=id,name,etc.

Categories : Solr

Hi I want the file name using filelistentityprocessor and lineentityprocessor
From the documentation of FileListEntityProcessor: The implicit fields generated by the FileListEntityProcessor are fileDir, file, fileAbsolutePath, fileSize, fileLastModified and these are available for use within the entity [..]. You can move these values into differently named fields by referencing them: <field column="file" name="filenamefield" /> <field column="fileLastModifie

Categories : Solr

How to do arithmetic query in SOLR search
How do you define length ? Chars or words ? You could do both of them but i think words is a bit trickier. To do chars, just use an https://wiki.apache.org/solr/UpdateRequestProcessor and fill in a predefined length field for the document. You could do the same for words, but you'd have to analyze the document, get the word count , then fill it in, which could prove to be expensive time-wise , esp

Categories : Solr

Is it possible to use Lucene of Solr for image retrieval?
Based on your post , you could store the features as keyword fields in Lucene or ES (solr has a strict schema definition and i don't think it would fit your needs very well as the feature matrix is usually sparsely populated in my understanding), and have a unique ID field from the image hash. Then you can just search for feature values ( feature1:value1 AND feature2:value2) and see what matches t

Categories : Solr

Solr index retrieve raw values after applied analysers to a field
You can use the Luke Request Handler to inspect a Lucene index on a lower level than the regular Solr interface. You can include a uniqueKey to retrieve information for a specific document. I'm not familiar with any SolrJ integration for the Luke handler, but The Almight Search Engine might be able to help.

Categories : Solr

Normalizing SOLR records for sharding: _version_ issues
It should be easier to modify the generated csv. Try to add the id to the csv directly adding a method to do that before the firs method. FileUtils.copyInputStreamToFile(csvInputstream, csvFile); //<-a method call to a function that reopen the csv file and add the mandatory id to each line filesToUpload.add(csvFile); //Add 10000 & start over again yearQueryParams.put(Comm

Categories : Solr

how to get data for particular period in apache solr
Depending on the field types, the second question should be a date interval filter (as long as it's date field .. although it would probably work with string fields as well): timestamp:[2014-11-05 TO 2014-11-09] You might have to tweak those values to get the correct cut off (but I think 11-09 should be 2014-11-09 00:00:00, meaning that only entries from earlier than that date should be include

Categories : Solr

Solr search from beginning of string
Problem solved by using EdgeNGramFilterFactory instead of NGramFilterFactory: <fieldType name="text_start_end" class="solr.TextField" omitNorms="false"> <analyzer> <tokenizer class="solr.ClassicTokenizerFactory"/> <filter class="solr.PositionFilterFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.EdgeNGramFilterFactory"

Categories : Solr

How to query for minimum should match in solr with edismax
Refer this link to understand concept of mm in solr: http://opensourceconnections.com/blog/2013/04/15/querying-more-fields-more-results-stop-wording-and-solrs-mm-min-should-match-argument/ for ex; q= fox brown white & mm=2 then it will gives result based on 2-matches. i.e, search result may be contains brown fox or black white fox.. http://localhost:8983/solr/340Nop/select?q=pc+netbook&

Categories : Solr

Solr facet counts for specific field values
Solr allows you to create a query-based facet count by using the facet.query parameter. When creating a filter query (fq) that's based on a facet field value, I now create a corresponding facet query: facet.query={!ex=g}genre:western and add it to the rest of my parameters: q=*:* fq={!tag=g}genre:western facet=on facet.field={!ex=g}genre facet.query={!ex=g}genre:western facet.mincount=1 facet.

Categories : Solr

Do I need Solr or ElasticSearch to create a REST API for a Lucene DB?
I'm not totally following your question but I would say, from experience, that the Solr API is really nice to work with. I've also used the Elastic Search API and again it's great. If you can pick one and stay with it, either should perfectly meet your needs. However, if there's a chance you might change the underlying engine then that's when your own API will pay dividends. Properly engineere

Categories : Solr

Prevent from solr query injections, when using solrj
Please consider use of a built-in solrj ClientUtils utility class. By means of which you can escape userInputText String escapedUserInputText = ClientUtils.escapeQueryChars(userInputText) For more details take a look at the queryparser syntax page.

Categories : Solr

Solr Wildcard search on int field
see this answer. You have to add a directive link in your schema.xml in {solr_home}/example/solr/your_collection/conf/schema.xml as shown in that answer. Copy all your fields to make it searchable for wildcard query.

Categories : Solr

Convert unstructured data(text) into structured format using java
You can use nlp tools like GATE (https://gate.ac.uk/), Apache OpenNLP (https://opennlp.apache.org/), Minorthird (http://sourceforge.net/projects/minorthird/), etc. You can write a jape grammar in GATE which creates annotations based on the words present in the text. For example you can annotate dimension, measurements, proportions etc as dimension and then look up for numbers in next sentence. Y

Categories : Solr

Changes to Solr schema.xml do not update after stoping and restarting Solr
Changing the schema of a core doesn't change the documents you already have there, which is why they look the same even after you restart the Solr service. You need to re-upload the documents with the new fields specified (if they are required fields) after you make a schema change to get these new fields for existing documents.

Categories : Solr

How index a partial field content in solr
Depending on how well-formed the input is, you can apply a copyField directive to a field defined with a PatternReplaceCharFilter as the first filter together with a regular expression removing everything that isn't enclosed within the tags. While parsing HTML with regular expressions usually is a bad idea, it would probably work "good enough" in this case. You can also apply a UpdateProcessor c

Categories : Solr

using a Solr function in the query itself
try this, fq={!frange l=0 u=2.2} sum(Field1,Field2) l is The lower bound u is The Uper bound Solr cwiki here. Function Range Query

Categories : Solr

Solr query logical operator Field AND with () return empty set
The NOT operator in Solr Query Syntax is -. So you need to use something like -field_a:a AND field_b:b

Categories : Solr

Running a weekly update on a live Solr environment
Depending on the size of the data, you can probably just keep the Solr core running while doing the update. First issue a delete, then index the data and finally commit the changes. The new index state won't be seen before the commit is issued, which allows you to serve the old data while waiting for the indexing to complete. Another option is to use the core admin to switch cores as you mentione

Categories : Solr

Solr and spelling exclude short words in result
In the field type definition for field you use for suggestions, you can use LengthFilterFactory - set the min to 3, and you won't see the short words indexed at all. See: https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.LengthFilterFactory For the second part of the question, you could use in your field definition <filter class="EdgeNGramFilterFactory" minGramSize="2" maxGra

Categories : Solr

Is there any way to get a thumbsuck idea of how much space your database will need for a search engine?
You've already done an estimate, but your estimate is probably way off. Almost no modern web page is only 1kb in size (MSN.com is 319KB (58.8KB gzipped) - but 1B web pages is, depending on who you're asking, a measurable amount of the relevant pages on the internet today. And keep in mind, you probably don't just want to store the actual page content, but also index it. This will include several i

Categories : Solr

Solr multiple words search
Try using the syntax for queries such as: KeywordDescription:value1 AND KeywordDescription:value2 so: KeywordDescription:luxury AND KeywordDescription:2008 Some links that might help you: Query syntax Query syntax 2 Hope it helps.

Categories : Solr

Why does Solr cloud separate write and read operations?
All writes go to the leader then the leader forwards the update request to the replicas in the same transaction. Writes go to the leader so documents can be merged with the latest version and it also makes orchestration a bit simpler. As for reads, queries are load balanced across all nodes (leaders + replicas).

Categories : Solr

Solr BlockJoin Indexing for Solr 4.10.1
To be able to use child="true" in DIH apply the patch from https://issues.apache.org/jira/browse/SOLR-5147 (I think it's the same DIH patch at solr-3076). The patch itself seems to be incompatible in neglectable details with the current trunk.

Categories : Solr

Solr Text field and String field - different search behaviour
TextFields usually have a tokenizer and text analysis attached, meaning that the indexed content is broken into separate tokens where there is no need for an exact match - each word / token can be matched separately to decide if the whole document should be included in the response. StrFields cannot have any tokenization or analysis / filters applied, and will only give results for exact matches.

Categories : Solr

Does Lucene's inverted index is stored in memory?
You can use RamDirectoryFactory but this is considered inefficient, especially for larger indexes - mainly because of GC overhead. You're better off tuning various Solr caches which are designed to achieve the same result.

Categories : Solr

How solr do ranking of indexed documents
Solr is built on top of lucene so it extends the same mechanism of scoring documents. This link should be helpful http://lucene.apache.org/core/3_5_0/scoring.html. You can control ranking of documents while indexing and specifying a boost parameter See SolrInputDocument.setDocumentBoost() and SolrInputField.setBoost. The higher the value you specify the higher the document rank will be in the sear

Categories : Solr

solr fq without specifying field
It is most likely in your solrconfig.xml as default field. Just to be sure which field the fq is searching by default in your environment, run the query with debug=true option and in the debug response you will see an entry with "parsed_filter_queries". This will provide actual field on which it is filtering.

Categories : Solr

Max length of a Solr uniqueKey
Cassandra does have a limit of 64K for keys. Generally in Solr, "text" should not be used for the key since it is tokenized. Use a "string" field instead. As the Cassandra FAQ wiki notes, a hash is a better choice for using long text values for keys: http://wiki.apache.org/cassandra/FAQ#max_key_size Ultimately, it comes down to how you wish to query the Solr documents. The general guidance for

Categories : Solr

CKAN/Jetty/Solr: ERROR 500: org/apache/tomcat/util/descriptor/LocalResolver
CKAN doesn't support Ubuntu 14.04 yet, there are a number of issues; https://github.com/ckan/ckan/labels/14.04 Install on Ubuntu 12.04. Having said that, there is a pull request here with working source install instructions for 14.04: https://github.com/ckan/ckan/pull/2020

Categories : Solr

Solr Tokenizer that splits on case change
The WordDelimiterFilterFactory is what you want to use. It allows you to split on case change (as well as things like intra-word delimiters and numbers, depending on the arguments you use). See the docs here: https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory In your case, you should use splitOnCaseChange="1" to get what you want.

Categories : Solr

Testing SOLR to Elasticsearch data transfer
First, make sure that you are using the same semantics. For example, same filters, tokenizers, stemmers. Also, Apache Solr 4.3.0 is built on Apache Lucene 4.3.0 , while ElasticSearch 1.3.2 is built on Apache Lucene 4.9.0 This might not be the issue, I don't know to be honest. But if I were you, I would check the release notes of Apache Lucene > 4.3.0 and see what is changed.

Categories : Solr

Solr (sunspot) not finding partial word match when suffix included
Can you post the logs/actual solr queries that are generated by the following two queries? Ingredient.search_by_partial_name('baki') # => [] Ingredient.search_by_partial_name('bak') # => [<Ingredient "baking powder">, It'd help to see that information, to see exactly what's being fed to Solr and therefore what Solr is trying to do. Edit: Given that you want partial matches, I'm

Categories : Solr

Solr find all ids that start with certain path
Sounds like you are using generic text tokenizer definition. You may want to look at the PathHierarchyTokinizer instead. It's designed to split at the path prefixes. And then, you will not need to do the * at the end.

Categories : Solr

Find a document with the date-time before current date-time in Solr
[* TO NOW] should actually work, but will wreck havoc with your caches. Try using [* TO NOW/DAY+1DAY] instead (which should give you midnight of the current day instead, and it won't change each second, so the performance will be far better).

Categories : Solr

how to get the top 1 document of each type, from a search on index(having multiple types)?
Try this, using top_hits aggregation: GET /machines/_search?search_type=count { "query": { "match_all": {} //your query here }, "aggs": { "top-types": { "terms": { "field": "_type" }, "aggs": { "top_docs": { "top_hits": { "sort": [ { "_score": { "order": "desc" }

Categories : Solr

How to remove menu's from html during crawl or indexing with nutch and solr
Add this in nutch site.xml <!-- tika properties to use BoilerPipe, according to Marcus Jelsma --> <property> <name>tika.use_boilerpipe</name> <value>true</value> </property> <property> <name>tika.boilerpipe.extractor</name> <value>ArticleExtractor</value> </property> This wont exactly remove header an

Categories : Solr

Solr Multivalued fields to string
To get information out of Solr in the format 4A,R3,UK you can simply change the output format in Solr for your individual query: Add wt=csv (wt is "writer type/response format") to your query. Documentation: http://wiki.apache.org/solr/CSVResponseWriter

Categories : Solr

differences between verity and solr
There is difference between working of verity and solr search engine. verity is classic search engine where as Solr is modern.Solr is more robust and fast. Raymond Camden have explained it well in his blog. For difference in result in solr you have to chose a proper serach syntax that will return you desired result. Solr support multiple search syntax to find matching result. Here is some example

Categories : Solr




© Copyright 2018 w3hello.com Publishing Limited. All rights reserved.