Commit Graph

  • 5df553c152 - added a json writer for solr (yes there was one using xslt but this one writes the same way as yacysearch.json) - using the new json solr result to change the ajax search in IndexControlURLs to the new solr search Michael Peter Christen 2012-09-10 14:30:44 +02:00
  • 4634f0e626 fix for images_withalt Michael Peter Christen 2012-09-10 12:30:03 +02:00
  • e65cecc419 - updated lucene libraries to 3.6.1 - added lucene-grouping which enables faceted search; try this: http://localhost:8090/solr/select?q=*:*&start=0&rows=3&facet=true&facet.field=host_s Michael Peter Christen 2012-09-10 10:12:38 +02:00
  • 1754fbb6d9 Merge remote-tracking branch 'reger/master' Michael Peter Christen 2012-09-10 08:10:53 +02:00
  • 4d29f59a27 removed warnings Michael Peter Christen 2012-09-10 07:15:52 +02:00
  • 8c099d2106 Merge remote-tracking branch 'origin/master' Michael Peter Christen 2012-09-10 07:05:20 +02:00
  • 59bd478ed1 Added more sophisticated RDF output for YMarks, including the folder structure (b:Topic) and support for multiple tags (dc:subject) and folders (b:hasTopic) via rdf:Bag container. apfelmaennchen 2012-09-09 22:56:24 +02:00
  • d31a632951 - added dmoz RDF dump importer - added indexing to Tables columns to support larger bookmark collections - added RDF output (HTTP) for public bookmarks at /YMarks.rdf - YMarkRDF also provides a Jena RDF Model as "internal" API - various other changes/fixes for YMarks (mainly backend) apfelmaennchen 2012-09-09 09:53:58 +02:00
  • 40d8086bf7 keep input order of translation entries within one file section. Allowing on translation conflicts (translaton of words contained in other sentence) to put shorter key at the end of the translation list. reger 2012-09-09 06:15:25 +02:00
  • 10b911eed4 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git Michael Peter Christen 2012-09-07 22:07:02 +02:00
  • be67c70a47 added Solr fields: inboundlinks_text_chars_val inboundlinks_text_words_val inboundlinks_alttag_txt outboundlinks_text_chars_val outboundlinks_text_words_val outboundlinks_alttag_txt Michael Peter Christen 2012-09-07 22:06:51 +02:00
  • d73fff0e0e added solr field images_withalt_i orbiter 2012-09-07 21:33:45 +02:00
  • 66ac4076c2 added disjunction '|' option to site parameter in GSA API orbiter 2012-09-06 22:35:55 +02:00
  • a975bcffcb clear fulltext-cache and stop crawling if running out of memory sixcooler 2012-09-06 22:10:03 +02:00
  • e78fe3f477 also do a clearcache on the solr-connector-caches sixcooler 2012-09-06 22:07:07 +02:00
  • 9ee2e09983 statistics for solr-cache sixcooler 2012-09-06 22:02:29 +02:00
  • d8425e6809 added collections to crawl monitor Michael Peter Christen 2012-09-04 14:47:53 +02:00
  • ee23fc7a32 added h1..h6 counter fields Michael Peter Christen 2012-09-04 14:11:11 +02:00
  • 4b36a2c3b4 small style changes Michael Peter Christen 2012-09-04 11:23:41 +02:00
  • 8ca842b137 added new button design to more buttons Michael Peter Christen 2012-09-03 16:04:57 +02:00
  • 04709e91d7 add nice submit buttons to pdblue skin Michael Peter Christen 2012-09-03 15:27:47 +02:00
  • ef6de52ab5 dependency is java6 only Michael Peter Christen 2012-09-03 15:26:47 +02:00
  • b2b516cc3e added a collection attribute to crawls and searches: - a solr field collection_sxt can be used to store a set of crawl tags - when this field is activated, a crawl tag can be assigned when crawls are started - the content of the collection field can be comma-separated, all of them are assigned to the documents when they are indexed as result of such a crawl start - a search result can be drilled down to a specific collection; this is currently only available in the solr interface and also in the gsa interface using the 'site' option - this adds a mandatory field for gsa queries (the google api demands that field all the time) Michael Peter Christen 2012-09-03 15:26:08 +02:00
  • 174530a9e0 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git Michael Peter Christen 2012-09-03 00:46:17 +02:00
  • 43f3a932fd removed jquery.slider as it is already included as part of jquery-ui package apfelmaennchen 2012-09-01 14:17:20 +02:00
  • a01eb1b7fe removed unused jquery plugin slider as it is part of jquery-ui package apfelmaennchen 2012-09-01 10:25:22 +02:00
  • 4815713ec7 added synchronization to solr server requests since lucene is not thread-safe. We experienced problems as described in http://stackoverflow.com/questions/5327978/lockobtainfailedexception-updating-lucene-search-index-using-solr Michael Peter Christen 2012-08-31 15:16:33 +02:00
  • f75b3f8a47 added more patches to work without RWI data structure Michael Peter Christen 2012-08-31 14:35:56 +02:00
  • a427a68bac removed many warnings Michael Peter Christen 2012-08-31 14:07:33 +02:00
  • c72c435517 - moved the gsa search interface from /gsa/searchresult? to /gsa/search? - fixed the NB field data Michael Peter Christen 2012-08-31 14:00:53 +02:00
  • 31d4d38804 - extended the solr interface by a references-by-word-count method - reduced danger that a non-existing RWI database causes NPEs - added Solr queries to did-you-mean: this makes it possible that our did-you-mean algorithm works together with only Solr and without RWIs Michael Peter Christen 2012-08-31 13:03:00 +02:00
  • 528d6763fa - added new solr fields: title_count_i, title_chars_val, title_words_val description_count_i, description_chars_val, description_words_val - added many asserts to ensure data type correctness from YaCy to Solr and vice versa - made many fixes according to new findings from these asserts (!) Michael Peter Christen 2012-08-31 10:30:43 +02:00
  • 3142e675e8 fixed problems with GSA api: - better FS attribute - highlightning of searched words in title Michael Peter Christen 2012-08-29 16:48:53 +02:00
  • 3b19fe7b52 - fixed num parameter in GSA api - changed FS attribute in GSA api Michael Peter Christen 2012-08-29 16:28:32 +02:00
  • 2ddc33646a added new field for solr: url_paths_sxt url_parameter_i url_parameter_key_sxt url_parameter_value_sxt url_chars_i Michael Peter Christen 2012-08-29 16:11:23 +02:00
  • 75d5e3475d Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git Michael Peter Christen 2012-08-29 10:13:51 +02:00
  • a2841261bd content control: apply filter if enabled to crawls cominch 2012-08-29 09:52:14 +02:00
  • dc468dad01 add content control features for custom filter lists cominch 2012-08-29 09:04:28 +02:00
  • 316b5fe116 - added a solr type definition verifier - fixed type definition found by the verifier - added multivalue-string fields for solr with extension 'sxt' - added multivalue-integer fields for solr with extension 'val' - renamed some solr attributes from txt to sxt - changed solr query line to an explicit AND/OR structure - added a country code second level domain list to Domains class; with parser - added a host string parser to get domain class name, country-code second-level domain and subdomain out of it - removed old coordinate attributes Michael Peter Christen 2012-08-28 16:58:06 +02:00
  • a3d5959981 Merge commit '65d49df865f60511d22d86fb15c33a082176e7ab' orbiter 2012-08-27 16:56:22 +02:00
  • 4521d63c92 added boosts to solr search queries Michael Peter Christen 2012-08-27 15:25:25 +02:00
  • 4c79ddb91e switched off some solr logging Michael Peter Christen 2012-08-27 14:41:47 +02:00
  • e8acd542b5 - added faceted drill-down for host and geolocation to solr queries - added a new geolocation field to index schema, the old values are migrated if possible Michael Peter Christen 2012-08-27 14:41:33 +02:00
  • f00168ecc5 added gsa result attribute 'has' Michael Peter Christen 2012-08-27 12:15:42 +02:00
  • 65d49df865 security fix: clear automtic password only if adminAccountForLocalhost=false to prevent remote access to protected pages after restart. if adminAccountForLocalhost=true leave automatic password unchanged so access from local host is granted but remote access is preventet from the 1st second. reger 2012-08-26 22:28:14 +02:00
  • 52a62af184 Merge branch 'master' of git://gitorious.org/yacy/rc1.git reger 2012-08-26 22:12:59 +02:00
  • 2094df2e4e - correct length computation for BStringObject (bugfix suggested by apfelmaennchen) - using ASCII for string conversion for Strings generated from Integer orbiter 2012-08-26 17:46:40 +02:00
  • 2d2be546fe fix path to env/grafics to display api icon on meta data page reger 2012-08-26 04:36:52 +02:00
  • 6d03433cda - added hack to prevent that stream servlet paths are not parsed wrongly if the path contains a dot. - added also warnings if documents are requests which do not exist. orbiter 2012-08-25 19:08:42 +02:00
  • a1227879a9 release 1.1 Release_1.1 orbiter 2012-08-24 23:59:10 +02:00
  • 7ac259477f added a direct access to solr search api to enhance the visibility if the embedded solr orbiter 2012-08-24 23:04:19 +02:00
  • 67f2866cd0 small fixes orbiter 2012-08-24 21:44:22 +02:00
  • ce156a01ba Merge commit 'c2341a175fdd755a34965ff63c7ea437b380352d' orbiter 2012-08-24 18:24:24 +02:00
  • c2341a175f Fixed a bug that prevented Yacy from indexing files with non ASCII filenames in FTP servers. David Rubio 2012-08-24 17:45:14 +02:00
  • 3ebc4264c5 fixed concurrent query orbiter 2012-08-24 14:15:40 +02:00
  • 29171e2f6c fixed generation of ontologies from index enumerations orbiter 2012-08-24 14:13:42 +02:00
  • 7cd302de3e omit xml parsing when using the embedded solr server orbiter 2012-08-24 12:18:30 +02:00
  • 787e1c6836 added the QueryResponse query(SolrParams params) method to the SolrServerConnector which is necessary to use facets in solr search. orbiter 2012-08-23 11:53:54 +02:00
  • 01a63ef595 redesign of YaCySchema and SolrDoc handling orbiter 2012-08-23 09:51:45 +02:00
  • 479bfca571 refctoring orbiter 2012-08-23 09:30:11 +02:00
  • 48a82bc705 log queries anonymous from gsa+solr requests Michael Peter Christen 2012-08-22 23:50:40 +02:00
  • ab6ec4ec52 added snippet computation to solr/rss and gsa result writer Michael Peter Christen 2012-08-22 17:37:34 +02:00
  • 4716546ef5 - reduced memory usage in index transmission using a transformation of Node to Row objects - removed peerDeparture in solr remote search in case that peer does not answer (this may be normal because it is allowed to switch this off) Michael Peter Christen 2012-08-22 16:30:33 +02:00
  • af764c106c re-activated audio and video search because they obviously work (!) Michael Peter Christen 2012-08-22 01:56:13 +02:00
  • 06b0081fdc fix for NPE during host navigation computation Michael Peter Christen 2012-08-22 01:55:39 +02:00
  • feb99bc291 fixed GSA format Michael Peter Christen 2012-08-22 00:48:37 +02:00
  • 653645c1cf corrected solr query syntax Michael Peter Christen 2012-08-22 00:48:03 +02:00
  • 08ae142a3d - enhanced caching after search queries to solr - reduced caching after short memory Michael Peter Christen 2012-08-22 00:31:14 +02:00
  • 716ea0cfe2 sorted the solr schema into mandatory and optional fields; reduced number of used field to reduce solr index size orbiter 2012-08-21 23:52:56 +02:00
  • 9b8c8c0f47 fix from gaston in http://forum.yacy-websuche.de/viewtopic.php?p=26909#p26909 orbiter 2012-08-21 21:03:26 +02:00
  • acb9f04e80 removed unused classes orbiter 2012-08-21 18:18:30 +02:00
  • 0ad52ac4c3 gsa bugfix for date parser Michael Peter Christen 2012-08-21 02:39:28 +02:00
  • 3ce4c2f937 fixes for gsa result format Michael Peter Christen 2012-08-21 01:57:46 +02:00
  • 2d5fdfeb65 added authorization-based maximum results limitation to solr and gsa search Michael Peter Christen 2012-08-20 17:10:48 +02:00
  • 67d235fae9 added gzip encoding to solr2sor http interface, client side (server already works) Michael Peter Christen 2012-08-20 16:53:21 +02:00
  • a049761e0c fixed double-check Michael Peter Christen 2012-08-20 14:16:37 +02:00
  • 6fc5400f91 added a tooltip for search navigation to mention that search pages can be navigated using the TAB key Michael Peter Christen 2012-08-20 13:02:29 +02:00
  • f42a57cd7d gsa format update Michael Peter Christen 2012-08-20 12:50:51 +02:00
  • b3aad6cc35 bugfix for remote search when search is done to solr Michael Peter Christen 2012-08-20 12:21:36 +02:00
  • ff3eaa21b0 added remote search to solr on YaCy peers! - when doing a remote search, node peers are selected for solr queries - the solr query is done concurrently to the standard YaCy rwi search - the solr search result is feeded into the same data structure that prepares the rwi search result - the same remote seach that is done to several outside peers is done to the local solr index - the search process works now also without any 'old' RWI data using solr Michael Peter Christen 2012-08-20 12:16:11 +02:00
  • a06123aec6 more abstraction and less parameter overhead for remote search Michael Peter Christen 2012-08-20 01:29:15 +02:00
  • f00733186b code simplifications Michael Peter Christen 2012-08-19 13:17:03 +02:00
  • 755f5e76cf removed strange assert statements and simplified code in metadata transformation Michael Peter Christen 2012-08-19 08:44:39 +02:00
  • db0d438709 fix for http://bugs.yacy.net/view.php?id=206 Michael Peter Christen 2012-08-19 08:43:56 +02:00
  • 404b0aab09 refactoring in remote search and stub for remote node peer selection orbiter 2012-08-18 23:59:25 +02:00
  • d7ea45f698 - get nice text_t values from metadata conversions that are stored into solr as fulltext search index. - added slow migration from old metadata to solr index entries: each entry from the old metadata is removed from that data structure and written into solr. orbiter 2012-08-18 19:36:21 +02:00
  • 99ef57f103 reduced sleep times orbiter 2012-08-18 17:48:20 +02:00
  • 780f8974e7 added ramaining iteration methods for solr in fulltext class orbiter 2012-08-18 15:39:14 +02:00
  • acd2dc3575 hack to removed StringBuilder overhead in query construction orbiter 2012-08-18 14:22:00 +02:00
  • db6863db77 reduced solr cache sizes to check if that solves memory problems a bit orbiter 2012-08-18 13:45:37 +02:00
  • 6f01542aaa explicit double-check in transferURL orbiter 2012-08-18 13:18:51 +02:00
  • ee01c12e56 fixes for putDocument and putMetadata orbiter 2012-08-18 13:05:27 +02:00
  • cc47a0876e reverted bf55f69176 to have a fall-back option in case that memory problems as reported in http://forum.yacy-websuche.de/viewtopic.php?p=26901#p26901 for full-solr installation are too strong and we have to work with an 'small memory footprint' peer system. orbiter 2012-08-18 10:28:40 +02:00
  • 0904afe8fb added concurrent iterator methods to the solr connectors Michael Peter Christen 2012-08-17 18:22:56 +02:00
  • d54b80327a refactoring Michael Peter Christen 2012-08-17 17:28:27 +02:00
  • f9fc5cfaba better check for bad urls in url transmission Michael Peter Christen 2012-08-17 17:17:00 +02:00
  • d39463a85c added deleteByQuery to solr connectors Michael Peter Christen 2012-08-17 17:05:46 +02:00
  • 0cab06c47c refactoring Michael Peter Christen 2012-08-17 15:52:33 +02:00
  • bf55f69176 removed write methods to old metadata file type; all metadata now goes to solr Michael Peter Christen 2012-08-17 15:46:26 +02:00
  • 40c0856489 refactoring Michael Peter Christen 2012-08-17 15:33:02 +02:00