Commit Graph

  • 39f8eb60c3 tried to prevent calls to bad-hack getSize() method and reduced overhead of that method a bit. orbiter 2012-08-10 18:10:25 +02:00
  • 9b88433f45 patch from hint in http://forum.yacy-websuche.de/viewtopic.php?p=26858#p26858 from gaston orbiter 2012-08-10 15:44:37 +02:00
  • e816b88b55 changed behaviour of metadata storage: in case that any solr is attached, the metadata is not written to the metadata-db, even if it is enabled but instead to solr. This prevents that metadata is written in two store systems at the same time. It is also the next step to migrate the current metadata-db to solr. orbiter 2012-08-10 15:39:10 +02:00
  • 2571e0d47a removed unused classes orbiter 2012-08-10 14:47:44 +02:00
  • f9c0e6e950 - Implemented and integrated the URIMetadataNode object which is a metadata representation from the solr index. This shall replace metadata from the built-in database in the future. - added the Solr-driven metadata into the search index of YaCy which makes it now possible to run YaCy without the old metadata index. This is a major stept forward to a full migration to Solr. Michael Peter Christen 2012-08-10 13:26:51 +02:00
  • b2b480fff2 more abstraction of the YaCySchema -> Opensearch matching process Michael Peter Christen 2012-08-10 09:48:15 +02:00
  • aa0ef98ffa Merge branch 'master' of git://gitorious.org/~chalker/yacy/chalkers-yacy-rc1 Michael Peter Christen 2012-08-10 09:47:15 +02:00
  • 73f6d69d03 more abstraction for solr query params parsing Michael Peter Christen 2012-08-10 07:58:45 +02:00
  • 24462e9baa set the title every time, it is possible that it has changed Michael Peter Christen 2012-08-10 07:51:57 +02:00
  • dcc72799c4 better abstraction for result writers using controlled vocabularies and URIRefs Michael Peter Christen 2012-08-10 07:45:43 +02:00
  • 136fcb1ad9 refactoring Michael Peter Christen 2012-08-10 06:47:13 +02:00
  • a12f693ec9 added two response writer for embedded solr interface: a rss/opensearch writer and an enhanced solr xml writer. The enhanced solr writer has less configuration overhead than the original writer and should by slightly faster. The rss/opensearch writer is at this time slightly incomplete compared with the already existing rss search result form YaCy and also snippets are missing at this time. To test the new interface, open for example: http://localhost:8090/solr/select?wt=rss&q=olympia The wt-code for the new result writers are= wt=rss for opensearch wt=exml for the enhanced solr xml writer. Additionally, the SRU search parameters had been added to the solr interface which can now also be used for a normal solr/xml search. Michael Peter Christen 2012-08-09 18:06:48 +02:00
  • 792ecf2444 Fix an error in Russian translation: "can not" => "can". Сковорода Никита Андреевич 2012-08-08 11:35:45 +04:00
  • bca4a16603 replaced the multivalue generic string field name suffix _ss by _txt because _ss is not part of the standard solr example schema. Michael Peter Christen 2012-08-06 17:58:09 +02:00
  • 67edfd991c Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git orbiter 2012-08-05 15:49:48 +02:00
  • d9173ba7ed added more solr fields to integrate values from URIMetadataRow. All writings to the Metadata-DB are now also done to solr. This includes metadata transfer during search and rwi transfer. orbiter 2012-08-05 15:49:27 +02:00
  • 70b10e8316 added the JSON response writer to solr interface, add &wt=json to the servlet GET properties to use this format Michael Peter Christen 2012-08-01 00:14:56 +02:00
  • 3276508d1b Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git Michael Peter Christen 2012-07-31 23:49:56 +02:00
  • 3ce04cecf3 bad hack to prevent a bug appearing in solr Michael Peter Christen 2012-07-31 23:49:07 +02:00
  • f32aa9a49c prevent merge of blobs that can't be handled in memory sixcooler 2012-07-31 23:23:16 +02:00
  • bbd242afb4 fix for a NPE Michael Peter Christen 2012-07-30 14:51:01 +02:00
  • 8d944f6517 nowrap from gaston in forum http://forum.yacy-websuche.de/viewtopic.php?p=26815#p26815 Michael Peter Christen 2012-07-30 12:39:47 +02:00
  • 24d9db1613 snippet retrieval loading processes may use a smaller minimum load time value than crawling processes. This speeds up the search result preparation dramatically. Michael Peter Christen 2012-07-30 10:38:23 +02:00
  • ef488a15f7 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git Michael Peter Christen 2012-07-27 12:14:24 +02:00
  • 1687737771 Abstraction of HandleMap and HandleSet Michael Peter Christen 2012-07-27 12:13:53 +02:00
  • 76b037a20a check content domain fix: search image/media should not show pages containing image/media search text should show all/text but image/media sixcooler 2012-07-27 04:11:52 +02:00
  • 9cd409682f close augmented stream if filled from cache to get its content use augmented stream if proxyAugmentation is set only sixcooler 2012-07-26 18:09:40 +02:00
  • e432bb9cd9 better calculation of possible saving in HeapReader index data structure Michael Peter Christen 2012-07-26 10:05:06 +02:00
  • 9549984c65 documentation/comments Michael Peter Christen 2012-07-25 21:34:23 +02:00
  • beb6425f0c Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git Michael Peter Christen 2012-07-25 21:18:30 +02:00
  • 83c93e9209 no translation of queue-links sixcooler 2012-07-25 15:35:13 +02:00
  • 3bcd9d622b cleaned up classes and methods which are either superfluous at this time or will be superfluous or subject of complete redesign after the migration to solr. Removing these things now will make the transition to solr more simple. Michael Peter Christen 2012-07-25 14:31:54 +02:00
  • 6f1ddb2519 Moved solr index-add method to the same method where the YaCy index is written. Also done some code-cleanup. Michael Peter Christen 2012-07-25 01:53:47 +02:00
  • 315d83cfa0 cleanup Michael Peter Christen 2012-07-24 22:16:56 +02:00
  • 1f41d9c6f5 bugfix for a NPE Michael Peter Christen 2012-07-24 17:29:32 +02:00
  • 76202f068e extended abstraction of local and remote solr index using one front-end for index administration and querying. Michael Peter Christen 2012-07-24 17:23:29 +02:00
  • d3f243e2e1 fixed node type calculation for principal peers Michael Peter Christen 2012-07-23 23:40:50 +02:00
  • 7ec7341f60 added user-authentication protection to solr search (same as implemented for yacysearch) Michael Peter Christen 2012-07-23 21:43:14 +02:00
  • e2a97ef8f6 better explain how to access the embedded solr Michael Peter Christen 2012-07-23 21:31:12 +02:00
  • 826967513b changed options in IndexFederated_p to switch on/off parts of the index individually. The settings are experimental and the values of the settings will be overwritten when an index migration from urldb to solr starts. Michael Peter Christen 2012-07-23 16:28:39 +02:00
  • cba4ab862e fix for http://bugs.yacy.net/view.php?id=202 Michael Peter Christen 2012-07-23 00:36:18 +02:00
  • b76836db7b Merge branch 'master' of git://gitorious.org/~reger/yacy/bbyacy-rc1 Michael Peter Christen 2012-07-23 00:35:14 +02:00
  • 36c9875b6e removed localized number formatting from num-results_totalcount response (this is only used in xml and json where localized format is not valid) reger 2012-07-23 00:00:40 +02:00
  • 0640a6f7e6 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git Michael Peter Christen 2012-07-22 21:50:44 +02:00
  • 69e743d9e3 - more abstraction for the RWI index as preparation for solr integration - added options in search index to switch parts of the index on or off orbiter 2012-07-22 13:18:45 +02:00
  • 6cc5d1094e Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git orbiter 2012-07-21 13:34:57 +02:00
  • 05a3ffd03a patches to ensure that solr connectors are active ony if they have a solr object assigned and vice versa orbiter 2012-07-20 11:47:50 +02:00
  • 5a3c829872 embedded solr is only initiated if it is activated with IndexFederated_p.html orbiter 2012-07-20 11:40:33 +02:00
  • 161005ceaa Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git Michael Peter Christen 2012-07-20 09:04:14 +02:00
  • bf4968d748 source change in classpath Michael Peter Christen 2012-07-20 09:04:02 +02:00
  • 3a350a2f83 partial html fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=4454 Lotus 2012-07-20 08:53:12 +02:00
  • 49ee31f837 added classpath for htroot/solr orbiter 2012-07-20 00:59:58 +02:00
  • 97b7bcf2a6 added a solr search index - by default, a (empty) solr storage instance is created at SEGMENTS/solr_36 - the index is written if in /IndexFederated_p.html the flag "embedded solr search index" is switched on - a standard solr query interface is available now with a new servlet at http://127.0.0.1:8090/solr/select Michael Peter Christen 2012-07-19 11:34:05 +02:00
  • f0a079ac9f allow larger log entries Michael Peter Christen 2012-07-14 16:28:14 +02:00
  • 9b48c9fe2e removed a crawler overhead (terminated loop which searches greatest stack that has zero-waiting urls). This should cause a slightly faster crawl for crawl stacks with many different domains in the crawl queue. Michael Peter Christen 2012-07-14 13:11:04 +02:00
  • 784a4abb18 enhancement in internal data organization which should generate less synchronizations in database access Michael Peter Christen 2012-07-14 13:09:44 +02:00
  • f78ce93a80 collection of speed and memory saving hacks Michael Peter Christen 2012-07-13 21:15:38 +02:00
  • c00a3cf74d less usage of generic logger to avoid logger generation overhead orbiter 2012-07-12 19:54:54 +02:00
  • a196f24f60 prevent enqueueing of non-loggeable logging entries orbiter 2012-07-12 19:42:42 +02:00
  • 482afed07c reduced logging overhead (a bit) orbiter 2012-07-12 19:23:40 +02:00
  • e76159040b Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git orbiter 2012-07-12 11:14:04 +02:00
  • bbfa497a3c replaced more size() > 0 by !isEmpty() orbiter 2012-07-12 11:12:21 +02:00
  • 58e7d1952f reduction of logging to prevent too much IO caused be logging Michael Peter Christen 2012-07-12 02:08:11 +02:00
  • 83da68c4c1 fixed a memory leak inside the logger which appeared if the log was writter faster that the logger is able to print this out to its out stream. A very large collection of unwritten log outputs had been seen during strong crawling. The new ArrayBlockingQueue is limited to prevent this case. Michael Peter Christen 2012-07-12 01:23:04 +02:00
  • e3aa05b9dd added creation of subpath pattern when crawl start is 'from file' Michael Peter Christen 2012-07-11 23:18:57 +02:00
  • 0cbda0b2b8 - replaced all length() == 0 and size() == 0 with isEmpty() - replaced some length() > 0 and size() > 0 with !isEmpty() - cannot be done automatically - implemented some isEmpty() methods orbiter 2012-07-10 22:59:03 +02:00
  • 28b30231c3 fix for url matcher of multiple amp& in an url, see: http://forum.yacy-websuche.de/viewtopic.php?f=8&t=4439&p=26650#p26650 orbiter 2012-07-10 17:39:56 +02:00
  • aef9dd0350 - removed cleaning of blacklist cache on startup - added cleaning of blacklist cache if cache is modified in interface - extended cache saving to all cache types - moved cache location to DATA/LISTS - fixed static file path which was relative to the application path but should be relative to data path - which is different in debian and mac implementations Roland 'Quix0r' Haeder 2012-07-10 13:08:16 +02:00
  • c7afa8bc48 using SwitchboardConstants for solr attributes orbiter 2012-07-10 12:01:20 +02:00
  • a99ef68422 bump to httpclient-4.2.1 sixcooler 2012-07-09 18:58:33 +02:00
  • c6d8950651 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git orbiter 2012-07-09 14:33:11 +02:00
  • 5f3b8dc040 fix for RSS reader orbiter 2012-07-09 14:32:35 +02:00
  • 62202e2d71 refactoring of query attribute variable names for better consistency with (next) stored query words orbiter 2012-07-09 11:14:50 +02:00
  • 2160f9a819 Release 1.04 Release_1.04 Michael Peter Christen 2012-07-09 00:13:59 +02:00
  • 1addbc792c use less memory for md5 cache Michael Peter Christen 2012-07-08 22:05:04 +02:00
  • f32de94723 more logging Michael Peter Christen 2012-07-08 22:04:36 +02:00
  • d09d9f2364 filter old peers from bootstrap (now stronger: 60 minutes instead of 240). Michael Peter Christen 2012-07-08 21:25:22 +02:00
  • 434ee90c59 added classification for control file types which shall not be loaded but placed onto the noload-queue Michael Peter Christen 2012-07-08 21:17:33 +02:00
  • 1517a3b7b9 added webm mime-type Michael Peter Christen 2012-07-08 17:59:20 +02:00
  • a90bcb48f6 added webm Michael Peter Christen 2012-07-08 17:58:05 +02:00
  • 801972fe6f fix for url camel case parser and sentence reader Michael Peter Christen 2012-07-08 16:48:09 +02:00
  • fbc1a2030d fix for sitemap importer: can now also import very large sitemaps within small memory configurations Michael Peter Christen 2012-07-08 16:11:50 +02:00
  • 92731e5287 fix for sevenzip parser Michael Peter Christen 2012-07-08 16:11:19 +02:00
  • 45641b0c23 catch and log a warning in RasterPlotter Michael Peter Christen 2012-07-06 09:21:12 +02:00
  • 8efc1c1078 - fixed a memory leak (or bad usage) during parsing/snippet fetch - more logging for errors Michael Peter Christen 2012-07-06 09:05:41 +02:00
  • c3db015410 prevent loading of content from the cache when retrieval with IFFRESH is used and cache is stale. Should speed up snippet generation when cache strategy is IFFRESH. Michael Peter Christen 2012-07-06 08:29:41 +02:00
  • 91f14ea38e fix to solr configuration (case where the external solr was not online) Michael Peter Christen 2012-07-06 01:29:13 +02:00
  • 2c5b68d932 more abstraction of error message sixcooler 2012-07-05 14:50:37 +02:00
  • 9758c521ab abstraction of error message Michael Peter Christen 2012-07-05 14:27:28 +02:00
  • ef0d09f103 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git Michael Peter Christen 2012-07-05 14:24:19 +02:00
  • b1e7c11fba fix for pattern matcher in html parser Michael Peter Christen 2012-07-05 14:24:03 +02:00
  • 8a6edc0031 fix for solr shutdown Michael Peter Christen 2012-07-05 14:23:43 +02:00
  • b8bcc06283 fix for urls beginning with "//" Michael Peter Christen 2012-07-05 14:23:29 +02:00
  • 9b6e4e46ca fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=4430 sixcooler 2012-07-05 14:06:00 +02:00
  • b0c408788b made class methods static where possible Michael Peter Christen 2012-07-05 12:38:41 +02:00
  • 5bd3c90907 - removed unnecessary semicolons - added default case for switch Michael Peter Christen 2012-07-05 11:18:31 +02:00
  • 132afaf687 removed unaccessible code Michael Peter Christen 2012-07-05 11:09:44 +02:00
  • 7c1ba99755 removed more unused method parameters Michael Peter Christen 2012-07-05 10:44:30 +02:00
  • 83701a1b4c removed unused ImageReference package Michael Peter Christen 2012-07-05 10:24:52 +02:00
  • 0301aba1e9 removed unused method parameters Michael Peter Christen 2012-07-05 10:23:07 +02:00