Commit Graph

  • 552ef9f18e fix for bad ErrorCache.exists test (bug from latest commit) Michael Peter Christen 2013-12-12 10:38:32 +01:00
  • 09412ea3a4 counting search requests in solr interface Michael Peter Christen 2013-12-12 03:37:19 +01:00
  • 303f5694ba avoid usage of existsByQuery. If a document can be loaded by the ID before testing other fields from the existsByQuery request, then a document cache fills and queries after that one can be avoided. Michael Peter Christen 2013-12-12 03:36:30 +01:00
  • b43bbd3cc4 join DefaultServlet and Jetty8 implementation - removing Jetty 8 specific dependencies reger 2013-12-09 23:45:57 +01:00
  • 8ac48aac27 update Maven pom to latest version number - include newer dependency versions of several lib/jar for eval. reger 2013-12-09 23:43:58 +01:00
  • 089c5007ee move conditionalHeader to DefaultServlet - by removing Jetty specific implementation detail reger 2013-12-08 00:56:45 +01:00
  • 67e7dc0cc6 added more properties to seedlist servlet Michael Peter Christen 2013-12-06 14:30:47 +01:00
  • 79771c60c0 IPv6 fixes Michael Peter Christen 2013-12-06 14:30:08 +01:00
  • 4e3375d983 next development version Michael Peter Christen 2013-12-06 13:47:50 +01:00
  • 92d9c56f9f Merge origin/master into jetty reger 2013-12-05 22:53:29 +01:00
  • f722e450b3 changed start parameters which caused deadlocks in mac and windows versions Michael Peter Christen 2013-12-05 00:55:35 +01:00
  • ddc7a24853 intermediate release 1.66 Michael Peter Christen 2013-12-04 23:16:06 +01:00
  • 78eac85161 better calibration of caches and queue maximum sizes Michael Peter Christen 2013-12-04 23:15:10 +01:00
  • da380343c2 perform greedy learning heuristic only if load < 1.0 Michael Peter Christen 2013-12-04 22:44:51 +01:00
  • 81926c055d fixed bug with image search in yacyinteractive Michael Peter Christen 2013-12-04 18:44:23 +01:00
  • edda0699e4 changed default timeout for port scanner Michael Peter Christen 2013-12-04 18:13:43 +01:00
  • c8af19bd37 removed unnecessary check which causes a NPE when searching with empty search string Michael Peter Christen 2013-12-04 17:58:36 +01:00
  • e3c2f09de9 - reduce computation in case that specific postprocessing fields are not selected - de-select citation rank computation Michael Peter Christen 2013-12-04 17:48:12 +01:00
  • cfa08024c7 removed optimization bevore postprocessing because that may cause a time-out which will cause that postprocessing fails. Michael Peter Christen 2013-12-04 16:04:29 +01:00
  • 6f3a923691 fixed urlmask which was not able to combine several constraints Michael Peter Christen 2013-12-04 13:48:01 +01:00
  • 9a27bf6e82 removed filter computation in Protocol class for remote searches because that is already done in the QueryParams class Michael Peter Christen 2013-12-04 13:09:15 +01:00
  • f1b5db2c45 - performance graph does not shop peer ping in memory monitor any more - after a forced GC, the PerformanceMemory view switches to automatic update by default Michael Peter Christen 2013-12-04 12:59:30 +01:00
  • a125904a1c fixed a NPE in surrogat processing Michael Peter Christen 2013-12-04 01:56:38 +01:00
  • 0db8e34625 enhanced webgraph processing Michael Peter Christen 2013-12-04 01:54:45 +01:00
  • 9d8b32c63a fixed a division by zero Michael Peter Christen 2013-12-04 01:54:14 +01:00
  • ac067b5236 clean-up Jetty handler classes reger 2013-12-01 19:36:24 +01:00
  • 10a6346056 clean-up test cases to work with current source reger 2013-12-01 03:38:58 +01:00
  • b75e92aac3 add read queryparameter in gsaservlet reger 2013-11-30 06:29:57 +01:00
  • 957f6297fb Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git Michael Peter Christen 2013-11-30 01:46:03 +01:00
  • 1e94719084 fix NPE on mime detection of unknown file extension reger 2013-11-29 23:23:47 +01:00
  • effea4bca0 Merge origin/master into jetty reger 2013-11-29 22:39:52 +01:00
  • b49e90d2e9 remove reference to solrServlet from YaCy servlet select - reference is not used - solrServlet is used in Jetty branch and adjustments there conflict with unused solrServlet here. reger 2013-11-29 22:10:14 +01:00
  • 38e1e3a707 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git Michael Peter Christen 2013-11-29 02:46:38 +01:00
  • 2c2ebb0d92 tried some hardening in order not letting any Solr-Searchers open sixcooler 2013-11-29 02:40:12 +01:00
  • cca79d12ef setting of some default values to make an client development start easy using the description at http://www.yacy-websuche.de/wiki/index.php/Dev:APIhello Michael Peter Christen 2013-11-29 01:28:48 +01:00
  • a16534cb0a tried to fix timeout and connection-lost problems when using an outside solr. Michael Peter Christen 2013-11-28 01:31:53 +01:00
  • c3dcbdc8d5 try to recover from an OOM during citation index reading and fail-over to second solr core in case of unrecoverable OOM. Michael Peter Christen 2013-11-28 01:10:25 +01:00
  • 9932c441c8 fixed a problem with Date fields parsing Solr results if a remote Solr is attached. Michael Peter Christen 2013-11-28 00:54:53 +01:00
  • 94db054aff memory-leak-fix: the DocListSearcher fires an query in its constructor and it is highly recommend to close every SolrRequest. Every Request, which is not closed leaves a Searcher with its Chaches an can not be garbage-collectet. sixcooler 2013-11-27 19:07:36 +01:00
  • 26bb1e37b7 implement core selection in SolrServlet - making initcore() obsolete reger 2013-11-27 02:51:02 +01:00
  • ae55d69ef6 include/exclude size NPE fix (recently added) Michael Peter Christen 2013-11-26 11:47:04 +01:00
  • 3d4b5e66ce disallow remote robots to crawl the HostBrowser servlet Michael Peter Christen 2013-11-26 07:06:25 +01:00
  • 234ca720f5 only admins should be able to force a commit Michael Peter Christen 2013-11-26 07:03:20 +01:00
  • 2c39b65409 fixes for searches containing stopwords. The fix was done using a reconstruction of the search word set access method to protect that words are deleted from the sets from the outside of the QueryGoal class. Michael Peter Christen 2013-11-26 02:24:47 +01:00
  • 5592ea57f0 hack to remove compiler warnings about deprecated classes. It would be better to remove the deprecated usage but to do this the Solr core must adopt the latest apache http core changes as well .. this is not our fault. Michael Peter Christen 2013-11-25 23:30:35 +01:00
  • 037cd0a57c using the BinaryResponseWriter which is supported within the YaCy solr servlet since YaCy 1.63. This is much more performant for the client than using the XMLResponseWriter because parsing of XML data is very CPU intensive. Older YaCy peers are still requested using the XMLResponseWriter but the majority of YaCy peers already respond with the binary writer. This makes remote searches much faster and less CPU intensive. orbiter 2013-11-25 21:31:40 +01:00
  • 61409788eb less word hash computations (removing some overhead because of MD5 calcs) using the clear word in a normalized form. orbiter 2013-11-25 15:20:54 +01:00
  • f23471c471 add check to prevent index entries containing url_file_ext_s with ";jsession=xyz" note: check could be implemented in MultiProtocolURL (but at this time didn't oversee possible implication) reger 2013-11-25 00:14:53 +01:00
  • 5c4a3d1c01 Merge origin/master into jetty reger 2013-11-24 21:00:39 +01:00
  • 444a9ae674 remove unused options and attributes from DefaultServlet cleanup obsolete class files reger 2013-11-24 20:11:39 +01:00
  • 8da75a4b0c fix contentType definition for Solr html responswriter from xml to html (hint: value is currently not used, but is in SolrServlet) reger 2013-11-24 04:31:08 +01:00
  • caa20d63d9 fixed seedlist (hash was missing) Michael Peter Christen 2013-11-22 14:15:52 +01:00
  • ccf2f4e43b refactoring of seed attributes (introduced more constants) Michael Peter Christen 2013-11-22 14:15:31 +01:00
  • 1f0bfa8fec added test to Base64Order (runs successfully!) Michael Peter Christen 2013-11-22 10:38:42 +01:00
  • c927b428d3 fixed json Michael Peter Christen 2013-11-22 10:07:08 +01:00
  • 64048ff217 fir for XSS Michael Peter Christen 2013-11-22 09:53:32 +01:00
  • b7f1e5af51 added new servlet which generates the same file as the principal peers upload to a bootstrap position you can call it either with http://localhost:8090/yacy/seedlist.html or to generate json (or jsonp) with http://localhost:8090/yacy/seedlist.json http://localhost:8090/yacy/seedlist.json?callback=seedlist orbiter 2013-11-19 15:56:10 +01:00
  • 3e552550d1 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git orbiter 2013-11-18 22:48:00 +01:00
  • c2d720cdaf purge a lucene cache - possible memory leak fix orbiter 2013-11-18 22:47:35 +01:00
  • e4f49fb175 for searchresults with empty title use filename as title - to not store a title in index which isn't extracted from source the title is empty check only added to ResultEntry class reger 2013-11-18 19:41:31 +01:00
  • b1dc9a6f52 - disable Jetty servlet defaultUseCache (prevent double caching) - include short memory status check for class cache in DefaultServlet - remove obsolete Resource interface for Jetty8YaCyDefaultServlet reger 2013-11-18 03:15:45 +01:00
  • f111f30ace Merge origin/master into jetty reger 2013-11-17 00:18:25 +01:00
  • f4172cbb3d fix for another XSS bug Michael Peter Christen 2013-11-17 00:17:25 +01:00
  • 94293176a3 use writeOptionHeaders with ServletResponse parameter only reger 2013-11-17 00:02:08 +01:00
  • ff86cb683f fixed some XSS bugs reported by Marius from http://ctf365.com/ orbiter 2013-11-16 20:34:31 +01:00
  • da33ee0d77 extended also timeout fr webgraph postprocessing orbiter 2013-11-16 18:30:06 +01:00
  • 74f9e40747 extended timeout during postprocessing of 30 minutes. orbiter 2013-11-16 18:29:08 +01:00
  • 19a051bec8 more monitoring for postprocessing and enhanced layout in Crawler monitor page orbiter 2013-11-16 18:23:14 +01:00
  • 9cf9727685 fix for wrong counter Michael Peter Christen 2013-11-16 11:33:35 +01:00
  • fceac8cffd more monitoring for postprocessing Michael Peter Christen 2013-11-16 08:23:42 +01:00
  • 6842783761 fixed and enhanced postprocessing Michael Peter Christen 2013-11-16 08:23:21 +01:00
  • 219d5934a4 fixed termination bug in Solr Connector Michael Peter Christen 2013-11-16 08:22:29 +01:00
  • bf1bdd52a6 prevent requesting of 0-facets (which actually exist) Michael Peter Christen 2013-11-15 15:41:41 +01:00
  • 9d5895f643 enhanced and fixed postprocessing Michael Peter Christen 2013-11-15 15:41:12 +01:00
  • f86fe90eda enhanced mass storage speed to remote solr servers Michael Peter Christen 2013-11-15 15:40:07 +01:00
  • 6ed9821209 fixed several problems in solr connectors Michael Peter Christen 2013-11-15 15:39:35 +01:00
  • 191fd3d7e7 added an optimization option to HandleSet mass data storage structure Michael Peter Christen 2013-11-15 15:38:00 +01:00
  • 94b565ea0d fixed keepalive min value Michael Peter Christen 2013-11-15 15:37:01 +01:00
  • 5ec5be5769 fixed logging for remote solr configuration Michael Peter Christen 2013-11-15 15:36:24 +01:00
  • b26787dc2d - DefaultServlet: remove static gzip option YaCy doesn't use pre-gzip'ed static html pages - ProxyServlet: remove not neede procedure - Server init: skip one overlaping servlet context reger 2013-11-14 01:37:51 +01:00
  • 24a052ecb9 removed debug code for existsByIds Michael Peter Christen 2013-11-13 13:41:18 +01:00
  • 087df05e24 added option to Config_Network_p.html to enable remote search while DHT-Receive is switched off. Michael Peter Christen 2013-11-13 13:38:01 +01:00
  • 1a4a69c226 set more logger to 'final static' Michael Peter Christen 2013-11-13 06:18:48 +01:00
  • c60947360d logger should be static Michael Peter Christen 2013-11-13 06:04:28 +01:00
  • 69b8d61c47 fix for search requests in GSA interface which contain 'funny' characters (like ':' etc.) Michael Peter Christen 2013-11-12 15:54:54 +01:00
  • b085cb522b replaced old existsByIds for embedded Solr with obviously much faster new selection method (including stil existing debug code to test that this is in fact better) orbiter 2013-11-11 11:25:01 +01:00
  • 1a6158e338 make test directory available in Maven pom - exclude reference to old slf4j-log4j12 reger 2013-11-10 22:20:35 +01:00
  • b4fdb8c887 cleanup test directory from Jetty 9 implementation samples - current Jetty implementation advances so that it seems not beneficial to keep the code as it makes the test unuseable and use of Jetty 9 is due to Java 1.7 dependency not in sight. reger 2013-11-10 22:01:31 +01:00
  • b29d262e70 implement Jetty8HttpServerImpl.generateSocketAddress (code 1:1 copied from serverCore) reger 2013-11-10 18:59:18 +01:00
  • 4234b0ed6c Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git orbiter 2013-11-10 18:50:43 +01:00
  • 909bbb49d8 added (partly commented) test code for url rewrite methods .. to be completed orbiter 2013-11-10 18:50:34 +01:00
  • 74c86a72a0 better default value for crawler user agent orbiter 2013-11-10 18:48:00 +01:00
  • 066a1ecf0a add highlight queryparams to solrservlet if missing - modify query params in Solr parameter map (instead of querystring) reger 2013-11-10 01:36:57 +01:00
  • 899e7e92b0 added debug code Michael Peter Christen 2013-11-09 02:37:12 +01:00
  • a5c1249ee2 reverted autowarming setting in solrconfig Michael Peter Christen 2013-11-09 01:43:44 +01:00
  • 4684330505 Merge origin/master into jetty reger 2013-11-07 21:44:14 +01:00
  • 1437c45383 merge rc1/master reger 2013-11-07 21:30:17 +01:00
  • 87a956e881 calculating and showing the number of files and the average size of a file in the HTCACHE in ConfigHTCache_p.html Michael Peter Christen 2013-11-07 12:13:12 +01:00
  • acc1f8a749 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git Michael Peter Christen 2013-11-07 12:01:37 +01:00
  • 81d9e23532 fixed another memory leak in the PDF parser: the class org.apache.pdfbox.pdmodel.font.PDFont occupies 8MB of space which cannot be cleaned if PDFont.clearResources is called. The attempt to clean the class cache therefore causes that the class is loaded and this cache is initialized with some rubbish. I tried to prevent to instantiate this class by usage of a hacked findLoadedClass call to the SystemClassLoader (which is protected ...). Now, without using the PDF parser at all, 8MB of RAM space is not occupied, however, when the first PDF arrives this space will be taked and never given back to GC. WAKE UP YOU LAZY PDFBOX HACKER AND FIX THIS SHIT! Michael Peter Christen 2013-11-07 11:57:01 +01:00