Commit Graph

  • bfcf9b7aa3 - added language detection using metadata from documents: html and odt documents provide this information - metadata and results from statistical analysis are compared and result is printed out as debug lines - added ranking profile for wanted language - added class with ISO 639 table, a list of all valid country codes that will be used for the language identification orbiter 2008-09-19 22:19:11 +00:00
  • 3768a1bd32 set encoding="UTF-8" for getpageinfo_p.xml apfelmaennchen 2008-09-19 14:29:10 +00:00
  • 5e8bd0f29c small fixes to getpageinfo_p.xml and htmlFilterContentScraper.java with respect to keyword extraction apfelmaennchen 2008-09-19 14:27:44 +00:00
  • 029e16b653 replaced some put(String, String) by putHTML(String, String) on serverObjects respond in htroot/ root didn't touch htroot/xml/ this should solve potential xss issues lotus 2008-09-19 11:45:11 +00:00
  • 5b2a57bfd0 - /xml/util/getpageinfo_p.xml added <desc> and <lang> tags - changed htmlFilterContentScraper.getKeywords() to split either space or comma charater not both apfelmaennchen 2008-09-18 21:01:23 +00:00
  • e1f67262f7 - added and removed some debugging output - fixed a bug with merge method - patched wrong output of language identification (not fixed, only patched!) orbiter 2008-09-18 14:12:15 +00:00
  • ce2a7ed116 integrated language detection classes into condenser environment orbiter 2008-09-18 13:12:33 +00:00
  • 2b13705839 fixed a mistake in indexing queue processing: documents had been parsed before it was checked if they should be indexed or not. parsing was not necessary for this check, so the check was moved in the queue in front of the document parsing orbiter 2008-09-18 11:36:09 +00:00
  • ea5de7436d added Sciencenet to the compare search orbiter 2008-09-18 10:56:18 +00:00
  • 21dbb39afa switched two balancer cases orbiter 2008-09-17 22:13:25 +00:00
  • 1bbf362cef update to the crawl balancer: better organization and better crawl delay prediction orbiter 2008-09-17 21:45:21 +00:00
  • 538991ff2f release 0.60 orbiter 2008-09-16 23:07:52 +00:00
  • ddcf285499 - fixed a bug in performance setting (did not work with german translation) - reduced maximum number of error url references to save some memory (this was actually a small memory leak) orbiter 2008-09-16 23:04:24 +00:00
  • 0cd0fee546 fixed bug with wrong proxy result enqueueing. See: http://forum.yacy-websuche.de/viewtopic.php?p=8130#p8130 - removed the online status property. This influenced the proxy behavior and created some complexity that was not needed because the online status was never used as it was ceated for (offline browsing) - checked all proxy identification procedures during crawling and enhanced transparency and error checking - fixed a proxy identification routine that caused the wrong selection of the proxy result queue orbiter 2008-09-16 21:56:23 +00:00
  • e071f759d2 YaCy-UI: small optical changes apfelmaennchen 2008-09-16 21:39:14 +00:00
  • bbacf86fe8 - added /xml/bookmarks/posts/add_p.xml - security fix to /xml/bookmarks/posts/delete_p.xml - YaCy-UI: added 'add' and 'delete' bookmarks apfelmaennchen 2008-09-16 21:38:13 +00:00
  • cd1ac5bb90 - fixed security issue with /xml/util/ynetSearch.xml - hopefully fixed YaCy-UI local search with async=false for Ajax-request apfelmaennchen 2008-09-16 05:55:31 +00:00
  • c73cf05ddd tried to fix local search in yacy-ui orbiter 2008-09-15 21:56:53 +00:00
  • 2468155e48 YaCy-UI: update to German translation apfelmaennchen 2008-09-15 21:02:35 +00:00
  • 99ff478d63 YaCy-UI: small optical fix apfelmaennchen 2008-09-15 20:30:33 +00:00
  • c8b6fbe900 translation update orbiter 2008-09-15 19:14:57 +00:00
  • 7e24c51fd5 - removed alternative search page in main menue in favor of rich client search page - added necessary option to search request of yacy-ui to get snippets orbiter 2008-09-15 19:00:22 +00:00
  • 670244849d fix for http://forum.yacy-websuche.de/viewtopic.php?p=9835#p9835 orbiter 2008-09-15 18:29:37 +00:00
  • fd9233244e configurable free disk space via disk.free lotus 2008-09-15 17:33:06 +00:00
  • 7c5867a832 Major update to YaCy-UI apfelmaennchen 2008-09-15 17:18:07 +00:00
  • 25a62cdc3f small fixes orbiter 2008-09-15 15:11:59 +00:00
  • 73f233bb11 * set resource observer to 1000MB * transparent favicon lotus 2008-09-15 12:41:27 +00:00
  • 1be24158a2 small fix / rendering option orbiter 2008-09-15 10:22:05 +00:00
  • 105be67998 - some bugfixes to compare search - redesigned input boxes: smaller, more space for result page orbiter 2008-09-15 09:43:51 +00:00
  • 693fa2a157 - renamed Comparison to compare_yacy - added more search engines - some refactoring and added a list that is used to present the search engine list in a specific order - added simpleheader and no-header options - added the compare search to the simple header - added default compare search page selection storage - after re-start you get the same default search engines as you selected before orbiter 2008-09-15 09:17:05 +00:00
  • 78ad58cd42 the UseConcMarkSweepGC option caused a full CPU usage - bug on Darwin (Mac OS X) but it was reported that the same option causes a good performance on Solaris Therefore it is now only used if YaCy runs on solaris. Pleas make comparisment tests with/without this option on linux and change start script if it is obvious that the option causes the linux JVM to work better. (with Java 1.5 !) orbiter 2008-09-14 20:35:50 +00:00
  • 5fbccfd75e fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1366&p=9348#p9348 orbiter 2008-09-14 20:10:43 +00:00
  • a28faabfd2 fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1351&p=9242#p9242 orbiter 2008-09-14 20:03:59 +00:00
  • 44a4e2815a translation update // so, das fehlte wohl daburna 2008-09-14 19:09:28 +00:00
  • c536212317 translation update // hoffe, dass ich nichts zerstoert habe, da mir vorherige updates fehlten. wenn ja, wird gleich repariert daburna 2008-09-14 19:07:50 +00:00
  • 7b63c66a08 - bugfix in bookmarksDB.Tag.hasPublicItems() - this anoying little bug prevented display of public items without admin login for /xml/bookmarks/... apfelmaennchen 2008-09-14 18:45:08 +00:00
  • 6216105ca5 small fix lotus 2008-09-14 18:12:26 +00:00
  • 5e5178b5e8 please use putHTML to avoid XSS lotus 2008-09-14 18:08:39 +00:00
  • b33a6cbb77 *) less disturbing elements in yacy frame low012 2008-09-14 17:58:08 +00:00
  • 98d902b972 * remember last searchwords in Comparison_p.html f1ori 2008-09-14 15:21:57 +00:00
  • bd45c5a2bc integrated the comparison page into the main menu orbiter 2008-09-14 10:40:01 +00:00
  • 5e0390a24c *) Ooooooooops! low012 2008-09-13 17:14:41 +00:00
  • dc56c35289 *) added page to compare results of 2 search engines low012 2008-09-13 16:50:01 +00:00
  • 1fb1665e71 increased dht interval to avoid peer selection failure (maybe too less peers available to fill the big gaps) orbiter 2008-09-12 13:38:27 +00:00
  • 880d1a83e2 do not change memory and some non-defined tasks with performance profiles lotus 2008-09-12 11:54:25 +00:00
  • 1eb813bd43 shifted index deletion-on-exit rule to the class where the errors are produced orbiter 2008-09-12 11:51:48 +00:00
  • ba76995d2c * fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1415 f1ori 2008-09-12 10:54:11 +00:00
  • bea6c13139 * with r5137 robotParser didn't work at all -> fix f1ori 2008-09-12 09:06:38 +00:00
  • 3ded1efe84 kelondroExceptionCounter didn't work lotus 2008-09-11 18:51:47 +00:00
  • ae677e1738 * fix problem in robotparser, see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1421&p=9742 f1ori 2008-09-11 18:12:17 +00:00
  • 383d89481e count errors before deleting collection.index lotus 2008-09-10 16:40:20 +00:00
  • 0bb4fbc403 delete corrupted collecion.index on exit for rebuild on next start see http://forum.yacy-websuche.de/viewtopic.php?p=9725#p9725 lotus 2008-09-10 12:55:14 +00:00
  • b68d06a6e8 performance settings based on network's remote crawl speed removed some _pro values from config lotus 2008-09-10 12:52:17 +00:00
  • d60b2b198d proxy fixed 'not modified' http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1419 danielr 2008-09-10 11:06:22 +00:00
  • bd0318ba81 * YaCy only supports gzip-encoding, so remove any other encoding from request * fixes http://www.yacy-forum.org/viewtopic.php?f=2&t=163 f1ori 2008-09-09 14:04:52 +00:00
  • bb5c898441 enhancements to localsearch behavior orbiter 2008-09-09 10:24:42 +00:00
  • 42e2d195ac added hint from http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1294 orbiter 2008-09-08 22:37:58 +00:00
  • 39964e88fa fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1329#p9121 orbiter 2008-09-08 22:06:45 +00:00
  • 3f3673b6e5 extended balancer: - added automatic time delay in case that a large number of urls come from the same domain - added additional time delay in case that an url is a dynamic (CGI) url. This shall cause less IO on targets orbiter 2008-09-08 21:50:37 +00:00
  • 3c6e8d2015 set default ppm when network is switched orbiter 2008-09-08 18:20:05 +00:00
  • 20c2d3c248 fix for bad formatting in CrawlResults orbiter 2008-09-08 13:59:35 +00:00
  • 01d3b2bd36 ahem.. 6PPM, not 10. orbiter 2008-09-08 09:51:08 +00:00
  • 3288c19c1a reduce remote crawl PPM for fresh peers in freeworld to 6 PPM orbiter 2008-09-08 09:49:08 +00:00
  • b92105c8b0 do not change auto recrawl scheduler with performance profiles lotus 2008-09-07 13:59:24 +00:00
  • 5ce9a100bb fix(2) for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1416 lotus 2008-09-07 13:57:53 +00:00
  • cf29ca19d4 possible fix for POST character encoding http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1374 danielr 2008-09-07 13:10:46 +00:00
  • a2eeb6138c fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1416 danielr 2008-09-07 13:04:17 +00:00
  • d09ddabd09 corrected a design mistake (5-byte hashes not necessary) orbiter 2008-09-04 21:28:00 +00:00
  • c97d0fcee7 modified the domain list export function: - used the new superfast domain list generation from the domain statistics - better interactive behavior orbiter 2008-09-04 20:28:36 +00:00
  • 77ee0765a4 - added domain statistic generation to IndexControlURLs_p.html servlet - added 'delete all' button to all results of such a domain statistic output which causes that all urls to this domain are deleted - extended stack cleaner to clean also the statistics: they are not completely destroyed, only the smallest counting domains are removed orbiter 2008-09-04 19:41:57 +00:00
  • 44bc8311af translation fix lotus 2008-09-04 19:26:59 +00:00
  • e5c0b969d6 * save performance profile speed * fix for wrong javastart_priority after first start lotus 2008-09-04 19:12:22 +00:00
  • d7a16c1f30 * added shutdown on search page (this page is shown after clicking the tray icon) * shorter, less technical words for configuration-links lotus 2008-09-04 12:51:05 +00:00
  • 80a7bc93d6 - added statistical evaluation about domains that appear during crawling - added tables that show this statistics in CrawlResults web pages orbiter 2008-09-04 09:59:17 +00:00
  • 4a4f388ca5 re-design and simplification of crawl start menu layout orbiter 2008-09-04 07:56:29 +00:00
  • 4fbee21cea - added fetch-ahead again (had been removed in last commit) - reverted default query mode to verify=false orbiter 2008-09-03 23:50:13 +00:00
  • 423a89ebe8 * fix if yacy was installed to a path with whitespace * show nice dots when waiting for restart/update lotus 2008-09-03 18:49:02 +00:00
  • fc03b0437a fixed a error case where a second search after a first search with a different search word failed orbiter 2008-09-03 15:55:25 +00:00
  • eca171ba2e fix for case where javascript was not filtered by the html parser see http://forum.yacy-websuche.de/viewtopic.php?p=9667#p9667 orbiter 2008-09-03 14:41:20 +00:00
  • 992635c074 translation update daburna 2008-09-03 13:44:58 +00:00
  • e645bae29f display table in log lotus 2008-09-03 13:14:01 +00:00
  • ead39064c5 fixed problem with wrong result number calculation orbiter 2008-09-03 10:04:46 +00:00
  • 2437beb96c fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1360&p=9321#p9321 hermens 2008-09-03 07:39:03 +00:00
  • 7b12e77a63 fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1393&hilit=&p=9655#p9655 orbiter 2008-09-03 00:50:42 +00:00
  • 05dbba4bab added logging conditions to all fine and finest log line calls this will prevent an overhead for the generation of the log lines in case that they then are not printed orbiter 2008-09-03 00:30:21 +00:00
  • d3d41e2ee4 - fixed problem with searching with quotes (still not complete, but not as bad as before) - fixed parsing of crawl-delay statements when seconds were given with float numbers - enhanced performance of profiling (not too many loggings; not more than one per second) - removed some debug output - fixed wrong return type in logging - added a logging condition in httpd to prevent that logging statements are generated when they are not written (should be added everywhere!) - fixed wrong word distance computation in RWI management orbiter 2008-09-02 23:49:48 +00:00
  • 3a0e96b552 * only create one debian package for all architectures f1ori 2008-09-02 21:53:45 +00:00
  • 3fbfd5a78b * fix for non-changing offset on new search term * dht-heap doesn't has to be deleted (5097), we simply write a new one on exit * do not install YaCy in startup because a Windows-shutdown might corrupt something. Installing YaCy as a service would solve this. lotus 2008-09-02 15:09:31 +00:00
  • 219b93df6a - fixed internal error after receiving chunked POST - removed debug output - added info for "501 Unknown" messages danielr 2008-08-29 13:51:22 +00:00
  • c245c7a45e delete index.dhtin/out.heap if restore fails see http://forum.yacy-websuche.de/viewtopic.php?p=9613#p9613 lotus 2008-08-29 13:10:41 +00:00
  • cd19d0aee6 - added warnings for failed transferRWI (dht-in) - fixed parseMultipart (uncompress gzipped body) (dht-in) - fixed parseMultipart (using content-length only if uncompressed) - better gzipped POST (chunked instead of content-length) (dht-out) danielr 2008-08-29 09:42:39 +00:00
  • 89cf795a5c proper default priority on first start (Windows) lotus 2008-08-29 07:01:38 +00:00
  • 016f57d714 fixed a dead link orbiter 2008-08-28 21:45:58 +00:00
  • df4ff423c4 added additional properties to query id's to distinguish search events better orbiter 2008-08-28 21:15:59 +00:00
  • d6d9b0f14a fixed transferRWI.html 'Read timed out' danielr 2008-08-28 08:37:51 +00:00
  • e503158527 Proxy: fix for never ending loading after POST danielr 2008-08-27 20:46:34 +00:00
  • 73519cbdca fixed pid-file for linux start-script danielr 2008-08-27 19:18:38 +00:00
  • 1a1d57e449 Proxy: added binary passthrough for POST danielr 2008-08-27 08:07:18 +00:00
  • aa6ae77e5e - autoReCrawl: fix for filter settings apfelmaennchen 2008-08-26 21:51:05 +00:00
  • 8ae29bad57 - fix to previous change of Crawl Profile Names apfelmaennchen 2008-08-26 20:42:29 +00:00