Commit Graph

  • 3c3e6499ae added more logging for merge operation orbiter 2009-06-19 14:51:35 +00:00
  • a119860b82 moved IndexImportWikimedia into different menu position orbiter 2009-06-19 14:03:28 +00:00
  • 15180fc95e - patch for future computation in SplitTable - added same concurrent process for has() from SPlitTable in ArrayStack orbiter 2009-06-18 23:24:23 +00:00
  • 9a5ec20b3c avoid merge during startup orbiter 2009-06-18 21:18:56 +00:00
  • bf6b92343c try to avoid stuck pdf parser lotus 2009-06-18 06:31:55 +00:00
  • c695c7f512 try to remove hung swf parser from queue lotus 2009-06-17 17:48:02 +00:00
  • fc69a76197 update to web structure picture: - allow bigger size - better instructions for api usage orbiter 2009-06-17 12:37:31 +00:00
  • ae015e8e98 refactoring of blob package classes orbiter 2009-06-17 09:58:15 +00:00
  • 8b8877c233 moved image collector orbiter 2009-06-16 21:48:09 +00:00
  • be1c7ddc64 refactoring of search classes -- moved Ranking Profile to search package orbiter 2009-06-16 21:45:40 +00:00
  • 1457bfce16 added updateYACY.sh to release build orbiter 2009-06-16 21:02:41 +00:00
  • fd31a3616a - more logging in server process - fix for bas ascii in comment orbiter 2009-06-16 15:10:59 +00:00
  • b5bc399cea added necessary synchronization for logging statistics (causes deadlock) orbiter 2009-06-16 10:37:13 +00:00
  • e377a1e9a1 fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1969 lulabad 2009-06-16 08:16:59 +00:00
  • 5a7fd6b4c8 just some comment lines orbiter 2009-06-16 06:38:26 +00:00
  • 31f60a3b3e when doing searches, also apply a online caution to DHT transmission and stop transmissions while heavy load caused by searching. This omits the many requests to the URL database that are needed for DHT transfer and it avoids collisions with URL retrieval needed for search results. orbiter 2009-06-15 22:03:27 +00:00
  • 17dc6d4be5 small fix for new Logger orbiter 2009-06-15 21:39:24 +00:00
  • ce1adf9955 serialized all logging using concurrency: high-performance search query situations as seen in yacy-metager integration showed deadlock situation caused by synchronization effects inside of sun.java code. It appears that the logger is not completely safe against deadlock situations in concurrent calls of the logger. One possible solution would be a outside-synchronization with 'synchronized' statements, but that would further apply blocking on all high-efficient methods that call the logger. It is much better to do a non-blocking hand-over of logging lines and work off log entries with a concurrent log writer. This also disconnects IO operations from logging, which can also cause IO operation when a log is written to a file. This commit not only moves the logger from kelondro to yacy.logging, it also inserts the concurrency methods to realize non-blocking logging. orbiter 2009-06-15 21:19:54 +00:00
  • aec3e7995a autoconfig.pac can be used to browse .yacy-domains only lotus 2009-06-15 19:48:11 +00:00
  • 4e825852d2 added stub for phpBB3 search integration guide orbiter 2009-06-15 11:52:57 +00:00
  • bc6dd8194b refactoring: moved search query class to new search package orbiter 2009-06-15 11:49:00 +00:00
  • a4805defdd added stub for new search process orbiter 2009-06-15 11:46:23 +00:00
  • b8e738a7be a collection of - small bug fixes - better/more comments - more asserts - fixed synchronization - test case enhancements - code cleanup - performance hacks orbiter 2009-06-14 22:09:08 +00:00
  • 39779e4796 DidYouMean: as I moved to only 8 consumer and 4 producer threads, I removed poison pills as it does not make sense anymore - threads are interrupted directly. Having a consumer thread per test case just didn't make sense either (see svn 6070) due to the massive overhead. apfelmaennchen 2009-06-14 16:31:31 +00:00
  • c3c4dd0933 DidYouMean - changed to much simpler LinkedBlockingQueue apfelmaennchen 2009-06-14 15:25:57 +00:00
  • 01ac1b5d7e - blocking queue implementation of DidYouMean - timeout ist set to 500ms apfelmaennchen 2009-06-14 11:53:09 +00:00
  • b8bb1bb364 join with a timeout does not cause that the corresponding thread is stopped after the time-out. It does only cause that the waiting is stopped. Here we need additionally a signal to the thread to stop after we finished waiting. orbiter 2009-06-13 23:54:52 +00:00
  • b69f22e9ca mistake in last commit: computation of loops in ReversingTwoConsecutiveLetters orbiter 2009-06-13 23:37:51 +00:00
  • 3130334932 - start first with threads that run more loops - join first with threads that run less loops orbiter 2009-06-13 23:34:16 +00:00
  • 6cde7ebf16 DidYouMean - without I/O intensive sorting by count - but with multiple threads apfelmaennchen 2009-06-13 23:16:14 +00:00
  • f348190566 tried to insert a database dump import method to the phpBB3 import function. Reason: imports or large database dumps are cannot be handled with phpMyAdmin and this should be an easy way to the database dumps into a mySQL database where it can be exported again with the phpBB3 content integration adapter. Completion or removal of this function stub will follow before next main release. orbiter 2009-06-13 23:03:40 +00:00
  • 945777aa80 replaced rwi term counting method by one that computes the maximum of the blobs that contibute to the RWI. An addition of the blob sizes is wrong/incorrect and does not reflect the real size. Truncation the size operation to the maximum of all blobs is also incorrect, but not as wrong as the sum of all blob sizes wich double-counts many rwi entries. orbiter 2009-06-13 22:59:54 +00:00
  • 303ccda69f small fix for "did you mean" apfelmaennchen 2009-06-13 11:11:30 +00:00
  • 7c4d1d471c hand-over of more specific object orbiter 2009-06-13 10:22:25 +00:00
  • 9150bc0f7d - don't show empty "did you mean" apfelmaennchen 2009-06-13 07:02:50 +00:00
  • 6c116be536 - set default &meanCount=5 apfelmaennchen 2009-06-13 06:49:17 +00:00
  • 09acfa66d1 - improved "did you mean" - added &meanCount= to query string - &meanCount=0 ==> no suggestion, no performance loss - sorting suggestions by sb.indexSegment.termIndex().count() apfelmaennchen 2009-06-13 06:20:05 +00:00
  • da6ce37f7b - fixed encoding problem - added limit to 10 suggestions apfelmaennchen 2009-06-12 21:36:26 +00:00
  • 54a48b4184 - added "did you mean" to search page - currently works for single word queries only! apfelmaennchen 2009-06-12 20:36:03 +00:00
  • 31360ba40c - Updated ConfigLiveSearch.html - added documentation for load_js and load_css apfelmaennchen 2009-06-12 05:57:08 +00:00
  • ab09d8ebb3 - small noscript fix - noscript is now functionall but ugly apfelmaennchen 2009-06-11 22:10:02 +00:00
  • 55ef9ae12a small fix for last post apfelmaennchen 2009-06-11 21:42:34 +00:00
  • 36dc9b09ac - partial update to jquery-1.3.2 - partial update to jquery-ui-1.7.2 - yacyportalsearch fixed sidebar for navigators apfelmaennchen 2009-06-11 21:34:39 +00:00
  • 550312ac85 added new command script to do a auto-Update from command line. this will make it easy to do mass-auto-updates in private yacy clusters orbiter 2009-06-11 11:31:26 +00:00
  • 0fc1168554 - reduced time-out for socket-connection communication from 20 seconds to 5 seconds. This is a test to find out if the time-out was a cause for problems in metager environments - turned a fine log entry in case of rejected connections on the server socket into a warning. (look for 'exceeding limit') orbiter 2009-06-11 10:20:31 +00:00
  • 28b86385cd patch for bad behaving swf parser orbiter 2009-06-11 09:54:48 +00:00
  • d58b395993 fix for http://forum.yacy-websuche.de/viewtopic.php?p=15693#p15693 orbiter 2009-06-11 09:38:25 +00:00
  • cffef67dc5 added a short info line about the latency monitor orbiter 2009-06-10 23:03:29 +00:00
  • 733385cdd7 enahnced database access times by removal of unnecessary synchronization. added also more hacks that resulted from high-volum query testing orbiter 2009-06-10 23:02:42 +00:00
  • 5a7dec880e - some improvements for: http://forum.yacy-websuche.de/viewtopic.php?f=9&t=1904#p15668 - portalsearch: introduced yconf.load_js and yconf.load_css - yacysearch.html still having problems with focus after sidebar is loaded - yacysearchtrailer.json seems not to be valid json for ?nav=all apfelmaennchen 2009-06-10 22:11:31 +00:00
  • 5d7045387b added more word lists and a multi-access search test tool for high-performance query testing: run searchtestmulti.sh; then 10 concurrent processes fire 1000 requests each to the local peer. orbiter 2009-06-10 22:01:48 +00:00
  • 398e210fef removed synchronization in logging that causes deadlocks in high-performance environments orbiter 2009-06-10 19:17:30 +00:00
  • db3a06dd81 removed cookie handling in httpc: - no need to do cookie handling in proxy, this was switched off so far - no need for cookies in crawler, this was switched on (by mistake) This fix was needed for a case where a web server flooded the crawler with cookies and caused a complete blocking of the httpc. orbiter 2009-06-10 16:11:09 +00:00
  • 1c54ae4a63 some small changes in HandleMap Testing orbiter 2009-06-10 15:02:52 +00:00
  • b21e9149f5 another fix for navigation results, the json result format and searches with yacyinteractive orbiter 2009-06-10 12:41:15 +00:00
  • 15c5406b9c fixed yacyinteractive orbiter 2009-06-10 07:24:45 +00:00
  • 2c5554c912 small enhancements in search result computation speed orbiter 2009-06-09 15:22:23 +00:00
  • e0b3984805 added navigation keys for site and author facets to remote search interface orbiter 2009-06-09 09:07:52 +00:00
  • 27fa6a66ad - completed the author navigation - removed some unused variables orbiter 2009-06-08 23:30:12 +00:00
  • a9a8b8d161 - added display of author navigation (usage of that navigator not yet implemented - added a synchronization in pdf parser which should help to avoid deadlocks that occur when displaying several search results pointing to pdf sources - fixed smaller bugs in navigation orbiter 2009-06-08 22:01:26 +00:00
  • c879783008 added steering of navigator computation: - by default the navigator computation if off for servlet yacysearch.html, but: - the servlet is called by default with a option to switch navigator results on this will prevent that metasearch users will get slow results that are caused by unnecessary computations orbiter 2009-06-07 22:51:15 +00:00
  • c079b18ee7 - refactoring of IntegerHandleIndex and LongHandleIndex: both classes had been merged into the new HandleMap class, which handles (key<byte[]>,n-byte-long) pairs with arbitraty key and value length. This will be useful to get a memory-enhanced/minimized database table indexing. - added a analysis method that counts bytes that could be saved in case the new HandleMap can be applied in the most efficient way. Look for the log messages beginning with "HeapReader saturation": in most cases we could save about 30% RAM! - removed the old FlexTable database structure. It was not used any more. - removed memory statistics in PerformanceMemory about flex tables and node caches (node caches were used by Tree Tables, which are also not used any more) - add a stub for a steering of navigation functions. That should help to switch off naviagtion computation in cases where it is not demanded by a client orbiter 2009-06-07 21:48:01 +00:00
  • bead0006da replaced tmp file extensions by prt orbiter 2009-06-06 18:09:58 +00:00
  • 3189f9cd39 fixed problem with DCEntry initialization orbiter 2009-06-06 18:00:50 +00:00
  • a704d82280 patch for problem with digest orbiter 2009-06-06 16:53:16 +00:00
  • 3029ef6eb3 fixed a bug that was recently inserted which caused that no idx and gap files were written. orbiter 2009-06-06 16:43:58 +00:00
  • b6e274f211 omit most of forced crawl delays by using a separat delay table which flushes delayed URLs at the correct time orbiter 2009-06-06 16:20:27 +00:00
  • d50be59088 - added a automatic re-construction of the domain stack after 10 minutes. this includes then urls to the domain stack that were left over in case of stack size limitations when the domain stack was created the last time - changed the busy sleep time for the crawl thread to 30 millisecons. This is sufficient to crawl with 2000 PPM. orbiter 2009-06-06 09:34:44 +00:00
  • 5fdba0fa51 - fixed a not working selection rule in balancer - more security about crawl-delay, be more fail-save - better logging in case of long forced crawl-delays orbiter 2009-06-06 08:46:59 +00:00
  • f5602404d5 another speed boost for the balancer orbiter 2009-06-06 02:37:04 +00:00
  • 95e8cbd1c3 new fully redesigned balancer and bugfixes regarding lost profile handles and killed crawls orbiter 2009-06-06 01:56:31 +00:00
  • c062385552 fix for http://forum.yacy-websuche.de/viewtopic.php?p=15555#p15555 orbiter 2009-06-05 18:18:16 +00:00
  • 42ae40b9f6 some bugfixes to database close() methods orbiter 2009-06-04 22:43:46 +00:00
  • a0c53abbe1 - wait until local results are computed during search, see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=2167&hilit=&p=15521#p15521 - show only x+1 pages in page navigator orbiter 2009-06-04 20:58:47 +00:00
  • 94f3d90af2 added a hint about regular expressions in crawl start orbiter 2009-06-04 20:03:26 +00:00
  • 9bfd22f65d fix for http://forum.yacy-websuche.de/viewtopic.php?p=15523#p15523 orbiter 2009-06-04 19:57:25 +00:00
  • 1c77db670f re-designed response format for navigation: - changed json and rss response templates orbiter 2009-06-04 10:54:49 +00:00
  • 15fad767c0 some refactoring of topic generation orbiter 2009-06-03 23:49:06 +00:00
  • f28f62fb21 added servlet for easy wiki content and search window integration orbiter 2009-06-03 22:22:20 +00:00
  • efe97f446a better proxy configuration in case of remote proxy lotus 2009-06-03 19:03:03 +00:00
  • cc49aedf12 - fixed problem with remote search NPE - more abstraction for search requests orbiter 2009-06-03 08:49:54 +00:00
  • 9e18abc2ac * fix charset detection, see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=2137 * why has this been uncommented??? f1ori 2009-06-02 20:54:13 +00:00
  • c38c852090 modified access method to get index entries out of a array of BLOBs: iterate them, then merge; not collect them and merge then. This should use less memory and may behave better in an environment with many queries. To ensure that too many queries will not cause total blocking, a time-out of one second was also added. After the time-out the index data that was collected so far is returned. orbiter 2009-06-02 16:53:45 +00:00
  • 55ff919b5d - yacysearchtrailer.html ... just an idea for a timeline apfelmaennchen 2009-06-02 16:47:39 +00:00
  • ab06a6edd2 renamed topwords to topics and enhanced computation methods of topics topics will now only be computed using the document title, not the document url, because the host navigator is now responsible for statistical effects of urls. orbiter 2009-06-02 15:20:10 +00:00
  • 61d9e131b4 better/new proxy auto config lotus 2009-06-02 12:18:29 +00:00
  • 0d44a6d503 - yacy portalsearch experiments with navigation in sidebar (topwords & domains) - not yet functional ... apfelmaennchen 2009-06-02 11:02:36 +00:00
  • 9f9a1b4ad8 - yacysearchtrailer.html small temporary work around for jquery-css display bug apfelmaennchen 2009-06-02 09:08:34 +00:00
  • b0e2d854e0 - fixed sidebar for yacysearch.html & yacysearchtrailer.html - @orbiter: please do not use <h2> or <h1> tags in the context of the sidebar!!! apfelmaennchen 2009-06-02 07:32:45 +00:00
  • a5d481eab1 enhanced navigation - fixed too early computation of navigation - moved navigation rendering to yacysearchtrailer - added more asserts orbiter 2009-06-01 22:45:28 +00:00
  • 3ca1f109c4 added more jquery themes orbiter 2009-06-01 21:49:18 +00:00
  • 3ea399ec91 fix for absolute paths for repository path orbiter 2009-06-01 10:54:41 +00:00
  • 6b92155eb6 corrected spelling lotus 2009-06-01 09:48:08 +00:00
  • 5eac607166 fixed configuration of repository path orbiter 2009-06-01 00:13:23 +00:00
  • daee735ad7 - fix for yacysearch.html - navigation/sidebar JavaScript is now also triggerd by #(navigation)# instead of display=3 apfelmaennchen 2009-05-31 07:29:24 +00:00
  • 8fe69da2bb - some fixes for prev. post - better resizing and dragging apfelmaennchen 2009-05-30 17:06:47 +00:00
  • 0eb3bffe97 - added 'drawer' (sidebar) for future navigational items to yacyui-portalsearch.js - http://forum.yacy-websuche.de/viewtopic.php?f=9&t=1904#p15311 apfelmaennchen 2009-05-30 14:32:29 +00:00
  • 7639ec2f38 - fixed letter case bug for dc record creation - dc parser is now lazy against letter cases orbiter 2009-05-29 15:09:37 +00:00
  • 34af8b4877 - yacysearch.html compromise for positioning the sidebar - position is now fixed on the right top side - should scale down to window width 800px, smaller windows will cause distortions - see http://forum.yacy-websuche.de/viewtopic.php?f=9&t=1904#p15293 apfelmaennchen 2009-05-29 14:57:09 +00:00
  • 4522c13ee7 added option for a table prefix when importing phpbb3 orbiter 2009-05-29 14:29:02 +00:00