Commit Graph

  • d8fca85c11 fixed search: allow dots in operators added new operator "tld:" which was the former "site:" "site:" uses fast site operator introduced in r5770 lotus 2009-04-28 17:12:31 +00:00
  • 8d6212233b fix for IODispatcher orbiter 2009-04-28 07:24:28 +00:00
  • f678472f46 fix for quote problem in json output orbiter 2009-04-27 22:27:02 +00:00
  • d079d6dfdb small changes in surrogate reader, wiki code and portal test orbiter 2009-04-27 20:30:43 +00:00
  • 07f09742bb set of small fixes and comments orbiter 2009-04-27 15:29:50 +00:00
  • d9e62508e5 fixed a NPE bug (from SVN 5888) orbiter 2009-04-27 12:52:12 +00:00
  • 06ed4ef7b3 * better picture handling borg-0300 2009-04-27 11:19:15 +00:00
  • 5a634cab23 removed generation of anchor link sets in document types that describe container formats. orbiter 2009-04-27 08:46:11 +00:00
  • 50e96ee894 - moved live search window in yacy to different location - changed div id for live seach from 'yacy' to 'yacylivesearch' orbiter 2009-04-26 21:04:31 +00:00
  • dd489fcce0 fixed some json encoding problems in yacysearchitem apfelmaennchen 2009-04-26 19:17:36 +00:00
  • 0b4e4bbddd fixed css interferance for previous posts apfelmaennchen 2009-04-26 18:46:05 +00:00
  • ed4fb3bf75 small fix to portal search apfelmaennchen 2009-04-26 18:30:21 +00:00
  • 753affca4b - request: http://forum.yacy-websuche.de/viewtopic.php?f=9&t=2041#p14336 - integrated portal search in yacy web interface apfelmaennchen 2009-04-26 18:22:28 +00:00
  • 4cf8b08eec Portal Search: - request: http://forum.yacy-websuche.de/viewtopic.php?f=15&t=1762&start=50#p14350 - window closes for empty query - example for fancy input field apfelmaennchen 2009-04-26 17:44:11 +00:00
  • 42d936288e small update for RichClient LogViewer apfelmaennchen 2009-04-26 17:20:08 +00:00
  • f1244264b8 *) hopefully fixed bug reported in http://forum.yacy-websuche.de/viewtopic.php?t=2057 low012 2009-04-26 16:18:14 +00:00
  • 2714ff034b avoid undefined in rssTerminal thanks to freq.9! http://forum.yacy-websuche.de/viewtopic.php?p=14288 lotus 2009-04-26 11:45:36 +00:00
  • 2e3186189b fix for mediawikiIndex surrogate producer + added concurrency orbiter 2009-04-25 21:52:21 +00:00
  • 6f5ea7b1a8 small fix for previous post apfelmaennchen 2009-04-25 21:28:08 +00:00
  • 2eabd989ce - added a log viewer to RichClient (alpha version, very slow) apfelmaennchen 2009-04-25 20:58:56 +00:00
  • 138a0747e3 added serverObjects.putJSON as JSON has very particulare encoding requirements apfelmaennchen 2009-04-25 20:56:29 +00:00
  • 557c2a32a3 small fix for yacyui-portalsearch apfelmaennchen 2009-04-25 16:59:35 +00:00
  • b4539a61dd some more documentation for yacyui-portaltest.html apfelmaennchen 2009-04-25 15:14:10 +00:00
  • 64a63306b8 added portal-test explanation page to the customization submenu orbiter 2009-04-25 13:18:13 +00:00
  • 64ce9da60f - new yconf parameter global - see http://forum.yacy-websuche.de/posting.php?mode=quote&f=9&p=14207#pr14207 apfelmaennchen 2009-04-25 13:08:07 +00:00
  • 5ca306da9a fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=2054#p14306 apfelmaennchen 2009-04-25 12:24:40 +00:00
  • 9325198c42 hopefully a fix for http://forum.yacy-websuche.de/viewtopic.php?f=15&t=1762#p14305 apfelmaennchen 2009-04-25 08:15:27 +00:00
  • d977dd9a96 fix for surrogate loader orbiter 2009-04-24 22:54:40 +00:00
  • 675f350d18 YaCy Portal Search Widget - see http://localhost:8080/yacy/ui/yacyui-portaltest.html - two new parameters (logo and link) for yconf as requested at http://forum.yacy-websuche.de/viewtopic.php?f=15&t=1762#p14101 apfelmaennchen 2009-04-24 17:21:31 +00:00
  • 9cb68353da fix for bug in ProfilingGraph for ppm >> 10000 ppm (!) orbiter 2009-04-24 13:18:20 +00:00
  • 9e4db75aac reduced internal logging and reduced memory that internal logging can use orbiter 2009-04-24 12:09:04 +00:00
  • c10c257255 attempt to fix a deadlock situation where the IODispatcher did not work. I suspect the dispatcher thread has crashed and queues filled so no indexing process was able to write data. This fix tries to heal the problem, but I am unsure if it helps. To get a better view of the problem, some more log outputs had been inserted. Added also a new attribut indexer.threads to get a control over the number of default threads for the indexer (default is 1) orbiter 2009-04-24 11:55:39 +00:00
  • 09987e93fd fixed some more bad handling of byte[] orbiter 2009-04-23 22:02:12 +00:00
  • 1bcc1450cb more explaining error message in case of IOExceptions during html parsing orbiter 2009-04-23 21:18:01 +00:00
  • fe51f4d668 less synchronization may help to prevent deadlocks orbiter 2009-04-23 20:54:13 +00:00
  • 58802e4201 added missing success test in storeDocumentIndex, see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1922&hilit= orbiter 2009-04-23 20:38:56 +00:00
  • fbca4f8354 more stability on watchcrawler lotus 2009-04-23 18:42:15 +00:00
  • 171e62bee5 addition to the fix from last commit (which did not work) orbiter 2009-04-23 16:36:21 +00:00
  • 059949a0d1 tried to fix problem with snippet fetch for second search page when verify=false orbiter 2009-04-23 15:29:30 +00:00
  • b08991e278 moved some constants, rename of Tray class lotus 2009-04-23 13:18:59 +00:00
  • 54773ad4d4 added release keys orbiter 2009-04-22 22:46:42 +00:00
  • 138422990a - removed useCell option: the indexCell data structure is now the default index structure; old collection data is still migrated - added some debugging output to balancer to find a bug - removed unused classes for index collection handling - changed some default values for the process handling: more memory needed to prevent OOM orbiter 2009-04-22 22:39:12 +00:00
  • 1b9e532c87 some concurrency for wikipedia dump reader orbiter 2009-04-22 17:43:27 +00:00
  • dec495ac78 added dummy class for help page see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=2033&hilit=&p=14107#p14107 orbiter 2009-04-22 13:59:20 +00:00
  • 25d2160288 small fix lotus 2009-04-22 13:19:37 +00:00
  • daea87d436 do not accept dht from bad versions delete bad hashes on receive lotus 2009-04-22 12:13:37 +00:00
  • 16baa7ad24 To translate a mediawiki dump into the YaCy surrogate format do the following: - download a wikipedia dump, i.e. dewiki-20090311-pages-articles.xml.bz2 from http://download.wikimedia.org/dewiki/20090311/ - move dewiki-20090311-pages-articles.xml.bz2 to DATA/HTCACHE/ - start the conversion; open a command shell, move to the yacy home directory and execute java -Xmx2000m -cp classes:lib/bzip2.jar de.anomic.tools.mediawikiIndex -convert DATA/HTCACHE/dewiki-20090311-pages-articles.xml.bz2 DATA/SURROGATES/in/ http://de.wikipedia.org/wiki/ orbiter 2009-04-21 22:12:19 +00:00
  • 0b2c98edc9 some more work on the wikipedia-dump exporter (not finished yet) orbiter 2009-04-21 15:19:32 +00:00
  • 5195c94838 two patches for performance enhancements of the index handover process from documents to the index cache: - one word prototype is generated for each document, that is re-used when a specific word is stored. - the index cache uses now ByteArray objects to reference to the RWI instead of byte[]. This enhances access to the the map that stores the cache. To dump the cache to the FS, the content must be sorted, but sorting takes less time than maintenance of a sorted map during caching. orbiter 2009-04-21 14:23:04 +00:00
  • 06c878ed11 moved update_key to correct position in file lulabad 2009-04-21 10:15:12 +00:00
  • 9416f5c26f more speed test cases: kelondro provides map functions that are more than 20% faster than standard java classes and use less than halve of the memory of java classes: just start IndexTest (here with 1000000 test objects) orbiter 2009-04-21 09:29:08 +00:00
  • b53790abb1 more performance hacks: 10% more speed for Base64.compare() which is really often used in YaCy code orbiter 2009-04-21 07:39:21 +00:00
  • 8ffb9889e1 some fixes and performance hacks orbiter 2009-04-20 23:01:44 +00:00
  • dfb96ecb72 more fixes orbiter 2009-04-20 22:08:38 +00:00
  • 1b8d346b4c fixes in connection with transiton to byte[] hashes orbiter 2009-04-20 21:54:00 +00:00
  • 0b0a46d35a * fix transferRWI as suggested by celle (thanks!) see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=2000#p14023 f1ori 2009-04-20 19:51:20 +00:00
  • 996572de95 quickfix orbiter 2009-04-20 16:11:35 +00:00
  • 380ed2dac0 performance and debugging additions orbiter 2009-04-20 15:01:43 +00:00
  • 635b0a9da7 code-split allow cgi indexing lotus 2009-04-20 13:28:28 +00:00
  • e7559f3234 fix for http://forum.yacy-websuche.de/viewtopic.php?p=13977#p13977 orbiter 2009-04-20 10:06:55 +00:00
  • fa3adbbfc6 added domain checks to surrogate reader and RWI transfer receiver to prevent spaming using surrogates orbiter 2009-04-20 06:38:28 +00:00
  • 76af84d732 * add custom comparator to ScoreCluster for byte[] * fixes http://forum.yacy-websuche.de/viewtopic.php?f=6&t=2010 f1ori 2009-04-19 20:01:46 +00:00
  • 31c6934df2 *) fix for r5832 low012 2009-04-19 07:40:23 +00:00
  • ab0030d7a7 allow dht-out for remote-crawl processing peers on default settings lotus 2009-04-18 20:04:01 +00:00
  • 616a4d724f high-end favicon with 2 versions: * true color + alpha channel for modern browsers * 256 colors and non-transparent background for others lotus 2009-04-18 18:37:26 +00:00
  • d1116c049f *) added new method "contains()" to Blacklist interface *) implemented contains() in class AbstractBlacklist *) used new method in Blacklist_p to prevent double entries in blacklists low012 2009-04-18 16:27:17 +00:00
  • 08445e42f0 * don't throw exception, in case of bad charset in http-header f1ori 2009-04-18 15:38:29 +00:00
  • 2f860a2564 * convert byte[] hashes to string for log output f1ori 2009-04-18 14:35:18 +00:00
  • 94a6c83256 * rewrite code without using java 1.6 features f1ori 2009-04-17 15:54:44 +00:00
  • d93a2a6552 * ignore whitespaces so you can copy&paste signatures better f1ori 2009-04-17 14:52:42 +00:00
  • fadf311b97 added sign key for yacystats updates lulabad 2009-04-17 14:32:08 +00:00
  • fbcbcc5bdb export of yacy document objects as dublin core record in xml orbiter 2009-04-17 14:20:12 +00:00
  • d7cbf4cdd4 more performance hacks: less overhead in word hash computation orbiter 2009-04-17 13:47:06 +00:00
  • 29e96c1a60 bugfixes and performance hacks orbiter 2009-04-17 13:04:56 +00:00
  • 4e97a31009 corrections in dublin core syntax orbiter 2009-04-17 12:23:00 +00:00
  • 44daec7936 * introduce signatures to autoupdate as long as there aren't publickeys for the updatelocations set, no signatures are checked * wiki-article follows... f1ori 2009-04-17 09:58:06 +00:00
  • 538e375901 replaced old caching method for computed word hashes with a better method. The word hash computation is a new performance bottleneck (after the IO bottleneck was removed with the IndexCell data structure) and a better caching for word hashes was necessary. orbiter 2009-04-17 09:26:16 +00:00
  • 9e853e1977 partly reverting SVN 5818: identical comparator required for join operator orbiter 2009-04-17 08:18:01 +00:00
  • e16c25ddf7 (peak-) performance hacks orbiter 2009-04-16 22:45:39 +00:00
  • 63cd152969 fixes orbiter 2009-04-16 22:18:35 +00:00
  • 7dfe7e7cc6 fixed some problems with surrogate reader. This is now ready for testing. orbiter 2009-04-16 21:29:41 +00:00
  • 3a1364ed5c removed example lines from SurrogateReader sources; added additional example file orbiter 2009-04-16 21:05:34 +00:00
  • 9050a3c4c5 alpha version of surrogate reading and indexing. see the example file for an explanation. orbiter 2009-04-16 20:47:55 +00:00
  • 870066ab35 another fix orbiter 2009-04-16 20:23:20 +00:00
  • b15b059c0d fix for latest commit orbiter 2009-04-16 19:53:21 +00:00
  • c8624903c6 full redesign of index access data model: terms (words) are not any more retrieved by their word hash string, but by a byte[] containing the word hash. this has strong advantages when RWIs are sorted in the ReferenceContainer Cache and compared with the sun.java TreeMap method, which needed getBytes() and new String() transformations before. Many thousands of such conversions are now omitted every second, which increases the indexing speed by a factor of two. orbiter 2009-04-16 15:29:00 +00:00
  • dd6b5005ff * fix missing charset handling in getpageinfo_p f1ori 2009-04-16 12:31:28 +00:00
  • bd5f4c78d8 - added default profile for surrogate indexing - integrated surrogate indexing into indexing queue process orbiter 2009-04-16 08:01:38 +00:00
  • ad78e3a59f - less lines in rssTerminal - crawl more documents: if remote crawling is enabled, a remote crawl list is also loaded if a local crawl is running in case that the indexer is idle orbiter 2009-04-15 23:07:51 +00:00
  • bc80dc913a added new surrogate reader (surrogates are parsed documents on batches) this will open a new way to insert indexes to YaCy (instead crawling) orbiter 2009-04-15 15:30:25 +00:00
  • 12d81e98eb - fixed bad search results when searching for empty string - simplified result handling and page composition in case that nothing was searched orbiter 2009-04-15 11:22:43 +00:00
  • 8a24350036 - fix for join method with new generalized RWI data structure (caused by latest commit) - added more functions to mediawiki parser orbiter 2009-04-15 10:26:24 +00:00
  • e58320a507 added more info in log fore debugging orbiter 2009-04-15 07:37:36 +00:00
  • 89ec3acb3e - full abstraction of index content type: the kelondro full text index may now also contain indexes about other content than text, i.e. navigation indexes or reverse linking indexes. - during index joins all word positions are maintained: better ranking for word distance possible; exact phrase match can be implemented soundly orbiter 2009-04-15 06:34:27 +00:00
  • 7a48090fcf - fix for "uk" language - svn attributes added borg-0300 2009-04-14 11:40:44 +00:00
  • dc2af61bc9 allow up to 50 results from remote peers orbiter 2009-04-13 21:47:57 +00:00
  • c0e8ed5461 fixed problem with not http client orbiter 2009-04-13 21:21:47 +00:00
  • 6504b21cea *) fix for http://forum.yacy-websuche.de/viewtopic.php?t=1976 low012 2009-04-13 09:22:11 +00:00
  • 8862a2fed0 ups orbiter 2009-04-12 10:22:21 +00:00
  • de68948bc5 better handling of free memory computation and emrgency cache flush for index cell orbiter 2009-04-12 09:24:32 +00:00