Commit Graph

  • 4bea3f9714 hack to reduce resource contention caused by massive UTF8 decodings which use java.nio resources: used a ASCII String <-> byte[] conversion wherever possible. Many Strings in YaCy are hashes which are pure ASCII (base64 hashes). The new ASCII String <-> byte[] conversion method have less computation overhead than the UTF8 conversion. orbiter 2011-05-27 08:24:54 +00:00
  • 746e3c3b06 Replaced a widely-used Property Object in the httpd with HashMap<String, Object> which is not synchronized like Properties A synchronization is not needed here and applies an overhead to the httpd process which is now removed. orbiter 2011-05-26 16:34:35 +00:00
  • cc239b18cd fix for IPv6 localhost proxy client orbiter 2011-05-26 16:24:11 +00:00
  • fcb7525047 * add .gitignore to svn, so it doesn't get lost on git syncronization f1ori 2011-05-26 16:13:07 +00:00
  • 14e1666b21 * fix replacing regexes in url proxy f1ori 2011-05-26 16:09:29 +00:00
  • e28bd0d038 fix for some possible causes of memory leaks orbiter 2011-05-26 14:35:32 +00:00
  • 09ba6814c0 - non-blocking word hash computation with dynamic digest object generation (this was important!) - (very) small performance enhancement in did-you-mean orbiter 2011-05-26 12:58:11 +00:00
  • 8d9b5dda3b disabled did-you-mean computation for json and rss search results where this info is not used orbiter 2011-05-26 12:35:24 +00:00
  • 10e2f588f8 - enhanced ybr ranking computation - many speed/performance hacks - added solr charding and new charding web interface - added option to switch off the yacy index when using solr - added new fail-url categories which are used to make a distinction which fail-urls to be sent to solr - refactoring/renaming of some method names to distinguish host/url hashes better - a large number of bug/npe fixes orbiter 2011-05-26 10:57:02 +00:00
  • bd55dcee50 - commented out experimental distributed ranking loading - less threads for blocking threads - disable all threads for DHT transmission for networks with zero peers orbiter 2011-05-24 21:08:01 +00:00
  • 98c4d25185 fix for endless loop in FTP crawling, see http://bugs.yacy.net/view.php?id=32 orbiter 2011-05-24 10:06:20 +00:00
  • d1dbbd956a always use a template method cache even if the template cache flag is set to false. This flag is only used to make dynamic updates to the template files, to not dynamic updates to the rewrite methods (which is not possible without recompiling). low memory usage is guaranteed by the usage of soft references which are dropped before an OOM is thrown orbiter 2011-05-24 09:31:07 +00:00
  • 0d040ff6bb fix for bug 0000036: no crawling of https pages orbiter 2011-05-24 09:14:32 +00:00
  • 3ed4a09368 small features, some bug fixes and performance hacks orbiter 2011-05-23 21:08:04 +00:00
  • e55c254f7b enhanced logging orbiter 2011-05-22 20:12:13 +00:00
  • 3ec94d87c4 show dom counter only for active crawls where the dom counter is enabled within the crawl profile orbiter 2011-05-22 19:34:20 +00:00
  • e3ee43e6ed these YBR files are not needed any more orbiter 2011-05-18 14:27:24 +00:00
  • b45701d20f this is a re-implementation of the YaCy Block Rank feature This time it works like this: - each peer provides its ranking information using the yacy/idx.json servlet - peers with more than 1 GB ram will load this information from all other peers, combine that into one ranking table and store it locally. This happens during the start-up of the peer concurrently. The new generated file with the ranking information is at DATA/INDEX/<network>/QUEUES/hostIndex.blob - this index is then computed to generate a new fresh ranking table. Peers which can calculate their own ranking table will do that every start-up to get latest feature updates until the feature is stable - I computed new ranking tables as part of the distribition and commit it here also - the YBR feature must be enabled manually by setting the YBR value in the ranking servlet to level 15. A default configuration for that is also in the commit but it does not affect your current installation only fresh peers - a recursive block rank refinement is implemented but disabled at this point. it needs more testing orbiter 2011-05-18 14:26:28 +00:00
  • d27a0a67ff fix in log initialization according to hint from Dominic orbiter 2011-05-17 15:53:59 +00:00
  • 205cc75157 abstraction of surrogate main element (xmlns:geo was missing for wiki extracts) orbiter 2011-05-17 08:57:49 +00:00
  • 021840e5ba removed (almost) deadlocks and unnecessary CPU load orbiter 2011-05-17 00:00:01 +00:00
  • 3d879e0995 test file not needed orbiter 2011-05-16 23:19:11 +00:00
  • 123375bfba added a new yacy protocol servlet 'idx'. This returns an index to one of the data entities that is stored in YaCy. This servlet currently only serves for indexes to the web structure hosts. It can be tested by calling http://localhost:8090/yacy/idx.json?object=host This yacy protocol servlet is the first one that returns JSON code and that also shows index entries in a readable format. This will make the development of API applications much easier. This is also an example implementation for possible json versions of the other existing YaCy protocol interfaces. orbiter 2011-05-15 22:57:31 +00:00
  • d326f1486a added timeout setting to scanner interface orbiter 2011-05-14 11:30:41 +00:00
  • f0d5ddfa92 *) preventing potential NPE which occured if user deleted DATA/RELEASE manually and opened ConfigureUpdate_p.java then low012 2011-05-14 09:23:19 +00:00
  • 5c981762c6 added bigrange option for network scan orbiter 2011-05-14 09:13:16 +00:00
  • c55787d07c *) revert of r7667 low012 2011-05-14 09:03:18 +00:00
  • bade61696f speed-up of network port scanner orbiter 2011-05-14 09:03:16 +00:00
  • b04382bc59 added topmenu as defined for search to wiki orbiter 2011-05-14 08:29:16 +00:00
  • 229df8b626 restart link after memory changed lotus 2011-05-13 17:24:03 +00:00
  • 1d8b0f74f4 one more fix for SVN 7713 orbiter 2011-05-13 15:31:24 +00:00
  • 0960261769 fix for svn 7713 orbiter 2011-05-13 15:20:57 +00:00
  • 7e368000c8 transparent progress bar apfelmaennchen 2011-05-13 13:40:23 +00:00
  • 5b579e21a3 code cleanup orbiter 2011-05-13 06:21:40 +00:00
  • fcd4b03892 show progress of search after display of results is finished orbiter 2011-05-13 06:20:00 +00:00
  • 8b63d7637d revert 7710, lotus 2011-05-09 20:06:05 +00:00
  • 965aac5ebb * proxy works almost Florian Richter 2011-05-09 18:20:34 +02:00
  • 440e3ba887 Windows Installer: - remove firewall-handling for WinXP (can only open for JRE not for special port) - Vista/Win 7: open port 1900 for communication with router (uPnP) pca 2011-05-09 06:16:35 +00:00
  • 039126cfaf better handling of on/off switched solr indexing orbiter 2011-05-08 22:47:20 +00:00
  • dc54915df4 fix for very bad compare orbiter 2011-05-08 08:45:58 +00:00
  • f123dbec79 fix in heuristics config lotus 2011-05-07 18:52:20 +00:00
  • 897b4e8b9c another hack to prevent black images orbiter 2011-05-07 07:45:02 +00:00
  • 9248a4eef4 reduce teh effect of 'Bildersuche findet generierte HTML-Seiten als Bilder' see http://bugs.yacy.net/view.php?id=9 orbiter 2011-05-07 07:37:46 +00:00
  • 0621a15f89 fix for wrong search result counter: added a counter for all filtered out entities see also http://bugs.yacy.net/view.php?id=5 orbiter 2011-05-06 23:04:27 +00:00
  • 61c9a791c4 YMarks: sidebar with tabs for tags and folders apfelmaennchen 2011-05-06 21:36:35 +00:00
  • 9c33b2fb58 fix for String Matcher in case that no snippet is returned (NPE) orbiter 2011-05-05 23:11:03 +00:00
  • 76f2817e00 a fix for the snippet computation and hopefully better snippets orbiter 2011-05-05 23:05:38 +00:00
  • deda54d684 - relaxed matching of string-search (this is now case-insensitive) - added transport of string-search pattern to remote search protocol - fixed a problem parsing snippets with a '-' inside orbiter 2011-05-05 22:37:06 +00:00
  • 8fd4e8ea98 proper jre version (without -s in filename) lotus 2011-05-05 20:03:27 +00:00
  • 15e3a57b4e removed unused functions in condenser orbiter 2011-05-05 09:23:10 +00:00
  • 6e42d4de88 - added full-String search function: find things that match exactly what is quoted in the query - re-structuring authentification methods to fix a problem with API steering orbiter 2011-05-05 00:25:14 +00:00
  • 8e10b82280 small fix for solr export orbiter 2011-05-03 22:21:45 +00:00
  • 8b8db2aaba YMarks: some small changes/fixes apfelmaennchen 2011-05-03 21:21:06 +00:00
  • 441035f1f4 YMarks: some improvements to flexigrid quick search on YMarks.html apfelmaennchen 2011-05-02 20:11:58 +00:00
  • 6fa439c82b - refactoring of robots - added option to crawler to send error-URLs to solr - changed solr scheme slightly (no multi-value fields where no multi values are) orbiter 2011-05-02 14:05:51 +00:00
  • 1ea0bc775c @apfelmaenchen: is this the expected, but forgotten change? Please correct if I'm wrong (this let me build Yacy again) sixcooler 2011-05-02 10:46:05 +00:00
  • e7c2ea193b YMark: - general improvements on importers, especially on auto tagging - added get_tags (needed for tag clouds etc.) - improved flexigrid support - added YMarks.html (not fully working) that will eventually replace Bookmarks.html apfelmaennchen 2011-05-01 21:42:48 +00:00
  • e3d19d0a90 fix in Document inboundlinks/outboundlinks sorting orbiter 2011-05-01 15:49:04 +00:00
  • 5e2d38ef19 Windows Installer: - fix for firewall Vista/Win7 - update to JRE 1.6 u25 - TODO: fix for firewall WinXP and setting for uPnP (Port 1900) pca 2011-04-30 19:32:07 +00:00
  • 4e8fa03514 added more attributes to html evaluation orbiter 2011-04-29 15:36:44 +00:00
  • 3b578a28ef some patches to prevent that empty or bad IP information is broadcasted - on client-side: fix bad IP reports from remote Peers by replacing their reported IP with their server IP if the reported IP is bad, broken or disallowed - on server-side: the same during a peer ping (here the ping'ed server acts also as client during the back-ping) and also when receiving a message or a search where the client sends also its seed. Here the IP is replaced by the client IP if the reported IP is broken or bad orbiter 2011-04-29 10:58:12 +00:00
  • 361841df16 another patch according to http://bugs.yacy.net/view.php?id=26#c36 orbiter 2011-04-29 02:26:50 +00:00
  • 37fede9d30 better logic for proper seed ip recognition and better error messages orbiter 2011-04-29 02:19:13 +00:00
  • 8b95a26866 better magic orbiter 2011-04-29 02:00:37 +00:00
  • 2700a58e5a added a magic to the peer ping that will be used in case that the contacting peer requests that it's reported IP shall be used for a back-ping. The back-ping now also returns the same magic which will make it possible that the requested peer can verify that the back-pinged peer is actually the same peer. orbiter 2011-04-29 01:52:20 +00:00
  • 8879cc1db2 removed System.out.println orbiter 2011-04-28 14:08:02 +00:00
  • c493f101c0 added one more script file to release build script orbiter 2011-04-28 13:19:24 +00:00
  • 528da7c9ea removed unused class and added license header for new class orbiter 2011-04-28 13:14:30 +00:00
  • f6077b3cc0 added more attributes for html parser and enhanced data structures orbiter 2011-04-28 13:09:01 +00:00
  • 0b02083e97 * function for simple crawl of one url f1ori 2011-04-28 13:04:33 +00:00
  • d671de8c17 add ranking weight to json-search-results f1ori 2011-04-28 11:18:14 +00:00
  • 4eb9c1e7c3 not setting userAgent from Constructor as default for following calls sixcooler 2011-04-26 17:39:16 +00:00
  • d8e934c085 better abstraction of http client identification orbiter 2011-04-26 13:35:29 +00:00
  • a3e707283d not using HTTPConnector anymore sixcooler 2011-04-26 11:46:31 +00:00
  • 9f1f47ec67 added some comments to explain the isLocal patch orbiter 2011-04-21 21:59:56 +00:00
  • b77b8cac0c - enhanced html parser: recognized much more details in the content - added more properties to solr index - refactoring - more constants in switchboard - fix for some NPEs - recognition of more images - removed synchronization in HandleMap (obviously not necessary?) - added a nolocal configuration to remove excessive dns lookup (works only on allip - default off). Indexes produced with this setting are all flagged with 'local' and are (on purpose) not usable for freeworld because they will be rejected as beeing local. orbiter 2011-04-21 13:58:49 +00:00
  • bc84d2bc9d *) fixed typo in stop script *) added <u> </u> tags for underlined text in Wiki Code *) minor code changes low012 2011-04-20 22:54:29 +00:00
  • b2281f0b7d YMark: intermediate work towards flexigrid support apfelmaennchen 2011-04-20 22:33:01 +00:00
  • 06d50fd801 *) fixed stupid bug (introduced in r7663 by myself) which caused wrong parsing of Wiki pages low012 2011-04-20 17:27:59 +00:00
  • 60412d2bb3 YMark: - more refactoring >> YMarkEntry - integration of SurrogateReader as bookmark importer - various small bug fixes e.g. get_xbel.xml apfelmaennchen 2011-04-18 21:42:14 +00:00
  • 7c149e0f9d *) ./stopYACY:sh -f kills YaCy in case regular shutdown does not work low012 2011-04-18 19:09:54 +00:00
  • 3d5104d357 - fixed a bug in crawl start with file name (npe in new url) - added deletion of solr index in IndexControlRWIs - added asynchronous adding of large url lists (happens when crawls are startet with file) - fixed npe in Image display - replaced language warning with fine logging - added a domain name cache in Domains that helps to speed up the isLocal property (less DNS lookups) - added a new storage class for this new cache: KeyList. The domain key list is stored in DATA/WORK/globalhosts.list - added concurrent solr updates and chunked transfers (50 documents until a commit is done) for high-speed feeding (> 40000 ppm) - fixed a bug in content scraper that chopped off large parts of crawl lists (using crawl start from file) orbiter 2011-04-18 16:11:16 +00:00
  • 08108f0ece fix for http://bugs.yacy.net/view.php?id=12 orbiter 2011-04-17 22:53:15 +00:00
  • fd3baa9025 fix for http://bugs.yacy.net/view.php?id=24 orbiter 2011-04-17 22:37:04 +00:00
  • 2e9694c9e9 *) removed recursion which hopefully prevents exception *) fixed bug in creation of table of content which caused double entries if a page was previewed more than once low012 2011-04-17 21:02:18 +00:00
  • a2e86daae9 YMark: more bug fixes apfelmaennchen 2011-04-16 22:09:50 +00:00
  • 62855f9567 YMark: code clean up and some small fixes apfelmaennchen 2011-04-16 21:19:42 +00:00
  • 667e912b19 YMark: - some improvements to firefox json bookmark importer - test import with: /api/ymarks/test_import.html - view ymarks with: /api/ymarks/test_treeview.html apfelmaennchen 2011-04-16 09:09:33 +00:00
  • 0abd99621c correct slip of click in classpath from last commit - I wonder there are 7658'is around apflemaenchen, please don't take this amiss sixcooler 2011-04-16 03:08:25 +00:00
  • a0e4960a4d YMark: - first attempt for a firefox json bookmark importer - added JSON library json-simple-1.1.jar apfelmaennchen 2011-04-15 20:58:58 +00:00
  • 958ff4778e enhanced location search: search is now done using verify=false (instead of verify=cacheonly) which will cause that much more targets can be found. This showed a bug where no location information was used from the metadata (and other metadata information) if cache=false is requested. The bug was fixed. orbiter 2011-04-15 15:54:19 +00:00
  • 8d63f3b70f just cosmetics - keeping my baby clean :-) sixcooler 2011-04-15 00:48:39 +00:00
  • e402622584 removed httpclient-3.1 (this was added with last commit which was a mistake) the httpclient is required by solrj but no class from solrj is used which references to httpclient-3.1 Instead the YaCy http client library based on the apache http client 4.1 is used using a wrapper class which is in net.yacy.cora.services.federated.solr.SolrHTTPClient orbiter 2011-04-14 20:12:14 +00:00
  • 19fd13d3bc Added federated index storage to solr. YaCy supports now the storage to remote solr indexes. More federated storage (and search) methods may follow. orbiter 2011-04-14 20:05:04 +00:00
  • c17d102bd8 enhanced speed for OrderedScoreMap inc method and size comparisment in concurrent environments orbiter 2011-04-13 22:04:23 +00:00
  • b788182954 some enhancements to scoring speed orbiter 2011-04-13 15:17:00 +00:00
  • 13724ddd43 * caching in proxy Florian Richter 2011-04-13 15:28:28 +02:00
  • 01690eab86 fix for mediawiki importer and wikicode parser orbiter 2011-04-13 13:22:27 +00:00
  • c5352e6872 added new SearchResult class (to be used later) orbiter 2011-04-13 06:16:31 +00:00
  • 4c013d9088 more UTF8 getBytes() performance hacks orbiter 2011-04-12 05:02:36 +00:00