Commit Graph

  • 5bfc02ccfb Repair publishThread hermens 2008-05-09 11:14:32 +00:00
  • f42c8cf69c updated terminal and dynamic webstructure applet: can now change when crawl is running orbiter 2008-05-09 00:01:47 +00:00
  • 906c144799 - design update to new terminal and rssTerminal - added terminal to main menu - removed transfer size limitation in server orbiter 2008-05-08 21:33:16 +00:00
  • 7ec01d444a fix for npe orbiter 2008-05-08 20:25:11 +00:00
  • ad0f905124 fix for npe in crawler orbiter 2008-05-08 20:16:19 +00:00
  • ae03a54d23 pdfParser: updated lib, fixed ClassNotFoundException: CMSError danielr 2008-05-08 16:55:45 +00:00
  • 0d3808bd9e minor refactoring danielr 2008-05-08 16:51:01 +00:00
  • 719f5defb1 updated some grafics at new terminal_p orbiter 2008-05-07 23:42:14 +00:00
  • 58830e9b28 added new terminal servlet using current visualization methods and a new one: a processing (processing.org) applet. the new servlet can be found at http://localhost:8080/terminal_p.html ..to be enhanced.. orbiter 2008-05-07 22:49:07 +00:00
  • 9bc56a9edc xss protection lotus 2008-05-07 16:37:13 +00:00
  • b32736762c enhanced rssTerminal - 3 lines possible - distinguishing of private and public data, if not authorized only public data is shown - shows now more events, including local searches in clear text if user is logged in - simplyfied peer events - better recognition of 'real' new peers - presentation of peer pings from other peers orbiter 2008-05-06 23:05:48 +00:00
  • fbb712c669 refactoring: moved importer classes to crawler and plasma package orbiter 2008-05-06 13:44:38 +00:00
  • ee81ff4ef4 added crawler target directory for build orbiter 2008-05-06 00:37:51 +00:00
  • 1689030ee8 refactoring: moved all crawler classes into their own package orbiter 2008-05-06 00:32:41 +00:00
  • fe4871ac02 removed empty package orbiter 2008-05-05 23:54:00 +00:00
  • 3082edfdbc ups orbiter 2008-05-05 23:19:49 +00:00
  • d2ba1fd2ab major step forward to network switching (target is easy switch to intranet or other networks .. and back) This change is inspired by the need to see a network connected to the index it creates in a indexing team. It is not possible to divide the network and the index. Therefore all control files for the network was moved to the network within the INDEX/<network-name> subfolder. The remaining YACYDB is superfluous and can be deleted. The yacyDB and yacyNews data structures are now part of plasmaWordIndex. Therefore all methods, using static access to yacySeedDB had to be rewritten. A special problem had been all the port forwarding methods which had been tightly mixed with seed construction. It was not possible to move the port forwarding functions to the place, meaning and usage of plasmaWordIndex. Therefore the port forwarding had been deleted (I guess nobody used it and it can be simulated by methods outside of YaCy). The mySeed.txt is automatically moved to the current network position. A new effect causes that every network will create a different local seed file, which is ok, since the seed identifies the peer only against the network (it is the purpose of the seed hash to give a peer a location within the DHT). No other functional change has been made. The next steps to enable network switcing are: - shift of crawler tables from PLASMADB into the network (crawls are also network-specific) - possibly shift of plasmaWordIndex code into yacy package (index management is network-specific) - servlet to switch networks orbiter 2008-05-05 23:13:47 +00:00
  • d70a472460 added file for previous commit danielr 2008-05-05 05:19:01 +00:00
  • d32fe84472 added default User-Agent danielr 2008-05-04 17:26:19 +00:00
  • 8c5f062e0b corrected YaCy version in HTTP User-Agent danielr 2008-05-04 12:18:00 +00:00
  • d7b21bc90c re-added gzip POST for transferRWI/URL (HTTP/1.1 compliant) danielr 2008-05-04 10:53:04 +00:00
  • a5a1f19368 * allow to force login for xbel, needed for yacybar f1ori 2008-05-03 11:13:27 +00:00
  • 8d83febb95 *) BlacklistCleaner_p.java reports exception to log instead of System.err *) changes in formatting for better readability in BlacklistCleaner_p.java *) replaced test for necessary Java version (was 1.4.2, is 1.5 now) low012 2008-05-03 10:16:04 +00:00
  • d4bce6affd refactoring (initialized static fields, removed empty if/else, serialized some fields in serializable classes) danielr 2008-05-03 09:06:00 +00:00
  • 19ca452666 updated language file daburna 2008-05-02 20:19:05 +00:00
  • be2c9c07ff escape some unescaped characers in URLs (fixes problems with proxy) danielr 2008-05-02 08:19:47 +00:00
  • d0678f7ab9 refactoring as result of http://forum.yacy-websuche.de/viewtopic.php?f=6&t=959&p=7560#p7560 orbiter 2008-05-01 22:40:42 +00:00
  • 483e9a2066 - shifted tld recognition methods from yacyURL to serverDomains - changed isLocal Property in such a way that it is possible to see if a domain is in the internet (and not intranet) orbiter 2008-04-30 23:06:42 +00:00
  • a3df23659c re-implementation of charset checking orbiter 2008-04-30 13:23:05 +00:00
  • 75a1702133 - fix for ConcurrentModificationException during shutdown - fix for Ranking distribution problem (suma-lab peer does not exist any more) orbiter 2008-04-30 11:19:52 +00:00
  • 27ab0a5f89 fixed XSS problem in ConfigProperties orbiter 2008-04-29 22:47:00 +00:00
  • 32b5b057b9 - modified, simplified old kelondroHTCache object; I believe it should be replaced by something completely new - removed tree data type in kelondroHTCache - added new class kelondroHeap; may be the core for a storage object that will once replace the many-files strategy of kelondroHTCache - removed compatibility mode in indexRAMRI orbiter 2008-04-29 22:31:05 +00:00
  • d3715e02ae removed double/redundant servlet Config_p orbiter 2008-04-29 19:19:14 +00:00
  • ec84a52adb change for problem with NPE (seen as "PROXY Unknown Error while processing request") danielr 2008-04-29 16:06:54 +00:00
  • 5813cc149f fix for bad rssTerminal behavior orbiter 2008-04-28 20:34:37 +00:00
  • 88216c1f1f fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1103&hilit=&p=7362#p7362 orbiter 2008-04-26 22:59:20 +00:00
  • d0b893523e - protection against RAM overflow caused by new peer rss news - more XSS protection orbiter 2008-04-26 22:53:04 +00:00
  • 685794e7e7 fix for parser/encoding Exception see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1111&hilit=&sid=55a320b54e1e3bda9410e7c50b5147f1&p=7431#p7431 orbiter 2008-04-26 22:14:45 +00:00
  • cf042e6957 reverted change by mistake in yacyVersion orbiter 2008-04-26 01:08:59 +00:00
  • 9935e83c86 added new news window into the status page. At this moment it is just a test. The news inside the window are about peer arrivals and departures, remote search accesses and crawls orbiter 2008-04-26 01:00:10 +00:00
  • bac38cfa18 added very rudimentary peer news as rss feed. An example can be retrieved with http://localhost:8080/xml/feed.rss?channel=PEERNEWS to be extended and integrated in interface ... orbiter 2008-04-24 23:30:13 +00:00
  • 6495227ad6 the class rssReader is replaced by RSSReader, RSSFeed and RSSMessage orbiter 2008-04-24 21:45:43 +00:00
  • 724bbdf9b2 refactoring of RSS reader orbiter 2008-04-24 21:31:07 +00:00
  • b9a2a2d287 more search performance hacks orbiter 2008-04-24 15:09:06 +00:00
  • ff755fb858 small corrections and enhancements after search timing profiling search should be a little bit faster now orbiter 2008-04-24 13:31:55 +00:00
  • 0702dd2507 added a profiling script to analyse search process timing orbiter 2008-04-24 13:28:18 +00:00
  • d0e2830e01 enhanced the thread dump to make it usable for scripted remote-debugging orbiter 2008-04-24 13:25:38 +00:00
  • e024e3b9cf added new default profiles to distinguish snippet fetch for local and global search the difference is, that a local search will no not cause a re-indexing of loaded pages orbiter 2008-04-24 08:42:08 +00:00
  • 2c0c8f0f0c SRU compliance according to http://www.loc.gov/standards/sru/specs/search-retrieve.html The example given on this page can be used to retrieve opensearch-compatible rss pages with YaCy orbiter 2008-04-23 16:16:41 +00:00
  • 9b03310f8a bin jetzt wach :/ danielr 2008-04-23 07:50:21 +00:00
  • 7bd8601f04 delete old releases compatible with java 1.5 ;) danielr 2008-04-23 07:22:20 +00:00
  • e90282da1c added experimental javascript terminal for rss feeds (not used yet anywhere yet, expect the worst) .. possibly to be used as content for iframes within monitoring pages not ready yet! orbiter 2008-04-22 23:09:24 +00:00
  • da386a1924 fixed deleteOldDownloads if there are no downloads danielr 2008-04-22 21:36:52 +00:00
  • 21418a22a3 removed DEBUG output danielr 2008-04-22 17:14:34 +00:00
  • 79a3edeeef deleting downloaded releases after x days (default 30) danielr 2008-04-22 16:53:53 +00:00
  • 763f9d4f5d serverCore: setting timeout for new connection before SSLDetect danielr 2008-04-22 09:03:16 +00:00
  • 1995faef8d - refactoring of Colage back-end: move to plasma package - renamed also the plasmaCrawlResults to have a consistent naming for url and image queues - added a double-check for the images - added additional queues for the images: all worse-quality images go there, so the queue can be used also if no sizes are given; no image is lost - added a cleanup for the stacks so they cannot flood the memory orbiter 2008-04-21 22:42:49 +00:00
  • d7e89c2aca fixed near-deadlock situation when deleting crawl profiles orbiter 2008-04-20 22:10:26 +00:00
  • 5e3ce46339 - better logging when rejecting a url because it is not in declared domain - more XSS attack protection orbiter 2008-04-20 21:36:25 +00:00
  • 6d1be66822 - longer refresh rate for reload of WatchCrawler page forwarding to indexing start (does not work in IE) - better names for search pages - Release 0.58 orbiter 2008-04-20 08:10:52 +00:00
  • 2149728227 - major rework on YaCy-UI - search results are retrieved from rss/xml, no other servlet needed - added double accordion sidebar menus apfelmaennchen 2008-04-19 11:31:41 +00:00
  • c270d02176 Reverting SVN 4716 orbiter 2008-04-19 09:58:36 +00:00
  • 48ffd61e6a changed "patched wrong" to warning, so it goes to the logfile danielr 2008-04-19 07:54:44 +00:00
  • 2f629d20a7 - tried to fix the '4217666-problem' - removed more unused code orbiter 2008-04-19 04:24:29 +00:00
  • 512f48e7d6 - removed unused methods - fixed xss attack on peer list in CrawlStartSimple orbiter 2008-04-19 03:33:07 +00:00
  • 14384e7a45 deactivated unnecessary and very CPU-intensive deletion check for blacklisted URLs in index receive orbiter 2008-04-19 03:02:44 +00:00
  • 701f769c66 * removed comma, which caused invalid xml f1ori 2008-04-18 15:07:36 +00:00
  • 3c76342619 - added servlet to configure the search page greeting line - added information output about the current network definition in the network servlet - better description and usage of profile entries in User Profile servlet regarding FOAF format - reformatting of menues at status page orbiter 2008-04-18 13:58:56 +00:00
  • b9602e891a * added CrawlProfileEditor_p.xml for monitoring in yacybar f1ori 2008-04-18 09:13:02 +00:00
  • d03940f2ec - included patch from http://forum.yacy-websuche.de/viewtopic.php?p=7193#p7193 - fixed problem with crawl profile editor after deletion of a crawl profile orbiter 2008-04-17 22:21:03 +00:00
  • d1ee231866 HTTPC close more unused connections danielr 2008-04-15 16:37:51 +00:00
  • 181796cffb - HTTPC ConnectionInfo entfernen bei Exceptions, unnötigen Code entfernt - FTPC (GET-)connections bei Fehlern auf jeden Fall schliessen danielr 2008-04-15 15:27:32 +00:00
  • 04c1226c80 added/fixed missing integrity-test else-case during deploy in case that we update with a tar file orbiter 2008-04-15 15:20:35 +00:00
  • 6155f0e634 last small changes until main release orbiter 2008-04-14 07:26:33 +00:00
  • 45ae3da7e7 another patch to prevent NPE in EcoTable orbiter 2008-04-14 05:33:32 +00:00
  • cb93ded5c6 applied configuration path patch orbiter 2008-04-14 04:10:51 +00:00
  • 96e39b297a reduced StackTraces (by connect timed out) danielr 2008-04-14 03:50:49 +00:00
  • 93376acdca fixed a bad chunkcache limit check which could have caused ArrayIndexOutOfBoundsExceptions orbiter 2008-04-14 03:49:02 +00:00
  • 1cab240198 patch for possible NPE in EcoTable iterator orbiter 2008-04-14 03:20:37 +00:00
  • 9a32a4c328 fixed concurrentModificationException during hello-process orbiter 2008-04-14 03:04:28 +00:00
  • 64c33e717f catched ConcurrentModificationException in ConnectionInfo.cleanUp so cleanUp is not interrupted danielr 2008-04-14 03:02:44 +00:00
  • 70826bb501 -small update for de.lng daburna 2008-04-14 01:14:44 +00:00
  • d8677ba611 fixed ConcurrentModificationException in HttpConnectionInfos danielr 2008-04-13 11:25:41 +00:00
  • c7021c14bb patch for ArrayIndexOutOfBoundsException in BMP parser (may occur in case of malformed BMPs) orbiter 2008-04-13 03:28:26 +00:00
  • 8dd35f74c8 fixed redirect problem (does not work for POST) see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1068&hilit= orbiter 2008-04-12 16:35:09 +00:00
  • 8313d58ae7 - integrated the collage into the Web Visualization menu - added a counter for the public and private queue on the page (testing..) - fixed wrong public/private categorization orbiter 2008-04-12 15:45:57 +00:00
  • c5d1d7faca undo wrong commited files danielr 2008-04-12 15:22:57 +00:00
  • 2617f4dcdb Connections_p.html: better formatting and remove very old entries danielr 2008-04-12 15:19:18 +00:00
  • 82bf9ac1c8 - added Collage servlet from datengrab and modified it: * all images are queued * private/public is respected * inserted into switchboard * added collageQueue class that stores all the queued images orbiter 2008-04-12 13:24:21 +00:00
  • 959f448e5f - disabled redirects in proxy (so client sees real path) - added connection stats (only connections currently in use) - remove "old" connections (closed or idle for some time) - synchronized shared parts of proxyHandler danielr 2008-04-12 11:39:48 +00:00
  • 8fe39ebd74 -fixed file transmission with POST. The only usage was in ranking transmission, therefore: -fixed ranking transmission orbiter 2008-04-12 08:12:51 +00:00
  • 82a9861779 fix for last commit orbiter 2008-04-11 12:55:43 +00:00
  • 5d1fbb25e7 fix for bad deploy: - the name of downloaded release files is adopted if the httpc delivers uncompressed tar.gz files (the .gz is removed from the file name) - the deploy method is able to handle tar-file (not tar.gz-files) orbiter 2008-04-11 12:37:17 +00:00
  • 202a3adb3e refactoring of HttpClient Writer processes orbiter 2008-04-10 22:47:05 +00:00
  • 8aa9fd8f24 HTTPC with only 1 retry danielr 2008-04-10 16:47:57 +00:00
  • 444dce7e81 more performance hacks orbiter 2008-04-10 15:28:58 +00:00
  • 2c2dcd12a2 - enhanced performance of Eco-Tables: less time-consuming size() - operations - will increase speed of indexing and collection.index creation orbiter 2008-04-10 13:24:55 +00:00
  • e356625b22 - refacotring of stream copy handling to support time-consuming operations - made usage of BufferedStreams explizit to distinct different copy method in serverFileUtils (byte-by-byte and using an own buffer) - introduced another timeout setting (java internal property) - more restrictions to clients accessing a single host (a security setting to prevent DoS by mistake) orbiter 2008-04-10 09:53:07 +00:00
  • f01c50cf8d Proxy logging error (first step to resolution!?) danielr 2008-04-10 06:56:06 +00:00
  • c3342e1178 - removed class with only one static method - removed connection method with too long time-out orbiter 2008-04-09 23:35:20 +00:00