Commit Graph

  • ef7fe537c5 fixed a cache-bug in cachedFileRA orbiter 2009-01-06 10:51:56 +00:00
  • ca80930892 accept leading dots on filetype: and site: search lotus 2009-01-06 10:04:24 +00:00
  • 6c7e83909b - refactoring of data access methods to be prepared for new cell data structure - removed a memory overhead in collections which prevent OOM Exception in low memory configurations orbiter 2009-01-06 09:38:08 +00:00
  • 5ce9d81955 * remove class-file from old location f1ori 2009-01-05 21:19:30 +00:00
  • 1af728ae09 *) regex for site operator changed as proposed by Lotus low012 2009-01-05 18:30:34 +00:00
  • c8451614f3 fix for overflow http://forum.yacy-websuche.de/viewtopic.php?p=11696#p11696 lotus 2009-01-05 18:28:27 +00:00
  • 9e58ae036d *) added site operator which can be used to only show results from a certain domain. example: "test site:edu" shows only documents which contain the word test and which come from an edu domain low012 2009-01-04 14:58:32 +00:00
  • 19e7c56f7f *) apply filter to dir list to only show .black files as blacklists low012 2009-01-04 10:14:19 +00:00
  • c4c4c223b9 fixed a problem with attribute flags on RWI entries that prevented proper selection of index-of constraint orbiter 2009-01-04 02:27:29 +00:00
  • 6072831235 no cr transmission for robinson peers see also: http://forum.yacy-websuche.de/viewtopic.php?p=10290#p10290 orbiter 2009-01-03 23:44:42 +00:00
  • 4bffe664ca *) moved entry field for new expressions to top of the list as requested in forum (http://forum.yacy-websuche.de/viewtopic.php?f=9&t=1678) *) added some Javascript to disable list selection on bottom of list in cases it is not needed (edit, delete) and only enable it if needed (move), if JS is turned off everything will work as usual low012 2009-01-03 10:18:48 +00:00
  • afe98bc11c *) added changes as proposed by Halborinda in http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1674 *) changed indention low012 2009-01-03 08:24:08 +00:00
  • 07fc115e90 removed active profiling in kelondroRowSet orbiter 2009-01-02 12:33:06 +00:00
  • be4c458951 refactoring (implemented Iterable in kelondroRowCollection) orbiter 2009-01-02 11:38:20 +00:00
  • bb5c2cd12e *) ISINDEX parameters will not be put on commandline anymore to prevent possible security hazards (better safe than sorry). Parmeters will have to be read from QUERY_STRING in ISINDEX case too which does not seem to be uncommon behaviour for web servers: http://vms.pdv-systeme.de/users/martinv/cgi_basics/cgi_basics.html#Datenuebergabe low012 2009-01-02 11:18:26 +00:00
  • b6bba18c37 replaced the storing procedure for the index ram cache with a method that generates BLOBHeap-compatible dumps this is a migration step to support a new method to store the web index, which will also based on the same data structure. made also a lot of refactoring for a better structuring of the BLOBHeap class. orbiter 2009-01-01 22:31:16 +00:00
  • db1cfae3e7 *) cleaning up after myself low012 2009-01-01 19:45:15 +00:00
  • f547f9a78c *) added CGI capabilities (run Perl scripts and other software via HTTP GET and POST) *) set cgi.allow to true in yacy.conf to enable CGI (CGI is disabled by default) *) edit cgi.suffixes in yacy.conf if necessary to use additional script types low012 2009-01-01 19:40:06 +00:00
  • bdc380cd84 * add lastModified to templateCache -> no outdated files from cache anymore... f1ori 2009-01-01 14:56:53 +00:00
  • 6792c2a07d * change mime type of xml documents from application/xml to text/xml -> for easier Javascript requests f1ori 2009-01-01 12:30:51 +00:00
  • cb1e887027 * move svnRevNr classes to libbuild f1ori 2008-12-31 19:58:22 +00:00
  • 025094675f * remove empty directory * add necessary dependency for pdfParser f1ori 2008-12-31 19:39:02 +00:00
  • c5691180cb * skip style-tags in HTML-files f1ori 2008-12-31 19:34:24 +00:00
  • 9d5d30f877 *) http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1672 low012 2008-12-31 16:50:10 +00:00
  • 5448aad328 removed unused code orbiter 2008-12-30 12:12:00 +00:00
  • 3567c58b18 added another filed information for BLOBHeap dumps: the gaps orbiter 2008-12-30 10:49:43 +00:00
  • abdd4aa414 added a index dump for blob heaps: this will increase the shutdown time for at most some seconds, but will speed up the start-up orbiter 2008-12-29 21:36:27 +00:00
  • 28d2d28573 added support for filetype search (just use filetype:<type> in the search query) orbiter 2008-12-29 17:57:04 +00:00
  • 8c3205b62e fix for OOB Exception see http://forum.yacy-websuche.de/viewtopic.php?p=11598#p11598 orbiter 2008-12-29 17:36:53 +00:00
  • 78c568331e added test channel to /xml/feed.rss can be obtained with http://localhost:8080/xml/feed.rss?set=TEST returns always a single feed entry with a fresh date orbiter 2008-12-29 12:39:07 +00:00
  • e004da48d3 - added fast fingerprint computation for files (any). Will be used in new index dump method - refactoring orbiter 2008-12-29 12:22:13 +00:00
  • eab72424df *) Fixed small bug: When adding new elements to blacklist via import, the blacklist which the elements were added to was supposed to be displayed, which did not work correctly. low012 2008-12-28 09:58:02 +00:00
  • 0e56675596 *) cleaning up ;-) low012 2008-12-27 20:09:36 +00:00
  • cf69557ea2 *) blacklists can be exported as XML or plain text now *) blacklist import via file upload works now low012 2008-12-27 15:38:20 +00:00
  • 1594a15be9 *) explicit mentioning of blacklist in blacklist cleaner low012 2008-12-27 13:06:05 +00:00
  • 2d2ce24011 * remove all encoding-stuff from proxy encoding is handled by parsers or browser, proxy only passes through f1ori 2008-12-23 19:14:54 +00:00
  • 73c8a0839c * abort download, when proxy connection is closed f1ori 2008-12-23 11:30:24 +00:00
  • bb935fdbb0 less organization overhead for DNS caching and prefetching orbiter 2008-12-23 10:06:49 +00:00
  • 4907697cfa * make fileuploads through proxy bigger than 65500 bytes possible * remove gzip-encoding for files from cache f1ori 2008-12-22 23:04:00 +00:00
  • fc8189f3fb better self-healing of corrupted databases orbiter 2008-12-22 16:43:49 +00:00
  • 963da8c3f9 * updated tm-extractors to new version 1.0 f1ori 2008-12-21 14:51:03 +00:00
  • 51f1a1927c * remove saaj.jar and axis.jar and references to it (was for soap-stuff?) f1ori 2008-12-21 13:06:04 +00:00
  • 5a89266598 *) new parameters for future use (better blacklist handling for im- and export) low012 2008-12-19 19:33:08 +00:00
  • e34ac22fbd - added new monitoring servlet at http://localhost:8080/PerformanceConcurrency_p.html - used the new monitoring to do some fine-tuning of the indexing queue orbiter 2008-12-19 15:26:01 +00:00
  • 449e697436 fix for null-seed in seedfile http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1653 lotus 2008-12-19 12:10:01 +00:00
  • d376d81fc4 replaced busy thread control of crawl stacker by blocking threads orbiter 2008-12-18 23:18:34 +00:00
  • f29b48d9ff patch for IndexOutOfBoundsException orbiter 2008-12-18 22:05:26 +00:00
  • 0881190b19 * Robots.txt: don't interpret Crawl-Delays for other robots fixes: http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1647 f1ori 2008-12-18 15:35:41 +00:00
  • 243e73f53b removed unnecessary usage of kelondroBLOBTree orbiter 2008-12-18 00:18:37 +00:00
  • 8cb7170b75 - set status of kelondroTree, kelondroBLOBTree and kelondroFlexTable to deprecated - removed initialization and/or usage of kelondroFlexTable (should meanwhile not be used any more) orbiter 2008-12-18 00:08:17 +00:00
  • 7535fd7447 - refactoring of CrawlEntry and CrawlStacker - introduced blocking queues in CrawlStacker to make it ready for concurrency - added a second busy thread for the CrawlStacker The CrawlStacker is multithreaded. It shall be transformed into a BlockingThread in another step. The concurrency of the stacker will hopefully solve some problems with cases where DNS blocks. orbiter 2008-12-17 22:53:06 +00:00
  • 6569cbbec1 npe fix: http://forum.yacy-websuche.de/viewtopic.php?t=1646 (break to avoid bad side effects) lotus 2008-12-16 20:53:31 +00:00
  • 18513e2ee2 npe fix: http://forum.yacy-websuche.de/viewtopic.php?t=1646 lotus 2008-12-16 13:36:13 +00:00
  • 2802138787 - refactoring of CrawlStacker (to prepare it for new multi-Threading to remove DNS lookup bottleneck) - fix of shallBeOwnWord target computation heuristic orbiter 2008-12-15 00:02:58 +00:00
  • b1e211b258 no error-alert: http://forum.yacy-websuche.de/viewtopic.php?t=1639 lotus 2008-12-13 12:04:08 +00:00
  • 13cb0916ee changes to statistics and content of thread dump servlet (points now more directly to performance leaks without mentioning class calls inside of sun/java calls that cannot be changed anyway) orbiter 2008-12-11 20:13:14 +00:00
  • db6b3bf5a3 speed enhancement for integrated http server: - tuning hacks in template engine - bypassing the template engine if no servlet present orbiter 2008-12-11 20:10:37 +00:00
  • 7cd08bd5fb fix for NPE in BLOBCompressor orbiter 2008-12-11 13:33:24 +00:00
  • 5b94498643 fine-tuning of cache usage from SVN 5386 and a bug fix for overflow in available() method orbiter 2008-12-10 14:35:01 +00:00
  • 1779c3c507 - added a read cache to the RAFile interface to RandomAccessFile - added a write buffer to BLOBHeap - modified the BLOBBuffer (is now only to buffer non-compressed content) - added content compression to the HTCache The new read cache will decrease the start/initialization time of BLOB files, like the HTCache, RobotsTxt and other BLOBHeap structures. orbiter 2008-12-10 11:15:19 +00:00
  • e1acdb952c fix for problem with userDB and bookmarksDB which was caused by changes in kelondroRA in SVN 5376 orbiter 2008-12-08 00:17:45 +00:00
  • 2c682d649b - no stop shortcut (-> stop via tray) - store registry keys on current profile lotus 2008-12-07 19:37:49 +00:00
  • e918d64c23 show hand-cursor an labels lotus 2008-12-06 17:32:53 +00:00
  • 4a2dac659e more speed hacks: - modified and activated write buffer - increased cache flush factor - fixed a problem with deadlocking of indexing process orbiter 2008-12-05 13:55:48 +00:00
  • 07d7653de1 update to JRE 6u11 lotus 2008-12-05 11:23:01 +00:00
  • 1fb518a5b4 display <String> etc. lotus 2008-12-04 20:21:53 +00:00
  • 47292e696a more performance hacks orbiter 2008-12-04 12:54:16 +00:00
  • 759cef23dd fix for bug in kelondroAbstractRA.readFully orbiter 2008-12-03 23:32:07 +00:00
  • bd1dc9cd5d thread dump with statistics, a little bit of profiling orbiter 2008-12-03 23:26:25 +00:00
  • d39d420b39 performance hacks orbiter 2008-12-03 15:38:29 +00:00
  • 5280ad638d added basic performance page other performance settings can be found on advanced settings lotus 2008-12-03 14:10:01 +00:00
  • 1a51d9fcfd display proper values lotus 2008-12-02 17:57:30 +00:00
  • 0b4808ba3d added new interactive search feature: - during the user types search queries, the local database is searched - results are presented interactively orbiter 2008-12-02 15:24:25 +00:00
  • 74a3d86114 fixed a error response that might present classified information orbiter 2008-12-01 23:14:42 +00:00
  • c6525ab75f fix for NPE in seed handling orbiter 2008-12-01 23:08:27 +00:00
  • fea82b54ef more contrast on search snippets lotus 2008-11-26 19:57:13 +00:00
  • 1951d30a62 addendum to last commit handle words with length < 3 correctly lotus 2008-11-26 19:43:40 +00:00
  • 325ba7bfb8 only query words with length > 2 this is not complete, yet lotus 2008-11-26 16:41:38 +00:00
  • 489edb4473 improved pattern selection lotus 2008-11-26 10:06:38 +00:00
  • e423fa9846 *) added method to only get file names in directory listing which match a filter *) only files which end with .black will be listed as blacklists *) added a little bit of Javadoc low012 2008-11-25 20:26:06 +00:00
  • 577b53aee6 added more search engines lotus 2008-11-24 13:05:20 +00:00
  • 7f4d411c0d npe-fix lotus 2008-11-24 13:04:57 +00:00
  • 513179f404 changed interface to colletctionIndex and adopted all implementing classes: do not return a result of a double-check when adding entries with addUnique orbiter 2008-11-23 23:55:08 +00:00
  • 9d64693cfb reverting again the changes to new concurrent chunkIterator orbiter 2008-11-23 22:22:44 +00:00
  • 45ad1c3dd5 - re-activated concurrent iterator for EcoFiles - added javadoc for new concurrent intialization in kelondroBytesLongMap - switched default value for commons storage to false - version step orbiter 2008-11-23 18:25:40 +00:00
  • 2e2120046f speed enhancement for BLOBHeap opening process using concurrency of FileIO and content processing orbiter 2008-11-23 17:38:01 +00:00
  • 1545e5440a * index deletion: checkbox-confirmation * watch crawler: less load on exhausted peers; wait for data before reloading again lotus 2008-11-23 12:02:58 +00:00
  • fa26a8f25a fix for deadlock-like behavior in balancer orbiter 2008-11-22 11:25:01 +00:00
  • 1918a0173e added more exception handling during crawling orbiter 2008-11-22 00:40:18 +00:00
  • 10f5ec1040 reverted last commit (more testing needed) orbiter 2008-11-22 00:12:50 +00:00
  • 5af8923f37 * distribute forgotten jar-file in parser f1ori 2008-11-22 00:05:04 +00:00
  • b0f2003792 fast database initialization and fast start.up of yacy: - applied knowledge about concurrent files stream reading and index processing from the wikimedia reader to the EcoTable initialization process: the file reader is now concurrent to the index generation - changed also some initialization processes to avoid some pauses during initialization orbiter 2008-11-21 23:21:33 +00:00
  • ba5b274b8c #translation update: -blacklist -crawlstart ... daburna 2008-11-21 16:45:45 +00:00
  • 0ca4bc7b79 - added reader and visualization for mediawiki-export files: files exported from mediawiki using the xml schema according to http://www.mediawiki.org/xml/export-0.3/ can be processed to be viewed in a YaCy servlet. To acces such a file, place it into DATA/HTCACHE/mediawiki/ i.e. the export from german wikipedia would be: DATA/HTCACHE/mediawiki/wikipedia.de.xml This file can then be accessed using the URL http://localhost:8080/mediawiki_p.html?dump=wikipedia.de.xml&title=YaCy if this is done the first time, an index file is created (for this case: more than 4 million lines must be written, this takes about 15 minutes) Then try the same url again. orbiter 2008-11-20 18:31:52 +00:00
  • 2e63f03ca5 copy&paste vergessen :/ danielr 2008-11-20 11:41:11 +00:00
  • cd8082b4e3 fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1111#p11166 danielr 2008-11-20 11:18:19 +00:00
  • 4f996a7651 fix for logparser pattern lotus 2008-11-17 16:23:17 +00:00
  • d18c18971e * dirlisting in UTF-8 encoding * fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1550&hilit=#p11108 f1ori 2008-11-15 20:49:03 +00:00
  • bb570716e6 added more testfiles lotus 2008-11-15 09:00:24 +00:00
  • 867d0f2f56 removed some unnecessary pause delays orbiter 2008-11-14 23:36:33 +00:00