Commit Graph

  • bfb518cd47 some refactoring to get the LoaderDispatcher a little bit more independent from the switchboard orbiter 2010-03-20 10:28:03 +00:00
  • 36bd843ece for for RFC5322 comformance as suggested by Quix0r in http://forum.yacy-websuche.de/viewtopic.php?p=19585#p19585 orbiter 2010-03-20 10:23:47 +00:00
  • c855fc48c6 only load robots.txt for http and http protocol orbiter 2010-03-20 10:15:11 +00:00
  • 0465f28f7f applied 'null in rss2.js' fix from Quix0r, see http://forum.yacy-websuche.de/viewtopic.php?p=19612#p19612 orbiter 2010-03-20 09:58:05 +00:00
  • 748abfcffa added patches to prevent yacy-protocol DoS settings orbiter 2010-03-19 15:31:15 +00:00
  • e820ed061a avoiding excessive DNS lookups to determine localhost orbiter 2010-03-19 14:28:25 +00:00
  • 11983bc936 redesigned some parts of the parser entry point: - in all cases that the parser is entered it is a whole set of possible parsers computed according to given mime type and file extension, that means that all parsers are considered where the registered mime acceptance and extension acceptions matches. that may cause that several parsers are tried for the same file which will cause a success in cases where there was only the mime type was used to choose the right parser and the mime type was given wrongly by the host httpd. orbiter 2010-03-19 13:04:42 +00:00
  • de88200e11 - added Byte Order Mark recognition to serverObjects The BOM character FEFF may appear at the beginning of strings if some browsers append the characters %EF%BB%BF to input values. see http://en.wikipedia.org/wiki/Byte_order_mark orbiter 2010-03-19 10:58:40 +00:00
  • 89b4fff1c2 adopted ant script for new exif library orbiter 2010-03-12 12:36:38 +00:00
  • 24e5faee75 added exif parsing for jpg images orbiter 2010-03-12 12:23:38 +00:00
  • 82f76e1296 removed log line orbiter 2010-03-11 20:31:38 +00:00
  • 0f8004f9da enhanced html parser to recognize a href tags inside header tags orbiter 2010-03-11 17:52:07 +00:00
  • 3300930fc5 - (almost) fixed FTP crawler - integrated/fixed SMB crawler orbiter 2010-03-11 15:43:06 +00:00
  • 35d0057cb0 stopYACY.sh can now use curl orbiter 2010-03-11 00:12:53 +00:00
  • 61493a9a9f added more information about metadata in ViewFile.html orbiter 2010-03-11 00:11:14 +00:00
  • 1198b9989d bugfixes, more sorttable orbiter 2010-03-10 15:39:36 +00:00
  • 27b2998eb4 added searchtable function to more tables in interface you can now sort by any column in most tables in YaCy just by clicking on the headline column of the table orbiter 2010-03-10 10:05:41 +00:00
  • 9623d9e6d2 added a smb loader component for the YaCy crawler orbiter 2010-03-10 08:55:29 +00:00
  • c77fbd0390 added sorttable (http://www.kryogenix.org/code/browser/sorttable/) javascript library to make tables sortable orbiter 2010-03-09 23:40:16 +00:00
  • 3014e5f6f9 - integrated live search in the IndexControlURLs input window for URLs: this searchs for occurrences of the given word in URLs and presents them in a pop-up list below the input line - some bugfixes for the new robots table viewer orbiter 2010-03-09 15:44:11 +00:00
  • ae2f3f000f better handling of table copy abandon .. prevent memory leak orbiter 2010-03-09 13:32:15 +00:00
  • 0769517129 added a robots.txt monitor in the crawler monitor submenu orbiter 2010-03-09 11:31:15 +00:00
  • 48995e71c4 added soft-auth to general authentication scheme orbiter 2010-03-09 00:07:17 +00:00
  • 72f00dee59 removed never-used server access account function orbiter 2010-03-08 22:30:45 +00:00
  • 474bb4de82 ups orbiter 2010-03-07 23:32:18 +00:00
  • 8c88abf685 added follow-me link for twitter in status hints orbiter 2010-03-07 23:29:29 +00:00
  • 58d75a6bde allow more results for a single query at the same time if the client is not authorized. This is necessary for the search widget where the default number of results is now set to 20 instead of 10 to cause that a scroll bar is shown which is necessary to get a trigger for new searches for more results. orbiter 2010-03-07 22:49:20 +00:00
  • 57e1eae95e longer time-out for url fetching .. may help to show all that links that the statistic say for a search result orbiter 2010-03-07 22:23:08 +00:00
  • 9e639603e3 after frequent occurrences of 100% CPU usages and permanent blockings I try to disable a function in a method that may cause the problem when calling an external library (apache http client 3.x). The thread dump that shows the problem is attached here. orbiter 2010-03-07 21:19:23 +00:00
  • 4144927d94 show less errors orbiter 2010-03-07 21:02:08 +00:00
  • 736df39c9c Updated German translation de.lng: mainly ViewFile.html additions and removed (De)Select All from Table_API_p.html section mikeworks 2010-03-07 16:31:49 +00:00
  • b88f5fbb4b slightly changed crawling policy orbiter 2010-03-07 01:46:08 +00:00
  • de01fe0e6d fix for bug in url parser orbiter 2010-03-07 01:33:18 +00:00
  • 7684a575c4 fix for deletion of error database each time when YaCy starts up orbiter 2010-03-07 00:33:39 +00:00
  • f561e340c6 show more results of single domains when not authorized fully (up to 100) orbiter 2010-03-07 00:12:58 +00:00
  • c4bdb1e7f2 added one more option in ViewFile to show an iframe like for the orginal web page content but using the cache than the direct link to the content in the web. Upgraded the very old and previously not any more used CacheResource_p servlet to a new and working version. orbiter 2010-03-06 23:41:51 +00:00
  • c09a995930 better logging of double occurrences of urls in the crawler orbiter 2010-03-06 20:31:30 +00:00
  • 1bbe14d23f SVN 6716 unfortunately contained parts of the unfinished SMB integration. To fix compile errors the remaining parts of the SMB implementation stub is added with this commit. This adds the jcifs smb library. orbiter 2010-03-05 21:46:22 +00:00
  • 884b262130 - added a new Wiki Namespace Navigator - some redesign of Navigator data structures orbiter 2010-03-05 21:25:49 +00:00
  • b0c6d0108b fix for select-all toggle in tables servlet orbiter 2010-03-05 16:15:59 +00:00
  • 617dfbbd06 allo 'authorization by encoded password' also if requesting client is not from localhost but from the same host as yacy is running on. orbiter 2010-03-05 16:03:55 +00:00
  • 270fb38674 - fixed some bugs in Table viewer - added 'select all' feature in Tables_p - enhanced ViewFile.html: has now an input field to load arbitrary resources from the web and analyze them (!!!) - included the ViewFile servlet into the Index Administration menu - show in ViewFile if ressource is in url-db and/or in Web cache - bugfixes to BEncodedHeap and Tables management orbiter 2010-03-05 15:41:15 +00:00
  • 38d7a28cd2 fix in viewfile needed when ViewFile is called only with 'url' parameter orbiter 2010-03-05 12:24:15 +00:00
  • 599c3766c4 added authentication to automated API call orbiter 2010-03-04 14:10:03 +00:00
  • 727dd9b193 - fixed a bug in robots.txt parser - moved storage of robots.txt entries to WorkTables, so it is now possible to browse the robots entries with the table browser orbiter 2010-03-04 11:58:07 +00:00
  • 54af9e6b49 - added parsing of robots meta-tag in html headers to detect a noindexing request - added evaluation and indexing prevention in case that a noindexing is given in a html file orbiter 2010-03-03 23:32:56 +00:00
  • f336ed568d fir for parameters on reload lotus 2010-03-03 21:14:12 +00:00
  • ddc21e7b73 adapted banner colors to current standard template lotus 2010-03-03 21:10:02 +00:00
  • e76f1e6cc0 Update property name bookmarks to display_bookmarks (changed in SVN 5554) to fix the broken Bookmark RSS feed - Fixes RSS part of http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1824 mikeworks 2010-02-28 11:22:01 +00:00
  • c52dec0c06 Updated German translation de.lng: Translated new entries in Network.html and Status_p.inc mikeworks 2010-02-27 03:44:25 +00:00
  • cc074c1a36 Renamed, removed and added license information for each jar archive in external lib folder mikeworks 2010-02-27 03:04:11 +00:00
  • 8a19be24de tell non-windows users about the tray-icon lotus 2010-02-26 19:44:23 +00:00
  • f5ec7ad077 replaced four old libraries with latest version orbiter 2010-02-26 14:14:50 +00:00
  • 6b89f681c5 added bing, dbpedia and wolfram alpha to the compare-search options orbiter 2010-02-26 10:34:36 +00:00
  • 475ffabfa1 Added License (Apache 1.1) information for Jakarta ORO library 2.0.7 (2.0.8 available) mikeworks 2010-02-26 04:26:04 +00:00
  • 46c4f8b68a better look-ahead into the crawl queue: show more on crawl monitor orbiter 2010-02-24 23:11:58 +00:00
  • 4b6efe3b48 more ergonomic default values for crawl start orbiter 2010-02-24 22:33:51 +00:00
  • 7b546415dc added svn6695 for windows lotus 2010-02-24 14:58:53 +00:00
  • f175f9a2d3 changed way how number of search requests are counted: so far only search requests at the remote search interface had been counted. This was done to protect the privacy of searchers, because counting was not done and published at the own search interface. This caused that no search requests of robinson peers had been counted, becuase they cannot be counted at remote peer. This change introduces a distinction of locally done search requests at the local search interface from search requests that are on the local interface but had been submitted from a remote IP without authentication. Now 3 counters are maintained: - partial count of remote searches - total count of local searches on robinson peers from non-authenticated clients - total count of local searches on robinson peers from localhost or authenticated clients In the global statistic of search requests now the first two counters of the three cases are added Because we habe a large number of robinson peers with a large number of remote non-authenticated requests the statistic should show at least three times of the number of search requests. orbiter 2010-02-24 13:53:55 +00:00
  • 84222e3b4f fix for auto-updater: delete old libraries before copy of new one orbiter 2010-02-24 13:46:50 +00:00
  • cd6de83905 next try for for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=2703 (reverted 6692) sixcooler 2010-02-23 15:59:58 +00:00
  • bfe4693e9a fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=2703 sixcooler 2010-02-23 13:46:56 +00:00
  • 6fde481ef4 missing for last commit orbiter 2010-02-22 20:21:39 +00:00
  • b68deb407a - moved test data from /bin to /test/words - refactoring of stopYACY.sh by introduction of /bin/apicall which is able to call any api file with attached authorization orbiter 2010-02-22 20:14:16 +00:00
  • cd7f0bf75f wait a little bit longer for a init.d restart between stop and start orbiter 2010-02-22 15:42:48 +00:00
  • 50169759ca Replaced old pdfbox and fontbox LICENSE files with new ones (still Apache 2.0) Testing delete and adding files mikeworks 2010-02-22 07:09:57 +00:00
  • 1e2c011c98 updated the jsch lib from 0.1.21 to 0.1.42 orbiter 2010-02-21 23:43:50 +00:00
  • c2b505ae87 updated bouncy castle libraries orbiter 2010-02-21 23:31:40 +00:00
  • 681f4d185f replaced microsoft office document parser POI 3.5 with latest version 3.6 orbiter 2010-02-21 23:18:52 +00:00
  • e9cdddcd0f updated parser libraries fontbox and pdfbox with latest version of jar files orbiter 2010-02-21 23:05:38 +00:00
  • 93b7ddc27d fix for http://forum.yacy-websuche.de/viewtopic.php?p=19376#p19376 orbiter 2010-02-21 22:49:35 +00:00
  • 3df1060015 jre update für installer lotus 2010-02-20 08:59:17 +00:00
  • 7d569ce735 Updated German translation de.lng: Translated new buttons in Table_API_p.html Updated French translation fr.lng: cleaned up syntax and added a few lines Updated Italian translation it.lng: cleaned up syntax Updated Slovakian translation sk.lng: cleaned up syntax mikeworks 2010-02-19 04:24:58 +00:00
  • 23bf81d225 new buttons to select/deselect all entries orbiter 2010-02-18 23:31:23 +00:00
  • 8030ed3319 self-healing for lost crawl profile handles orbiter 2010-02-18 21:55:45 +00:00
  • 881a1065ce version number step to 0.94 orbiter 2010-02-17 21:56:56 +00:00
  • e3e5e05ec2 fix for problem in ranking setting which was caused by the introduction of a toString() method in serverObjects see also: http://forum.yacy-websuche.de/viewtopic.php?p=19310#p19310 orbiter 2010-02-17 21:31:08 +00:00
  • e3ccfb54aa fix for display problem in Firefox on MacOS X orbiter 2010-02-17 09:08:16 +00:00
  • 07fc76a45e Updated German translation: - Forgot kiosk mode in index.html mikeworks 2010-02-17 03:18:37 +00:00
  • 564927ce72 redesign of CrawlResult data structures because of OOM occurrences during URL deletion processes. orbiter 2010-02-16 23:06:04 +00:00
  • bab0438fee Updated German translation: - kiosk mode and some corrections mikeworks 2010-02-16 21:59:00 +00:00
  • 30c8185139 fix for sid check orbiter 2010-02-15 23:31:32 +00:00
  • ef62d017e5 integrated session id filtering for crawler orbiter 2010-02-15 23:15:17 +00:00
  • d8d9984913 added framework for session id filtering (not ready yet) orbiter 2010-02-15 22:30:41 +00:00
  • 2bc36de336 - fix for bug in svn 6669 - cleanup orbiter 2010-02-15 22:06:13 +00:00
  • d378ca4604 better handling of concurrency in seed orbiter 2010-02-15 15:57:35 +00:00
  • 6538043d89 fix for http://forum.yacy-websuche.de/viewtopic.php?p=19189#p19189 orbiter 2010-02-15 15:45:31 +00:00
  • 047f8718a7 added kiosk-mode button on standard search page and interactive search page see also: http://forum.yacy-websuche.de/viewtopic.php?p=19178#p19178 orbiter 2010-02-15 12:54:53 +00:00
  • ac492fa2a5 added a close button image orbiter 2010-02-15 10:40:33 +00:00
  • f1d56b2a68 Updated German translation: - Added h2 topic in CrawlResults.html and fixed one line mikeworks 2010-02-12 14:19:36 +00:00
  • 1a4cb51744 ups, unclosed HTML tag fixed.... suessthomas 2010-02-11 22:13:52 +00:00
  • 9e14958115 minor corrections and bug fixes suessthomas 2010-02-11 15:05:38 +00:00
  • 0e5df46c95 Updated German translation: Corrected one line in section for IndexImportOAIPMH_p.html that was not correctly escaped mikeworks 2010-02-11 01:58:32 +00:00
  • c167e1352e Updated German translation: - Added env/templates/submenuContentIntegration.template - Added IndexImportWikimedia_p.html - Added IndexImportOAIPMH_p.html - Added IndexImportOAIPMHList_p.html mikeworks 2010-02-11 01:50:03 +00:00
  • 01dbba9fc4 Updated German translation: - added API icon text in Network.html mikeworks 2010-02-10 23:51:01 +00:00
  • e071d71f19 fix for yacy-banner-network-values http://forum.yacy-websuche.de/viewtopic.php?f=6&t=2521 sixcooler 2010-02-09 18:22:36 +00:00
  • 945e0ba5a5 allow global search if res. observer disabled index transmission lotus 2010-02-09 17:14:16 +00:00
  • 8faeedd99a not a fix! for: http://forum.yacy-websuche.de/viewtopic.php?f=5&t=2679 lotus 2010-02-09 09:33:30 +00:00
  • 063b29060c Minor Changes suessthomas 2010-02-08 10:13:14 +00:00
  • 787b588c33 reverted a part of svn6636: - didn't work on blobs >2GB - should be obsolete since svn6651 http://forum.yacy-websuche.de/viewtopic.php?f=6&t=2652&sid=7fa98fd3edfc2a03f26394d545e3e3c1&p=19172#p19172 sixcooler 2010-02-07 19:32:46 +00:00