Commit Graph

  • 4bd927d513 the Semantic Web moves in! - added two new api files for document metadata: - added a XHTML+RDFa html file shows the document metadata in a format that presents the data for rendering and for metadata retrieval. This is a typical document format for a semantic web data structure. the used RDF vocabulary is Dublin Core - added a xml file that shows the same data as pure DC metadata - integrated the API into the existing IndexControlURLs interface orbiter 2009-01-13 22:04:38 +00:00
  • 7eade3f181 * fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1728 f1ori 2009-01-13 21:42:00 +00:00
  • d1bace5e4d enhanced cleanup function orbiter 2009-01-13 15:34:11 +00:00
  • cb76d9e0e4 more synchronized in BLOBHeap (will not fix problem with Runtime-Error as reported in forum) orbiter 2009-01-13 13:22:29 +00:00
  • 0afce376d7 * forgot libbuild/ directory in dist-Target f1ori 2009-01-13 10:01:00 +00:00
  • 613c49bc38 YaCy-UI: update to welcome text (change-log and bug tracker) for stable release apfelmaennchen 2009-01-13 05:59:39 +00:00
  • ff41da613e removed exception printout during load of snippets orbiter 2009-01-13 00:30:19 +00:00
  • 814a28775f removed thread dump writing in case of invocation target exception in httpd (looked bad, not serious) orbiter 2009-01-13 00:27:01 +00:00
  • bed38a5f8c fix for uncaught exception in RSSReader orbiter 2009-01-13 00:20:37 +00:00
  • 05c235de32 fix for npe orbiter 2009-01-13 00:10:42 +00:00
  • 7608944081 *) bugfix for REMOTE_HOST environment variable in CGI code (shows hostname of client instead of hostname of YaCy peer now) low012 2009-01-12 22:15:00 +00:00
  • a6b29cf72c reverted change of search event processing in SVN 5460. The new code did not work properly, it gave remote search requests too less time orbiter 2009-01-12 15:06:22 +00:00
  • c7c291bc6b allow simultaneous inurl: site: and filetype: search lotus 2009-01-12 14:59:27 +00:00
  • d162cce6b4 classpath update lotus 2009-01-12 13:39:04 +00:00
  • e54d588a15 "patterns as defined by match operator in java" as r5475 says lotus 2009-01-12 13:36:36 +00:00
  • 9ef77d57f5 added an access control to the search interface using white/blacklists: in the network configuration, you can configure a whiteliste and a blacklist - blacklistet clients cannot search - whitelistet client get never any search restrictions - for all other clients: apply DoS search restrictions Please see the example configuriation in yacy.network.freeworld.unit by default, all clients from localhosts get whitlistet. If you have your own YaCy network, please put all the IPs of your peers into the whitelist orbiter 2009-01-12 10:55:48 +00:00
  • ac89e8e84d removed unused search interface orbiter 2009-01-12 08:45:12 +00:00
  • 9b26dfec80 small fix to correct encoding of xml output apfelmaennchen 2009-01-11 22:50:53 +00:00
  • efe801173c better dht-in cache flush. see also: http://forum.yacy-websuche.de/viewtopic.php?p=11936#p11936 orbiter 2009-01-11 22:39:49 +00:00
  • 941ab78d9b better termination for blocking threads orbiter 2009-01-11 22:34:08 +00:00
  • 3dc208fad0 bugfix: bookmarks can now handle folder names like /news and /newspaper without getting confused... apfelmaennchen 2009-01-11 19:39:51 +00:00
  • e948df68ac longer timeout for queues during shutdown orbiter 2009-01-11 19:10:09 +00:00
  • 2b32248079 fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1516&p=10545#p10545 orbiter 2009-01-11 19:02:50 +00:00
  • cc207a979e - added new unified bookmark api to /xml/bookmarks/ the get_bookmarks api currently supports: .xml: posts, xbel, rss, flexigrid .json: posts, flexigrid .html: work in progress - YaCy-UI: support for new bookmark api apfelmaennchen 2009-01-11 12:29:19 +00:00
  • c1330f5743 *) added environment variable DOCUMENT_ROOT *) caught exception low012 2009-01-10 18:31:10 +00:00
  • f26b8fcb1b *) comment mode is 'moderated' instead of 'activated' by default now (to avoid spam being visible) low012 2009-01-10 12:58:35 +00:00
  • b2a8c653ee small fixes orbiter 2009-01-10 09:21:44 +00:00
  • f675d47f86 better protection against database failures orbiter 2009-01-10 09:19:46 +00:00
  • 37e1cb139d * some improvements of the initscript for debian f1ori 2009-01-09 20:53:08 +00:00
  • 8632eebf60 - added api icon to the web structure visualization - removed fixed horizontal menu - the api icon in the search results can only be seen when display=1 orbiter 2009-01-09 15:42:20 +00:00
  • 4f45605f04 small update for timing in search result processing orbiter 2009-01-09 15:28:45 +00:00
  • 66818a2f2e smaller api banner lotus 2009-01-09 10:27:04 +00:00
  • 9d119c6b61 migration of auto-update rules to new release strategy: next stable will be 0.7, development releases are 0.*x, experimental will be if x = 1, 2, 3 orbiter 2009-01-09 10:08:11 +00:00
  • 4d5b401f00 try to fix some performance problems with the internal index management: - ensuring that ordered indexes stay ordered during remove - no unnecessary ordering checks - better test logic in crawl stacker orbiter 2009-01-09 00:06:36 +00:00
  • 4641ecd6d9 inurl: search lotus 2009-01-08 18:59:29 +00:00
  • 299189f1a9 added the API icon to the bookmarks, the network page and the search page orbiter 2009-01-07 23:45:20 +00:00
  • a1bf687b3b added first API tooltip! - description of JSON search result in interactive search orbiter 2009-01-07 22:57:32 +00:00
  • 0d1bd78674 * full site: syntax support e.g. site:de.wikipedia.org possible if dots in query would work yet lotus 2009-01-07 21:05:07 +00:00
  • a0605325bb fixed a NullPointer Exception borg-0300 2009-01-07 15:44:09 +00:00
  • 9bed4de280 fix for the search bug introduced in SVN 5449 orbiter 2009-01-06 23:16:10 +00:00
  • d6a5c98080 api banner concept draft lotus 2009-01-06 20:51:22 +00:00
  • b2b7edae18 fixed interactive search - added dummy servlet class, because otherwise the template engine is not triggered. thats so because the yacy httpd works much faster as normal file server without a scan of the served pages. Therefore each page with templates must now have a class file associated to it. - fixed json output format of yacysearch orbiter 2009-01-06 20:04:09 +00:00
  • 2be119f0df adjusted big peer to 28M links lotus 2009-01-06 18:20:06 +00:00
  • 4ec1aacde1 prevent indexing from windows indexing service lotus 2009-01-06 18:17:53 +00:00
  • c6880ce28b removed the permanent cache flush and replaced it with a periodic cache flush The cache is now flushed only for one second every ten seconds. During a crawl the cache fills up completely, and is only flushed if space is needed for more documents. orbiter 2009-01-06 13:51:59 +00:00
  • ef7fe537c5 fixed a cache-bug in cachedFileRA orbiter 2009-01-06 10:51:56 +00:00
  • ca80930892 accept leading dots on filetype: and site: search lotus 2009-01-06 10:04:24 +00:00
  • 6c7e83909b - refactoring of data access methods to be prepared for new cell data structure - removed a memory overhead in collections which prevent OOM Exception in low memory configurations orbiter 2009-01-06 09:38:08 +00:00
  • 5ce9d81955 * remove class-file from old location f1ori 2009-01-05 21:19:30 +00:00
  • 1af728ae09 *) regex for site operator changed as proposed by Lotus low012 2009-01-05 18:30:34 +00:00
  • c8451614f3 fix for overflow http://forum.yacy-websuche.de/viewtopic.php?p=11696#p11696 lotus 2009-01-05 18:28:27 +00:00
  • 9e58ae036d *) added site operator which can be used to only show results from a certain domain. example: "test site:edu" shows only documents which contain the word test and which come from an edu domain low012 2009-01-04 14:58:32 +00:00
  • 19e7c56f7f *) apply filter to dir list to only show .black files as blacklists low012 2009-01-04 10:14:19 +00:00
  • c4c4c223b9 fixed a problem with attribute flags on RWI entries that prevented proper selection of index-of constraint orbiter 2009-01-04 02:27:29 +00:00
  • 6072831235 no cr transmission for robinson peers see also: http://forum.yacy-websuche.de/viewtopic.php?p=10290#p10290 orbiter 2009-01-03 23:44:42 +00:00
  • 4bffe664ca *) moved entry field for new expressions to top of the list as requested in forum (http://forum.yacy-websuche.de/viewtopic.php?f=9&t=1678) *) added some Javascript to disable list selection on bottom of list in cases it is not needed (edit, delete) and only enable it if needed (move), if JS is turned off everything will work as usual low012 2009-01-03 10:18:48 +00:00
  • afe98bc11c *) added changes as proposed by Halborinda in http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1674 *) changed indention low012 2009-01-03 08:24:08 +00:00
  • 07fc115e90 removed active profiling in kelondroRowSet orbiter 2009-01-02 12:33:06 +00:00
  • be4c458951 refactoring (implemented Iterable in kelondroRowCollection) orbiter 2009-01-02 11:38:20 +00:00
  • bb5c2cd12e *) ISINDEX parameters will not be put on commandline anymore to prevent possible security hazards (better safe than sorry). Parmeters will have to be read from QUERY_STRING in ISINDEX case too which does not seem to be uncommon behaviour for web servers: http://vms.pdv-systeme.de/users/martinv/cgi_basics/cgi_basics.html#Datenuebergabe low012 2009-01-02 11:18:26 +00:00
  • b6bba18c37 replaced the storing procedure for the index ram cache with a method that generates BLOBHeap-compatible dumps this is a migration step to support a new method to store the web index, which will also based on the same data structure. made also a lot of refactoring for a better structuring of the BLOBHeap class. orbiter 2009-01-01 22:31:16 +00:00
  • db1cfae3e7 *) cleaning up after myself low012 2009-01-01 19:45:15 +00:00
  • f547f9a78c *) added CGI capabilities (run Perl scripts and other software via HTTP GET and POST) *) set cgi.allow to true in yacy.conf to enable CGI (CGI is disabled by default) *) edit cgi.suffixes in yacy.conf if necessary to use additional script types low012 2009-01-01 19:40:06 +00:00
  • bdc380cd84 * add lastModified to templateCache -> no outdated files from cache anymore... f1ori 2009-01-01 14:56:53 +00:00
  • 6792c2a07d * change mime type of xml documents from application/xml to text/xml -> for easier Javascript requests f1ori 2009-01-01 12:30:51 +00:00
  • cb1e887027 * move svnRevNr classes to libbuild f1ori 2008-12-31 19:58:22 +00:00
  • 025094675f * remove empty directory * add necessary dependency for pdfParser f1ori 2008-12-31 19:39:02 +00:00
  • c5691180cb * skip style-tags in HTML-files f1ori 2008-12-31 19:34:24 +00:00
  • 9d5d30f877 *) http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1672 low012 2008-12-31 16:50:10 +00:00
  • 5448aad328 removed unused code orbiter 2008-12-30 12:12:00 +00:00
  • 3567c58b18 added another filed information for BLOBHeap dumps: the gaps orbiter 2008-12-30 10:49:43 +00:00
  • abdd4aa414 added a index dump for blob heaps: this will increase the shutdown time for at most some seconds, but will speed up the start-up orbiter 2008-12-29 21:36:27 +00:00
  • 28d2d28573 added support for filetype search (just use filetype:<type> in the search query) orbiter 2008-12-29 17:57:04 +00:00
  • 8c3205b62e fix for OOB Exception see http://forum.yacy-websuche.de/viewtopic.php?p=11598#p11598 orbiter 2008-12-29 17:36:53 +00:00
  • 78c568331e added test channel to /xml/feed.rss can be obtained with http://localhost:8080/xml/feed.rss?set=TEST returns always a single feed entry with a fresh date orbiter 2008-12-29 12:39:07 +00:00
  • e004da48d3 - added fast fingerprint computation for files (any). Will be used in new index dump method - refactoring orbiter 2008-12-29 12:22:13 +00:00
  • eab72424df *) Fixed small bug: When adding new elements to blacklist via import, the blacklist which the elements were added to was supposed to be displayed, which did not work correctly. low012 2008-12-28 09:58:02 +00:00
  • 0e56675596 *) cleaning up ;-) low012 2008-12-27 20:09:36 +00:00
  • cf69557ea2 *) blacklists can be exported as XML or plain text now *) blacklist import via file upload works now low012 2008-12-27 15:38:20 +00:00
  • 1594a15be9 *) explicit mentioning of blacklist in blacklist cleaner low012 2008-12-27 13:06:05 +00:00
  • 2d2ce24011 * remove all encoding-stuff from proxy encoding is handled by parsers or browser, proxy only passes through f1ori 2008-12-23 19:14:54 +00:00
  • 73c8a0839c * abort download, when proxy connection is closed f1ori 2008-12-23 11:30:24 +00:00
  • bb935fdbb0 less organization overhead for DNS caching and prefetching orbiter 2008-12-23 10:06:49 +00:00
  • 4907697cfa * make fileuploads through proxy bigger than 65500 bytes possible * remove gzip-encoding for files from cache f1ori 2008-12-22 23:04:00 +00:00
  • fc8189f3fb better self-healing of corrupted databases orbiter 2008-12-22 16:43:49 +00:00
  • 963da8c3f9 * updated tm-extractors to new version 1.0 f1ori 2008-12-21 14:51:03 +00:00
  • 51f1a1927c * remove saaj.jar and axis.jar and references to it (was for soap-stuff?) f1ori 2008-12-21 13:06:04 +00:00
  • 5a89266598 *) new parameters for future use (better blacklist handling for im- and export) low012 2008-12-19 19:33:08 +00:00
  • e34ac22fbd - added new monitoring servlet at http://localhost:8080/PerformanceConcurrency_p.html - used the new monitoring to do some fine-tuning of the indexing queue orbiter 2008-12-19 15:26:01 +00:00
  • 449e697436 fix for null-seed in seedfile http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1653 lotus 2008-12-19 12:10:01 +00:00
  • d376d81fc4 replaced busy thread control of crawl stacker by blocking threads orbiter 2008-12-18 23:18:34 +00:00
  • f29b48d9ff patch for IndexOutOfBoundsException orbiter 2008-12-18 22:05:26 +00:00
  • 0881190b19 * Robots.txt: don't interpret Crawl-Delays for other robots fixes: http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1647 f1ori 2008-12-18 15:35:41 +00:00
  • 243e73f53b removed unnecessary usage of kelondroBLOBTree orbiter 2008-12-18 00:18:37 +00:00
  • 8cb7170b75 - set status of kelondroTree, kelondroBLOBTree and kelondroFlexTable to deprecated - removed initialization and/or usage of kelondroFlexTable (should meanwhile not be used any more) orbiter 2008-12-18 00:08:17 +00:00
  • 7535fd7447 - refactoring of CrawlEntry and CrawlStacker - introduced blocking queues in CrawlStacker to make it ready for concurrency - added a second busy thread for the CrawlStacker The CrawlStacker is multithreaded. It shall be transformed into a BlockingThread in another step. The concurrency of the stacker will hopefully solve some problems with cases where DNS blocks. orbiter 2008-12-17 22:53:06 +00:00
  • 6569cbbec1 npe fix: http://forum.yacy-websuche.de/viewtopic.php?t=1646 (break to avoid bad side effects) lotus 2008-12-16 20:53:31 +00:00
  • 18513e2ee2 npe fix: http://forum.yacy-websuche.de/viewtopic.php?t=1646 lotus 2008-12-16 13:36:13 +00:00
  • 2802138787 - refactoring of CrawlStacker (to prepare it for new multi-Threading to remove DNS lookup bottleneck) - fix of shallBeOwnWord target computation heuristic orbiter 2008-12-15 00:02:58 +00:00
  • b1e211b258 no error-alert: http://forum.yacy-websuche.de/viewtopic.php?t=1639 lotus 2008-12-13 12:04:08 +00:00