Commit Graph

  • 7feb549ce6 Small HTML-Fixes suessthomas 2010-07-04 22:16:58 +00:00
  • aa66da5135 corrected hint for debian installation update orbiter 2010-06-30 14:31:16 +00:00
  • 7188c54ddb patch to get dht access to developer peers orbiter 2010-06-30 08:42:29 +00:00
  • 25024d6ab2 fix for problen when accessing the metadata index. The index was not available for all peers with no RAM table copy. orbiter 2010-06-30 07:22:50 +00:00
  • 8e88fa4a62 *) fixed indetion (tab vs. spaces) *) added Android packages MIME type low012 2010-06-29 21:31:22 +00:00
  • b6fb239e74 redesign of parser interface: some file types are containers for several files. These containers had been parsed in such a way that the set of resulting parsed content was merged into one single document before parsing. Using this parser infrastructure it is not possible to parse document containers that contain individual files. An example is a rss file where the rss messages can be treated as individual documents with their own url reference. Another example is a surrogate file which was treated with a special operation outside of the parser infrastructure. This commit introduces a redesigned parser interface and a new abstract parser implementation. The new parser interface has now only one entry point and returns always a set of parsed documents. In case of single documents the parser method returns a set of one documents. To be compliant with the new interface, the zip and tar parser had been also completely redesigned. All parsers are now much more simple and cleaner in its structure. The switchboard operations had been extended to operate with sets of parsed files, not single parsed files. additionally, parsing of jar manifest files had been added. orbiter 2010-06-29 19:20:45 +00:00
  • 59c894029b removed confusing double set button in ConfigHeuristics orbiter 2010-06-28 22:27:20 +00:00
  • d4851441b0 *) Added Android packages to parser in order to be able to create a decentralized search for direct downloads of Android apps. low012 2010-06-28 20:41:08 +00:00
  • 150cf42a1b migrated all my LGPL 3 -licensed files to the LGPL 2.1 because LGPL 3 is not compatible to the GPL 2 see http://www.gnu.org/licenses/license-list.html for explanation Since (as far as I know) nobody else has ever contributed to these files I may be allowed to just apply an older license. You may consider this as a dual-licensing and may use and optionally replicate the older files under GPL 3. orbiter 2010-06-28 16:25:14 +00:00
  • 11b7853940 added a configuration page for search heuristics. currently you can switch on there: - a site-operation heuristic that loads all direct links from a portal page if the site-operator is used - a direct crawl for search results from scroogle for the given search terms The configuration page can be found directly beside the network configuration page orbiter 2010-06-27 21:38:16 +00:00
  • 5d00888c95 - added animated visualization for DHT-in and DHT-out in network graphic - found and fixed a possible memory leak in YaCy internal RSS feed system - some refactoring in RSS feed mechanisms to make this possible orbiter 2010-06-27 10:45:20 +00:00
  • bf25407fdd added peer hash to internal RSSFeed. The hash will be used to display news activities in the network graphic. orbiter 2010-06-26 23:10:57 +00:00
  • 1557e0f2d0 - some refactoring for internal RSSFeed (protocol of all actions as seen on status page) - added dht-out to internal RSSFeed (you can see now messages about distributed indexes on status page) orbiter 2010-06-26 22:39:27 +00:00
  • 5a4684f21f allow words with length >= 2 (you can't search for 'wm' with 3-letter words...) lets try that. If we run into a memory problem because of too many 2-letter-words, then we must introduce whitelists for 2-letter words. orbiter 2010-06-26 16:31:26 +00:00
  • b5e190099d - updated pdfbox and fontbox to 1.1.0 - added license file to sbbi-upnplib orbiter 2010-06-26 10:58:07 +00:00
  • 37b8827a7a - removed the UPnP library sources from sbbi and added the jar library again. The library was included to get support for fedora releases, but after this time the fact that the sbbi cannot be part of fedora should be re-discussed. If this will still not be possible, then we may integrate the sbbi UPnP package using reflection. - cleaned uo the code. The new eclipse helios provided new warnings for dead code. This change cleans up most of these warnings orbiter 2010-06-26 10:32:47 +00:00
  • dcd01698b4 added a 'transition feature' that shall lower the barrier to move from g**gle to yacy (yes!): orbiter 2010-06-25 16:44:57 +00:00
  • d5d48b8dc7 enhanced network animation (smooth loading, reload not all 4 animation phases at once) orbiter 2010-06-24 15:01:26 +00:00
  • 103c848af8 enhancements in image drawing speed orbiter 2010-06-24 13:20:45 +00:00
  • 3a9dc52ac2 added a fascinating new way to search _and_ start a web crawl at the same time: implemented a hint from dulcedo "use site: - operator as crawl start point". YaCy already was able to search using a site-constraint. This function is now extended with a instant crawling feature. When you now use the site-operator, then the landing page of the site iand every page that is linked from this page are loaded, indexed and selected for the search result within that search request. When the remote server responds quickly enough, then this process can result in search results during the normal search result preparation .. just in some seconds. orbiter 2010-06-23 11:19:32 +00:00
  • 8e3cbbb6a9 more animation: update of network image every 10 seconds orbiter 2010-06-23 10:29:04 +00:00
  • 2b4f8f6c06 animated network graphic! orbiter 2010-06-23 10:05:08 +00:00
  • d7767e7589 IFFRESH is too strong, IFEXIST sufficient for cache policy when doing a link verification (this is as it was two commits before) orbiter 2010-06-22 19:16:26 +00:00
  • 777195e8d1 more abstraction for access of LoaderDispatcher and cache orbiter 2010-06-22 12:28:53 +00:00
  • 7bcfa033c9 more abstraction of the htcache when using the LoaderDispatcher: a cache access shall not made directly to the cache any more, all loading attempts shall use the LoaderDispatcher. To control the usage of the cache, a enum instance from CrawlProfile.CacheStrategy shall be used. Some direct loading methods without the usage of a cache strategy have been removed. This affects also the verify-option of the yacysearch servlet. If there is a 'verify=false' now after this commit this does not necessarily mean that no snippets are generated. Instead, all snippets that can be retrieved using the cache only are presented. This still means that the search hit was not verified because the snippet was generated using the cache. If a cache-based generation of snippets is not possible, then the verify=false causes that the link is not rejected. orbiter 2010-06-21 14:54:54 +00:00
  • fd9f0714a3 added link verification, global search and navigation to opensearch description. Hint: the YaCy search can easily be integrated into the firefox search window: Just start a search, then open the pop-up menu inside the firefox search input window and select "add search engine" orbiter 2010-06-20 11:04:11 +00:00
  • 7e2d6fac12 patch for bad values during local search join orbiter 2010-06-20 00:31:00 +00:00
  • 2ddb952a5c added the (fixed and anhanced) secondary search process. The process was disabled since some time. The search process for more than one word should be enhanced now and produce much more results. orbiter 2010-06-20 00:11:12 +00:00
  • 58035ef784 fix in snippet loading orbiter 2010-06-18 19:36:11 +00:00
  • 986d4f34d9 added a consistency check for new queues orbiter 2010-06-18 18:59:42 +00:00
  • 73f03e05ee fixed a bug in snippet fetch strategy: cache only does not help if resource can only be found in web orbiter 2010-06-18 15:25:25 +00:00
  • fbf021bb50 redesign of index abstract processing - currently disabled until enough peers have fix in SVN 6928 orbiter 2010-06-18 09:44:21 +00:00
  • 090eae2cf5 fix for broken index abstract generation orbiter 2010-06-18 09:25:44 +00:00
  • e78fd21fca Added German translation in de.lng for new DictionaryLoader_p.html Geolocalization component loader page mikeworks 2010-06-18 06:36:18 +00:00
  • 87087f12fe - scanned remote search process and enhanced some data structure and synchronizations here and there - removed concurrency overhead for small number of index normalizations as it happens during remote search - removed 'load only parseable' constraint for snippet fetch because some resources may not have any url file extension and these had therefore not been parseable and searcheable since they may become parseable after loading when their mime type is known - this partly fixes some problems with http://forum.yacy-websuche.de/viewtopic.php?p=20300#p20300 but more changes are necessary to get all expected search results orbiter 2010-06-17 11:59:40 +00:00
  • 7ddb70e7c6 new license for ai.greedy component: LGPL (nobody else than me modified that code) orbiter 2010-06-16 22:16:03 +00:00
  • b62fb38344 fix for case where no release provider responds during auto-update (caused NPE) orbiter 2010-06-16 18:43:45 +00:00
  • de4f30bb2e UTF-8 fix orbiter 2010-06-16 15:22:31 +00:00
  • 3a1cebb598 bugfixes orbiter 2010-06-16 15:11:21 +00:00
  • 989819a28c - reduced peer-ping time-out from 30 to 10 seconds - no re-try for the peer ping any more (it's a test, let's see what happens) orbiter 2010-06-16 08:30:13 +00:00
  • 51332b787d reverted SVN 6869 as discussed with dulcedo in car after LinuxTag: missing time-out may be cause of locks during DHT-out orbiter 2010-06-15 20:30:53 +00:00
  • 4eab6473d3 option to set more than 9999 MB RAM in input field :-) orbiter 2010-06-15 20:26:15 +00:00
  • b03caaa57a better handling of OOM situations orbiter 2010-06-15 19:44:05 +00:00
  • 56ff9d5fd4 - extended news size from 512 to 1024 characters - a new news db will be created (news1024.db), the old one (news.db) can be deleted - peers with too large news payload are not ignored any more (they may have been invisible because they had a too large news payload!) orbiter 2010-06-15 10:43:47 +00:00
  • 353a924760 - changed default memory to 500m - now xms is lower than xmx (lets try what happens) - removed default path for intranet crawl starts to avoid confusion as seen on linuxtag - added time-out to upnp request (i have a new router which may need that) orbiter 2010-06-14 21:36:40 +00:00
  • 0f3a3e34e1 Updated German translation de.lng and fixed typos in html files (english) mikeworks 2010-06-11 13:51:49 +00:00
  • 5251a18e65 de.lng: Added new Network.html German translations Network.html: shortened some <br /> tags to <br/> ConfigBasic.html fixed some typo cann for German translation file mikeworks 2010-06-11 00:17:23 +00:00
  • a33f39832e - small change in display of use cases - explain usage of ftp, smb and file search domains orbiter 2010-06-06 23:26:04 +00:00
  • c71d829bb5 more time-out properties for http connection manager orbiter 2010-06-01 23:37:43 +00:00
  • 1610c81dff fixes for embedded search / search widget orbiter 2010-06-01 22:09:17 +00:00
  • 60e71876ad - more abstraction (HashMap -> Map) - more concurrency-awareness (HashMap -> ConcurrentHashMap) orbiter 2010-06-01 13:02:11 +00:00
  • a83772c71b fixes and enhancements for balancer: - crawl lists for each domain now uses a HandleSet which should use less memory than LinkedLists - but: fill more entries into the domain lists (all available entries) - fixes to selection criteria (best domain selection) orbiter 2010-06-01 09:30:23 +00:00
  • 9cde05418f fixed url crawl list display orbiter 2010-05-31 00:27:00 +00:00
  • 2eea806005 less errors in image parser orbiter 2010-05-30 11:18:05 +00:00
  • 30b337fa9f fixes to balancer when crawling filesystem (problem was: host == null) orbiter 2010-05-30 11:17:38 +00:00
  • 844853243a fixed balancer time guessing orbiter 2010-05-30 10:28:42 +00:00
  • 2e679b1302 Small Fixes - no functional Changes suessthomas 2010-05-27 21:01:22 +00:00
  • 3f93a0cc8f redesign of remote proxy settings orbiter 2010-05-26 00:01:16 +00:00
  • 11639aef35 - added new protocol loader for 'file'-type URLs - it is now possible to crawl the local file system with an intranet peer - redesign of URL handling - refactoring: created LGPLed package cora: 'content retrieval api' which may be used externally by other applications without yacy core elements because it has no dependencies to other parts of yacy orbiter 2010-05-25 12:54:57 +00:00
  • 2fd795207c adaptive network peer counter table orbiter 2010-05-24 22:41:14 +00:00
  • 6950d8a33d fixes to SMB crawler orbiter 2010-05-23 01:17:44 +00:00
  • 0eafd94b22 corrected colors of network legend orbiter 2010-05-22 23:09:19 +00:00
  • 9977fb9cf5 more enhancements to Network servlet: own peer in overview orbiter 2010-05-21 23:50:39 +00:00
  • bfdb9f4e06 extended statistics on Network servlet page - added number of online peers at the last day and the last week - changed design of statistic table - network picture now shows exactly those peers that are counted in the statistic overview for one day orbiter 2010-05-21 23:27:32 +00:00
  • 431852f0a7 testing new 'seach on map' image (slightly larger) orbiter 2010-05-21 13:12:47 +00:00
  • e40542579e fixes for wrong attribut name search->query (SRU) orbiter 2010-05-21 13:02:35 +00:00
  • 903ff21478 increased default time-out orbiter 2010-05-21 09:09:26 +00:00
  • 98c1d65415 - show up to 10 locations (maps) after search (instead of a max of 5) - order locations by (primary) population and (secondary) longitude (reverse ordering, both) - added population from GeoNames, OpenGeoDB does not have that information - changed default viewpoint of map to (30,15); shows more land and europe in the center orbiter 2010-05-21 08:18:04 +00:00
  • 9842fab6e4 - fixes to query parameter - replaced/removed search query attribute (was old style, new is 'query' according to SRU) orbiter 2010-05-20 22:05:04 +00:00
  • 6ec9ced4cd - fix for multi-word search for locations - changed description text to 'title' entity (subject is a list of keywords and was very messed) - added ViewFile in location pop-up orbiter 2010-05-20 15:07:57 +00:00
  • 7f35e1955e Added alt tag and width and height properties to earthsearch.png in yacysearchtrailer.html for HTML validity Added alt tag to page tabs in yacysearch.java for HTML validity Added new German translations for geo search phrase in de.lng mikeworks 2010-05-20 06:36:02 +00:00
  • 1defd580bc - added option to localization search to distinguish between a search for a location according to the search word only or for the relation between a web search results and locations found in the metadata fields - used that to display two layers on map: cities and search result locations - added many marker grafics for the display of the markers on the map - some refactoring of the yacy news code plus bugfixes for latest move from Tree to Table data structure orbiter 2010-05-19 12:53:09 +00:00
  • ad823a4716 *) minor changes (only cosmetics, no functional changes) low012 2010-05-18 21:31:59 +00:00
  • dcac90d2f9 *) removed unnecessary import low012 2010-05-18 21:09:41 +00:00
  • 1e8c6cefae - added 'search on map' - Link to search result page - added default search option to location search - show default search in search window on location search page - added icon for location search orbiter 2010-05-18 14:48:54 +00:00
  • 227ebc6651 - added more map layers to the new location search: openstreetmap (mapnik, osmarender, cycle map) - cycle map is default because it looks best at 'world view' - added control elements to map - increased map size - added deletion of search results for each time when a new search is done - moved search box up and added yacy icon in such a way that the search page looks exaclty the same as the standard search orbiter 2010-05-18 13:52:15 +00:00
  • bd0a9df895 fix for bad location double check orbiter 2010-05-18 11:54:30 +00:00
  • b7556893c6 removed terminate buttons for build-in crawl profiles in crawl profile editor orbiter 2010-05-18 07:08:01 +00:00
  • a5ec7db4ab *) Oops! low012 2010-05-17 19:44:02 +00:00
  • b02078b58c *) added visualization of GeoRSS search (very basic, but it's a start...) *) removed double code low012 2010-05-17 19:39:41 +00:00
  • bd2587954a ? orbiter 2010-05-16 01:05:45 +00:00
  • e43e61e502 added another geolocalization data source: GeoNames - added downloader option in DictionaryLoader - added generalization (interfaces and overarching localization) - more abstraction using the libraries orbiter 2010-05-15 23:49:30 +00:00
  • c9862e0ca9 *) removed unnecessary imports low012 2010-05-15 01:17:45 +00:00
  • 76aea981ec *) added W3C geo GeoRSS (see http://en.wikipedia.org/wiki/Georss) low012 2010-05-15 01:11:03 +00:00
  • 118d589eff replaced the very very old data structure 'Records' with a simple table to fix the problem from http://forum.yacy-websuche.de/viewtopic.php?p=20066#p20066 orbiter 2010-05-15 00:59:02 +00:00
  • 734298facd *) added missing namespace declarations in yacysearch.rss *) minor change in yacysearchtrailer.java low012 2010-05-15 00:09:53 +00:00
  • 2a8f70f0ca - fix for caching of OSM tiles. if you want that this fix applies to your peer, please delete the crawl profiles - fix for initial generation of crawl profiles (one more reason to remove your crawl profiles) - more String -> byte[] migration - more logging for cache store/hit orbiter 2010-05-14 23:50:07 +00:00
  • 2126c03a62 - removed download-limit that can be given for the crawler for non-crawler download tasks. This was necessary because the same procedure was used for other downloads like for the download of dictionary files where a limit is not useful. The limit still stays for the indexer - migrated the opengeodb downloader to a new version of the opengeodb-dump orbiter 2010-05-14 18:30:11 +00:00
  • 3661cb692c added dictionary loader servlet that can be used to get the geolocalization file: /DictionaryLoader_p.html Will also be used for more dictionary files in the future orbiter 2010-05-14 09:52:53 +00:00
  • 90fa8fd4d4 - support gpx file extension - non-blocking location search (time-out handling was wrong) orbiter 2010-05-12 08:49:20 +00:00
  • b0927d26e0 *) fix for "more options" link *) removed suplus code low012 2010-05-12 00:48:24 +00:00
  • 439b44be9e removed exit from computation in ReferenceContainerArray.get merge method an warning is still given, but method computes at normal operation see also: http://forum.yacy-websuche.de/viewtopic.php?p=20038#p20038 orbiter 2010-05-11 23:36:40 +00:00
  • 7b880d73d0 adjustments to granted query size orbiter 2010-05-11 23:28:43 +00:00
  • 4cd56d3966 - fix for http://forum.yacy-websuche.de/viewtopic.php?p=20036#p20036 - enhancement to kml search orbiter 2010-05-11 23:06:39 +00:00
  • 586bc4d920 - remove superfluous entries in remote search tracker handles - avoid concurrent access from same client this is a fix for http://forum.yacy-websuche.de/viewtopic.php?p=20045#p20045 orbiter 2010-05-11 22:26:18 +00:00
  • 789c6b26ce added a location search service: using the following servlet/example: http://localhost:8080/yacysearch_location.kml?query=berlin&maximumTime=2000&maximumRecords=100 orbiter 2010-05-11 12:58:05 +00:00
  • f23cbd2dab more bugfixes to date parser orbiter 2010-05-11 11:32:46 +00:00
  • cf43bdc87e This is a large bugfix and enhancement commit to support a better location detection for data - fixes to http file server session handling - fixes and enhancements to metadata date/time handling - added dc:publisher metadata field and updated all document parser - fixed bug in metdata read procedure - enhanced dublin core and rss parser to understand more fields more properly - enhanced url selection in case that multiple urls are given in surrogates - fix for condenser; failure when last word does not end with termination symbol orbiter 2010-05-11 11:14:05 +00:00
  • 6eba2cb96b fix in bmp parser orbiter 2010-05-09 13:27:58 +00:00
  • c6d9a12a99 fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=2810 lotus 2010-05-09 11:21:11 +00:00