Commit Graph

  • 3c04dd11de removed dead link Michael Peter Christen 2013-05-29 13:42:38 +02:00
  • 1eb9626cca less logging Michael Peter Christen 2013-05-29 13:30:32 +02:00
  • 536fd1450e added new keys for update locations Michael Peter Christen 2013-05-29 13:10:32 +02:00
  • 281959a2d7 added option to re-boot the embedded solr during run-time. Added also API recording for this method so it can be repeated automatically. The index dump generation is now also available for API recording. Added some synchronization in backend which was necessary for this. Michael Peter Christen 2013-05-29 13:09:34 +02:00
  • 80a7989e8c fixed ClassCastException: [Ljava.lang.Object; cannot be cast to [Ljava.util.List; in robots.txt servlet Michael Peter Christen 2013-05-29 12:02:19 +02:00
  • da621e827e prevent NPE in case RWI is disabled orbiter 2013-05-28 16:26:38 +02:00
  • c2bcfd8afb Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git Michael Peter Christen 2013-05-28 11:39:10 +02:00
  • 67757b425a use a retry handler with retryCount=0 because we usually expect requests to fail if we access non-permanently available resources (peers, web pages) and want to fail fast without repeating the same request which is doomed to fail. The previous appearance of http client connection had a 1-2-4-8-second timeout scheme, which caused that connection attempts lasted for 16 seconds. Michael Peter Christen 2013-05-28 11:38:45 +02:00
  • 7300d81f40 include API Table deletion requests to the API recorder Michael Peter Christen 2013-05-28 11:35:56 +02:00
  • c2b1075dcf activating pollImmediately in case that DHT receive is off. This will cause a much faster search result when running in public robinson mode. Michael Peter Christen 2013-05-28 10:36:49 +02:00
  • d2ade87b49 fixed missing thisaddress in yacysearch.html which caused that the opensearch link was not working Michael Peter Christen 2013-05-28 10:33:41 +02:00
  • 179d032181 added a (badly formatted) delete button for process scheduler entries Michael Peter Christen 2013-05-27 16:15:58 +02:00
  • 888a985dc6 set a higher limit for table copy usage orbiter 2013-05-27 15:23:12 +02:00
  • 2b563debbf javadoc of new multiple-exist test Michael Peter Christen 2013-05-27 13:45:09 +02:00
  • c03f75ebc3 fix DHT url receive see http://bugs.yacy.net/view.php?id=242 reger 2013-05-26 03:24:32 +02:00
  • 8fb1b1e290 *) simplified banner creation code Marc Nause 2013-05-25 12:56:43 +02:00
  • cd0b5f31b4 *) updated links to description of regex Marc Nause 2013-05-25 11:08:06 +02:00
  • 8f2d3ce2f9 reduced locking situation in crawler: shifted synchronized location and reduced time-out of robots.txt load limit Michael Peter Christen 2013-05-20 22:05:28 +02:00
  • f93501e6e0 nice crawl name if crawl is started with file:// (was: null) Michael Peter Christen 2013-05-20 11:25:26 +02:00
  • b4f0cac102 added the reindexing job servlet to the submenu structure Michael Peter Christen 2013-05-20 11:02:21 +02:00
  • 97ab5b90e8 - odt & ooxml (office document) parser correction to add content to fulltext index - adjust Junit yacyVersionTest & ParserTest - update yacyVersion.combined2prettyVersion to the default 4-digit minor ver. reger 2013-05-20 01:50:09 +02:00
  • b68fbe7d21 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git Michael Peter Christen 2013-05-17 14:13:07 +02:00
  • 06d3063dc9 - no downcase when using collection modifier - removed warnings Michael Peter Christen 2013-05-17 14:11:10 +02:00
  • 8dbc80da70 redesign of index.exist-test: this shall now not be done using a single id to be tested, but with a collection of ids. This will cause only a single call to solr instead of many. The result is a much better performace when testing the existence of many urls. The effect should cause very much less IO during index transmission, both on sender and receiver side. Michael Peter Christen 2013-05-17 13:59:37 +02:00
  • 7f63d3747d more generic field selection for reindex option of documents with disabled fields using Luke request to compare config with actual fields in index reger 2013-05-15 23:16:32 +02:00
  • c91c67c3cd reject bad solr requests Michael Peter Christen 2013-05-15 22:42:05 +02:00
  • 44e363f37f refactoring of WorkflowProcessor, added process counter, update of process counter if an blocking thread dies. Added also a new column in PerformanceConcurrency_p servlet to show the actual number of concurrent processes. Michael Peter Christen 2013-05-13 13:28:07 +02:00
  • 4058369288 fixed query expressions for collection selection (added quotes) Michael Peter Christen 2013-05-13 13:27:01 +02:00
  • f2e36fbd06 enhanced deletion process for very large number of documents Michael Peter Christen 2013-05-13 13:26:24 +02:00
  • 79401cb938 added reindex option for documents with disabled or obsolete fields to Solr Schema Editor page (IndexSchema_p.html) this allows to remove obsolete fields from the index (according to current schema config) by selecting all documents containig disabled fields. reger 2013-05-13 04:06:57 +02:00
  • cf36c1614f prevent that concurrent deletion process causes wrong double-check in crawl start orbiter 2013-05-12 21:37:45 +02:00
  • aeff31cd44 fix for workflow processor (cause: latest redesign for less threads) orbiter 2013-05-12 21:36:20 +02:00
  • 77faeada4d small memory leak patch Michael Peter Christen 2013-05-11 11:19:06 +02:00
  • b24d1d18e4 removed synchronization and concurrency in Fulltext class, concurrent deletions are now handled in ConcurrentUpdateSolrConnector Michael Peter Christen 2013-05-11 10:53:12 +02:00
  • f965d04496 added new peer icons for Mentor peers and Mentee peers (not used yet) Michael Peter Christen 2013-05-10 17:33:02 +02:00
  • b9b446bca6 - added ssl configuration sign (a lock) to network statistic/table - fixed a bug in bitfield Michael Peter Christen 2013-05-10 17:32:21 +02:00
  • 7095446ad3 added checkbox (near port) to switch on ssl support (https access) to the admin interface. Michael Peter Christen 2013-05-10 13:49:46 +02:00
  • e6c8b545c2 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git Michael Peter Christen 2013-05-10 12:16:55 +02:00
  • a83c2fe833 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git orbiter 2013-05-10 12:02:40 +02:00
  • 4baa0d4a97 Added a default keystore for ssl encryption of the YaCy web interface. This will enable https-access to YaCy, but this feature is disabled by default using the new server.https=false attribute. This has two purposes: - make it easier for everyone to use https (just set server.https=true) - provide the basis for secure yacy-to-yacy communication in the future orbiter 2013-05-10 12:02:31 +02:00
  • 0aef60f66e Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git Michael Peter Christen 2013-05-10 06:03:24 +02:00
  • da191c839d reduce SolrConnectorLogging setting (from default ALL to INFO) reger 2013-05-10 05:54:07 +02:00
  • aaddb4809c Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git Michael Peter Christen 2013-05-10 04:57:15 +02:00
  • 038f956821 fix for sitemap detection: the sitemap url was not visible if it appeared after the declaration of robots allow/deny for the crawler because the sitemap parser terminated after the allow/deny rules had been found. Now the parser reads the robots.txt until the end to discover also sitemap rules at the end of the file. Michael Peter Christen 2013-05-10 04:56:58 +02:00
  • 4fc6837690 - fix monitor url of crawl job in PerformanceQueues_p.html - reduce logging of every index add (switch embeddedsolr.add from info to debug) reger 2013-05-10 04:38:13 +02:00
  • 442ed50be0 removed some unnecessary synchronizations Michael Peter Christen 2013-05-09 03:06:48 +02:00
  • 9bd2aee180 migrated to solr 4.3.0 Michael Peter Christen 2013-05-09 02:17:53 +02:00
  • ad050ec88d - upgraded httpclient, httpcore and httpmime - removed httpclient 3.1 which has been used by solrj < 4.x.x and is now not used any more - fixed some parts in YaCy which used methods from httpclient 3.1 Michael Peter Christen 2013-05-09 00:22:45 +02:00
  • 4b100f8b48 Merge branch 'master' of ssh://gitorious.org/yacy/rc1 Michael Peter Christen 2013-05-08 23:46:03 +02:00
  • 3abf516ca7 merged classpath Bitte geben Sie eine Versionsbeschreibung für Ihre Änderungen ein. Zeilen, Michael Peter Christen 2013-05-08 23:45:29 +02:00
  • a1c989002b fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=4652 generate dht data even if dht receive and dht transmission is switched off orbiter 2013-05-08 16:48:45 +02:00
  • 48e9a54e80 updated pdf parser orbiter 2013-05-08 15:17:06 +02:00
  • e26bdd4a52 fixes to deletion methods (removed unnecessary concurrency and added removal of crawl queue entries) Michael Peter Christen 2013-05-08 13:26:25 +02:00
  • f2c9b0b5f2 better robustness of Concurrent Solr Connector against update/deletion thread failure Michael Peter Christen 2013-05-08 12:41:24 +02:00
  • f7f3e28c5e prevent that the size of the index is computed too many times. Because the index size is now provided by solr, and the only way to do that is a match for [* TO *], a size computation is quite complex and time-consuming. Therefore this patch prevents that the method is called at all and if necessary puts a DOS-preventing barrier in front of it. Michael Peter Christen 2013-05-08 11:50:46 +02:00
  • cca19d94d4 re-declared some fields to be of type string rather than text which makes them more efficient and less large Michael Peter Christen 2013-05-06 16:45:54 +02:00
  • cc90f82dbb increased default proxy client timeout to one minute Michael Peter Christen 2013-05-06 14:58:18 +02:00
  • ed1d5bace6 draw the names of other peers which receive/send dht into the network graphic Michael Peter Christen 2013-05-06 14:27:39 +02:00
  • b528448332 enlarge network graph circle according to image height and reduce the image height in the Network servlet. Overall, the image is now larger but takes less space on the web page. Michael Peter Christen 2013-05-05 23:39:46 +02:00
  • 58d85b5b80 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git Michael Peter Christen 2013-05-05 21:05:47 +02:00
  • 24d2b4baee remove pre 1.0 migration statement which possibly overwrites user navigator setting reger 2013-05-05 05:00:42 +02:00
  • f1bb54943e typo Michael Peter Christen 2013-05-04 09:34:06 +02:00
  • d7fd346917 - added regular-expression based deletions - on-demand collection-list generation for collection-based deletions instead of a default collection-list presentation (this makes calling the interface much faster since the computation of collections lists for large indexes may take some seconds) Michael Peter Christen 2013-05-04 01:14:10 +02:00
  • 3841854c97 abstraction of catchall term Michael Peter Christen 2013-05-04 00:14:22 +02:00
  • ea85674be2 added the date to error documents Michael Peter Christen 2013-05-04 00:14:00 +02:00
  • 72003b109b Merge branch 'master' of git://gitorious.org/yacy/rc1.git reger 2013-05-03 03:56:25 +02:00
  • 4fec35a665 adjust Test case EmbeddedSolrConnector reger 2013-05-03 03:55:14 +02:00
  • 6fafed2180 fix for solr cache when a delete buffer is filled and a document, which is the delete queue, is replaced with a new one. Michael Peter Christen 2013-05-03 02:03:30 +02:00
  • 20b767f35e preventing score computation in solr where applicable Michael Peter Christen 2013-05-03 02:02:35 +02:00
  • 7de5b9cfa0 fix for http://bugs.yacy.net/view.php?id=233 - check geolocation coordinates and accept only those, which are well-formed - the solr push process does not stop crawling any more if after 20 requests to Solr Solr does not accept the record. Instead, a severe log entry asks the user to create a bug request orbiter 2013-05-03 00:24:39 +02:00
  • e145afb8d6 fix for PerformanceMemory showing UNRESOLVED_PATTERN by removing solr-cache-stuff, which is not available anymore sixcooler 2013-05-02 15:47:21 +02:00
  • ee217dbdee remove sort order in all cases where not needed Michael Peter Christen 2013-04-30 11:44:56 +02:00
  • 70e981b333 prevent that long-running deletion tasks block a hard commit. Michael Peter Christen 2013-04-30 11:09:21 +02:00
  • bb4bf3d8fd infinity timeout bug protection patch Michael Peter Christen 2013-04-30 11:06:48 +02:00
  • 1b102d98d8 - added index deletion to index administration submenu - added index deletion processes to the process scheduler/recorder Michael Peter Christen 2013-04-30 02:11:28 +02:00
  • ee95e772cf Merge branch 'master' of git://gitorious.org/~saranshupscale/yacy/yacy-india-rc1 Michael Peter Christen 2013-04-30 00:20:42 +02:00
  • ab686900c1 New Hindi Translation Saransh Sharma 2013-04-30 03:33:21 +05:30
  • d1be4127e7 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git Michael Peter Christen 2013-04-29 19:31:40 +02:00
  • 0e2ee00fea added an index deletion servlet and some style changes for the 'dangerous' engage-button Michael Peter Christen 2013-04-29 19:30:53 +02:00
  • 1aac722cc6 added another solr connector, the ConcurrentUpdateSolrConnector which does not block when long-running updates to solr are made. This is realized using blocking queues which process all long-running tasks in the background. Also some bugfixes to existing connectors. Michael Peter Christen 2013-04-29 19:30:04 +02:00
  • 0af7803367 added more features to ScoreMap (pretty toString) Michael Peter Christen 2013-04-29 19:28:17 +02:00
  • f36a7da5f6 - re-introduced existById in solr connector. - intruduced raw-queries for the re-introduced byId-Queries (they are hopefully faster than full edismax queries) - removed the cached solr connector (testing this) to rely only on the solr built-in search caches. That should save some RAM (also). We will see if this is usable. Michael Peter Christen 2013-04-28 21:20:14 +02:00
  • e4f7e5bcfe fixed bad css change Michael Peter Christen 2013-04-28 20:09:45 +02:00
  • 46fa800bc7 added httpstatus_i to automatically switched on fields (used in all search queries) reger 2013-04-27 03:11:44 +02:00
  • 3502b4c697 refactoring (renaming) of yacy-solr api Michael Peter Christen 2013-04-27 01:32:18 +02:00
  • 3a0fcfbeda Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git Michael Peter Christen 2013-04-26 10:50:08 +02:00
  • 25499eead5 - added a new field for the regular expression in crawl start - added the field in crawl profile - adopted logging end error management - adopted duplicate document detection - added a new rule to the indexing process to reject non-matching content - full redesign of the expert crawl start servlet The new filter field can now be seen in /CrawlStartExpert_p.html at Section "Document Filter", subsection item "Filter on Content of Document" Michael Peter Christen 2013-04-26 10:49:55 +02:00
  • 0a9b0992f3 RinkingSolr_p: include warning if boost field not in local index reger 2013-04-26 02:26:38 +02:00
  • e1bfe9d07a - reduction of the concurrently running processes to make YaCy more adjusted to smaller and 1-core devices. - the workflow processor now starts no process at all. these are started as soon as parser/condenser/indexing queues are filled. - better abstraction orbiter 2013-04-25 11:33:17 +02:00
  • c091000165 added collection attribute also to the rss feed reader Michael Peter Christen 2013-04-24 01:14:35 +02:00
  • 43ca359e24 Merge branch 'master' of ssh://gitorious.org/yacy/rc1 Michael Peter Christen 2013-04-23 21:01:08 +02:00
  • 2d60dfb3e1 Merge branch 'master' of git://gitorious.org/~saranshupscale/yacy/yacy-india-rc1 Michael Peter Christen 2013-04-23 21:00:49 +02:00
  • f7571386a3 added a 'collection' property attribute in yacysearch.html which can be used to select between different collections as defined during a crawl start with the 'collection' attribute. This actually implements the ability to prepare search tenants which restrict their search results to a specific collection. The main use for this is to provide tenants to the yaml4 interface (at this time). orbiter 2013-04-23 20:42:54 +02:00
  • 04b61e08c8 More Translation Saransh Sharma 2013-04-23 19:31:17 +05:30
  • 3e79bd4b1f Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git orbiter 2013-04-23 12:15:46 +02:00
  • d571e739b6 increased row limitation for authorized users from 10000 to 100000000 in solr interface orbiter 2013-04-23 12:15:33 +02:00
  • d937c55204 extended limitation of dom export size from 100000 to 100000000 Michael Peter Christen 2013-04-22 22:33:13 +02:00
  • fc2095ac67 some extensions to raster plotter to transform a RGB picture to an indexed color scheme. This is needed for gif animations Michael Peter Christen 2013-04-22 14:33:04 +02:00
  • c1a2175fbc added transparency to gif image animation and the integration to the YaCy httpd for on-the-fly generated gifs (including animated gifs) Michael Peter Christen 2013-04-21 12:29:05 +02:00
  • a1fffe8e86 fixed default ranking values Michael Peter Christen 2013-04-21 12:27:27 +02:00