Commit Graph

  • 75d5e3475d Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git Michael Peter Christen 2012-08-29 10:13:51 +02:00
  • a2841261bd content control: apply filter if enabled to crawls cominch 2012-08-29 09:52:14 +02:00
  • dc468dad01 add content control features for custom filter lists cominch 2012-08-29 09:04:28 +02:00
  • 316b5fe116 - added a solr type definition verifier - fixed type definition found by the verifier - added multivalue-string fields for solr with extension 'sxt' - added multivalue-integer fields for solr with extension 'val' - renamed some solr attributes from txt to sxt - changed solr query line to an explicit AND/OR structure - added a country code second level domain list to Domains class; with parser - added a host string parser to get domain class name, country-code second-level domain and subdomain out of it - removed old coordinate attributes Michael Peter Christen 2012-08-28 16:58:06 +02:00
  • a3d5959981 Merge commit '65d49df865f60511d22d86fb15c33a082176e7ab' orbiter 2012-08-27 16:56:22 +02:00
  • 4521d63c92 added boosts to solr search queries Michael Peter Christen 2012-08-27 15:25:25 +02:00
  • 4c79ddb91e switched off some solr logging Michael Peter Christen 2012-08-27 14:41:47 +02:00
  • e8acd542b5 - added faceted drill-down for host and geolocation to solr queries - added a new geolocation field to index schema, the old values are migrated if possible Michael Peter Christen 2012-08-27 14:41:33 +02:00
  • f00168ecc5 added gsa result attribute 'has' Michael Peter Christen 2012-08-27 12:15:42 +02:00
  • 65d49df865 security fix: clear automtic password only if adminAccountForLocalhost=false to prevent remote access to protected pages after restart. if adminAccountForLocalhost=true leave automatic password unchanged so access from local host is granted but remote access is preventet from the 1st second. reger 2012-08-26 22:28:14 +02:00
  • 52a62af184 Merge branch 'master' of git://gitorious.org/yacy/rc1.git reger 2012-08-26 22:12:59 +02:00
  • 2094df2e4e - correct length computation for BStringObject (bugfix suggested by apfelmaennchen) - using ASCII for string conversion for Strings generated from Integer orbiter 2012-08-26 17:46:40 +02:00
  • 2d2be546fe fix path to env/grafics to display api icon on meta data page reger 2012-08-26 04:36:52 +02:00
  • 6d03433cda - added hack to prevent that stream servlet paths are not parsed wrongly if the path contains a dot. - added also warnings if documents are requests which do not exist. orbiter 2012-08-25 19:08:42 +02:00
  • a1227879a9 release 1.1 Release_1.1 orbiter 2012-08-24 23:59:10 +02:00
  • 7ac259477f added a direct access to solr search api to enhance the visibility if the embedded solr orbiter 2012-08-24 23:04:19 +02:00
  • 67f2866cd0 small fixes orbiter 2012-08-24 21:44:22 +02:00
  • ce156a01ba Merge commit 'c2341a175fdd755a34965ff63c7ea437b380352d' orbiter 2012-08-24 18:24:24 +02:00
  • c2341a175f Fixed a bug that prevented Yacy from indexing files with non ASCII filenames in FTP servers. David Rubio 2012-08-24 17:45:14 +02:00
  • 3ebc4264c5 fixed concurrent query orbiter 2012-08-24 14:15:40 +02:00
  • 29171e2f6c fixed generation of ontologies from index enumerations orbiter 2012-08-24 14:13:42 +02:00
  • 7cd302de3e omit xml parsing when using the embedded solr server orbiter 2012-08-24 12:18:30 +02:00
  • 787e1c6836 added the QueryResponse query(SolrParams params) method to the SolrServerConnector which is necessary to use facets in solr search. orbiter 2012-08-23 11:53:54 +02:00
  • 01a63ef595 redesign of YaCySchema and SolrDoc handling orbiter 2012-08-23 09:51:45 +02:00
  • 479bfca571 refctoring orbiter 2012-08-23 09:30:11 +02:00
  • 48a82bc705 log queries anonymous from gsa+solr requests Michael Peter Christen 2012-08-22 23:50:40 +02:00
  • ab6ec4ec52 added snippet computation to solr/rss and gsa result writer Michael Peter Christen 2012-08-22 17:37:34 +02:00
  • 4716546ef5 - reduced memory usage in index transmission using a transformation of Node to Row objects - removed peerDeparture in solr remote search in case that peer does not answer (this may be normal because it is allowed to switch this off) Michael Peter Christen 2012-08-22 16:30:33 +02:00
  • af764c106c re-activated audio and video search because they obviously work (!) Michael Peter Christen 2012-08-22 01:56:13 +02:00
  • 06b0081fdc fix for NPE during host navigation computation Michael Peter Christen 2012-08-22 01:55:39 +02:00
  • feb99bc291 fixed GSA format Michael Peter Christen 2012-08-22 00:48:37 +02:00
  • 653645c1cf corrected solr query syntax Michael Peter Christen 2012-08-22 00:48:03 +02:00
  • 08ae142a3d - enhanced caching after search queries to solr - reduced caching after short memory Michael Peter Christen 2012-08-22 00:31:14 +02:00
  • 716ea0cfe2 sorted the solr schema into mandatory and optional fields; reduced number of used field to reduce solr index size orbiter 2012-08-21 23:52:56 +02:00
  • 9b8c8c0f47 fix from gaston in http://forum.yacy-websuche.de/viewtopic.php?p=26909#p26909 orbiter 2012-08-21 21:03:26 +02:00
  • acb9f04e80 removed unused classes orbiter 2012-08-21 18:18:30 +02:00
  • 0ad52ac4c3 gsa bugfix for date parser Michael Peter Christen 2012-08-21 02:39:28 +02:00
  • 3ce4c2f937 fixes for gsa result format Michael Peter Christen 2012-08-21 01:57:46 +02:00
  • 2d5fdfeb65 added authorization-based maximum results limitation to solr and gsa search Michael Peter Christen 2012-08-20 17:10:48 +02:00
  • 67d235fae9 added gzip encoding to solr2sor http interface, client side (server already works) Michael Peter Christen 2012-08-20 16:53:21 +02:00
  • a049761e0c fixed double-check Michael Peter Christen 2012-08-20 14:16:37 +02:00
  • 6fc5400f91 added a tooltip for search navigation to mention that search pages can be navigated using the TAB key Michael Peter Christen 2012-08-20 13:02:29 +02:00
  • f42a57cd7d gsa format update Michael Peter Christen 2012-08-20 12:50:51 +02:00
  • b3aad6cc35 bugfix for remote search when search is done to solr Michael Peter Christen 2012-08-20 12:21:36 +02:00
  • ff3eaa21b0 added remote search to solr on YaCy peers! - when doing a remote search, node peers are selected for solr queries - the solr query is done concurrently to the standard YaCy rwi search - the solr search result is feeded into the same data structure that prepares the rwi search result - the same remote seach that is done to several outside peers is done to the local solr index - the search process works now also without any 'old' RWI data using solr Michael Peter Christen 2012-08-20 12:16:11 +02:00
  • a06123aec6 more abstraction and less parameter overhead for remote search Michael Peter Christen 2012-08-20 01:29:15 +02:00
  • f00733186b code simplifications Michael Peter Christen 2012-08-19 13:17:03 +02:00
  • 755f5e76cf removed strange assert statements and simplified code in metadata transformation Michael Peter Christen 2012-08-19 08:44:39 +02:00
  • db0d438709 fix for http://bugs.yacy.net/view.php?id=206 Michael Peter Christen 2012-08-19 08:43:56 +02:00
  • 404b0aab09 refactoring in remote search and stub for remote node peer selection orbiter 2012-08-18 23:59:25 +02:00
  • d7ea45f698 - get nice text_t values from metadata conversions that are stored into solr as fulltext search index. - added slow migration from old metadata to solr index entries: each entry from the old metadata is removed from that data structure and written into solr. orbiter 2012-08-18 19:36:21 +02:00
  • 99ef57f103 reduced sleep times orbiter 2012-08-18 17:48:20 +02:00
  • 780f8974e7 added ramaining iteration methods for solr in fulltext class orbiter 2012-08-18 15:39:14 +02:00
  • acd2dc3575 hack to removed StringBuilder overhead in query construction orbiter 2012-08-18 14:22:00 +02:00
  • db6863db77 reduced solr cache sizes to check if that solves memory problems a bit orbiter 2012-08-18 13:45:37 +02:00
  • 6f01542aaa explicit double-check in transferURL orbiter 2012-08-18 13:18:51 +02:00
  • ee01c12e56 fixes for putDocument and putMetadata orbiter 2012-08-18 13:05:27 +02:00
  • cc47a0876e reverted bf55f69176 to have a fall-back option in case that memory problems as reported in http://forum.yacy-websuche.de/viewtopic.php?p=26901#p26901 for full-solr installation are too strong and we have to work with an 'small memory footprint' peer system. orbiter 2012-08-18 10:28:40 +02:00
  • 0904afe8fb added concurrent iterator methods to the solr connectors Michael Peter Christen 2012-08-17 18:22:56 +02:00
  • d54b80327a refactoring Michael Peter Christen 2012-08-17 17:28:27 +02:00
  • f9fc5cfaba better check for bad urls in url transmission Michael Peter Christen 2012-08-17 17:17:00 +02:00
  • d39463a85c added deleteByQuery to solr connectors Michael Peter Christen 2012-08-17 17:05:46 +02:00
  • 0cab06c47c refactoring Michael Peter Christen 2012-08-17 15:52:33 +02:00
  • bf55f69176 removed write methods to old metadata file type; all metadata now goes to solr Michael Peter Christen 2012-08-17 15:46:26 +02:00
  • 40c0856489 refactoring Michael Peter Christen 2012-08-17 15:33:02 +02:00
  • 2ccf1dba71 upgrade to solr 3.6.1 Michael Peter Christen 2012-08-17 15:11:21 +02:00
  • e651d3e320 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git Michael Peter Christen 2012-08-17 14:45:18 +02:00
  • 06a78eecb7 code simplification Michael Peter Christen 2012-08-17 14:43:32 +02:00
  • 54bea21c02 bugfix for solr connector, possibly a cause for http://forum.yacy-websuche.de/viewtopic.php?p=26893#p26893 Michael Peter Christen 2012-08-17 14:34:31 +02:00
  • 9bece5ac5f enhanced snippet fetch - removed a bug that caused documents to be parsed even if a solr text was available Michael Peter Christen 2012-08-17 14:22:07 +02:00
  • 8a91f4fa42 local robots.txt: disallow external crawlers to follow the URL proxy cominch 2012-08-17 11:47:39 +02:00
  • 18f989dfb1 - refactoring (load -> getMetadata) - added getDocument to retrieve Solr documents which shall replace getMetadata Michael Peter Christen 2012-08-17 01:34:38 +02:00
  • 395b78a0d8 using the solr search index to concurrently search within solr and the rwis during local search requests. Michael Peter Christen 2012-08-17 01:21:56 +02:00
  • 6197caf698 added clear-text search words in query params Michael Peter Christen 2012-08-16 23:05:37 +02:00
  • efafa79db5 - added a content-encoding: gzip to streamed http server responses - finish and close streamed http responses immediately - this applies only to the solr interface which should be much faster now! Michael Peter Christen 2012-08-16 22:35:19 +02:00
  • 23226676c6 FOR THE BRAVE.. this is a forced migration to solr which is now ready for production as a replacement of the metadata-db. This intermediate release 1.041 will switch on the previously optional solr index and the old metadata-db will still work as it did before. Solr+metadata are accessed in mixed mode, no migration is done yet. If this causes not a catastrophe until the end of the weekend, we will do a YaCy 1.1 main release containing this as default. Michael Peter Christen 2012-08-16 18:17:47 +02:00
  • a1b2c9a67d doctype2mime fix, influences metadata conversion between old metadata and solr Michael Peter Christen 2012-08-16 17:49:35 +02:00
  • 7c31be1c80 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git Michael Peter Christen 2012-08-16 17:45:26 +02:00
  • 6456a1656a changed local robots.txt to prevent external crawlers to submit random search queries cominch 2012-08-16 17:38:10 +02:00
  • a16206e38b more attempts to clean the index (cleaning is faster then) Michael Peter Christen 2012-08-16 17:24:25 +02:00
  • 703f427303 fixed some peer-ping connection details - larger time-out - removed too old seedlist - fixed a bug in connection test Michael Peter Christen 2012-08-16 17:11:54 +02:00
  • 597bb76e4f get the peer location more quickly Michael Peter Christen 2012-08-16 16:28:57 +02:00
  • 156d457aec fix for Index out of bounds exception in Network servlet orbiter 2012-08-16 07:47:52 +02:00
  • da93addec3 addon to e74d66e28c (removed htmlparser.jar): for Mac App orbiter 2012-08-16 07:28:38 +02:00
  • ae9cd7a118 fix xss bug #204 Lotus 2012-08-15 14:23:21 +02:00
  • 1641835fef replaced yacy xml encoding by solr xml encoding Michael Peter Christen 2012-08-14 13:29:11 +02:00
  • 89fe13e73d enhanced GSA and RSS output format: corrected date, added some missing fields, added xml encoding for utf8 Michael Peter Christen 2012-08-14 13:19:29 +02:00
  • ea49a8aa8c Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git Michael Peter Christen 2012-08-14 12:40:44 +02:00
  • d988ba50cf added a very rudimentary, incomplete, non-verified GSA response writer for solr. Try this: http://localhost:8090/gsa/searchresult?q=pdf&site=col1&num=10 Michael Peter Christen 2012-08-14 12:40:26 +02:00
  • aab0b680c3 - added xslt support for solr result formats. try i.e. http://localhost:8090/solr/select?q=*:*&start=0&rows=10&wt=xslt&tr=json.xsl - added servlet-side mime-type configuration for streamed servlets. this is used for the result formatters in solr result formats Michael Peter Christen 2012-08-14 11:12:50 +02:00
  • e74d66e28c augmented browsing: remove htmlparser library cominch 2012-08-14 10:09:46 +02:00
  • e2119f4e76 augmented browsing: replace htmlparser by jsoup, which is more stable and reliable cominch 2012-08-14 10:06:12 +02:00
  • ad62609ec7 added a possibility to define a custom network definition URL for remote management cominch 2012-08-13 16:57:53 +02:00
  • fb0f430685 Merge remote-tracking branch 'original yacy/master' cominch 2012-08-13 16:48:14 +02:00
  • 9448d9a8a2 ups Michael Peter Christen 2012-08-13 14:01:45 +02:00
  • e5ef840f40 - renamed DoubleSolrConnector to MirrorSolrConnector and added a hit/miss/document cache to the MirrorSolrConnector. - more abstraction to SolrDocument in Connector interface - bugfixes in Solr field reader Michael Peter Christen 2012-08-13 13:32:32 +02:00
  • 94a334f128 another fix to the Solr metadata reading process and to the shutdown process Michael Peter Christen 2012-08-13 11:13:53 +02:00
  • b51df6c7e8 - added coordinate storage in solr schema - fixed shutdown process - fixed some solr-to-metadata reading - added a large number of metadata attributes in ViewFile.html Michael Peter Christen 2012-08-13 10:40:04 +02:00
  • da851c6071 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git Michael Peter Christen 2012-08-11 01:21:18 +02:00
  • bd4f03bc85 removed unused class Michael Peter Christen 2012-08-11 01:05:40 +02:00