Commit Graph

  • 8c4ab9c76b Added an option to eventually limit size of remote solr documents put to local index. See mantis #626. luc 2015-12-16 02:20:03 +01:00
  • 8827b86b2a Added useful debug JVM parameters as comment. luc 2015-12-16 02:09:33 +01:00
  • a2c08402af Merge branch 'master' of https://github.com/yacy/yacy_search_server luc 2015-12-15 23:30:30 +01:00
  • 288acceac3 fix test htmlParserTest, charset parameter + upd maven templating-plugin version reger 2015-12-15 02:09:43 +01:00
  • 55a4d15775 Added a note on deprecated default search field and operator. luc 2015-12-14 23:55:12 +01:00
  • 70595d05d0 Modified MemoryControl.main() test to properly end for better results displaying. luc 2015-12-14 23:49:28 +01:00
  • 1be67d9ab6 CachedSolrConnector was replaced by ConcurrentUpdateSolrConnector years ago - time to let it go Commented out unused table of cache-objects sixcooler 2015-12-14 21:33:27 +01:00
  • 28b8bc290a fix use of NETWORK_SEARCHVERIFY for rwi verification was not used to set the searchevent parameter (done in SearchEventCache.getEvent) - remove unused corresponding QueryParams.filterfailurls param. reger 2015-12-13 20:01:49 +01:00
  • 020630efd8 remove unused network scanner parameter from queryparameter Search event is not using networkscanner (removed filterscannerfail param always init to false) reger 2015-12-13 02:50:08 +01:00
  • 967508a87d fix in error handling Michael Peter Christen 2015-12-09 01:21:00 +01:00
  • 7cda48a9d6 add hint to "default max results per page" limit on ConfigPortal (limit is applied in yacysearch & max. total results by sum result-stack size) - remove obsolete search.navigation prop (has moved to ConfigSearchPage_p) reger 2015-12-09 00:49:38 +01:00
  • b2fac989fd Merge pull request #32 from luccioman/master Michael Peter Christen 2015-12-08 12:19:05 +01:00
  • ad5586f8f6 Merge branch 'master' of https://github.com/yacy/yacy_search_server luc 2015-12-08 03:35:36 +01:00
  • 8ebefa4233 Fixed MediaWiki import : DCEntry conversion to SolrInputDocument was failing. Looks like it was broken since Commit b43811d38c luc 2015-12-08 03:34:03 +01:00
  • 7736ee5a42 Updated MediaWimporter main() : display usage in console and stop properly without calling System.exit luc 2015-12-08 03:30:51 +01:00
  • cdb8f3b10d make current ranking score value avail. to search interface / api Update the result score result field with the result queue ranking value to reflect the actual calculated/used score, for rwi & solr stack results. (calc. etc. is unchanged, it's just that result entry carries the latest val as api retrieves the number from it) reger 2015-12-08 03:17:32 +01:00
  • a622c9b656 upd to Bootstrap v3.3.6 reger 2015-12-07 23:03:59 +01:00
  • 92df100596 Merge branch 'master' of https://github.com/yacy/yacy_search_server luc 2015-12-07 21:59:05 +01:00
  • 27d11f8671 Fixed isSolrDump function : PushBackInputStream was not unread when returning false (for example with a WikiMedia dump). luc 2015-12-07 21:58:36 +01:00
  • 1043fe55a3 add missing bootstrap 3.3.4 glyphicons file see comment @luc http://mantis.tokeek.de/view.php?id=623#c1151 reger 2015-12-07 01:30:00 +01:00
  • 135a123a77 less logging in new language detection Michael Peter Christen 2015-12-03 00:39:15 +01:00
  • ef8cd80593 fix for npe Michael Peter Christen 2015-12-03 00:33:13 +01:00
  • 0347bfa71f Apply collection query constraint/modifiert to rwi result stack. Collection is not available in pure rwi entries (but in local solr metadata) But if user wishes to filter by query constraint also rwi shall adhere to this (even if only rwi entries with parsed or solr received metadata may fit) reger 2015-12-02 22:57:59 +01:00
  • dac8332476 Merge pull request #29 from luccioman/master Michael Peter Christen 2015-12-01 14:39:59 +01:00
  • 29585e2c5b Corrected return type when licence is gone to be consistent with other error cases. luc 2015-12-01 09:55:47 +01:00
  • df77e90ed7 Merge branch 'master' of https://github.com/yacy/yacy_search_server luc 2015-12-01 09:13:16 +01:00
  • 2a67d2ba6f Corrected error management for unsupported image formats, parsing errors, and unavailable resources : avoid logging to much Exceptions as these errors easily occur when searching images. luc 2015-12-01 01:06:01 +01:00
  • 997f18f658 prevent exception on repeated ViewImage with same urlLicense reger 2015-12-01 00:06:50 +01:00
  • 3af492538d fixed typos Sergey Stepanov 2015-11-30 22:13:34 +03:00
  • 9cfa847c94 upd maven pom (add langdetect) reger 2015-11-30 18:57:16 +01:00
  • 5309b99e98 Merge branch 'master' of git@github.com:yacy/yacy_search_server.git Michael Peter Christen 2015-11-30 17:01:39 +01:00
  • d6e9834040 Merge branch 'master' of https://github.com/Scarfmonster/yacy_search_server Michael Peter Christen 2015-11-30 16:54:54 +01:00
  • c030b9bc5c Merge pull request #27 from Stepanov-Sergey/master Michael Peter Christen 2015-11-30 13:45:25 +01:00
  • cff152f26e Merge branch 'master' of git@github.com:yacy/yacy_search_server.git Michael Peter Christen 2015-11-30 13:44:36 +01:00
  • 7e785dac8e urlproxyheader must be in the default package because all classes in the htroot path must be in the default package Michael Peter Christen 2015-11-30 13:35:41 +01:00
  • d82d311995 Merge branch 'master' of https://github.com/luccioman/yacy_search_server Michael Peter Christen 2015-11-30 13:34:10 +01:00
  • d3ab43e743 fixed classpath Michael Peter Christen 2015-11-30 13:19:49 +01:00
  • 06994b5853 Merge pull request #23 from linkerlin/patch-1 Michael Peter Christen 2015-11-30 13:18:03 +01:00
  • de0f3c6ff1 added Russian synonyms Sergey Stepanov 2015-11-30 11:37:47 +03:00
  • b5371ea8c1 read/init crawl queue in a thread to speed-up YaCy start on large existing crawler queues reger 2015-11-29 05:19:39 +01:00
  • f05b34fc35 upd to slf4j-1.7.13 reger 2015-11-29 01:24:46 +01:00
  • 1160b13172 remove unused md5 from ViewFile servlet params reger 2015-11-28 23:09:15 +01:00
  • e163ea88f6 fix vsdParser (Visio) parser return statement (final block un-necessary throw) reger 2015-11-28 02:43:38 +01:00
  • b2c8bc0ae6 remove md5_s from default index fields it is not assigned a value / not used Due to above also excluded from transfer protocol. reger 2015-11-27 02:41:02 +01:00
  • e40ae0943b - No max dimensions specified : render raw image data when source and target image format are the same. - Corrected scaling condition. luc 2015-11-26 09:30:43 +01:00
  • 4c36b7bd14 Merge branch 'master' of https://github.com/yacy/yacy_search_server luc 2015-11-26 09:28:34 +01:00
  • 90686a75a2 fix flux factor (additional crawl delay by access count) calculation reger 2015-11-25 01:34:41 +01:00
  • d79fa7fbeb upd to Jetty v9.2.14.v20151106 reger 2015-11-24 21:35:58 +01:00
  • 4af27289e5 Merge branch 'master' of https://github.com/yacy/yacy_search_server luc 2015-11-23 09:01:25 +01:00
  • 297fdb60d3 throw exception if crawler hostqueue can't create hostpath directory. In rare cases hostname may not be a valid filesystem directory name, which can't be created (e.g. containing '*' char). To prevent crawl queue looping on this invalid entry by throwing a malformedurlexception. reger 2015-11-22 21:26:18 +01:00
  • 755efac17d Use same max file size when loading all resource bytes or opening stream content luc 2015-11-20 19:35:39 +01:00
  • 5eafce5577 Rendering performance improvement : use EncodedImage constructor with BufferedImage parameter to avoid re-rerendering BufferedImage. luc 2015-11-20 15:02:58 +01:00
  • bc6c79fc12 Corrected scaling function for non RGB images. luc 2015-11-20 14:35:36 +01:00
  • 042b0e9658 Corrected IcedTea version. See http://mantis.tokeek.de/view.php?id=615 luc 2015-11-20 10:15:54 +01:00
  • 1565559df8 Refactoring : extracted write InputStream method. luc 2015-11-20 09:42:24 +01:00
  • f0478bb14d BMP and ICO image formats support : integrated /haraldk/TwelveMonkeys imageio-bmp-3.2 library. luc 2015-11-20 09:38:16 +01:00
  • b6ba941d33 Configuration projet eclipse : ajout nature et validation javascript luc 2015-11-20 09:32:30 +01:00
  • 7f27683831 Correction erreur de compilation. luc 2015-11-20 09:29:02 +01:00
  • 07437986e7 Merge branch 'master' of https://github.com/yacy/yacy_search_server luc 2015-11-20 08:15:24 +01:00
  • 97cc03ef6a start using a template for urlproxy header It is included as iframe /proxmsg/urlproxyheader.html to allow full servlet functionallity and flexibility to display some index/meta data in future. reger 2015-11-20 01:49:56 +01:00
  • d08e421809 fix link to logo (yacysearch.xsl) reger 2015-11-19 21:08:00 +01:00
  • f01d49c37a Process large or local file images dealing directly with content InputStream. luc 2015-11-18 10:15:06 +01:00
  • 3c4c77099d If available, check content length before downloading. Check also content length is not over Integer.MAX_VALUE. luc 2015-11-18 10:11:38 +01:00
  • 5bbb2e1730 Ensure resource is closed when reading a full file InputStream luc 2015-11-18 10:08:06 +01:00
  • 6291a57300 Merge branch 'master' of https://github.com/yacy/yacy_search_server luc 2015-11-18 08:49:31 +01:00
  • 0d3c5b223e have psParser cleanup temp file reger 2015-11-17 23:45:29 +01:00
  • 7d0d19cb8e avoid File.deleteOnExit() on temp files JVM registers each file in a list regardless of already deleted and never cleans up the list during runtime. This accumulates to a considerable amount of mem during large crawls and/or long uptime. To tackle this, all temp files are now created in a subdir of java.io.tmpdir and the jvm tmpdir property is set to this subdir, which is deleted by code on shutdown. Additionally let pdfParser use this tmp subdir too. reger 2015-11-17 22:27:07 +01:00
  • bfe51001e3 Merge branch 'master' of https://github.com/yacy/yacy_search_server luc 2015-11-17 08:30:32 +01:00
  • 02e4489a23 set tmpfile.deleteOnExit by default, to make sure files are removed on shutdown. reger 2015-11-16 21:37:45 +01:00
  • 2985baaa01 Exclude repetitive protocol part in tokenized url used as description if none is avail. from parser. reger 2015-11-16 01:06:20 +01:00
  • ca3d26a401 harmonize wordsintitle & CollectionSchema.title_words_val calculation, remove obsolete partial init of wordreference from urimetadata reger 2015-11-15 06:06:37 +01:00
  • 7bf03856d1 add link to quick select blacklist from title list reger 2015-11-15 00:39:38 +01:00
  • 440ce6d198 add German translation to re-crawl job reger 2015-11-15 00:34:22 +01:00
  • 5362a80f1c upd to httpcore 4.4.4 reger 2015-11-14 21:16:31 +01:00
  • e90593450c upd to TwelveMonkeys ImageIO 3.2 reger 2015-11-14 01:46:25 +01:00
  • b4dbff6a6a fix yacysearch.json "totalResults" element "totalResults" is included twice (at begin & end), only the element after performing the search holds number > 0 see http://mantis.tokeek.de/view.php?id=608 reger 2015-11-13 20:10:47 +01:00
  • 52a9040ae6 Sort out double keywords (dc_subject) early in parsed documents - by direct using Set vs. List - remove not neede String[] getter reger 2015-11-13 01:48:28 +01:00
  • 49331dc523 Merge branch 'master' of https://github.com/yacy/yacy_search_server luc 2015-11-12 08:21:56 +01:00
  • 0de6988604 Added links to more image test suites. luc 2015-11-12 08:21:37 +01:00
  • 47d70732f6 improve locale translator - skip empty line - robustness file section detection (space independant) reger 2015-11-11 00:57:51 +01:00
  • 646afe9183 do not store subfield *_coordinate + make all num-fields being docvalues sixcooler 2015-11-10 20:45:33 +01:00
  • 194df613de not using 'location' as defaultfacetfield - since we removed it being default. sixcooler 2015-11-10 20:43:58 +01:00
  • d3b9349b6f simplification / speedup of GenerationMemoryStrategy sixcooler 2015-11-10 20:39:46 +01:00
  • f5a9948860 do not store subfield *_coordinate sixcooler 2015-11-10 20:32:42 +01:00
  • fca353e5eb set startuptype of most solr handlers to lazy sixcooler 2015-11-10 20:32:05 +01:00
  • 4a905ec134 fix to not let the AccessTracker-Log grow to much, but have enough data to monitor. (+gitignore-correction) sixcooler 2015-11-10 20:27:17 +01:00
  • 209f502f09 Merge branch 'master' of https://github.com/yacy/yacy_search_server sixcooler 2015-11-10 20:23:03 +01:00
  • 20e18d79f8 harmonize document title for archive parsers reger 2015-11-10 01:29:13 +01:00
  • d481653202 Merge branch 'master' of https://github.com/yacy/yacy_search_server sixcooler 2015-11-09 20:42:44 +01:00
  • 658d9e74d2 Create .travis.yml Linker Lin 2015-11-09 15:18:32 +08:00
  • f11b5e8309 Merge branch 'master' of https://github.com/yacy/yacy_search_server luc 2015-11-09 08:13:12 +01:00
  • 112ae013f4 update bzip and bzip parser process, to return one document for the file with combined parser results of the containing file and registers it with supplied url and mime of the archive. reger 2015-11-07 19:13:18 +01:00
  • e76a90837b update zip and tar parser process, to return one document for the file with combined parser results of the containing files. reger 2015-11-06 23:58:55 +01:00
  • bc610e5382 Merge branch 'master' of https://github.com/yacy/yacy_search_server sixcooler 2015-11-06 23:28:39 +01:00
  • 0e8b3d9a90 Refactoring : default favicon and image processing errors. - moved default favicon processing from ViewImage to yacysearchitem.html : when previewing ico image search results we don't want a default favicon be displayed - throw an IOException ending in a HTTP 500 error when image processing fails, rather than returning a null result : behavior is more consistent accross browsers (for exempla Chrome and Firefox), especially with new default favicon display system luc 2015-11-05 09:45:19 +01:00
  • 4e673ffc9a Ensure closing of InputStream even when an exception occurs. luc 2015-11-05 09:40:24 +01:00
  • 10696b53f7 Merge branch 'master' of https://github.com/yacy/yacy_search_server luc 2015-11-05 08:26:52 +01:00
  • 8532565c7d optimize order of parsers to try - start with a parser matching the remote supplied mime reger 2015-11-04 21:52:02 +01:00
  • 681889ae64 use current tar library for untar files - remove old source copy reger 2015-11-04 02:57:00 +01:00
  • 5d71fc70e3 fix tarParser early exit on looping content - adjust check of data available according to doc - return null on no recognized content (to not exit TextParser next parser try) - use commons.compress directly reger 2015-11-03 22:14:14 +01:00